agent-cli dev new: agent launch reports success even when agent process fails to start

## Problem

`agent-cli dev new <branch> --agent --with-agent <agent-name>` reports success (`✓ Started <agent> in new iterm2 tab`) even when the agent process fails to actually start in the tab. This has been observed repeatedly with specific agents while others (e.g., `gemini`, `claude`) start fine in the same session.

The tab is created and the command is sent, but the agent never starts running — the tab appears empty or shows a shell prompt without the agent. The user has no indication anything went wrong.

## Root cause

In `dev/terminals/iterm2.py`, `open_new_tab()` only verifies that the AppleScript executed without error (i.e., the tab was created). It does **not** verify that:

1. The shell in the new tab is ready to receive commands
2. The command actually executed
3. The agent process started running

```python
# iterm2.py:75-84 — success = "osascript didn't crash", not "agent is running"
try:
    subprocess.run(["osascript", "-e", applescript], check=True, capture_output=True, text=True)
    return True  # ← Only means the tab was created
except subprocess.CalledProcessError:
    return False
```

The `_launch_agent()` function in `cli.py` then prints the misleading success message:

```python
if terminal.open_new_tab(path, full_cmd, tab_name=tab_name):
    _success(f"Started {agent.name} in new {terminal.name} tab")  # ← Misleading
    return
```

There may also be a race condition: the AppleScript creates the tab and immediately sends `write text` with the command, but the shell in the new tab may not be ready to process it yet.

## Suggested fix

### Option A: Post-launch process verification (recommended)

After `open_new_tab()` returns, poll for the agent process to confirm it actually started:

```python
def _verify_agent_started(agent: CodingAgent, path: Path, timeout: float = 10.0) -> bool:
    """Verify the agent process actually started in the worktree."""
    import time
    deadline = time.monotonic() + timeout
    while time.monotonic() < deadline:
        # Check if agent process is running in the worktree directory
        result = subprocess.run(
            ["pgrep", "-f", f"{agent.command}.*{path.name}"],
            capture_output=True
        )
        if result.returncode == 0:
            return True
        time.sleep(1.0)
    return False
```

Then in `_launch_agent()`:

```python
if terminal.open_new_tab(path, full_cmd, tab_name=tab_name):
    if _verify_agent_started(agent, path):
        _success(f"Started {agent.name} in new {terminal.name} tab")
    else:
        _warn(f"Tab opened but {agent.name} may not have started. Check the tab manually.")
        _warn(f"To retry: cd {path} && {full_cmd}")
    return
```

### Option B: Add delay before `write text` in AppleScript

The race condition between tab creation and command writing could be mitigated:

```applescript
tell current session
    set name to "..."
    delay 0.5
    write text "..."
end tell
```

### Option C: Wrapper script with signal-back

The wrapper script could create a sentinel file on start and the launcher could poll for it:

```bash
#!/usr/bin/env bash
touch /tmp/agent-cli-started-$$
exec codex "$(cat .claude/TASK.md)"
```

## Additional suggestion: exit code

When the agent process can't be verified, `agent-cli dev new` should exit with a non-zero exit code so callers can detect the failure programmatically.

## Environment

- macOS with iTerm2
- Multiple agents tested in the same session — some start fine, one consistently fails
- The wrapper script approach (`_create_prompt_wrapper_script`) is used when `--prompt-file` is passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

agent-cli dev new: agent launch reports success even when agent process fails to start #423

Problem

Root cause

Suggested fix

Option A: Post-launch process verification (recommended)

Option B: Add delay before `write text` in AppleScript

Option C: Wrapper script with signal-back

Additional suggestion: exit code

Environment

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

agent-cli dev new: agent launch reports success even when agent process fails to start #423

Description

Problem

Root cause

Suggested fix

Option A: Post-launch process verification (recommended)

Option B: Add delay before write text in AppleScript

Option C: Wrapper script with signal-back

Additional suggestion: exit code

Environment

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Option B: Add delay before `write text` in AppleScript