Skip to content

code-coach review: CLI WM_COPYDATA infinite-block — 1 Should-fix (Option A landed) #9

@psmon

Description

@psmon

Source

  • RCA log: `harness/logs/code-coach/2026-05-10-07-51-cli-block-recurrence-rca.md`
  • Knowledge (newly canonicalised): `harness/knowledge/code-coach/wm-copydata-ipc-pitfalls.md`

Verdict

Severity Count Status
Must-fix 0
Should-fix 1 resolved by hotfix this commit (Option A)
Suggestion 2 open (Option B, UIA STA gap)

Findings

1. Should-fix — `SendMessage` (no-timeout) makes CLI infinitely block on busy GUI

File: `Project/AgentZeroWpf/NativeMethods.cs:384-385`, `Project/AgentZeroWpf/CliHandler.cs:99`

```csharp
[LibraryImport("user32.dll", EntryPoint = "SendMessageW")]
public static partial IntPtr SendMessageCopyData(...);
```

`SendMessage` is synchronous and unbounded — any stall on the WPF UI thread
(LLM model warm-up, ConPTY pipe blocking, modal dialog, COM marshal) leaves
the CLI process blocked forever. `_timeoutMs = 5000` only governs MMF
response polling, not the send itself. This is the recurring "CLI 블락 현상."

Rewrite (landed):

```csharp
// NativeMethods.cs
[LibraryImport("user32.dll", EntryPoint = "SendMessageTimeoutW")]
public static partial IntPtr SendMessageTimeoutCopyData(
IntPtr hWnd, uint Msg, IntPtr wParam, ref COPYDATASTRUCT lParam,
uint fuFlags, uint uTimeout, out IntPtr lpdwResult);

// CliHandler.SendWpfCommand
var rc = NativeMethods.SendMessageTimeoutCopyData(
agentWnd, NativeMethods.WM_COPYDATA, IntPtr.Zero, ref cds,
NativeMethods.SMTO_ABORTIFHUNG | NativeMethods.SMTO_NORMAL,
uTimeout: 3000, out _);
if (rc == IntPtr.Zero) { /* print error, return false */ }
```

Returning `bool` is load-bearing — every caller (`status`, `copy`,
`terminal-list`, `terminal-send`, `terminal-key`, `terminal-read`,
`bot-chat`) was updated to short-circuit.

2. Suggestion — Heavy work inside WndProc still risks per-handler timeouts (Option B)

File: `Project/AgentZeroWpf/UI/APP/MainWindow.xaml.cs:546` (`HandleTerminalSend`)

After Option A, a hung ConPTY child no longer locks the CLI infinitely — but
`SendMessageTimeout` will fire 3 s timeouts repeatably as long as
`session.WriteAndSubmit` runs synchronously in WndProc. Move handler bodies
to `Task.Run` and respond from a worker; CLI MMF poll already covers the
latency.

Apply per-handler. Start with `HandleTerminalSend` / `HandleTerminalKey`
(both write to the PTY); `HandleStatus` / `HandleTerminalList` are pure
in-process reads and don't justify the threading overhead.

3. Suggestion — `OsControlService.TextCapture` lacks STA marshal (latent)

File: `Project/AgentZeroWpf/OsControl/OsControlService.cs:158-181`

```csharp
public static string TextCapture(long hwnd, ...)
{
var result = ElementTreeScanner.Scan((IntPtr)hwnd, ...); // no STA pin
...
}
```

The sibling `ElementTreeAsync` does marshal onto an STA thread (and the
comment there explicitly states "System.Windows.Automation requires an STA
thread"). `TextCapture` works today only because callers happen to be on
STA threads (CLI main, WPF dispatcher). A future LLM-toolbelt path on an
MTA thread will deadlock.

Recommendation

  • Apply now — Option A (this PR/commit). Eliminates the infinite-block
    class entirely.
  • Next mission — Option B: backgrounding for `HandleTerminalSend` and
    `HandleTerminalKey` first.
  • Adjacent mission — STA pin in `OsControlService.TextCapture`. Add a
    reviewer-checklist line for "any new `System.Windows.Automation` call must
    be STA-pinned."

Closes-when

  • CLI calls use `SendMessageTimeoutW` with `SMTO_ABORTIFHUNG` and a
    bounded timeout
  • Every `SendWpfCommand` caller short-circuits on `false`
  • CLI prints a unique error string on timeout (greppable from operator's terminal)
  • Knowledge captured in `harness/knowledge/code-coach/wm-copydata-ipc-pitfalls.md`
  • Option B applied to `HandleTerminalSend` (separate ticket)
  • STA pin added to `OsControlService.TextCapture` (separate ticket)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions