fix: harden dynamic tool handlers against deadlock, hangs, and runaway output by electronicBlacksmith · Pull Request #36 · ghostwright/phantom

electronicBlacksmith · 2026-04-05T01:03:59Z

Summary

Fixes four latent liveness/stability bugs in the MCP dynamic-tool execution path. None produce visible errors today — they produce hangs or silently missing tools, which is the worst kind of bug because the agent has no signal anything is wrong.

Pipe-buffer deadlock — executeShellHandler and executeScriptHandler drained stdout then stderr sequentially. Any handler writing >64 KB to stderr before closing stdout (curl -v, git clone, npm install, verbose loggers) blocked on its next stderr write while phantom waited for stdout EOF forever.
No subprocess timeout — Bun.spawn ran with no kill path. A hung handler froze the agent turn indefinitely with no recovery.
No stdout/stderr size cap — new Response(stream).text() slurped unbounded output, risking OOM of the 2 GB container.
No per-tool guard in registerAllOnServer — one tool with a bad inputSchema would throw during the loop and silently skip every subsequent tool on every agent query (MCP factory pattern recreates servers per query, so a single bad schema persists across every turn).

Fix

All four fixes land in src/mcp/dynamic-handlers.ts and src/mcp/dynamic-tools.ts. Orthogonal, small, surgical.

New readStreamWithCap helper drains with a byte cap and continues reading-to-void past the cap so the child never blocks on a full pipe buffer.
New drainProcessWithLimits helper runs concurrent drains via Promise.all, schedules SIGTERM at HANDLER_TIMEOUT_MS (default 60s, env-overridable via PHANTOM_DYNAMIC_HANDLER_TIMEOUT_MS), and escalates to SIGKILL after a 2s grace.
Size cap defaults to 1 MB (PHANTOM_DYNAMIC_HANDLER_MAX_OUTPUT_BYTES). Truncation notice matches the truncateForSlack convention.
registerAllOnServer now wraps per-tool registration in try/catch, logs a warning with the tool name, and continues. Broken tools are not auto-unregistered — the schema might be temporarily invalid, and silently deleting the row would be destructive. Log loudly, keep the row, let the operator decide.

Security boundary preserved

buildSafeEnv and the --env-file= pattern in executeScriptHandler are unchanged. The subprocess environment isolation boundary documented in SECURITY.md is intact — no secrets leak into spawned processes.

Tests

Five new tests, all spawning real subprocesses (no mocks):

Pipe-drain deadlock regression — writes 200 KB to stderr then prints to stdout. Under the old sequential-drain code this hangs forever. Under the fix it completes in ~100 ms. This is the most important test — it's the one that proves the deadlock bug stays dead.
Timeout kills a hung handler — sleep 10 with a 500 ms timeout override. Expects timeout error in well under 5s.
Output cap truncates — emits ~270 KB of base64 with a 10 KB cap. Asserts truncation notice is present and total length is bounded.
Non-zero exit surfaces stderr and exit code — guards the error-path message content.
registerAllOnServer tolerates a failing tool — one bad tool logs a warning via console.warn, the other registers successfully, no exception bubbles out.

A note on test env cleanup

Env-var cleanup in the new tests uses Reflect.deleteProperty(process.env, ...) rather than delete (which Biome's noDelete rule rejects) or = undefined (which coerces to the string "undefined" on process.env and does not actually unset the key). This matches the pattern you acknowledged as correct in #5 when @coe0718 raised the same issue.

The existing buildSafeEnv tests at the top of the file still use the = undefined pattern — that's out of scope for this PR but flagging it so you're aware the file will have two different patterns until the follow-up you mentioned in #5 lands. Happy to roll that into this PR if you'd prefer a single-pattern file.

Test plan

bun run lint — clean (0 errors, 0 suppressions)
bun run typecheck — clean
bun test src/mcp/__tests__/dynamic-handlers.test.ts src/mcp/__tests__/dynamic-tools.test.ts — 33 pass, 0 fail
bun test — 862 pass, 2 fail (the 2 pre-existing phantom init environmental failures unrelated to this change)
Maintainer review

…y output Four latent liveness/stability bugs on the MCP dynamic-tool execution path would silently hang agent turns or crash the container. None surfaced visible errors, which made them the worst kind of bug: the agent just stopped. 1. Pipe-buffer deadlock: executeShellHandler and executeScriptHandler drained stdout then stderr sequentially. Any handler writing >64KB to stderr before closing stdout (curl -v, git clone, npm install, verbose loggers) blocked on its next stderr write while phantom waited for stdout EOF forever. Fix: Promise.all over both streams via a new readStreamWithCap helper. 2. No subprocess timeout: Bun.spawn ran with no kill path. A hung handler froze the agent turn indefinitely with no recovery. Fix: drainProcessWithLimits schedules SIGTERM at HANDLER_TIMEOUT_MS (default 60s, env-overridable via PHANTOM_DYNAMIC_HANDLER_TIMEOUT_MS) and escalates to SIGKILL after a 2s grace. Timeouts report partial stderr so the agent has actionable signal. 3. No stdout/stderr size cap: new Response(stream).text() slurped unbounded output, risking OOM of the 2GB container. Fix: readStreamWithCap enforces a 1MB cap by default (PHANTOM_DYNAMIC_HANDLER_MAX_OUTPUT_BYTES), appends a clear truncation notice, and continues draining-to-void so the child never blocks on a full pipe buffer. 4. DynamicToolRegistry.registerAllOnServer had no per-tool guard. One tool with a bad inputSchema would throw during the loop and silently skip every subsequent tool on every agent query (MCP factory pattern recreates servers per query). Fix: per-tool try/catch, warn with tool name, continue. Broken tools are not auto-unregistered; the operator decides. buildSafeEnv and the --env-file= pattern in executeScriptHandler are unchanged, preserving the subprocess environment isolation boundary from SECURITY.md. Tests spawn real subprocesses and include a 200KB-stderr regression test that would hang under the old sequential-drain code. Env-var cleanup in the new tests uses Reflect.deleteProperty(process.env, ...) rather than `delete` (Biome noDelete) or `= undefined` (coerces to the string "undefined" on process.env and does not actually unset the key). This matches the pattern acknowledged as correct by the maintainer in #5.

electronicBlacksmith closed this Apr 5, 2026

electronicBlacksmith deleted the fix/dynamic-handler-hardening branch April 5, 2026 04:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: harden dynamic tool handlers against deadlock, hangs, and runaway output#36

fix: harden dynamic tool handlers against deadlock, hangs, and runaway output#36
electronicBlacksmith wants to merge 1 commit intoghostwright:mainfrom
electronicBlacksmith:fix/dynamic-handler-hardening

electronicBlacksmith commented Apr 5, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

electronicBlacksmith commented Apr 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Fix

Security boundary preserved

Tests

A note on test env cleanup

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

electronicBlacksmith commented Apr 5, 2026 •

edited

Loading