fix: harden dynamic tool handlers against deadlock, hangs, and runaway output#36
Closed
electronicBlacksmith wants to merge 1 commit intoghostwright:mainfrom
Closed
Conversation
…y output Four latent liveness/stability bugs on the MCP dynamic-tool execution path would silently hang agent turns or crash the container. None surfaced visible errors, which made them the worst kind of bug: the agent just stopped. 1. Pipe-buffer deadlock: executeShellHandler and executeScriptHandler drained stdout then stderr sequentially. Any handler writing >64KB to stderr before closing stdout (curl -v, git clone, npm install, verbose loggers) blocked on its next stderr write while phantom waited for stdout EOF forever. Fix: Promise.all over both streams via a new readStreamWithCap helper. 2. No subprocess timeout: Bun.spawn ran with no kill path. A hung handler froze the agent turn indefinitely with no recovery. Fix: drainProcessWithLimits schedules SIGTERM at HANDLER_TIMEOUT_MS (default 60s, env-overridable via PHANTOM_DYNAMIC_HANDLER_TIMEOUT_MS) and escalates to SIGKILL after a 2s grace. Timeouts report partial stderr so the agent has actionable signal. 3. No stdout/stderr size cap: new Response(stream).text() slurped unbounded output, risking OOM of the 2GB container. Fix: readStreamWithCap enforces a 1MB cap by default (PHANTOM_DYNAMIC_HANDLER_MAX_OUTPUT_BYTES), appends a clear truncation notice, and continues draining-to-void so the child never blocks on a full pipe buffer. 4. DynamicToolRegistry.registerAllOnServer had no per-tool guard. One tool with a bad inputSchema would throw during the loop and silently skip every subsequent tool on every agent query (MCP factory pattern recreates servers per query). Fix: per-tool try/catch, warn with tool name, continue. Broken tools are not auto-unregistered; the operator decides. buildSafeEnv and the --env-file= pattern in executeScriptHandler are unchanged, preserving the subprocess environment isolation boundary from SECURITY.md. Tests spawn real subprocesses and include a 200KB-stderr regression test that would hang under the old sequential-drain code. Env-var cleanup in the new tests uses Reflect.deleteProperty(process.env, ...) rather than `delete` (Biome noDelete) or `= undefined` (coerces to the string "undefined" on process.env and does not actually unset the key). This matches the pattern acknowledged as correct by the maintainer in #5.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes four latent liveness/stability bugs in the MCP dynamic-tool execution path. None produce visible errors today — they produce hangs or silently missing tools, which is the worst kind of bug because the agent has no signal anything is wrong.
executeShellHandlerandexecuteScriptHandlerdrained stdout then stderr sequentially. Any handler writing >64 KB to stderr before closing stdout (curl -v,git clone,npm install, verbose loggers) blocked on its next stderr write while phantom waited for stdout EOF forever.Bun.spawnran with no kill path. A hung handler froze the agent turn indefinitely with no recovery.new Response(stream).text()slurped unbounded output, risking OOM of the 2 GB container.registerAllOnServer— one tool with a badinputSchemawould throw during the loop and silently skip every subsequent tool on every agent query (MCP factory pattern recreates servers per query, so a single bad schema persists across every turn).Fix
All four fixes land in
src/mcp/dynamic-handlers.tsandsrc/mcp/dynamic-tools.ts. Orthogonal, small, surgical.readStreamWithCaphelper drains with a byte cap and continues reading-to-void past the cap so the child never blocks on a full pipe buffer.drainProcessWithLimitshelper runs concurrent drains viaPromise.all, schedulesSIGTERMatHANDLER_TIMEOUT_MS(default 60s, env-overridable viaPHANTOM_DYNAMIC_HANDLER_TIMEOUT_MS), and escalates toSIGKILLafter a 2s grace.PHANTOM_DYNAMIC_HANDLER_MAX_OUTPUT_BYTES). Truncation notice matches thetruncateForSlackconvention.registerAllOnServernow wraps per-tool registration in try/catch, logs a warning with the tool name, and continues. Broken tools are not auto-unregistered — the schema might be temporarily invalid, and silently deleting the row would be destructive. Log loudly, keep the row, let the operator decide.Security boundary preserved
buildSafeEnvand the--env-file=pattern inexecuteScriptHandlerare unchanged. The subprocess environment isolation boundary documented in SECURITY.md is intact — no secrets leak into spawned processes.Tests
Five new tests, all spawning real subprocesses (no mocks):
sleep 10with a 500 ms timeout override. Expects timeout error in well under 5s.registerAllOnServertolerates a failing tool — one bad tool logs a warning viaconsole.warn, the other registers successfully, no exception bubbles out.A note on test env cleanup
Env-var cleanup in the new tests uses
Reflect.deleteProperty(process.env, ...)rather thandelete(which Biome'snoDeleterule rejects) or= undefined(which coerces to the string"undefined"onprocess.envand does not actually unset the key). This matches the pattern you acknowledged as correct in #5 when @coe0718 raised the same issue.The existing
buildSafeEnvtests at the top of the file still use the= undefinedpattern — that's out of scope for this PR but flagging it so you're aware the file will have two different patterns until the follow-up you mentioned in #5 lands. Happy to roll that into this PR if you'd prefer a single-pattern file.Test plan
bun run lint— clean (0 errors, 0 suppressions)bun run typecheck— cleanbun test src/mcp/__tests__/dynamic-handlers.test.ts src/mcp/__tests__/dynamic-tools.test.ts— 33 pass, 0 failbun test— 862 pass, 2 fail (the 2 pre-existingphantom initenvironmental failures unrelated to this change)