fix: prevent Gemini second-turn hangs in ACP adapter#143
Conversation
Two bugs caused the "same session supports a second turn" E2E test to
fail intermittently:
1. waitForResponse data loss — when Gemini sent session/request_permission
in the same stdout chunk as the session/new response, the remaining
lines after the matched response ID were silently dropped. The adapter
never processed the permission request, Gemini blocked waiting for a
response, and session/prompt was never processed.
Fix: return { result, leftover } from waitForResponse, pass leftover
from both handshake steps as initialBuffer / preBufferedData so AcpSession
replays them on startup.
2. No process-exit detection + no process-group kill — createMessageStream
only listened on child.stdout "close", which never fires when grandchild
processes inherit the stdout pipe. If Gemini spawned subprocesses that
kept the pipe open, backendConnected stayed true indefinitely.
Fix: add child.on("exit", onChildExit) → finalize() alongside onClose
so the stream ends as soon as the main Gemini process exits regardless
of grandchildren. Also switch close() and the catch block in connect()
to process.kill(-pid, signal) with child.kill() fallback, matching the
pattern from NodeProcessManager (commit 70ad1ca).
The spawn call already had detached: true (added in prior session).
- killProcessGroup: wrap the child.kill() fallback in its own try/catch so a race where the process exits between the two kill calls doesn't surface as an unhandled exception (matches the pattern in node-process-manager.ts) - AcpSession.close(): inline the exit/timeout logic into a single Promise so the SIGKILL timer is both unref'd (won't keep the event loop alive) and cleared via clearTimeout when the child exits normally, preventing a spurious 5s delay on clean shutdown
Summary of ChangesHello @teng-lin, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request resolves intermittent hangs experienced by Gemini in the ACP adapter, particularly during multi-turn conversations. The changes focus on improving the reliability of inter-process communication by preventing data loss during response parsing and ensuring comprehensive termination of all associated backend processes. These fixes enhance the stability and responsiveness of the Gemini integration. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request addresses hangs in the ACP adapter by fixing data loss in waitForResponse and improving process termination logic. The waitForResponse function now returns leftover data to prevent message loss, and a new killProcessGroup helper is introduced to ensure child processes and their descendants are properly terminated. The changes also include more robust error handling and event listener management. Overall, these are solid improvements to the stability of the ACP adapter. I have a couple of suggestions to further refine the changes.
| const pid = child.pid; | ||
| try { | ||
| if (pid !== undefined) process.kill(-pid, signal); | ||
| else child.kill(signal); | ||
| } catch { | ||
| try { | ||
| child.kill(signal); | ||
| } catch { | ||
| // Process already exited — nothing to do. | ||
| } | ||
| } |
There was a problem hiding this comment.
The logic with the nested try-catch is a bit complex and contains a redundant path. Specifically, if pid is undefined and child.kill(signal) throws, the catch block will try to execute child.kill(signal) again. This can be simplified to improve readability and remove the redundant call.
const pid = child.pid;
if (pid !== undefined) {
try {
// Try to kill the whole process group first.
process.kill(-pid, signal);
return; // If successful, we're done.
} catch {
// Fallback to killing just the child process if group kill fails
// (e.g., on Windows, or if the process has already exited).
}
}
try {
child.kill(signal);
} catch {
// Process already exited — nothing to do.
}
trace.ndjson
Outdated
| @@ -0,0 +1,66 @@ | |||
| {"trace":true,"traceId":"t_79beddb9","layer":"backend","direction":"send","messageType":"native_outbound","ts":"2026-02-25T18:18:40.644Z","elapsed_ms":0,"sessionId":"3f4b0dcb-7d14-4397-9364-f37750f271c0","seq":1,"phase":"handshake_send","size_bytes":161,"body":{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":1,"clientCapabilities":{},"clientInfo":{"name":"beamcode","version":"0.1.0"}}}} | |||
Summary
Fixes a flaky E2E test ("same session supports a second turn") that intermittently hangs with
backendConnected=truewhen running Gemini over ACP. Two root causes were identified and fixed:Bug 1:
waitForResponsedata lossWhen Gemini sends
session/request_permissionin the same stdout chunk as thesession/newresponse, the old code would find the matching response ID, drop all remaining lines from that chunk, and never process the permission request. Gemini would block forever waiting for a response.Fix:
waitForResponsenow returns{ result, leftover }capturing any bytes after the matched response in the same chunk. TheinitLeftoverfrom the first handshake is passed asinitialBufferto the second call, andsessionLeftoveris passed toAcpSessionaspreBufferedDatafor replay during stream startup.Bug 2: No process-exit detection + missing process-group kill
createMessageStreamonly listened onstdout.on("close"), which never fires if grandchild processes inherit the stdout pipe. Additionally,close()and the catch block inconnect()usedchild.kill()instead ofprocess.kill(-pid, ...)for process-group termination, leaving orphaned descendants alive.Fix:
child.on("exit", finalize)to terminate the stream when the main process exits regardless of pipe ownershipprocess.kill(-pid, signal)withchild.kill()fallback for process-group kill (matchingNodeProcessManagerpattern)close()timer to be.unref()'d and.clearTimeout()'d on normal exitTest Results
Files Changed
src/adapters/acp/acp-adapter.ts— leftover capture, process-group kill, better error handlingsrc/adapters/acp/acp-session.ts— dual exit/close listeners, preBufferedData replay, timer cleanupsrc/adapters/acp/kill-process-group.ts— extracted helper with hardened fallback (new file)trace.ndjson— debugging artifact (can be removed)🤖 Generated with Claude Code