Skip to content

fix: prevent Gemini second-turn hangs in ACP adapter#143

Merged
teng-lin merged 4 commits intomainfrom
worktree-gemini-e2e-error
Feb 26, 2026
Merged

fix: prevent Gemini second-turn hangs in ACP adapter#143
teng-lin merged 4 commits intomainfrom
worktree-gemini-e2e-error

Conversation

@teng-lin
Copy link
Owner

Summary

Fixes a flaky E2E test ("same session supports a second turn") that intermittently hangs with backendConnected=true when running Gemini over ACP. Two root causes were identified and fixed:

Bug 1: waitForResponse data loss

When Gemini sends session/request_permission in the same stdout chunk as the session/new response, the old code would find the matching response ID, drop all remaining lines from that chunk, and never process the permission request. Gemini would block forever waiting for a response.

Fix: waitForResponse now returns { result, leftover } capturing any bytes after the matched response in the same chunk. The initLeftover from the first handshake is passed as initialBuffer to the second call, and sessionLeftover is passed to AcpSession as preBufferedData for replay during stream startup.

Bug 2: No process-exit detection + missing process-group kill

createMessageStream only listened on stdout.on("close"), which never fires if grandchild processes inherit the stdout pipe. Additionally, close() and the catch block in connect() used child.kill() instead of process.kill(-pid, ...) for process-group termination, leaving orphaned descendants alive.

Fix:

  • Added child.on("exit", finalize) to terminate the stream when the main process exits regardless of pipe ownership
  • Switched to process.kill(-pid, signal) with child.kill() fallback for process-group kill (matching NodeProcessManager pattern)
  • Hardened the fallback with its own try/catch to prevent spurious exceptions on process-already-exited races
  • Fixed close() timer to be .unref()'d and .clearTimeout()'d on normal exit

Test Results

  • All 147 ACP adapter tests pass
  • All 3123 unit/integration tests pass
  • TypeScript typecheck clean

Files Changed

  • src/adapters/acp/acp-adapter.ts — leftover capture, process-group kill, better error handling
  • src/adapters/acp/acp-session.ts — dual exit/close listeners, preBufferedData replay, timer cleanup
  • src/adapters/acp/kill-process-group.ts — extracted helper with hardened fallback (new file)
  • trace.ndjson — debugging artifact (can be removed)

🤖 Generated with Claude Code

Two bugs caused the "same session supports a second turn" E2E test to
fail intermittently:

1. waitForResponse data loss — when Gemini sent session/request_permission
   in the same stdout chunk as the session/new response, the remaining
   lines after the matched response ID were silently dropped. The adapter
   never processed the permission request, Gemini blocked waiting for a
   response, and session/prompt was never processed.

   Fix: return { result, leftover } from waitForResponse, pass leftover
   from both handshake steps as initialBuffer / preBufferedData so AcpSession
   replays them on startup.

2. No process-exit detection + no process-group kill — createMessageStream
   only listened on child.stdout "close", which never fires when grandchild
   processes inherit the stdout pipe. If Gemini spawned subprocesses that
   kept the pipe open, backendConnected stayed true indefinitely.

   Fix: add child.on("exit", onChildExit) → finalize() alongside onClose
   so the stream ends as soon as the main Gemini process exits regardless
   of grandchildren. Also switch close() and the catch block in connect()
   to process.kill(-pid, signal) with child.kill() fallback, matching the
   pattern from NodeProcessManager (commit 70ad1ca).

   The spawn call already had detached: true (added in prior session).
- killProcessGroup: wrap the child.kill() fallback in its own try/catch so
  a race where the process exits between the two kill calls doesn't surface
  as an unhandled exception (matches the pattern in node-process-manager.ts)

- AcpSession.close(): inline the exit/timeout logic into a single Promise so
  the SIGKILL timer is both unref'd (won't keep the event loop alive) and
  cleared via clearTimeout when the child exits normally, preventing a
  spurious 5s delay on clean shutdown
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @teng-lin, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request resolves intermittent hangs experienced by Gemini in the ACP adapter, particularly during multi-turn conversations. The changes focus on improving the reliability of inter-process communication by preventing data loss during response parsing and ensuring comprehensive termination of all associated backend processes. These fixes enhance the stability and responsiveness of the Gemini integration.

Highlights

  • Data Loss Prevention: Implemented a mechanism in waitForResponse to capture and return any leftover data in the stdout buffer after a matching JSON-RPC response. This prevents silent data loss when multiple messages arrive in a single chunk, addressing a root cause of Gemini hangs.
  • Robust Process Termination: Enhanced process termination logic to reliably kill entire process groups, including orphaned descendant processes. This was achieved by using process.kill(-pid, signal) and introducing a new killProcessGroup helper, preventing backend processes from lingering after the main process exits.
  • Improved Disconnection Detection: Modified AcpSession to listen for both stdout.on("close") and child.on("exit") events. This ensures that the session correctly detects backend disconnection even if grandchild processes keep the stdout pipe open.
  • Pre-buffered Data Replay: Introduced preBufferedData in AcpSession to replay messages that were buffered during the ACP handshake but not processed immediately. This ensures that messages like session/request_permission are not lost.
Changelog
  • src/adapters/acp/acp-adapter.ts
    • Imported killProcessGroup for robust process termination.
    • Configured child processes to run in a new, detached process group to enable proper group termination.
    • Updated waitForResponse calls to handle and pass along 'leftover' data from stdout chunks.
    • Replaced direct child.kill() calls with the new killProcessGroup helper for more reliable cleanup.
  • src/adapters/acp/acp-session.ts
    • Imported killProcessGroup for consistent process termination.
    • Added preBufferedData property to store and replay messages received during the handshake phase.
    • Modified the constructor to accept and initialize preBufferedData.
    • Updated the close method to use killProcessGroup and improved timer handling for graceful exit.
    • Configured the message stream to listen for both child.on("exit") and stdout.on("close") to detect process termination.
    • Implemented logic to replay preBufferedData when the message stream starts, ensuring no messages are missed.
  • src/adapters/acp/kill-process-group.ts
    • Added a new utility file containing killProcessGroup function.
    • Implemented killProcessGroup to send signals to an entire process group, with a fallback to child.kill() for robustness.
  • trace.ndjson
    • Added a debugging artifact, which can be removed.
Activity
  • All 147 ACP adapter tests passed.
  • All 3123 unit/integration tests passed.
  • TypeScript typecheck was clean.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses hangs in the ACP adapter by fixing data loss in waitForResponse and improving process termination logic. The waitForResponse function now returns leftover data to prevent message loss, and a new killProcessGroup helper is introduced to ensure child processes and their descendants are properly terminated. The changes also include more robust error handling and event listener management. Overall, these are solid improvements to the stability of the ACP adapter. I have a couple of suggestions to further refine the changes.

Comment on lines +11 to +21
const pid = child.pid;
try {
if (pid !== undefined) process.kill(-pid, signal);
else child.kill(signal);
} catch {
try {
child.kill(signal);
} catch {
// Process already exited — nothing to do.
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The logic with the nested try-catch is a bit complex and contains a redundant path. Specifically, if pid is undefined and child.kill(signal) throws, the catch block will try to execute child.kill(signal) again. This can be simplified to improve readability and remove the redundant call.

  const pid = child.pid;
  if (pid !== undefined) {
    try {
      // Try to kill the whole process group first.
      process.kill(-pid, signal);
      return; // If successful, we're done.
    } catch {
      // Fallback to killing just the child process if group kill fails
      // (e.g., on Windows, or if the process has already exited).
    }
  }

  try {
    child.kill(signal);
  } catch {
    // Process already exited — nothing to do.
  }

trace.ndjson Outdated
@@ -0,0 +1,66 @@
{"trace":true,"traceId":"t_79beddb9","layer":"backend","direction":"send","messageType":"native_outbound","ts":"2026-02-25T18:18:40.644Z","elapsed_ms":0,"sessionId":"3f4b0dcb-7d14-4397-9364-f37750f271c0","seq":1,"phase":"handshake_send","size_bytes":161,"body":{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":1,"clientCapabilities":{},"clientInfo":{"name":"beamcode","version":"0.1.0"}}}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This file appears to be a debugging artifact. As noted in the pull request description, it should be removed before merging to keep the repository history clean.

@teng-lin teng-lin merged commit 5b64107 into main Feb 26, 2026
6 checks passed
@teng-lin teng-lin deleted the worktree-gemini-e2e-error branch February 26, 2026 00:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant