You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When PolyPilot calls ResumeSessionAsync on a session that has tools actively executing on the headless CLI server, does the resume command destroy the running tools?
PR #472 assumes yes and works around it with a poll-then-resume pattern (never calling resume on active sessions). But we have not definitively proven causation, only correlation.
Correlation evidence
Session 4f4f2380 (worker-1, April 1 2026):
06:32:23.027Z tool.execution_start <- tool running on CLI
06:32:50.928Z session.resume <- PolyPilot called ResumeSessionAsync
07:03:13.653Z session.shutdown <- no tool.execution_complete ever arrived
The tool started, resume was called 27s later, and the tool never completed. The session eventually shut down 30 minutes later.
Context: This happened during the ResumeOrchestrationIfPendingAsync flow which called EnsureSessionConnectedAsync then ResumeSessionAsync on a session that was still actively processing.
What we could NOT prove
A controlled CLI-only repro was attempted (April 2) but was inconclusive:
Started a copilot CLI session, sent a prompt intended to trigger sleep 120
The agent interpreted it differently (ran read_bash with delay instead)
Stopped the first CLI, resumed with copilot --resume=<id>
The tool completed before the resume (at 81s, not 120s) -- unclear if it was interrupted or timed out naturally
The interactive CLI TUI makes controlled repros difficult. A proper test requires the SDK programmatically (create session, inject a known long-running tool, call resume, check for completion).
Question
When PolyPilot calls
ResumeSessionAsyncon a session that has tools actively executing on the headless CLI server, does the resume command destroy the running tools?PR #472 assumes yes and works around it with a poll-then-resume pattern (never calling resume on active sessions). But we have not definitively proven causation, only correlation.
Correlation evidence
Session
4f4f2380(worker-1, April 1 2026):The tool started, resume was called 27s later, and the tool never completed. The session eventually shut down 30 minutes later.
Context: This happened during the
ResumeOrchestrationIfPendingAsyncflow which calledEnsureSessionConnectedAsyncthenResumeSessionAsyncon a session that was still actively processing.What we could NOT prove
A controlled CLI-only repro was attempted (April 2) but was inconclusive:
sleep 120read_bashwith delay instead)copilot --resume=<id>The interactive CLI TUI makes controlled repros difficult. A proper test requires the SDK programmatically (create session, inject a known long-running tool, call resume, check for completion).
Current workaround (PR #472)
The poll-then-resume pattern avoids calling
ResumeSessionAsyncon active sessions entirely:IsSessionStillProcessing()detects active sessions via events.jsonlPollEventsAndResumeWhenIdleAsyncpolls every 5s forsession.shutdownResumeSessionAsyncafter the CLI finishesWhat needs to happen
tool.execution_startin the event streamsession.resume()while the tool is runningtool.execution_completearrives with the correct outputRelated