Skip to content

Fix pending tool result causing Anthropic API contract violation on runtime restart#4315

Open
alco wants to merge 2 commits into
mainfrom
alco/fix-pending-tool
Open

Fix pending tool result causing Anthropic API contract violation on runtime restart#4315
alco wants to merge 2 commits into
mainfrom
alco/fix-pending-tool

Conversation

@alco
Copy link
Copy Markdown
Member

@alco alco commented May 12, 2026

Summary

When the agent runtime is interrupted mid-tool-execution (e.g. desktop app crash), the tool call is persisted to the timeline with a non-terminal status (started, args_complete, or executing) but no tool_result is ever written. On the next wake, defaultProjection in timeline-context.ts emits the tool_call message but skips the tool_result because it only emits one for completed/failed statuses. The resulting message history sent to Anthropic violates the API contract that every tool_use block must be followed by a matching tool_result, and the API rejects the request with:

tool_use ids were found without tool_result blocks immediately after: tc-NN

This puts the agent into a permanent error loop — every subsequent wake replays the same broken history and fails identically.

Fix

Synthesize an error tool_result for any tool call that is not in a terminal state. The result's content notes that the tool was interrupted and includes the last-known status, so the model can see what happened and proceed instead of being trapped.

This is the smallest change that restores API compliance and lets a crashed agent recover on its own.

Test plan

  • Reproduce the failure by crashing the desktop app mid-bash tool execution and confirm the agent recovers on next wake instead of looping on 400 invalid_request_error
  • Verify a fresh conversation with no interrupted tools is unaffected

🤖 Generated with Claude Code

@alco alco added the claude label May 12, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented May 12, 2026

❌ 1 Tests Failed:

Tests completed Failed Passed Skipped
690 1 689 39
View the top 1 failed test(s) by shortest run time
test/timeline-context.test.ts > timeline context > buildTimelineMessages keeps pending tool calls without emitting tool results
Stack Traces | 0.00639s run time
AssertionError: expected [ { role: 'tool_call', …(4) }, …(1) ] to deeply equal [ { role: 'tool_call', …(4) } ]

- Expected
+ Received

@@ -6,6 +6,12 @@
        "id": "user-1",
      },
      "toolCallId": "tc-pending",
      "toolName": "lookup",
    },
+   {
+     "content": "Tool execution was interrupted before completion (status: executing)",
+     "isError": true,
+     "role": "tool_result",
+     "toolCallId": "tc-pending",
+   },
  ]

 ❯ test/timeline-context.test.ts:147:7

To view more test analytics, go to the Test Analytics Dashboard
📋 Got 3 mins? Take this short survey to help us improve Test Analytics.

@claude
Copy link
Copy Markdown

claude Bot commented May 12, 2026

Claude Code Review

Summary

No code changes since iteration 1 — the two commits were recommitted on 2026-05-18 but the diff is identical. The original review still stands; the critical test-failure issue is now confirmed by codecov on this PR.

What's Working Well

(Unchanged from iteration 1.)

  • Minimal, targeted change at the projection layer — keeps blast radius small.
  • Synthesized message embeds the last-known status.
  • isError: true is set correctly.
  • Changeset present, scoped to @electric-ax/agents-runtime.
  • Consistent with existing pruning logic in context-assembly.ts:343-362 that already defends the tool_use/tool_result pairing contract.

Issues Found

Critical (Must Fix)

Existing test still fails on this branch

File: packages/agents-runtime/test/timeline-context.test.ts:121-156

buildTimelineMessages keeps pending tool calls without emitting tool results continues to assert the old "no tool_result for executing" behavior. CI on this PR shows this test failing exactly as predicted in iteration 1 (codecov bot, 1 failed test). The test name and assertion need to be updated to reflect the new "synthesize error result" contract.

Important (Should Fix)

No positive test for the new branch across non-terminal statuses

timeline-context.ts:156-164 is a new branch with no direct coverage. The runtime can crash in any of started / args_complete / executing, and the projection treats them uniformly, so a single parameterized test covering all three would prevent regression. Useful assertions: isError: true, content references the prior status, exactly one tool_result follows each tool_call.

Suggestions (Nice to Have)

Comment is slightly misleading

packages/agents-runtime/src/timeline-context.ts:157 says "crashed mid-execution," but started / args_complete mean the tool never began executing. Suggested:

// Runtime interrupted before the tool reached a terminal state.
// Synthesize an error result so Anthropic's tool_use/tool_result pairing holds.
Follow-up: converge the persisted row to a terminal state

The synthetic error is regenerated on every wake; the persisted row never moves to failed. Fine for this PR's "smallest change" framing, but worth a follow-up to flip interrupted tool calls to failed on first touch after restart so the rest of the system sees a consistent terminal status.

Issue Conformance

No linked issue. PR description is self-contained and accurately captures the bug, including the exact Anthropic error string and reproduction scenario.

Previous Review Status

Iteration 1 finding Status
Critical: existing test contradicts new behavior Not addressed — confirmed failing in CI
Important: no test for synthesized error result Not addressed
Suggestion: comment wording Not addressed
Suggestion: converge persisted row to terminal state Not addressed (acceptable as follow-up)

Review iteration: 2 | 2026-05-18

alco and others added 2 commits May 19, 2026 01:38
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@alco alco force-pushed the alco/fix-pending-tool branch from 6fd6acd to 113584d Compare May 18, 2026 23:38
@alco alco self-assigned this May 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants