fix(agentic): make truncated Write recovery actionable end-to-end by nonoqing · Pull Request #653 · GCWing/BitFun

nonoqing · 2026-05-11T07:51:49Z

80cd279 introduced truncation recovery and injected an "use Edit to continue" hint into the assistant result. In practice the end-to-end loop never closed because:

DeepResearchAgent did not include Edit in its allowed_tools, so the model's Edit attempts (correctly following the hint) were rejected at the allowed-list check and the model fell back to Read.
The fallback Reads were identical (same file, same range) every round, producing a tight tool-call loop that nothing prevented.
The hint itself used "verify the result" wording that nudged the model toward Read instead of Edit.

Closes all three gaps:

Add Edit to DeepResearchAgent::default_tools so the hint is actionable. Add a test assertion to lock in the invariant.
Tighten the recovery hint: explicitly forbid re-reading the just-written file, point the model at its own previous tool_use.input.content as the source for old_string, and give it a clean escape hatch ("tell the user the output was truncated") when no concrete continuation plan exists.
Add a per-session tool-call loop detector on ToolPipeline. When the same tool is called with deep-equal arguments TOOL_CALL_LOOP_THRESHOLD (=3) consecutive times in a session, the (THRESHOLD+1)-th call is rejected with an error naming the loop and instructing the model to switch tools or finish. Counts consecutive matches at the tail of a bounded 10-entry ring buffer, so a legitimate intervening call resets the streak. clear_session_tool_call_history is exposed so callers can drop the buffer when a session ends.

80cd279 introduced truncation recovery and injected an "use Edit to continue" hint into the assistant result. In practice the end-to-end loop never closed because: 1. DeepResearchAgent did not include Edit in its allowed_tools, so the model's Edit attempts (correctly following the hint) were rejected at the allowed-list check and the model fell back to Read. 2. The fallback Reads were identical (same file, same range) every round, producing a tight tool-call loop that nothing prevented. 3. The hint itself used "verify the result" wording that nudged the model toward Read instead of Edit. Closes all three gaps: - Add Edit to DeepResearchAgent::default_tools so the hint is actionable. Add a test assertion to lock in the invariant. - Tighten the recovery hint: explicitly forbid re-reading the just-written file, point the model at its own previous tool_use.input.content as the source for old_string, and give it a clean escape hatch ("tell the user the output was truncated") when no concrete continuation plan exists. - Add a per-session tool-call loop detector on ToolPipeline. When the same tool is called with deep-equal arguments TOOL_CALL_LOOP_THRESHOLD (=3) consecutive times in a session, the (THRESHOLD+1)-th call is rejected with an error naming the loop and instructing the model to switch tools or finish. Counts *consecutive* matches at the tail of a bounded 10-entry ring buffer, so a legitimate intervening call resets the streak. clear_session_tool_call_history is exposed so callers can drop the buffer when a session ends. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

nonoqing merged commit ffa8cc5 into GCWing:main May 11, 2026
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(agentic): make truncated Write recovery actionable end-to-end#653

fix(agentic): make truncated Write recovery actionable end-to-end#653
nonoqing merged 1 commit intoGCWing:mainfrom
nonoqing:yuyiqing/dev

nonoqing commented May 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

nonoqing commented May 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant