Symptom
In plan mode, the planner subagent on altimate-backend/altimate-default (the gateway alias, currently routing to a GPT-5.x class model) sometimes ends its first step without ever calling a tool. Two observed sub-modes:
- Prompt-context-only plan: the model writes a plausible-sounding plan from the prompt alone without reading any files. Frequently references files/symbols that may not exist or misses existing patterns.
- Outright refusal: the model returns "I'm sorry, but I cannot assist with that request." and stops, even for benign requests (one repro: "plan a feature which allows me to turn on YOLO mode from within the interface via a button" — content-policy refusal driven by the phrase "YOLO mode").
In both cases the in-product safety net (processor.ts:351-389) fires the plan_no_tool_generation warning. Until now its copy told users to "switch to a model with stronger tool-use," which is misleading — the model is tool-capable; the failure modes are (a) prompt engineering not strong enough to force exploration, (b) gateway content policy declining the request.
Historical context
Jira AI-5987 [internal] tracked an earlier observation of the same failure class (61.2% failure rate / 30 of 49 plan-mode sessions over 96h, 2026-03-23) but was cancelled without a remediation in altimate-code. This issue re-opens the surface from a UX angle.
Final warning copy:
⚠️ altimate-code: the plan agent on <provider>/<model> stopped without calling any tools — it neither read, searched, nor explored the codebase. Common causes: (a) the model wrote a plan from prompt context alone, (b) the model declined to engage with the request (content-policy refusal), or (c) the request was too thin to act on. To recover, try one of: reply asking it to investigate first (read/grep/glob/explore); rephrase the request more concretely; or /model to a tier that's more eager to explore (e.g. Claude Sonnet/Opus).
Trip-wire condition and plan_no_tool_generation telemetry event shape unchanged.
Open questions / follow-ups (out of scope for this PR)
- Should we differentiate refusals from prompt-context-plans in telemetry? Would need to look at the most recent
text-end content from the step. Considered and rejected for the first cut (pattern-matching is locale-specific and brittle), but worth revisiting if telemetry shows refusals are common enough to call out separately in the UI.
- Should
altimate-default route to a Claude tier for plan-agent steps specifically? The gateway has the model identity; plan-agent steps could pin to a tool-eager model independent of the user's normal /model choice. Bigger change, separate proposal.
- Long term: is the AI-5987 "61% failure" observation still accurate post-fix? Worth re-querying telemetry after this fix lands and a release cycle passes.
Verification
- 160 affected tests pass; typecheck clean.
- Preview binary built locally from
fix/plan-agent-tool-use (version string 0.0.0-fix/plan-agent-tool-use-202606041209). Manual repro of the YOLO refusal scenario confirms the new warning copy is shipped and accurately describes the observed behavior.
Symptom
In plan mode, the planner subagent on
altimate-backend/altimate-default(the gateway alias, currently routing to a GPT-5.x class model) sometimes ends its first step without ever calling a tool. Two observed sub-modes:In both cases the in-product safety net (
processor.ts:351-389) fires theplan_no_tool_generationwarning. Until now its copy told users to "switch to a model with stronger tool-use," which is misleading — the model is tool-capable; the failure modes are (a) prompt engineering not strong enough to force exploration, (b) gateway content policy declining the request.Historical context
Jira AI-5987 [internal] tracked an earlier observation of the same failure class (61.2% failure rate / 30 of 49 plan-mode sessions over 96h, 2026-03-23) but was cancelled without a remediation in altimate-code. This issue re-opens the surface from a UX angle.
Final warning copy:
Trip-wire condition and
plan_no_tool_generationtelemetry event shape unchanged.Open questions / follow-ups (out of scope for this PR)
text-endcontent from the step. Considered and rejected for the first cut (pattern-matching is locale-specific and brittle), but worth revisiting if telemetry shows refusals are common enough to call out separately in the UI.altimate-defaultroute to a Claude tier for plan-agent steps specifically? The gateway has the model identity; plan-agent steps could pin to a tool-eager model independent of the user's normal/modelchoice. Bigger change, separate proposal.Verification
fix/plan-agent-tool-use(version string0.0.0-fix/plan-agent-tool-use-202606041209). Manual repro of the YOLO refusal scenario confirms the new warning copy is shipped and accurately describes the observed behavior.