plan agent stops without tools on altimate-default (prompt-only plans + content-policy refusals)

## Symptom

In plan mode, the planner subagent on `altimate-backend/altimate-default` (the gateway alias, currently routing to a GPT-5.x class model) sometimes ends its first step without ever calling a tool. Two observed sub-modes:

1. **Prompt-context-only plan**: the model writes a plausible-sounding plan from the prompt alone without reading any files. Frequently references files/symbols that may not exist or misses existing patterns.
2. **Outright refusal**: the model returns "I'm sorry, but I cannot assist with that request." and stops, even for benign requests (one repro: *"plan a feature which allows me to turn on YOLO mode from within the interface via a button"* — content-policy refusal driven by the phrase "YOLO mode").

In both cases the in-product safety net (`processor.ts:351-389`) fires the `plan_no_tool_generation` warning. Until now its copy told users to "switch to a model with stronger tool-use," which is misleading — the model **is** tool-capable; the failure modes are (a) prompt engineering not strong enough to force exploration, (b) gateway content policy declining the request.

## Historical context

Jira **AI-5987** [internal] tracked an earlier observation of the same failure class (61.2% failure rate / 30 of 49 plan-mode sessions over 96h, 2026-03-23) but was cancelled without a remediation in altimate-code. This issue re-opens the surface from a UX angle.

Final warning copy:

> ⚠️ altimate-code: the `plan` agent on `<provider>/<model>` stopped without calling any tools — it neither read, searched, nor explored the codebase. Common causes: (a) the model wrote a plan from prompt context alone, (b) the model declined to engage with the request (content-policy refusal), or (c) the request was too thin to act on. To recover, try one of: reply asking it to investigate first (`read`/`grep`/`glob`/`explore`); rephrase the request more concretely; or `/model` to a tier that's more eager to explore (e.g. Claude Sonnet/Opus).

Trip-wire condition and `plan_no_tool_generation` telemetry event shape unchanged.

## Open questions / follow-ups (out of scope for this PR)

- Should we differentiate refusals from prompt-context-plans in telemetry? Would need to look at the most recent `text-end` content from the step. Considered and rejected for the first cut (pattern-matching is locale-specific and brittle), but worth revisiting if telemetry shows refusals are common enough to call out separately in the UI.
- Should `altimate-default` route to a Claude tier for plan-agent steps specifically? The gateway has the model identity; plan-agent steps could pin to a tool-eager model independent of the user's normal `/model` choice. Bigger change, separate proposal.
- Long term: is the AI-5987 "61% failure" observation still accurate post-fix? Worth re-querying telemetry after this fix lands and a release cycle passes.

## Verification

- 160 affected tests pass; typecheck clean.
- Preview binary built locally from `fix/plan-agent-tool-use` (version string `0.0.0-fix/plan-agent-tool-use-202606041209`). Manual repro of the YOLO refusal scenario confirms the new warning copy is shipped and accurately describes the observed behavior.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

plan agent stops without tools on altimate-default (prompt-only plans + content-policy refusals) #887

Symptom

Historical context

Open questions / follow-ups (out of scope for this PR)

Verification

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

plan agent stops without tools on altimate-default (prompt-only plans + content-policy refusals) #887

Description

Symptom

Historical context

Open questions / follow-ups (out of scope for this PR)

Verification

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions