Subagent spawn fails near context limit due to truncated tool-call JSON

## Summary
When context cache is near full, Late can emit an oversized `spawn_subagent` tool call. The tool argument JSON appears to be truncated mid-string, and the backend (llama.cpp OpenAI-compatible server) returns HTTP 500 with a parse error. The subagent spawn fails instead of recovering gracefully.

## Environment
1. OS: Ubuntu 24.04 LTS
2. Client: Late (planning flow with subagent spawning)
3. Backend: llama.cpp / llama-swap router
4. Endpoint: `http://127.0.0.1/v1/chat/completions`
5. Context state at failure: near max (`n_tokens = 262143`, `truncated = 1`)

## Steps to Reproduce
1. Run a long planning session until context cache is nearly full.
2. Trigger a `spawn_subagent` call with a large multi-paragraph goal payload.
3. Let the model stream tool-call arguments.
4. Observe backend response and Late behavior.

## Expected Behavior
1. Late should avoid sending oversized tool-call arguments when context is near limit.
2. If tool-call JSON is truncated or invalid, Late should recover gracefully (retry with compact args or ask for a smaller payload).
3. Error surfaced to user should be actionable.

## Actual Behavior
1. Backend returns HTTP 500 due to malformed tool-call arguments JSON.
2. Subagent spawn fails.
3. Delegation step does not complete.

## Error Excerpt
LLAMA-CPP ERROR:
```
prompt eval time = 221.03 ms / 22 tokens
eval time = 88800.20 ms / 1535 tokens
total time = 89021.23 ms / 1557 tokens
slot release: n_tokens = 262143, truncated = 1
got exception: Failed to parse tool call arguments as JSON: parse error ... invalid string: missing closing quote
POST /v1/chat/completions ... 500
```

## Impact
1. Long-running sessions become unreliable right when delegation is needed most.
2. High-context workflows can fail depending on tool-call argument size.
3. Users lose time recovering from failed subagent spawns.

## Suggested Fix Direction
1. Add a preflight context-budget check before requests with tools.
2. Enforce a maximum size for `spawn_subagent` goal payloads.
3. Add automatic retry on malformed tool-call argument errors with a compact re-prompt.
4. Surface backend error body in Late so the exact failure reason is visible.

## Temporary Workarounds
1. Start a fresh session before spawning subagents when context is large.
2. Keep subagent goal text short and atomic.
3. Put long implementation details in a file and reference it instead of embedding large payloads.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Subagent spawn fails near context limit due to truncated tool-call JSON #33

Summary

Environment

Steps to Reproduce

Expected Behavior

Actual Behavior

Error Excerpt

Impact

Suggested Fix Direction

Temporary Workarounds

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Subagent spawn fails near context limit due to truncated tool-call JSON #33

Description

Summary

Environment

Steps to Reproduce

Expected Behavior

Actual Behavior

Error Excerpt

Impact

Suggested Fix Direction

Temporary Workarounds

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions