Problem
Users hit runtime errors when an agent's effective tool count exceeds the model provider's hard cap. Surfaced today as:
Error: No output generated. Check the stream for errors.
Cause(openai/gpt-5): Invalid 'tools': array too long.
Expected an array with maximum length 128, but got an array with length 216 instead.
[400] { "error": { "message": "Invalid 'tools': array too long. Expected an array with maximum length 128, but got an array with length 216 instead.",
"type": "invalid_request_error", "param": "tools", "code": "array_above_max_length" } }
Limit varies by provider:
- OpenAI (GPT-5 family): 128 function tools per request
- Anthropic (Claude): effectively unlimited within token budget
- Other providers TBD
User sees this only at first run, after they've gone through onboard + skill selection + MCP server binding + save. By the time they hit it, multiple configuration choices need to be unwound to fit under the limit.
Proposal
Validate at agent create / update, before the configuration is persisted (or on first dry-run after save). Two layers:
- Server-side (apps/main agent CRUD endpoint) — authoritative. Reject the request with a typed error listing the offending count + limit + which buckets contributed (
builtin + skills + per-MCP-server).
- Console-side — surface a running tally in the agent form and a warning chip when it crosses the chosen model's limit before submit.
Open design questions
-
What counts as a "tool"?
- Builtin tools (bash, read, edit, web_fetch, etc.)
- Each Skill the agent has access to (Skills are usually 1 tool each, but a skill that wraps a Skill tool can introspect more)
- Each MCP server's exposed tools — requires introspection (the agent might not know the count until it actually connects)
-
MCP introspection at validate time?
- We could call MCP server's
tools/list at agent create to count, but that requires live MCP creds + a synchronous network roundtrip per server
- Or estimate based on a stored count cached from prior calls
- Or punt: count MCP servers instead of tools, and apply a heuristic multiplier (e.g. assume 20 tools/server)
-
Limit source of truth?
- Hard-code per-model in
apps/main/src/model-cards (or similar)
- Or fetch from provider's API on first use and cache
-
Failure mode when over limit?
- Hard-reject at save (forces user to drop something)
- Allow save but mark agent as
state: needs_attention, refuse to spawn sessions until fixed
- Allow save + run, let the run fail at first invocation (current behavior)
Why now
Surfaced 2026-05-18 by a hosted user running an over-stuffed agent against GPT-5. Likely will recur as users add more MCP servers — Console doesn't currently show how the count compounds.
Out of scope (separate issues)
- Per-tool token budgeting (different problem; some tools have huge schemas)
- Model-side "tools too large total bytes" — distinct from count limit
Problem
Users hit runtime errors when an agent's effective tool count exceeds the model provider's hard cap. Surfaced today as:
Limit varies by provider:
User sees this only at first run, after they've gone through onboard + skill selection + MCP server binding + save. By the time they hit it, multiple configuration choices need to be unwound to fit under the limit.
Proposal
Validate at agent create / update, before the configuration is persisted (or on first dry-run after save). Two layers:
builtin + skills + per-MCP-server).Open design questions
What counts as a "tool"?
MCP introspection at validate time?
tools/listat agent create to count, but that requires live MCP creds + a synchronous network roundtrip per serverLimit source of truth?
apps/main/src/model-cards(or similar)Failure mode when over limit?
state: needs_attention, refuse to spawn sessions until fixedWhy now
Surfaced 2026-05-18 by a hosted user running an over-stuffed agent against GPT-5. Likely will recur as users add more MCP servers — Console doesn't currently show how the count compounds.
Out of scope (separate issues)