Skip to content

Validate tool count against model provider limits at agent create/update #88

@hrhrng

Description

@hrhrng

Problem

Users hit runtime errors when an agent's effective tool count exceeds the model provider's hard cap. Surfaced today as:

Error: No output generated. Check the stream for errors.
Cause(openai/gpt-5): Invalid 'tools': array too long.
Expected an array with maximum length 128, but got an array with length 216 instead.
[400] { "error": { "message": "Invalid 'tools': array too long. Expected an array with maximum length 128, but got an array with length 216 instead.",
"type": "invalid_request_error", "param": "tools", "code": "array_above_max_length" } }

Limit varies by provider:

  • OpenAI (GPT-5 family): 128 function tools per request
  • Anthropic (Claude): effectively unlimited within token budget
  • Other providers TBD

User sees this only at first run, after they've gone through onboard + skill selection + MCP server binding + save. By the time they hit it, multiple configuration choices need to be unwound to fit under the limit.

Proposal

Validate at agent create / update, before the configuration is persisted (or on first dry-run after save). Two layers:

  1. Server-side (apps/main agent CRUD endpoint) — authoritative. Reject the request with a typed error listing the offending count + limit + which buckets contributed (builtin + skills + per-MCP-server).
  2. Console-side — surface a running tally in the agent form and a warning chip when it crosses the chosen model's limit before submit.

Open design questions

  1. What counts as a "tool"?

    • Builtin tools (bash, read, edit, web_fetch, etc.)
    • Each Skill the agent has access to (Skills are usually 1 tool each, but a skill that wraps a Skill tool can introspect more)
    • Each MCP server's exposed tools — requires introspection (the agent might not know the count until it actually connects)
  2. MCP introspection at validate time?

    • We could call MCP server's tools/list at agent create to count, but that requires live MCP creds + a synchronous network roundtrip per server
    • Or estimate based on a stored count cached from prior calls
    • Or punt: count MCP servers instead of tools, and apply a heuristic multiplier (e.g. assume 20 tools/server)
  3. Limit source of truth?

    • Hard-code per-model in apps/main/src/model-cards (or similar)
    • Or fetch from provider's API on first use and cache
  4. Failure mode when over limit?

    • Hard-reject at save (forces user to drop something)
    • Allow save but mark agent as state: needs_attention, refuse to spawn sessions until fixed
    • Allow save + run, let the run fail at first invocation (current behavior)

Why now

Surfaced 2026-05-18 by a hosted user running an over-stuffed agent against GPT-5. Likely will recur as users add more MCP servers — Console doesn't currently show how the count compounds.

Out of scope (separate issues)

  • Per-tool token budgeting (different problem; some tools have huge schemas)
  • Model-side "tools too large total bytes" — distinct from count limit

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions