Validate tool count against model provider limits at agent create/update

## Problem

Users hit runtime errors when an agent's effective tool count exceeds the model provider's hard cap. Surfaced today as:

```
Error: No output generated. Check the stream for errors.
Cause(openai/gpt-5): Invalid 'tools': array too long.
Expected an array with maximum length 128, but got an array with length 216 instead.
[400] { "error": { "message": "Invalid 'tools': array too long. Expected an array with maximum length 128, but got an array with length 216 instead.",
"type": "invalid_request_error", "param": "tools", "code": "array_above_max_length" } }
```

Limit varies by provider:
- **OpenAI** (GPT-5 family): 128 function tools per request
- **Anthropic** (Claude): effectively unlimited within token budget
- Other providers TBD

User sees this only at first run, after they've gone through onboard + skill selection + MCP server binding + save. By the time they hit it, multiple configuration choices need to be unwound to fit under the limit.

## Proposal

Validate at **agent create / update**, before the configuration is persisted (or on first dry-run after save). Two layers:

1. **Server-side** (apps/main agent CRUD endpoint) — authoritative. Reject the request with a typed error listing the offending count + limit + which buckets contributed (`builtin + skills + per-MCP-server`).
2. **Console-side** — surface a running tally in the agent form and a warning chip when it crosses the chosen model's limit before submit.

## Open design questions

1. **What counts as a "tool"?**
   - Builtin tools (bash, read, edit, web_fetch, etc.)
   - Each Skill the agent has access to (Skills are usually 1 tool each, but a skill that wraps a Skill tool can introspect more)
   - Each MCP server's exposed tools — requires introspection (the agent might not know the count until it actually connects)

2. **MCP introspection at validate time?**
   - We could call MCP server's `tools/list` at agent create to count, but that requires live MCP creds + a synchronous network roundtrip per server
   - Or estimate based on a stored count cached from prior calls
   - Or punt: count MCP **servers** instead of **tools**, and apply a heuristic multiplier (e.g. assume 20 tools/server)

3. **Limit source of truth?**
   - Hard-code per-model in `apps/main/src/model-cards` (or similar)
   - Or fetch from provider's API on first use and cache

4. **Failure mode when over limit?**
   - Hard-reject at save (forces user to drop something)
   - Allow save but mark agent as `state: needs_attention`, refuse to spawn sessions until fixed
   - Allow save + run, let the run fail at first invocation (current behavior)

## Why now

Surfaced 2026-05-18 by a hosted user running an over-stuffed agent against GPT-5. Likely will recur as users add more MCP servers — Console doesn't currently show how the count compounds.

## Out of scope (separate issues)

- Per-tool token budgeting (different problem; some tools have huge schemas)
- Model-side "tools too large total bytes" — distinct from count limit

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Validate tool count against model provider limits at agent create/update #88

Problem

Proposal

Open design questions

Why now

Out of scope (separate issues)

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Validate tool count against model provider limits at agent create/update #88

Description

Problem

Proposal

Open design questions

Why now

Out of scope (separate issues)

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions