diff --git a/src/content/docs/api/mcp-endpoint.mdx b/src/content/docs/api/mcp-endpoint.mdx index a6584af..08f0686 100644 --- a/src/content/docs/api/mcp-endpoint.mdx +++ b/src/content/docs/api/mcp-endpoint.mdx @@ -97,8 +97,8 @@ Each initialize creates a new MCP session tracked server-side. | Setting | Env var | Default | Description | |---------|---------|---------|-------------| -| Max concurrent sessions | `MCP_MAX_SESSIONS` | `100` | New initialize returns `429` once this is reached. | -| Session TTL | `MCP_SESSION_TTL_SECONDS` | `28800` (8 h) | Idle timeout. Each request resets the clock — an actively-used session never expires. | +| Max concurrent sessions | `MCP_MAX_SESSIONS` | `100` | Soft cap. A new initialize at capacity evicts the least-recently-used session and is admitted; it never fails with `429`. | +| Session TTL | `MCP_SESSION_TTL_SECONDS` | `28800` (8 h) | Idle timeout. Each request resets the clock — an actively-used session never expires. Idle sessions are closed by a periodic sweep on the serving pod. | Sessions are also closed on `DELETE /mcp` (with a valid `mcp-session-id`), transport close, or server shutdown. @@ -211,7 +211,7 @@ The external client sees a flat list of tools. NimbleBrain resolves the caller's - **Organization scoping** — when `organizationId` is set, only members of that WorkOS organization can authenticate. - **Workspace isolation** — every request is scoped to one workspace; tools cannot reach data in other workspaces. - **Feature and role filtering** — disabled and admin-only tools are never exposed. -- **Session limits** — `MCP_MAX_SESSIONS` prevents resource exhaustion from runaway clients. +- **Session limits** — `MCP_MAX_SESSIONS` bounds memory; the LRU eviction policy means a runaway client cannot deny service to others. - **Bundle env isolation** — bundle processes receive a filtered host environment; the MCP endpoint does not bypass that. ## Setup (operators) diff --git a/src/content/docs/config/environment.mdx b/src/content/docs/config/environment.mdx index a1abbb1..c097cb7 100644 --- a/src/content/docs/config/environment.mdx +++ b/src/content/docs/config/environment.mdx @@ -31,8 +31,8 @@ NimbleBrain reads several environment variables for authentication, server bindi | `NB_WEB_URL` | No | falls back to `NB_API_URL`, then the request origin | Public origin of the web app. Used to build post-OAuth redirect targets back to the SPA. Only needed when the API and web are served from different origins. | | `NB_HOST_URL` | No | `http://localhost:27247` | Public host URL passed to bundle subprocesses (used by automation executors and similar). Distinct from `NB_API_URL` — the OAuth callback URL is derived from `NB_API_URL`, not this. | | `NB_REGISTRIES` | No | — | JSON array of registry configs that overrides the persisted `registries.json` for the lifetime of the process. Adds custom `static` / `mpak` / `mcp` registries without touching pod filesystem. The locked bundled-static registry is preserved automatically. See [Connectors Catalog → Operator override](/config/connectors-catalog/#operator-override). | -| `MCP_MAX_SESSIONS` | No | `100` | Max concurrent MCP sessions on `/mcp`. New initialize returns `429` once reached. Non-positive / non-integer values are rejected with a warning and the default is used. **Sized in tandem with `MCP_SESSION_TTL_SECONDS`:** longer TTLs hold more sessions in memory before they're swept, so a fleet of 100 connectors held open all day will keep this cap pegged. Raise the cap or shorten the TTL if you have more concurrent connectors than the default headroom. | -| `MCP_SESSION_TTL_SECONDS` | No | `28800` (8 h) | MCP session inactivity TTL in seconds. Each request resets the clock — actively-used sessions never expire. Non-positive / non-integer values are rejected with a warning and the default is used. Also configurable via [`sessionStore.ttlSeconds`](/config/nimblebrain-json#sessionstore) in `nimblebrain.json`. | +| `MCP_MAX_SESSIONS` | No | `100` | Soft cap on concurrent MCP sessions on `/mcp`. **A new initialize at capacity evicts the least-recently-used session** rather than returning an error, so well-formed clients always succeed. The cap is a memory-budget knob, not a rate limiter. Non-positive / non-integer values are rejected with a warning and the default is used. Watch for `[mcp] evicting transport reason=pressure` log lines: if those spike, your natural concurrency exceeds the cap and you should raise it (or shorten `MCP_SESSION_TTL_SECONDS` to release idle sessions sooner). | +| `MCP_SESSION_TTL_SECONDS` | No | `28800` (8 h) | MCP session inactivity TTL in seconds. A periodic sweep closes any session whose last request is past this window — both the in-memory transport on the serving pod and its metadata in the session store. Each request resets the clock, so an actively-used session never expires. Non-positive / non-integer values are rejected with a warning and the default is used. Also configurable via [`sessionStore.ttlSeconds`](/config/nimblebrain-json#sessionstore) in `nimblebrain.json`. | ### Auth & identity