From 971eefed3ab2d23c57a4444c07d4a3e422e0a29e Mon Sep 17 00:00:00 2001
From: Ragnor Comerford <hello@ragnor.co>
Date: Wed, 10 Jun 2026 21:49:52 +0200
Subject: [PATCH 1/5] docs(rfc-003): canonical MCP implementation blueprint
 (omnigraph-mcp crate)

Brings RFC-003 to the canonical from-scratch spec for the in-server MCP surface:
a dedicated omnigraph-mcp transport crate + McpBackend trait (server -> mcp dep
to avoid a Cargo cycle), the verified rmcp 1.7 design (conformance MUSTs for
free), backend reuse (do_* / run_query / run_mutate / authorize / api), the
13-tool catalog + Cedar mapping, ParamKind -> JSON Schema, deny-masking +
isError split, two resources, per-graph routing (graphs_list stays REST-only),
auth + client-compat matrix + OAuth fast-follow, tests, and locked decisions.

Build the implementation on this branch from blueprint sections B.1-B.13. The
spec is informed by the #157 reference implementation.
---
 docs/dev/rfc-003-mcp-server-surface.md | 159 ++++++++++++++++++++++++-
 1 file changed, 153 insertions(+), 6 deletions(-)

diff --git a/docs/dev/rfc-003-mcp-server-surface.md b/docs/dev/rfc-003-mcp-server-surface.md
index 32fbce5a..09542d39 100644
--- a/docs/dev/rfc-003-mcp-server-surface.md
+++ b/docs/dev/rfc-003-mcp-server-surface.md
@@ -1,6 +1,6 @@
 # RFC: MCP Server Surface for `omnigraph-server` — Full Tool Parity, Stored Queries, Modular Auth
 
-**Status:** Proposed
+**Status:** Reference implementation shipped in [omnigraph#157](https://github.com/ModernRelay/omnigraph/pull/157) (proved rmcp 1.7 on edition 2024, the auth-extension passthrough, and the full tool/resource/Cedar surface). **This RFC is now the canonical spec for a clean reimplementation from `main`** that lands the surface with a dedicated `omnigraph-mcp` crate from the start — build from the **[Implementation Blueprint](#implementation-blueprint-canonical--build-the-fresh-implementation-from-this)** below, which incorporates the as-built reality and supersedes the §5 sketches where they differ. Deferred to follow-ups: per-query `invoke_query` scope (PR 0b), the OAuth/RFC-9728 layer (MR-956), and the stdio→proxy collapse.
 **Date:** 2026-06-01
 **Tickets:** MR-969 (stored queries + MCP exposure — the surface this completes), MR-956 (federated auth / WorkOS OAuth — the auth substrate this consumes), MR-971 (per-server credential resolver), MR-974 (agent setup surface — the installer that wires this), MR-668 (multi-graph server — shipped, the routing this builds on)
 **Builds on:** [omnigraph#128](https://github.com/ModernRelay/omnigraph/pull/128) (`ragnorc/stored-queries-mcp`) — the shipped stored-query registry, `GET /queries`, `POST /queries/{name}`, and the coarse `invoke_query` gate.
@@ -11,7 +11,7 @@
 
 Add a first-class **MCP (Model Context Protocol) server surface to `omnigraph-server`**, exposed over **Streamable HTTP**, that projects the server's operations as MCP tools and resources for LLM clients (Claude Code/Desktop/web, Cursor, etc.). Two populations of tools share one projection path:
 
-1. **Built-in operational tools** — parity with the existing `@modernrelay/omnigraph-mcp` stdio package's **13 tools** (`health`, `snapshot`, `read`, `schema_get`, `branches_list`, `commits_list`, `commits_get`, `change`, `ingest`, `branches_create`, `branches_delete`, `branches_merge`, `schema_apply`) and its **2 resources** (`omnigraph://schema`, `omnigraph://branches`), plus a new server-scoped `graphs_list` tool and an `omnigraph://graphs` resource (multi-graph mode).
+1. **Built-in operational tools** — parity with the existing `@modernrelay/omnigraph-mcp` stdio package's **13 tools** (`health`, `snapshot`, `read`, `schema_get`, `branches_list`, `commits_list`, `commits_get`, `change`, `ingest`, `branches_create`, `branches_delete`, `branches_merge`, `schema_apply`) and its **2 resources** (`omnigraph://schema`, `omnigraph://branches`). (Server-scoped graph discovery — a `graphs_list` tool / `omnigraph://graphs` resource — was considered but **dropped from MCP**; it stays REST-only via `GET /graphs`. See [B.10](#b10-routing-resolved).)
 2. **Dynamic stored-query tools** — one MCP tool per `mcp.expose: true` entry in the `queries:` registry (MR-969 / #128), with parameters typed from the `.gq` declaration via the shipped `query_catalog_entry` / `param_descriptor` projection.
 
 Every tool is **authorized by the server's existing Cedar policy engine**. The MCP layer never implements its own authentication: it consumes an **already-resolved `ResolvedActor`** from the server's bearer middleware (`require_bearer_auth` today; the `TokenVerifier` seam when MR-956 lands), so the **same MCP endpoint serves on-prem (static or customer-OIDC tokens) and our cloud (WorkOS OAuth) by configuration only**. Cloud OAuth is an additive layer (RFC 9728 protected-resource metadata) that slots in with zero MCP changes.
@@ -20,6 +20,147 @@ The end-state collapses two diverging tool implementations into one: the in-serv
 
 > **Key caveat, stated up front (see §5.9 below):** the headline "a token scoped via Cedar to a *specific set* of stored queries" requires **per-query `invoke_query` scope**, which is *designed* (rfc-001) but **not yet implemented** — the shipped action is coarse (any stored query on the graph, or none). Per-actor Cedar curation works today for *built-in vs ad-hoc vs admin* tools and for *stored-vs-ad-hoc*; sub-selecting individual stored queries per actor is gated on a prerequisite (PR 0b). Until then, stored-query curation is graph-level (registry membership + `mcp.expose`).
 
+## Implementation Blueprint (canonical — build the fresh implementation from this)
+
+> This section is the **authoritative, verified spec**. A reference implementation shipped in [omnigraph#157](https://github.com/ModernRelay/omnigraph/pull/157) and proved every choice here: rmcp **1.7.0** integrates on edition 2024, the auth-extension passthrough works, all conformance MUSTs are met, and the surface splits cleanly into a transport crate + a server backend. Build a clean implementation from `main` by following B.1–B.13 in order. Where this blueprint and the older §5 design sketches differ, **the blueprint wins** (the §5 text is retained as design rationale). Every reuse point named here exists in the engine/server today.
+
+### B.1 Crate architecture & dependency direction
+
+Split the surface into a **transport crate** plus a **server-side backend**:
+
+- **`crates/omnigraph-mcp`** (new) — the rmcp Streamable-HTTP transport, the `McpBackend` trait, and rmcp model re-exports. `rmcp` (+ `tower-http`'s `limit` feature) live **only** here.
+- **`omnigraph-server`** depends on `omnigraph-mcp` and *implements* `McpBackend`. All omnigraph-specific tool/resource/Cedar/dispatch logic lives in the server.
+
+**The dependency MUST go `omnigraph-server → omnigraph-mcp`, never the reverse.** The server binary mounts `/mcp`, so a `mcp → server` dependency cycles at the package level (`server-bin → omnigraph-mcp → server-lib`), which Cargo rejects. The trait inverts the direction: the crate defines the seam, the server fills it. This is also *why* the crate cannot reach server internals (`AppState`, `do_*`, `authorize`, `api::*`) — it abstracts over them.
+
+`omnigraph-mcp/Cargo.toml`: `edition = "2024"`, version-locked to the workspace. Deps: `rmcp = { version = "1.7", default-features = false, features = ["server", "transport-streamable-http-server"] }`, `axum` (for `Router`), `http`, `tower-http = { features = ["limit"] }`, `tokio`, `async-trait`, `serde_json`. Add `"crates/omnigraph-mcp"` to the workspace `members`; in `omnigraph-server/Cargo.toml` drop `rmcp` and the `tower-http` `limit` feature, add `omnigraph-mcp`.
+
+### B.2 The `McpBackend` seam
+
+Use `#[async_trait]` (boxed futures sidestep the async-fn-in-trait `Send`-bound friction; the cost is negligible at MCP QPS, and the server already depends on `async-trait`):
+
+```rust
+#[async_trait]
+pub trait McpBackend: Clone + Send + Sync + 'static {
+    fn server_info(&self) -> ServerInfo;
+    async fn list_tools(&self, parts: &http::request::Parts) -> Result<Vec<Tool>, McpError>;
+    async fn call_tool(&self, parts: &http::request::Parts, name: &str, args: JsonObject)
+        -> Result<CallToolResult, McpError>;
+    async fn list_resources(&self, parts: &http::request::Parts) -> Result<Vec<Resource>, McpError>;
+    async fn read_resource(&self, parts: &http::request::Parts, uri: &str)
+        -> Result<ReadResourceResult, McpError>;
+}
+```
+
+**Why `&http::request::Parts` (the load-bearing mechanism — verified in rmcp source and `#157`).** rmcp's `StreamableHttpService` injects the original request's `http::request::Parts` into `RequestContext.extensions`. The server's `require_bearer_auth` + `resolve_graph_handle` middleware run *before* the MCP service and insert `ResolvedActor` + `Arc<GraphHandle>` into the request's extensions. The crate hands `&Parts` to the backend; the backend reads its own types from `parts.extensions`. The crate never names an omnigraph type → auth stays decoupled (§5.8) and the crate is reusable. The crate re-exports the rmcp model types the backend needs: `McpError (= rmcp::ErrorData)`, `Tool`, `CallToolResult`, `Content`, `JsonObject`, `Resource`, `RawResource`, `ResourceContents`, `ReadResourceResult`, `ServerInfo`, `ServerCapabilities`, `ToolAnnotations`, `Annotated` — so the server uses rmcp types via `omnigraph_mcp::…` and carries no direct rmcp dep.
+
+### B.3 Transport (lives in `omnigraph-mcp`)
+
+- A generic `struct McpService<B>` implements rmcp's `ServerHandler`, delegating each method to `B` after extracting `&Parts` from `ctx.extensions` once (missing → `McpError::internal_error`). `get_info → backend.server_info()`; `initialize`/`ping` use rmcp's defaults.
+- `pub fn mcp_router<B: McpBackend>(backend: B, body_limit: usize) -> axum::Router`:
+  - `let svc = StreamableHttpService::new(move || Ok(McpService::new(backend.clone())), Arc::new(LocalSessionManager::default()), config)` with `config = StreamableHttpServerConfig::default().with_stateful_mode(false).with_json_response(true)` (`StreamableHttpServerConfig` is `#[non_exhaustive]` — build from `Default`, mutate via the `with_*` setters; keep the loopback `allowed_hosts` default).
+  - Return `Router::new().route_service("/mcp", svc).layer(tower_http::limit::RequestBodyLimitLayer::new(body_limit))`. rmcp reads the body directly (not via an axum extractor), so axum's `DefaultBodyLimit` does **not** bound `/mcp`; the tower-http layer does.
+- **Stateless mode delivers these conformance MUSTs for free (verified against rmcp 1.7 source):** `GET`/`DELETE /mcp` → `405` (with `Allow`); a disallowed `Host`/`Origin` → `403` (loopback hosts by default — DNS-rebind guard); `MCP-Protocol-Version` → `400` on unsupported, default `2025-03-26` when absent. **No conformance middleware is needed** (the §5.6 "honour the header" footnote is satisfied by rmcp).
+- **Client/test obligations rmcp enforces:** the request must carry `Accept: application/json, text/event-stream` (both), `Content-Type: application/json`, and a `Host` header. rmcp negotiates `protocolVersion` (a recent client sees `2025-11-25`).
+
+### B.4 Server-side backend (lives in `omnigraph-server`)
+
+`struct OmnigraphMcpBackend { state: AppState }` (derive `Clone` — `AppState` is already `#[derive(Clone)]`, `Arc`-backed) implements `McpBackend`. Per request it resolves the actor + handle from `parts.extensions.get::<ResolvedActor>()` / `get::<Arc<GraphHandle>>()`.
+
+**Reuse, never reinvent.** First factor **10 thin `do_*` fns** out of the inline `server_*` HTTP handlers (each is `authorize_request(...) → engine call → DTO`) so REST and MCP dispatch one path: `do_snapshot`, `do_schema_get`, `do_branches_list`, `do_commits_list`, `do_commit_show`, `do_ingest`, `do_branch_create`, `do_branch_delete`, `do_branch_merge`, `do_schema_apply`. (No `do_graphs_list` — `graphs_list` is not an MCP tool, see B.10; `server_graphs_list` stays inline for `GET /graphs`.) Land that as a behavior-neutral refactor commit first (it keeps the REST handlers as thin wrappers; all server tests stay green). Then reuse as-is: `run_query` / `run_mutate` (already decoupled from request bodies), `authorize` → `Authz { Allowed, Denied(msg) }` (with `Err` reserved for operational 401/500), `api::query_catalog_entry` / `ParamKind` / `read_output`, `ApiError` (add `pub(crate) status_code()` + `message_str()` accessors for the error classifier). Mount in `build_app`: `.merge(mcp::mcp_router(state))` inside the `per_graph_protected` group, where the server's thin `mcp::mcp_router(state) = omnigraph_mcp::mcp_router(OmnigraphMcpBackend::new(state), INGEST_REQUEST_BODY_LIMIT_BYTES)`.
+
+Represent the built-ins as a `Builtin` enum (one variant per tool; `descriptor` / `gate` / `call` as match arms) — lower liability than ~14 unit structs + `dyn` + `async-trait` per tool. Stored-query tools are a sibling populator over `handle.queries`.
+
+### B.5 Tool catalog (13 built-ins) + Cedar mapping
+
+Each built-in reuses the **exact `PolicyAction` its REST route enforces**. (No `graphs_list` — server-scoped graph discovery is REST-only, see B.10.)
+
+| MCP tool | Scope | Cedar action |
+|---|---|---|
+| `health` | server | none (liveness/version) |
+| `snapshot`, `schema_get`, `branches_list`, `commits_list`, `commits_get` | graph | `Read` |
+| `query` (ad-hoc read) | graph | `Read` (`run_query` self-authorizes) |
+| `mutate` (ad-hoc write) | graph | `Change` |
+| `ingest` (NDJSON) | graph | `Change` (+ `BranchCreate` when `from` forks) |
+| `branches_create` / `branches_delete` / `branches_merge` | graph | `BranchCreate` / `BranchDelete` / `BranchMerge` |
+| `schema_apply` (`allow_data_loss`) | graph | `SchemaApply` |
+| *stored query* | graph | `InvokeQuery` (coarse) then inner `Read`/`Change` |
+
+Baked-in decisions (resolve the Open Questions):
+- **Tool ids are `query`/`mutate` only — no `read`/`change` aliases.** The server HTTP surface already deprecated `/read`,`/change`; a fresh in-server MCP has no legacy clients to keep, so it exposes only the canonical ids. [Open Q7 → resolved: no aliases.]
+- **Ad-hoc `query`/`mutate` are always exposed, Cedar-only** — no `mcp.allow_adhoc` switch. [Open Q3 → resolved: always-on + Cedar.]
+
+Annotations (rmcp defaults `destructiveHint` **and** `openWorldHint` to **true**, so set them explicitly via `ToolAnnotations::new().read_only(b).destructive(b).open_world(b)`): read tools → `read_only(true).open_world(false)`; `mutate`/`ingest`/`branches_delete`/`branches_merge`/`schema_apply` → `read_only(false).destructive(true).open_world(false)`; `branches_create` (additive) → `read_only(false).destructive(false).open_world(false)`.
+
+### B.6 Stored-query tools
+
+One MCP tool per `mcp.expose` registry entry (named by its `tool_name`), projected from the **same `api::query_catalog_entry`** the `GET /queries` catalog uses; parameters → JSON Schema per B.7. The outer gate is the **coarse `InvokeQuery`** action (all exposed queries on the graph, or none — per-query scope is deferred, see B.13); the call then runs the registry source through `run_query` / `run_mutate`, whose inner `Read` / `Change` gate the body — the **double-gate of `POST /queries/{name}`**. Skip a stored tool whose name collides with a built-in (built-ins win, so the catalog never has a duplicate tool name).
+
+### B.7 `ParamKind` → JSON Schema (stored-query params)
+
+| `ParamKind` | JSON Schema |
+|---|---|
+| String / Bool / Int / Float | `{"type":"string"}` / `{"type":"boolean"}` / `{"type":"integer"}` / `{"type":"number"}` |
+| BigInt (i64/u64) | `{"type":"string","pattern":"^-?\\d+$"}` (JSON numbers lose precision >2⁵³) |
+| Date / DateTime | `{"type":"string","format":"date"}` / `{"type":"string","format":"date-time"}` |
+| Blob | `{"type":"string","contentEncoding":"base64"}` |
+| Vector | `{"type":"array","items":{"type":"number"},"minItems":dim,"maxItems":dim}` (from `vector_dim`) |
+| List | `{"type":"array","items":<item_kind schema>}` (scalar items only) |
+
+`nullable == false` → in `required`. Add `branch` (and `snapshot` for reads) as optional invocation knobs. Fold `instruction` into the description.
+
+### B.8 `list` / `call` semantics
+
+- **`list_tools` / `list_resources`** are Cedar-filtered: for each tool/resource, evaluate `authorize(actor, policy_for(gate), { action, branch: None })`; emit only `Allowed`; an `Err` (operational, e.g. policy-engine error) propagates as a JSON-RPC error; a `Denied` simply hides. Stored-query tools list as a group iff the coarse `InvokeQuery` is allowed.
+- **`call_tool`:** an unknown tool **or** a denied tool returns the **identical** `unknown tool: <name>` (`-32602`) so the catalog can't be probed without the grant. A business/validation/engine failure (4xx/409) → `CallToolResult { isError: true }` (so the model self-corrects — the 2025-11-25 SEP-1303 split); an operational 5xx → JSON-RPC error. A missing/bad bearer is an HTTP `401` at the boundary *before* rmcp.
+- **Branch-scope caveat (R7):** list visibility evaluates with `branch: None`; the actual `do_*` / `run_*` re-authorizes against the real branch, so a branch-scoped policy may list a tool yet deny a specific-branch call. `tools/call` is authoritative.
+
+### B.9 Resources
+
+Two resources: `omnigraph://schema` (`Read` → `do_schema_get`) and `omnigraph://branches` (`Read` → `do_branches_list`, JSON text). (No `omnigraph://graphs` — server-scoped, dropped with `graphs_list`; see B.10.) `list_resources`/`read_resource` Cedar-filtered + masked exactly like tools. Advertise the `resources` capability only because both handlers are backed (don't advertise a capability whose `read` would 404).
+
+### B.10 Routing (RESOLVED)
+
+`/mcp` lives in the `per_graph_protected` route group: single mode → `POST /mcp`; multi mode → `POST /graphs/{graph_id}/mcp` (per-graph isolation; consistent with the `/graphs/{id}/...` REST cluster routing). [Open Q5 → resolved: per-graph, final.]
+
+**Decided: the MCP surface has no server-scoped tools or resources.** `graphs_list` and `omnigraph://graphs` are **dropped from MCP** — graph discovery is a REST/admin concern, served by `GET /graphs`. Every MCP tool/resource is graph-scoped, the per-graph `/mcp` is fully clean, and there is no flat server-level `/mcp`. (If a concrete need to enumerate graphs *over MCP* ever arises, add a flat server-level `POST /mcp` in the `management` group — bearer-only, no graph handle, server-scoped tools only — but do not build it speculatively.)
+
+**Do not** consolidate to a single flat `/mcp` that takes `graph_id` per call: MCP's `tools/list` cannot carry a graph, so it can't list per-graph stored-query tools; it also breaks isolation, pollutes every tool's `input_schema`, and diverges from the URL-scoped REST routing.
+
+### B.11 Auth (decoupled; OAuth is a committed fast-follow)
+
+The handler consumes an already-resolved `ResolvedActor` and **branches on nothing** about how the token was verified (§5.8). Static bearer works **today** with the developer clients; the consumer connectors need OAuth, a planned additive layer that changes zero MCP code (it only swaps the bearer middleware behind a `TokenVerifier`, and serves RFC 9728 metadata).
+
+| Integration | Static bearer (this surface) | Note |
+|---|---|---|
+| Claude Code, Cursor, VS Code | ✅ | `claude mcp add --transport http <url>/mcp --header "Authorization: Bearer <tok>"` |
+| Claude Messages API MCP connector | ✅ | caller passes `authorization_token` → `Authorization: Bearer` |
+| claude.ai web / Claude Desktop connectors | ❌ needs OAuth fast-follow | requires OAuth 2.1 + PKCE (S256) + RFC 9728 + DCR/CIMD/custom client id+secret |
+| ChatGPT developer-mode connectors | ❌ needs OAuth fast-follow | OAuth 2.1 (CIMD/DCR/PKCE) or "no auth"; no static-bearer mode |
+
+OAuth fast-follow (MR-956): serve `/.well-known/oauth-protected-resource` + `WWW-Authenticate` on 401, front a managed AS (WorkOS AuthKit by default) that supports DCR + PKCE, validate audience-bound JWTs offline → `ResolvedActor`. Keep it **config-gated/dual-mode** so a server that does not advertise OAuth lets the dev clients keep using the static `Authorization` header (avoids the Claude Code header-vs-OAuth conflict).
+
+### B.12 Tests & verification
+
+- **Protocol:** `initialize` handshake + advertised `{tools, resources}` caps; `tools/list` shape; `tools/call` happy path; JSON-RPC errors (`-32601`/`-32602`); `resources/list` + `resources/read`; `GET /mcp` → 405; `MCP-Protocol-Version` 400/default; `Origin` → 403.
+- **Cedar (coarse):** a read-only actor sees the read tools but not `mutate`/`ingest`/`branches_*`/`schema_apply`; a denied `tools/call` masks byte-identically to an unknown one; stored queries listed only with `invoke_query`; the double-gate (an `invoke_query`-only actor sees a stored tool but the call surfaces `isError` when the inner `read` denies).
+- **Dispatch:** a `mutate` call writes end-to-end (proves the actor/handle extension passthrough); a malformed query → `isError:true`, not a JSON-RPC error.
+- **Resources:** list + read of `schema`/`branches`; a denied read masks as unknown.
+- **Auth decoupling / no-bearer:** `/mcp` 401s without a bearer (before rmcp) and 200s with one; the suite is green under the static-hash verifier (and a mock `ResolvedActor` source proves verifier-agnosticism).
+- **Crate-level:** a tiny `omnigraph-mcp/tests/` with a trivial `McpBackend` impl serving `initialize` + `GET→405` proves the crate stands alone; add an rmcp surface-guard there pinning `StreamableHttpServerConfig` field names + the `ServerHandler` method shapes.
+- **Verification commands:** `cargo build --workspace --locked`; `cargo tree -p omnigraph-server -e normal | grep rmcp` shows rmcp **only** transitively under `omnigraph-mcp`; `cargo test -p omnigraph-server --test server` (incl. the `mcp_*` cases, black-box over `build_app`) + `--test openapi` (no `/mcp` leak — it carries no `#[utoipa::path]`); live smoke: run the server with a bearer + policy, `curl` `initialize`/`tools/list`/`tools/call`/`GET→405`.
+
+### B.13 Decisions locked
+
+- **rmcp 1.7** (not hand-rolled) — verified to integrate on edition 2024. [Open Q2 → resolved.]
+- **Coarse `invoke_query` only**; per-query scope deferred (PR 0b — adds a query-name dimension to `PolicyRequest` + the Cedar schema). [The headline caveat.]
+- **Ad-hoc `query`/`mutate` always exposed, Cedar-only**; no `mcp.allow_adhoc`. [Open Q3 → resolved.]
+- **`query`/`mutate` ids only**, no `read`/`change` aliases. [Open Q7 → resolved.]
+- **Per-graph `/mcp` routing**; `graphs_list`/`omnigraph://graphs` **dropped from MCP** (graph discovery via REST `GET /graphs`); no server-scoped MCP tools. [Open Q5 → resolved.]
+- **text-JSON `content` for v1**; `structuredContent`/`outputSchema` deferred. [Open Q4 → resolved.]
+- **BigInt as JSON string.** [Open Q1 → resolved.]
+- **Static bearer now, OAuth/RFC-9728 fast-follow.**
+
 ## Relationship to RFC-001
 
 [rfc-001-queries-envelope-mcp.md](rfc-001-queries-envelope-mcp.md) (MR-656 / MR-976 / MR-969) is the parent design for stored queries + the response envelope + MCP. This RFC is the **detailed MCP-transport design** that #128 left for a follow-up, and it **revises rfc-001 in three places where the shipped code or the MCP wire protocol diverged from rfc-001's sketch**:
@@ -32,9 +173,11 @@ Everything else in rfc-001 (two-paths-one-engine, per-query `invoke_query` *as t
 
 > **Numbering note:** the `TokenVerifier`/WorkOS auth design is referred to in code (`crates/omnigraph-server/src/identity.rs`) as "RFC 0001," which is a *different* document from this repo's `docs/dev/rfc-001-queries-envelope-mcp.md`. To avoid the collision this RFC cites the auth substrate as **MR-956** throughout, never "RFC 0001."
 
-## Reconciliation with shipped code (verified against `ragnorc/stored-queries-mcp` HEAD)
+## Reconciliation with shipped code (historical — pre-MCP, against #128 HEAD)
+
+> *Historical: this was the gap analysis against `#128` (the stored-query REST foundation) before the MCP surface was built. The three ❌ items below — the MCP protocol surface, and the `TokenVerifier` — were subsequently built/addressed in [#157](https://github.com/ModernRelay/omnigraph/pull/157) (transport, tools, resources) except per-query scope and OAuth, which remain deferred. For the current build instructions see the [Implementation Blueprint](#implementation-blueprint-canonical--build-the-fresh-implementation-from-this).*
 
-Verified against `crates/omnigraph-server/src/{lib.rs,api.rs}` and `crates/omnigraph-policy/src/lib.rs` at the current branch head (not the #128 PR body, and not `api.rs` alone):
+Verified against `crates/omnigraph-server/src/{lib.rs,api.rs}` and `crates/omnigraph-policy/src/lib.rs` at the `#128` branch head:
 
 - ✅ `GET /queries` returns the `mcp.expose == true` subset as `QueriesCatalogOutput { queries: [QueryCatalogEntry] }`, each with typed `ParamDescriptor`s, `tool_name`, `description`, `instruction`, and a `mutation` flag. **MCP-ready projection, but exposed as bespoke REST/JSON — not the MCP wire protocol.**
 - ✅ `POST /queries/{name}` route exists (`server_invoke_query`, `lib.rs`).
@@ -79,6 +222,8 @@ Stack (verified `Cargo.toml`): Axum + utoipa (OpenAPI) + `omnigraph-policy` (Ced
 
 ## Design
 
+> *§5 is the original design sketch (design rationale). Where it differs from the [Implementation Blueprint](#implementation-blueprint-canonical--build-the-fresh-implementation-from-this) above, the Blueprint is authoritative. Notable divergences proven out by [#157](https://github.com/ModernRelay/omnigraph/pull/157): the §5.1 per-tool `McpTool` trait became a `Builtin` enum + an `McpBackend` crate trait (B.1–B.4); §5.6's rmcp-vs-hand-roll is resolved to rmcp 1.7 (B.3); §5.7's "server tools on a per-graph endpoint" is resolved in B.10; the §5.2 `read`/`change` aliases are dropped (B.5).*
+
 ### 5.1 One tool model: a `McpTool` trait, two populators
 
 Both built-in and stored-query tools implement one trait so `tools/list` / `tools/call` never special-case:
@@ -222,9 +367,9 @@ With ad-hoc `query`/`mutate`/`schema_apply` present as tools, the **only** thing
 
 ## Relationship to the `@modernrelay/omnigraph-mcp` stdio package
 
-Verified surface of the package (`omnigraph-ts`, pkg version `0.3.0`, `@modelcontextprotocol/sdk@^1.29.0`, **stdio only**): **13 tools** (`health`, `snapshot`, `read`, `schema_get`, `branches_list`, `commits_list`, `commits_get`, `change`, `ingest`, `branches_create`, `branches_delete`, `branches_merge`, `schema_apply`) and **2 resources** (`omnigraph://schema`, `omnigraph://branches`). It is a thin client over the SDK → HTTP routes and **forwards the caller's bearer verbatim** (no inspection).
+Surface of the package (`omnigraph-ts`, `@modelcontextprotocol/sdk@^1.29.0`, **stdio only**). *Figures refreshed 2026-06: the package re-synced to the engine in [omnigraph-ts#11](https://github.com/ModernRelay/omnigraph-ts/pull/11) and is now at `0.6.1` — not the `0.3.0` this RFC was first drafted against.* It exposes **16 tools** (`health`, `snapshot`, `query`, `read`, `schema_get`, `branches_list`, `graphs_list`, `commits_list`, `commits_get`, `mutate`, `change`, `ingest`, `branches_create`, `branches_delete`, `branches_merge`, `schema_apply` — note it already canonicalized `query`/`mutate` with `read`/`change` as deprecated aliases, and added `graphs_list`) and **~9 resources** (`omnigraph://schema`, `omnigraph://branches`, `omnigraph://graphs`, plus a vendored `omnigraph://best-practices/*` cookbook). It is a thin client over the SDK → HTTP routes and **forwards the caller's bearer verbatim** (no inspection).
 
-Once parity lands, **collapse to one implementation**: the in-server MCP is canonical (Cedar-gated, remote-capable, the path that becomes a Claude-web connector via MR-956). The stdio package degrades to a **thin stdio↔HTTP proxy** forwarding JSON-RPC (and the incoming `Authorization`) to `/mcp` — staying the local on-ramp for Claude Code/Desktop while sharing one tool set, one Cedar gate. Transition: keep the current independent stdio package on its `0.3.x`/`0.6.x` line; ship proxy mode in a later TS minor once the server endpoint is GA. (Note: the package is currently several minors behind the server — its vendored `spec/openapi.json` predates the stored-query routes — so it needs the standard re-sync regardless of MCP work.)
+Once parity lands, **collapse to one implementation**: the in-server MCP is canonical (Cedar-gated, remote-capable, the path that becomes a Claude-web connector via MR-956). The stdio package degrades to a **thin stdio↔HTTP proxy** forwarding JSON-RPC (and the incoming `Authorization`) to `/mcp` — staying the local on-ramp for Claude Code/Desktop while sharing one tool set, one Cedar gate. Transition: keep the current independent stdio package on its `0.6.x` line; ship proxy mode in a later TS minor once the server endpoint is GA. (The package already re-synced to `0.6.1` in [omnigraph-ts#11](https://github.com/ModernRelay/omnigraph-ts/pull/11); its client-side stored-query-tools attempt, [omnigraph-ts#7](https://github.com/ModernRelay/omnigraph-ts/pull/7), was **closed** in favor of this server-side surface.)
 
 ## Testing
 
@@ -260,6 +405,8 @@ Once parity lands, **collapse to one implementation**: the in-server MCP is cano
 
 ## Open Questions
 
+> *Resolved during [#157](https://github.com/ModernRelay/omnigraph/pull/157) — see [B.13](#b13-decisions-locked) for the locked decisions. Q1→string, Q2→rmcp 1.7, Q3→always-on Cedar-only, Q4→text-JSON v1, Q5→per-graph routing (graphs_list per B.10), Q6→stateless POST confirmed, Q7→no `read`/`change` aliases. Q8 (PR 0b shape: Cedar resource vs context attribute) remains open, gated on the deferred per-query scope work. The items below are kept as the original decision context.*
+
 1. **BigInt/u64 as JSON string** (recommended, precision-safe) vs number.
 2. **`rmcp` vs hand-rolled** JSON-RPC (spike `rmcp` on edition 2024; default to hand-roll on friction).
 3. **Default-off `mcp.allow_adhoc`** for ad-hoc `query`/`mutate` (recommended) vs always-on + Cedar-only.

From 6fc92cab0669d0a313596c3e372a9bbb3012e9e8 Mon Sep 17 00:00:00 2001
From: Ragnor Comerford <hello@ragnor.co>
Date: Thu, 11 Jun 2026 10:46:21 +0200
Subject: [PATCH 2/5] docs(rfc-003): correct-by-construction fixes from PR
 review

Fold the valid #157 review findings into the blueprint as by-design fixes so a
from-scratch build cannot reintroduce them:

- B.3.1: host/origin policy derived from the server bind + config (loopback ->
  loopback hosts; non-loopback -> configured public host, else Host-allowlist
  off with bearer + Origin as the controls). Closes the loopback-only-default
  footgun that 403s every remote client before bearer auth.
- B.6/B.7: stored-query params nested under a `params` object (mirrors
  POST /queries/{name}), so the branch/snapshot knobs cannot collide with a
  query param; mutation tools omit `snapshot` + additionalProperties:false, so
  mutation-against-a-snapshot is unrepresentable. Vector `dim` made intrinsic to
  the kind, killing the unwrap_or(0) zero-length-array schema.
- B.8: one canonical classify(Result<_, ApiError>) for tool errors (5xx ->
  JSON-RPC, 4xx/409 -> isError), no second mapper to drift; list visibility uses
  the call-path default-branch authorization, not a branch:None probe that hid
  branch-scoped-grant tools.
- Reconciled the stale Summary count and the section-5-era Rollout/Testing
  sections (banners pointing to B.12/B.4/B.13).
---
 docs/dev/rfc-003-mcp-server-surface.md | 31 +++++++++++++++++---------
 1 file changed, 20 insertions(+), 11 deletions(-)

diff --git a/docs/dev/rfc-003-mcp-server-surface.md b/docs/dev/rfc-003-mcp-server-surface.md
index 09542d39..34c9243b 100644
--- a/docs/dev/rfc-003-mcp-server-surface.md
+++ b/docs/dev/rfc-003-mcp-server-surface.md
@@ -11,7 +11,7 @@
 
 Add a first-class **MCP (Model Context Protocol) server surface to `omnigraph-server`**, exposed over **Streamable HTTP**, that projects the server's operations as MCP tools and resources for LLM clients (Claude Code/Desktop/web, Cursor, etc.). Two populations of tools share one projection path:
 
-1. **Built-in operational tools** — parity with the existing `@modernrelay/omnigraph-mcp` stdio package's **13 tools** (`health`, `snapshot`, `read`, `schema_get`, `branches_list`, `commits_list`, `commits_get`, `change`, `ingest`, `branches_create`, `branches_delete`, `branches_merge`, `schema_apply`) and its **2 resources** (`omnigraph://schema`, `omnigraph://branches`). (Server-scoped graph discovery — a `graphs_list` tool / `omnigraph://graphs` resource — was considered but **dropped from MCP**; it stays REST-only via `GET /graphs`. See [B.10](#b10-routing-resolved).)
+1. **Built-in operational tools** — parity with the existing `@modernrelay/omnigraph-mcp` stdio package's operational tool set (`health`, `snapshot`, `schema_get`, `branches_list`, `commits_list`, `commits_get`, ad-hoc read/write, `ingest`, branch ops, `schema_apply`) and its core resources (`omnigraph://schema`, `omnigraph://branches`). (Exact counts moved over time — the package is now at 16 tools / ~9 resources; see [Relationship](#relationship-to-the-modernrelayomnigraph-mcp-stdio-package). The authoritative as-built catalog is [B.5](#b5-tool-catalog-13-built-ins--cedar-mapping): 13 tools.) (Server-scoped graph discovery — a `graphs_list` tool / `omnigraph://graphs` resource — was considered but **dropped from MCP**; it stays REST-only via `GET /graphs`. See [B.10](#b10-routing-resolved).)
 2. **Dynamic stored-query tools** — one MCP tool per `mcp.expose: true` entry in the `queries:` registry (MR-969 / #128), with parameters typed from the `.gq` declaration via the shipped `query_catalog_entry` / `param_descriptor` projection.
 
 Every tool is **authorized by the server's existing Cedar policy engine**. The MCP layer never implements its own authentication: it consumes an **already-resolved `ResolvedActor`** from the server's bearer middleware (`require_bearer_auth` today; the `TokenVerifier` seam when MR-956 lands), so the **same MCP endpoint serves on-prem (static or customer-OIDC tokens) and our cloud (WorkOS OAuth) by configuration only**. Cloud OAuth is an additive layer (RFC 9728 protected-resource metadata) that slots in with zero MCP changes.
@@ -57,10 +57,15 @@ pub trait McpBackend: Clone + Send + Sync + 'static {
 ### B.3 Transport (lives in `omnigraph-mcp`)
 
 - A generic `struct McpService<B>` implements rmcp's `ServerHandler`, delegating each method to `B` after extracting `&Parts` from `ctx.extensions` once (missing → `McpError::internal_error`). `get_info → backend.server_info()`; `initialize`/`ping` use rmcp's defaults.
-- `pub fn mcp_router<B: McpBackend>(backend: B, body_limit: usize) -> axum::Router`:
-  - `let svc = StreamableHttpService::new(move || Ok(McpService::new(backend.clone())), Arc::new(LocalSessionManager::default()), config)` with `config = StreamableHttpServerConfig::default().with_stateful_mode(false).with_json_response(true)` (`StreamableHttpServerConfig` is `#[non_exhaustive]` — build from `Default`, mutate via the `with_*` setters; keep the loopback `allowed_hosts` default).
-  - Return `Router::new().route_service("/mcp", svc).layer(tower_http::limit::RequestBodyLimitLayer::new(body_limit))`. rmcp reads the body directly (not via an axum extractor), so axum's `DefaultBodyLimit` does **not** bound `/mcp`; the tower-http layer does.
-- **Stateless mode delivers these conformance MUSTs for free (verified against rmcp 1.7 source):** `GET`/`DELETE /mcp` → `405` (with `Allow`); a disallowed `Host`/`Origin` → `403` (loopback hosts by default — DNS-rebind guard); `MCP-Protocol-Version` → `400` on unsupported, default `2025-03-26` when absent. **No conformance middleware is needed** (the §5.6 "honour the header" footnote is satisfied by rmcp).
+- `pub fn mcp_router<B: McpBackend>(backend: B, body_limit: usize, hosts: McpHostPolicy) -> axum::Router`:
+  - `config = StreamableHttpServerConfig::default().with_stateful_mode(false).with_json_response(true).with_allowed_hosts(hosts.allowed_hosts).with_allowed_origins(hosts.allowed_origins)` (`StreamableHttpServerConfig` is `#[non_exhaustive]` — build from `Default`, mutate via the `with_*` setters). **Do not keep rmcp's fixed loopback `allowed_hosts` default** — it `403`s every non-loopback client *before* bearer auth (a deploy footgun); derive the policy per B.3.1.
+  - `let svc = StreamableHttpService::new(move || Ok(McpService::new(backend.clone())), Arc::new(LocalSessionManager::default()), config)`; return `Router::new().route_service("/mcp", svc).layer(tower_http::limit::RequestBodyLimitLayer::new(body_limit))`. rmcp reads the body directly (not via an axum extractor), so axum's `DefaultBodyLimit` does **not** bound `/mcp`; the tower-http layer does.
+- **Stateless mode delivers these conformance MUSTs for free (verified against rmcp 1.7 source):** `GET`/`DELETE /mcp` → `405` (with `Allow`); a disallowed `Host`/`Origin` → `403` (per the `McpHostPolicy`, B.3.1); `MCP-Protocol-Version` → `400` on unsupported, default `2025-03-26` when absent. **No conformance middleware is needed** (the §5.6 "honour the header" footnote is satisfied by rmcp).
+
+**B.3.1 Host/Origin policy — derived from deployment, never a fixed default.** The `Host` header is not a security boundary for a bearer-authed server (the token is); rmcp's loopback `allowed_hosts` default exists to protect *local* dev servers from browser DNS-rebinding, and for any non-loopback deploy it just rejects legitimate clients. So `omnigraph-server` computes `McpHostPolicy` once at startup from what it already knows (the `--bind` address + any configured public host) and passes it in:
+- **Loopback bind** (`127.0.0.1` / `::1` / `localhost`) → loopback `allowed_hosts` (rebind protection genuinely matters locally).
+- **Non-loopback bind** (`0.0.0.0` / a public IP) → the operator chose remote reachability: allow the configured public host(s) if set, else **disable Host-allowlisting** (accept any `Host`) and log it — bearer auth + `Origin` validation are the real controls.
+- `allowed_origins` is the actual browser-attack control: default empty (off — non-browser MCP clients send no `Origin`), configurable for browser-based clients. The class "one fixed default that is wrong for some deployments" is closed by making the policy a function of the deployment.
 - **Client/test obligations rmcp enforces:** the request must carry `Accept: application/json, text/event-stream` (both), `Content-Type: application/json`, and a `Host` header. rmcp negotiates `protocolVersion` (a recent client sees `2025-11-25`).
 
 ### B.4 Server-side backend (lives in `omnigraph-server`)
@@ -94,7 +99,7 @@ Annotations (rmcp defaults `destructiveHint` **and** `openWorldHint` to **true**
 
 ### B.6 Stored-query tools
 
-One MCP tool per `mcp.expose` registry entry (named by its `tool_name`), projected from the **same `api::query_catalog_entry`** the `GET /queries` catalog uses; parameters → JSON Schema per B.7. The outer gate is the **coarse `InvokeQuery`** action (all exposed queries on the graph, or none — per-query scope is deferred, see B.13); the call then runs the registry source through `run_query` / `run_mutate`, whose inner `Read` / `Change` gate the body — the **double-gate of `POST /queries/{name}`**. Skip a stored tool whose name collides with a built-in (built-ins win, so the catalog never has a duplicate tool name).
+One MCP tool per `mcp.expose` registry entry (named by its `tool_name`), projected from the **same `api::query_catalog_entry`** the `GET /queries` catalog uses; parameters → JSON Schema per B.7. The outer gate is the **coarse `InvokeQuery`** action (all exposed queries on the graph, or none — per-query scope is deferred, see B.13); the call then runs the registry source through `run_query` / `run_mutate`, whose inner `Read` / `Change` gate the body — the **double-gate of `POST /queries/{name}`**. Skip a stored tool whose name collides with a built-in (built-ins win, so the catalog never has a duplicate tool name). Dispatch reads the query params from `args.params` and the knobs from `args.branch` / `args.snapshot` (B.7's nested envelope), so a query parameter named `branch`/`snapshot` cannot be shadowed.
 
 ### B.7 `ParamKind` → JSON Schema (stored-query params)
 
@@ -104,16 +109,16 @@ One MCP tool per `mcp.expose` registry entry (named by its `tool_name`), project
 | BigInt (i64/u64) | `{"type":"string","pattern":"^-?\\d+$"}` (JSON numbers lose precision >2⁵³) |
 | Date / DateTime | `{"type":"string","format":"date"}` / `{"type":"string","format":"date-time"}` |
 | Blob | `{"type":"string","contentEncoding":"base64"}` |
-| Vector | `{"type":"array","items":{"type":"number"},"minItems":dim,"maxItems":dim}` (from `vector_dim`) |
+| Vector | `{"type":"array","items":{"type":"number"},"minItems":dim,"maxItems":dim}` — **`dim` is intrinsic to the kind** (`ParamKind::Vector { dim: u32 }`, never an `Option`), so a dimensionless vector is unrepresentable and the `unwrap_or(0)` footgun (a 0-length-array schema that rejects all input) cannot exist; if a `dim` is ever genuinely unknown, **omit** `minItems`/`maxItems` rather than emit `0`. |
 | List | `{"type":"array","items":<item_kind schema>}` (scalar items only) |
 
-`nullable == false` → in `required`. Add `branch` (and `snapshot` for reads) as optional invocation knobs. Fold `instruction` into the description.
+**Envelope (correct by construction).** The tool's `input_schema` **nests the query params under a `params` object**, mirroring the `POST /queries/{name}` request: `{ "params": { <the query's typed params> }, "branch"?: string, "snapshot"?: string }`. The invocation knobs and the query's own parameters then live in **separate namespaces and cannot collide** — a query param named `branch`/`snapshot` is harmless. `nullable == false` params go in the inner `params.required`. For a **mutation** tool, omit `snapshot` from the schema and set `additionalProperties: false`, so "mutation against a snapshot" is *unrepresentable* (the REST path `400`s it; the MCP schema makes it impossible). Fold `instruction` into the description.
 
 ### B.8 `list` / `call` semantics
 
-- **`list_tools` / `list_resources`** are Cedar-filtered: for each tool/resource, evaluate `authorize(actor, policy_for(gate), { action, branch: None })`; emit only `Allowed`; an `Err` (operational, e.g. policy-engine error) propagates as a JSON-RPC error; a `Denied` simply hides. Stored-query tools list as a group iff the coarse `InvokeQuery` is allowed.
-- **`call_tool`:** an unknown tool **or** a denied tool returns the **identical** `unknown tool: <name>` (`-32602`) so the catalog can't be probed without the grant. A business/validation/engine failure (4xx/409) → `CallToolResult { isError: true }` (so the model self-corrects — the 2025-11-25 SEP-1303 split); an operational 5xx → JSON-RPC error. A missing/bad bearer is an HTTP `401` at the boundary *before* rmcp.
-- **Branch-scope caveat (R7):** list visibility evaluates with `branch: None`; the actual `do_*` / `run_*` re-authorizes against the real branch, so a branch-scoped policy may list a tool yet deny a specific-branch call. `tools/call` is authoritative.
+- **`list_tools` / `list_resources`** are Cedar-filtered. For each tool/resource, run the **same authorization the call path runs, with default args** (the default branch `main`) — **not** a separate `branch: None` probe (a `branch: None` request matches no `branch_scope` rule, so it would *hide* tools the actor can actually call on a scoped branch). Sharing the authorization input makes list and call agree by construction for the common case. Emit only `Allowed`; an `Err` (operational) propagates as a JSON-RPC error; a `Denied` hides. Stored-query tools list as a group iff the coarse `InvokeQuery` is allowed.
+- **`call_tool`:** an unknown tool **or** a denied tool returns the **identical** `unknown tool: <name>` (`-32602`) so the catalog can't be probed without the grant. Route **every** `do_*`/`run_*` result through **one canonical `classify(Result<_, ApiError>) -> Result<CallToolResult, McpError>`** — the single source of truth, used at every dispatch site so the mapping can't drift: `Ok` → success JSON content; `Err` 4xx/409 → `CallToolResult { isError: true }` (the model self-corrects, per the 2025-11-25 SEP-1303 split); `Err` operational 5xx → JSON-RPC error. (Do not keep two mappers — a single `classify` replaces the `api_error_to_tool` / `api_operational_error` split that drifted in the reference impl.) A missing/bad bearer is an HTTP `401` at the boundary *before* rmcp.
+- **Branch-scope residual (R7):** because visibility uses the default branch, a grant scoped to a *non-default* branch is an **over-show** — the tool lists and `call_tool` is the authoritative gate. Over-showing is the safe direction (the agent sees the tool, tries it, gets a clear deny); under-showing a callable tool is the failure mode the default-branch probe avoids.
 
 ### B.9 Resources
 
@@ -373,6 +378,8 @@ Once parity lands, **collapse to one implementation**: the in-server MCP is cano
 
 ## Testing
 
+> *Superseded by [B.12](#b12-tests--verification), which is the authoritative test plan. The list below is the original §5-era sketch; ignore its references to `graphs_list`, `read`/`change` aliases, and the per-query `invoke_query` test (all resolved/dropped — see B.13).*
+
 - **Protocol conformance:** `initialize` handshake + advertised capabilities; `tools/list` shape; `tools/call` happy path; JSON-RPC error envelopes (`-32601` unknown method, `-32602` invalid params / unknown tool); `resources/list` + `resources/read`.
 - **Cedar filtering (coarse, today):** an actor with `allow InvokeQuery` + `deny Read/Change` sees *all* exposed stored queries but **not** `query`/`mutate`/`schema_get`; `tools/call query` returns masked "unknown tool"; an admin sees the full catalog.
 - **Cedar filtering (per-query, gated on PR 0b):** actor scoped to `InvokeQuery [find_user]` sees *only* `find_user`; `tools/call list_orders` masks. **This test ships with PR 0b**, not PR 1 — it cannot pass against the coarse action.
@@ -388,6 +395,8 @@ Once parity lands, **collapse to one implementation**: the in-server MCP is cano
 
 ## Rollout — phased by risk
 
+> *Superseded by the Blueprint's build order ([B.4](#b4-server-side-backend-lives-in-omnigraph-server) → crate ([B.1](#b1-crate-architecture--dependency-direction)–[B.3](#b3-transport-lives-in-omnigraph-mcp)) → backend → tools/stored/resources). The PR phases below are the original §5-era plan; they still mention `graphs_list`, `read`/`change` aliases, and the `mcp.allow_adhoc` switch, all of which the locked decisions ([B.13](#b13-decisions-locked)) drop. PR 0b (per-query `invoke_query` scope) remains the one deferred follow-up.*
+
 - **PR 0a — extract the reusable invoke path (small).** The coarse `invoke_query` gate + 404 denial-masking are **already shipped** in `server_invoke_query`. Extract the read/mutate dispatch into `invoke_stored_query(handle, name, params, branch/snapshot, actor)` so MCP `tools/call` and the HTTP route share one path. No behaviour change. *(Replaces the previous draft's "PR 0 — wire the gate", which was already done.)*
 - **PR 0b — per-query `invoke_query` scope (the safety prerequisite).** Add a query-name dimension to `PolicyRequest` + the Cedar schema (rfc-001's intended design), wire it at `POST /queries/{name}` and in the stored-query `McpTool::authorization`. Independently useful (the `allow InvokeQuery [find_user]` policy). **Gates the per-query Cedar-filtering test and §5.9's recommended agent policy.**
 - **PR 1 — MCP transport + read-only parity + stored-query reads.** Endpoint(s), `initialize`/`tools/list`/`tools/call`/`resources/*`, the `McpTool` registry, Cedar-filtered listing, the read-only built-ins (`health`, `graphs_list`, `snapshot`, `read`/`query`, `schema_get`, `branches_list`, `commits_*`) + resources + stored-query *reads*. All auth-agnostic.

From 61c18b67ba4bcf5172d9b77551761ed41c241afc Mon Sep 17 00:00:00 2001
From: Ragnor Comerford <hello@ragnor.co>
Date: Sat, 13 Jun 2026 11:53:09 +0200
Subject: [PATCH 3/5] docs(rfc-003): validate against main; fold the
 implementation spec in
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Rebased onto current main (0.7.0-dev) and reconciled the MCP blueprint
with the code it builds on, replacing the earlier blueprint sketch with a
spec grounded in real interfaces (each EXISTING snippet cited to file:line,
re-read from source). Corrections (delta table in §0):

- ParamKind::Vector is a unit variant; the dimension is
  ParamDescriptor.vector_dim: Option<u32>, not an intrinsic struct field.
  The JSON-Schema builder now omits minItems/maxItems when the dim is
  absent instead of emitting 0.
- Server handlers, auth helpers, and run_query/run_mutate moved from
  lib.rs to handlers.rs; stored-query config in config.rs; DTOs in api.rs.
- ingest/load unified: server_ingest calls load_as; a missing branch with
  no `from` is a 404, never an implicit fork. The MCP tool is named `load`.
- Client credentials follow the shipped RFC-007 model (OperatorServer { url },
  env -> credentials-file chain, no keychain step, no nested auth: block).
- Test references updated for the split server suites (mcp.rs / stored_queries.rs,
  not --test server).
- Stored-query declaration source noted as cluster.yaml file-discovery with
  omnigraph.yaml as the RFC-008-deprecated legacy surface.

Folds the standalone implementation-spec draft into this file so there is one
MCP RFC; cross-links rfc-005/007/009 now that they are in the corpus.
---
 docs/dev/rfc-003-mcp-server-surface.md | 1116 +++++++++++++++++-------
 1 file changed, 819 insertions(+), 297 deletions(-)

diff --git a/docs/dev/rfc-003-mcp-server-surface.md b/docs/dev/rfc-003-mcp-server-surface.md
index 34c9243b..cff392b3 100644
--- a/docs/dev/rfc-003-mcp-server-surface.md
+++ b/docs/dev/rfc-003-mcp-server-surface.md
@@ -1,426 +1,948 @@
-# RFC: MCP Server Surface for `omnigraph-server` — Full Tool Parity, Stored Queries, Modular Auth
-
-**Status:** Reference implementation shipped in [omnigraph#157](https://github.com/ModernRelay/omnigraph/pull/157) (proved rmcp 1.7 on edition 2024, the auth-extension passthrough, and the full tool/resource/Cedar surface). **This RFC is now the canonical spec for a clean reimplementation from `main`** that lands the surface with a dedicated `omnigraph-mcp` crate from the start — build from the **[Implementation Blueprint](#implementation-blueprint-canonical--build-the-fresh-implementation-from-this)** below, which incorporates the as-built reality and supersedes the §5 sketches where they differ. Deferred to follow-ups: per-query `invoke_query` scope (PR 0b), the OAuth/RFC-9728 layer (MR-956), and the stdio→proxy collapse.
-**Date:** 2026-06-01
-**Tickets:** MR-969 (stored queries + MCP exposure — the surface this completes), MR-956 (federated auth / WorkOS OAuth — the auth substrate this consumes), MR-971 (per-server credential resolver), MR-974 (agent setup surface — the installer that wires this), MR-668 (multi-graph server — shipped, the routing this builds on)
-**Builds on:** [omnigraph#128](https://github.com/ModernRelay/omnigraph/pull/128) (`ragnorc/stored-queries-mcp`) — the shipped stored-query registry, `GET /queries`, `POST /queries/{name}`, and the coarse `invoke_query` gate.
-**Supersedes:** the MCP-transport portion of [rfc-001-queries-envelope-mcp.md](rfc-001-queries-envelope-mcp.md) (`/mcp/tools` + `/mcp/invoke`). See [Relationship to RFC-001](#relationship-to-rfc-001).
-**Target release:** v0.8.x (phased — see Rollout)
+# RFC-003: MCP Server Surface for `omnigraph-server` — Full Tool Parity, Stored Queries, Modular Auth
+
+**Status:** Proposed — buildable implementation spec, **validated against `main`
+at the 0.7.0 development head** (post-v0.6.2; workspace bumped to 0.7.0 in
+`dedd647`). A reference implementation shipped in
+[omnigraph#157](https://github.com/ModernRelay/omnigraph/pull/157) (proved rmcp
+1.7 on edition 2024, the auth-extension passthrough, and the full
+tool/resource/Cedar surface); this RFC is the canonical spec for a clean
+reimplementation from `main` with a dedicated `omnigraph-mcp` crate.
+**Date:** 2026-06-13
+**Tickets:** MR-969 (stored queries + MCP exposure), MR-956 (OAuth/RFC-9728
+fast-follow — the auth substrate), MR-971 (per-server credential resolver — landed
+as RFC-007), MR-974 (`omnigraph mcp install`), MR-668/RFC-005 (multi-graph server — shipped).
+**Builds on:** [rfc-001-queries-envelope-mcp.md](rfc-001-queries-envelope-mcp.md)
+(stored queries + response envelope + the parent MCP design),
+[rfc-005-server-cluster-boot.md](rfc-005-server-cluster-boot.md) (multi-graph boot),
+[rfc-007-operator-config.md](rfc-007-operator-config.md) (the landed client
+credential model), [rfc-009-unify-access-paths.md](rfc-009-unify-access-paths.md)
+(the embedded/remote `GraphClient` unification this surface is a sibling of), and
+[omnigraph#128](https://github.com/ModernRelay/omnigraph/pull/128) (the shipped
+stored-query registry, `GET /queries`, `POST /queries/{name}`, the coarse
+`invoke_query` gate).
+**Supersedes:** the MCP-transport portion of `rfc-001` (`GET /mcp/tools` + `POST
+/mcp/invoke`). See [Relationship to RFC-001](#relationship-to-rfc-001).
+**Target release:** v0.8.x (per RFC-009, MCP-adjacent work is post-v0.7.0).
+
+> **Provenance.** Every snippet labelled **`EXISTING (reuse)`** is copied verbatim
+> from `origin/main` with a `file:line` citation, re-read from source on
+> 2026-06-13 — verify with `git show origin/main:<file>`. Snippets labelled
+> **`NEW (build this)`** are the interfaces to add. The one behavior this repo
+> cannot verify — that rmcp 1.7 forwards the request's `http::request::Parts`
+> into `RequestContext.extensions` — was proven by the reference impl
+> [omnigraph#157](https://github.com/ModernRelay/omnigraph/pull/157); it is
+> carried forward as a cited assumption, pinned by a crate-level surface guard
+> (§11). This document absorbed the standalone implementation-spec draft
+> (formerly `rfc-010`); there is one MCP RFC.
 
 ## Summary
 
-Add a first-class **MCP (Model Context Protocol) server surface to `omnigraph-server`**, exposed over **Streamable HTTP**, that projects the server's operations as MCP tools and resources for LLM clients (Claude Code/Desktop/web, Cursor, etc.). Two populations of tools share one projection path:
-
-1. **Built-in operational tools** — parity with the existing `@modernrelay/omnigraph-mcp` stdio package's operational tool set (`health`, `snapshot`, `schema_get`, `branches_list`, `commits_list`, `commits_get`, ad-hoc read/write, `ingest`, branch ops, `schema_apply`) and its core resources (`omnigraph://schema`, `omnigraph://branches`). (Exact counts moved over time — the package is now at 16 tools / ~9 resources; see [Relationship](#relationship-to-the-modernrelayomnigraph-mcp-stdio-package). The authoritative as-built catalog is [B.5](#b5-tool-catalog-13-built-ins--cedar-mapping): 13 tools.) (Server-scoped graph discovery — a `graphs_list` tool / `omnigraph://graphs` resource — was considered but **dropped from MCP**; it stays REST-only via `GET /graphs`. See [B.10](#b10-routing-resolved).)
-2. **Dynamic stored-query tools** — one MCP tool per `mcp.expose: true` entry in the `queries:` registry (MR-969 / #128), with parameters typed from the `.gq` declaration via the shipped `query_catalog_entry` / `param_descriptor` projection.
-
-Every tool is **authorized by the server's existing Cedar policy engine**. The MCP layer never implements its own authentication: it consumes an **already-resolved `ResolvedActor`** from the server's bearer middleware (`require_bearer_auth` today; the `TokenVerifier` seam when MR-956 lands), so the **same MCP endpoint serves on-prem (static or customer-OIDC tokens) and our cloud (WorkOS OAuth) by configuration only**. Cloud OAuth is an additive layer (RFC 9728 protected-resource metadata) that slots in with zero MCP changes.
-
-The end-state collapses two diverging tool implementations into one: the in-server MCP is the canonical, Cedar-gated, remotely-reachable surface; the stdio package becomes a thin stdio↔HTTP proxy (local on-ramp) over it.
+Add a first-class **MCP (Model Context Protocol) server surface to
+`omnigraph-server`**, exposed over **Streamable HTTP**, projecting the server's
+operations as MCP tools and resources for LLM clients (Claude Code/Desktop/web,
+Cursor, ChatGPT dev-mode, etc.). Two tool populations share one projection path:
+
+1. **Built-in operational tools** — `health`, `snapshot`, `schema_get`,
+   `branches_list`, `commits_list`, `commits_get`, ad-hoc `query`/`mutate`,
+   `load` (NDJSON), `branches_create`/`branches_delete`/`branches_merge`,
+   `schema_apply` — plus resources `omnigraph://schema` and `omnigraph://branches`.
+   The authoritative catalog is [§6](#6-tool-catalog-13-built-ins--cedar-mapping)
+   (13 tools). Server-scoped graph discovery (`graphs_list` / `omnigraph://graphs`)
+   is **dropped from MCP** — it stays REST-only via `GET /graphs` ([§9](#9-routing--existing-reuse-validated)).
+2. **Dynamic stored-query tools** — one MCP tool per `expose: true` entry in the
+   graph's loaded stored-query registry, typed from the `.gq` declaration via the
+   shipped `query_catalog_entry` / `ParamDescriptor` projection ([§7](#7-stored-query-projection--existing-reuse--the-corrected-schema)).
+
+Every tool is **authorized by the server's existing Cedar policy engine**. The MCP
+layer never implements its own authentication: it consumes an **already-resolved
+`ResolvedActor`** from the server's bearer middleware, so the **same MCP endpoint
+serves on-prem (static or customer-OIDC tokens) and cloud (WorkOS OAuth) by
+configuration only** — cloud OAuth is an additive RFC-9728 layer that slots in with
+zero MCP changes. The end-state collapses two diverging tool implementations into
+one: the in-server MCP is canonical; the stdio package degrades to a thin
+stdio↔HTTP proxy over it.
+
+> **Key caveat, up front:** the headline "a token Cedar-scoped to a *specific set*
+> of stored queries" requires **per-query `invoke_query` scope**, which is *designed*
+> (rfc-001) but **not yet implemented** — the shipped action is coarse (any stored
+> query on the graph, or none; `policy/src/lib.rs:59-73`). Per-actor Cedar curation
+> works today for *built-in vs ad-hoc vs admin* tools and for *stored-vs-ad-hoc*;
+> sub-selecting individual stored queries per actor is gated on PR 0b ([§12](#12-decisions--locked--open)).
+
+---
+
+## 0. What changed since the first blueprint draft (validation deltas)
+
+The first blueprint of this RFC was authored ~79 commits behind the current head.
+Building from it verbatim would miss these; each is corrected in the noted section.
+
+| # | The first draft said | `main` reality | §  |
+|---|---|---|---|
+| D1 | `ParamKind::Vector { dim: u32 }`, "dim intrinsic, never an `Option`" | `ParamKind::Vector` is a **unit** variant; dim is `ParamDescriptor.vector_dim: Option<u32>` | §7 |
+| D2 | handlers / `server_invoke_query` / auth in `lib.rs`; "verified against `{lib.rs,api.rs}`" | moved to **`handlers.rs`**; stored-query config in **`config.rs`**; DTOs in **`api.rs`** | §4–§8 |
+| D3 | `ingest` tool wraps `ingest_as`; "(`+BranchCreate` when `from` forks)" | `server_ingest` calls **`load_as`**; missing branch **without `from` → 404**, no implicit fork; `ingest_as` shim retired server-side; `ingest` is now a **deprecated CLI alias of `load`** | §8 |
+| D4 | client creds: `servers: { onprem: { endpoint, auth: { token\|oauth } } }`; chain env→**keychain**→file | `OperatorServer { url }` (field is `url`, **no `auth:` block**, "no tokens in this file, ever"); chain is **env → credentials file** (no keychain step) | §10 |
+| D5 | tests: `cargo test -p omnigraph-server --test server` (incl. `mcp_*`), `--test server server_opens_s3…` | `server.rs` monolith split; suites are `auth_policy.rs / data_routes.rs / schema_routes.rs / stored_queries.rs / multi_graph.rs / boot_settings.rs / s3.rs / openapi.rs`. S3 → `--test s3` | §11 |
+| D6 | "`queries.<name>.mcp.expose` in `omnigraph.yaml`" | field survives (default `true`); multi-graph declaration is **`cluster.yaml graphs.<id>.queries`** with file-discovery (`677320c`); RFC-008 deprecates `omnigraph.yaml` | §7.3 |
 
-> **Key caveat, stated up front (see §5.9 below):** the headline "a token scoped via Cedar to a *specific set* of stored queries" requires **per-query `invoke_query` scope**, which is *designed* (rfc-001) but **not yet implemented** — the shipped action is coarse (any stored query on the graph, or none). Per-actor Cedar curation works today for *built-in vs ad-hoc vs admin* tools and for *stored-vs-ad-hoc*; sub-selecting individual stored queries per actor is gated on a prerequisite (PR 0b). Until then, stored-query curation is graph-level (registry membership + `mcp.expose`).
+Everything else from the blueprint (the crate split, the `McpBackend` seam,
+stateless Streamable-HTTP transport, the per-graph routing decision, the
+coarse-`invoke_query` caveat, "no `read`/`change` aliases", "no `graphs_list` in
+MCP") **validated against code and carries forward unchanged.**
 
-## Implementation Blueprint (canonical — build the fresh implementation from this)
+---
 
-> This section is the **authoritative, verified spec**. A reference implementation shipped in [omnigraph#157](https://github.com/ModernRelay/omnigraph/pull/157) and proved every choice here: rmcp **1.7.0** integrates on edition 2024, the auth-extension passthrough works, all conformance MUSTs are met, and the surface splits cleanly into a transport crate + a server backend. Build a clean implementation from `main` by following B.1–B.13 in order. Where this blueprint and the older §5 design sketches differ, **the blueprint wins** (the §5 text is retained as design rationale). Every reuse point named here exists in the engine/server today.
+## 1. Crate architecture & dependency direction
 
-### B.1 Crate architecture & dependency direction
+Two crates. `rmcp` lives **only** in `omnigraph-mcp`; `omnigraph-server` carries
+no direct rmcp dep.
 
-Split the surface into a **transport crate** plus a **server-side backend**:
+```
+omnigraph-server (impl McpBackend, all omnigraph logic)
+        │   depends on
+        ▼
+omnigraph-mcp    (rmcp transport, McpBackend trait, rmcp model re-exports)
+        │   depends on
+        ▼
+rmcp 1.7 + tower-http(limit) + axum + http
+```
 
-- **`crates/omnigraph-mcp`** (new) — the rmcp Streamable-HTTP transport, the `McpBackend` trait, and rmcp model re-exports. `rmcp` (+ `tower-http`'s `limit` feature) live **only** here.
-- **`omnigraph-server`** depends on `omnigraph-mcp` and *implements* `McpBackend`. All omnigraph-specific tool/resource/Cedar/dispatch logic lives in the server.
+The dependency **must** go `server → mcp`, never the reverse: the server binary
+mounts `/mcp`, so `mcp → server` would cycle at the package level
+(`server-bin → omnigraph-mcp → server-lib`). The trait inverts it — the crate
+defines the seam, the server fills it — which is also why the crate can never
+name `AppState`, `GraphHandle`, `do_*`, or `api::*`.
+
+`omnigraph-mcp/Cargo.toml`:
+
+```toml
+[package]
+name = "omnigraph-mcp"
+edition = "2024"           # matches the workspace (omnigraph-server is edition 2024)
+version.workspace = true
+
+[dependencies]
+rmcp = { version = "1.7", default-features = false, features = ["server", "transport-streamable-http-server"] }
+axum = { workspace = true }
+http = "1"
+tower-http = { workspace = true, features = ["limit"] }
+tokio = { workspace = true }
+async-trait = { workspace = true }
+serde_json = { workspace = true }
+```
 
-**The dependency MUST go `omnigraph-server → omnigraph-mcp`, never the reverse.** The server binary mounts `/mcp`, so a `mcp → server` dependency cycles at the package level (`server-bin → omnigraph-mcp → server-lib`), which Cargo rejects. The trait inverts the direction: the crate defines the seam, the server fills it. This is also *why* the crate cannot reach server internals (`AppState`, `do_*`, `authorize`, `api::*`) — it abstracts over them.
+Workspace edit — add the member (current `members`, `Cargo.toml`):
+
+```toml
+members = [
+    "crates/omnigraph-compiler",
+    "crates/omnigraph",
+    "crates/omnigraph-cli",
+    "crates/omnigraph-cluster",
+    "crates/omnigraph-policy",
+    "crates/omnigraph-server",
+    "crates/omnigraph-mcp",      # NEW
+]
+```
 
-`omnigraph-mcp/Cargo.toml`: `edition = "2024"`, version-locked to the workspace. Deps: `rmcp = { version = "1.7", default-features = false, features = ["server", "transport-streamable-http-server"] }`, `axum` (for `Router`), `http`, `tower-http = { features = ["limit"] }`, `tokio`, `async-trait`, `serde_json`. Add `"crates/omnigraph-mcp"` to the workspace `members`; in `omnigraph-server/Cargo.toml` drop `rmcp` and the `tower-http` `limit` feature, add `omnigraph-mcp`.
+In `omnigraph-server/Cargo.toml`: add `omnigraph-mcp = { workspace = true }`;
+do **not** add `rmcp`. (Confirmed absent today: `git grep rmcp origin/main --
+'**/Cargo.toml'` is empty.)
 
-### B.2 The `McpBackend` seam
+---
 
-Use `#[async_trait]` (boxed futures sidestep the async-fn-in-trait `Send`-bound friction; the cost is negligible at MCP QPS, and the server already depends on `async-trait`):
+## 2. The `McpBackend` seam — `NEW (build this)` in `omnigraph-mcp`
 
 ```rust
+// crates/omnigraph-mcp/src/lib.rs
+use async_trait::async_trait;
+
+// rmcp model types re-exported so the server uses them via `omnigraph_mcp::…`
+// and carries no direct rmcp dependency.
+pub use rmcp::model::{
+    Annotated, CallToolResult, Content, RawResource, ReadResourceResult, Resource,
+    ResourceContents, ServerCapabilities, ServerInfo, Tool, ToolAnnotations,
+};
+pub use rmcp::ErrorData as McpError;
+pub type JsonObject = serde_json::Map<String, serde_json::Value>;
+
 #[async_trait]
 pub trait McpBackend: Clone + Send + Sync + 'static {
     fn server_info(&self) -> ServerInfo;
-    async fn list_tools(&self, parts: &http::request::Parts) -> Result<Vec<Tool>, McpError>;
+
+    async fn list_tools(&self, parts: &http::request::Parts)
+        -> Result<Vec<Tool>, McpError>;
+
     async fn call_tool(&self, parts: &http::request::Parts, name: &str, args: JsonObject)
         -> Result<CallToolResult, McpError>;
-    async fn list_resources(&self, parts: &http::request::Parts) -> Result<Vec<Resource>, McpError>;
+
+    async fn list_resources(&self, parts: &http::request::Parts)
+        -> Result<Vec<Resource>, McpError>;
+
     async fn read_resource(&self, parts: &http::request::Parts, uri: &str)
         -> Result<ReadResourceResult, McpError>;
 }
 ```
 
-**Why `&http::request::Parts` (the load-bearing mechanism — verified in rmcp source and `#157`).** rmcp's `StreamableHttpService` injects the original request's `http::request::Parts` into `RequestContext.extensions`. The server's `require_bearer_auth` + `resolve_graph_handle` middleware run *before* the MCP service and insert `ResolvedActor` + `Arc<GraphHandle>` into the request's extensions. The crate hands `&Parts` to the backend; the backend reads its own types from `parts.extensions`. The crate never names an omnigraph type → auth stays decoupled (§5.8) and the crate is reusable. The crate re-exports the rmcp model types the backend needs: `McpError (= rmcp::ErrorData)`, `Tool`, `CallToolResult`, `Content`, `JsonObject`, `Resource`, `RawResource`, `ResourceContents`, `ReadResourceResult`, `ServerInfo`, `ServerCapabilities`, `ToolAnnotations`, `Annotated` — so the server uses rmcp types via `omnigraph_mcp::…` and carries no direct rmcp dep.
+`&http::request::Parts` is the whole decoupling mechanism. The crate hands the
+backend the request parts; the backend reads **its own** types out of
+`parts.extensions` (§4). The crate never names an omnigraph type. `#[async_trait]`
+boxes the futures — negligible at MCP QPS and it sidesteps the
+async-fn-in-trait `Send`-bound friction; the server already depends on
+`async-trait`.
 
-### B.3 Transport (lives in `omnigraph-mcp`)
+---
 
-- A generic `struct McpService<B>` implements rmcp's `ServerHandler`, delegating each method to `B` after extracting `&Parts` from `ctx.extensions` once (missing → `McpError::internal_error`). `get_info → backend.server_info()`; `initialize`/`ping` use rmcp's defaults.
-- `pub fn mcp_router<B: McpBackend>(backend: B, body_limit: usize, hosts: McpHostPolicy) -> axum::Router`:
-  - `config = StreamableHttpServerConfig::default().with_stateful_mode(false).with_json_response(true).with_allowed_hosts(hosts.allowed_hosts).with_allowed_origins(hosts.allowed_origins)` (`StreamableHttpServerConfig` is `#[non_exhaustive]` — build from `Default`, mutate via the `with_*` setters). **Do not keep rmcp's fixed loopback `allowed_hosts` default** — it `403`s every non-loopback client *before* bearer auth (a deploy footgun); derive the policy per B.3.1.
-  - `let svc = StreamableHttpService::new(move || Ok(McpService::new(backend.clone())), Arc::new(LocalSessionManager::default()), config)`; return `Router::new().route_service("/mcp", svc).layer(tower_http::limit::RequestBodyLimitLayer::new(body_limit))`. rmcp reads the body directly (not via an axum extractor), so axum's `DefaultBodyLimit` does **not** bound `/mcp`; the tower-http layer does.
-- **Stateless mode delivers these conformance MUSTs for free (verified against rmcp 1.7 source):** `GET`/`DELETE /mcp` → `405` (with `Allow`); a disallowed `Host`/`Origin` → `403` (per the `McpHostPolicy`, B.3.1); `MCP-Protocol-Version` → `400` on unsupported, default `2025-03-26` when absent. **No conformance middleware is needed** (the §5.6 "honour the header" footnote is satisfied by rmcp).
+## 3. Transport — `NEW (build this)` in `omnigraph-mcp`
 
-**B.3.1 Host/Origin policy — derived from deployment, never a fixed default.** The `Host` header is not a security boundary for a bearer-authed server (the token is); rmcp's loopback `allowed_hosts` default exists to protect *local* dev servers from browser DNS-rebinding, and for any non-loopback deploy it just rejects legitimate clients. So `omnigraph-server` computes `McpHostPolicy` once at startup from what it already knows (the `--bind` address + any configured public host) and passes it in:
-- **Loopback bind** (`127.0.0.1` / `::1` / `localhost`) → loopback `allowed_hosts` (rebind protection genuinely matters locally).
-- **Non-loopback bind** (`0.0.0.0` / a public IP) → the operator chose remote reachability: allow the configured public host(s) if set, else **disable Host-allowlisting** (accept any `Host`) and log it — bearer auth + `Origin` validation are the real controls.
-- `allowed_origins` is the actual browser-attack control: default empty (off — non-browser MCP clients send no `Origin`), configurable for browser-based clients. The class "one fixed default that is wrong for some deployments" is closed by making the policy a function of the deployment.
-- **Client/test obligations rmcp enforces:** the request must carry `Accept: application/json, text/event-stream` (both), `Content-Type: application/json`, and a `Host` header. rmcp negotiates `protocolVersion` (a recent client sees `2025-11-25`).
+```rust
+// crates/omnigraph-mcp/src/transport.rs
+use std::sync::Arc;
+use rmcp::transport::streamable_http_server::{
+    StreamableHttpServerConfig, StreamableHttpService, session::local::LocalSessionManager,
+};
+
+pub struct McpHostPolicy {
+    pub allowed_hosts: Option<Vec<String>>,    // None ⇒ accept any Host
+    pub allowed_origins: Option<Vec<String>>,  // None/empty ⇒ off (non-browser clients send no Origin)
+}
 
-### B.4 Server-side backend (lives in `omnigraph-server`)
+pub fn mcp_router<B: McpBackend>(backend: B, body_limit: usize, hosts: McpHostPolicy) -> axum::Router {
+    // StreamableHttpServerConfig is #[non_exhaustive] — build from Default, mutate via with_*.
+    let config = StreamableHttpServerConfig::default()
+        .with_stateful_mode(false)
+        .with_json_response(true)
+        .with_allowed_hosts(hosts.allowed_hosts)
+        .with_allowed_origins(hosts.allowed_origins);
+
+    let svc = StreamableHttpService::new(
+        move || Ok(McpService::new(backend.clone())),
+        Arc::new(LocalSessionManager::default()),
+        config,
+    );
+
+    axum::Router::new()
+        .route_service("/mcp", svc)
+        // rmcp reads the body directly (NOT via an axum extractor), so axum's
+        // DefaultBodyLimit does NOT bound /mcp — the tower-http layer does.
+        .layer(tower_http::limit::RequestBodyLimitLayer::new(body_limit))
+}
+```
 
-`struct OmnigraphMcpBackend { state: AppState }` (derive `Clone` — `AppState` is already `#[derive(Clone)]`, `Arc`-backed) implements `McpBackend`. Per request it resolves the actor + handle from `parts.extensions.get::<ResolvedActor>()` / `get::<Arc<GraphHandle>>()`.
+`McpService<B>` implements rmcp's `ServerHandler`, extracts `&Parts` from
+`ctx.extensions` once (missing → `McpError::internal_error`), and delegates each
+method to `B`. `get_info → backend.server_info()`; `initialize`/`ping` use rmcp
+defaults.
 
-**Reuse, never reinvent.** First factor **10 thin `do_*` fns** out of the inline `server_*` HTTP handlers (each is `authorize_request(...) → engine call → DTO`) so REST and MCP dispatch one path: `do_snapshot`, `do_schema_get`, `do_branches_list`, `do_commits_list`, `do_commit_show`, `do_ingest`, `do_branch_create`, `do_branch_delete`, `do_branch_merge`, `do_schema_apply`. (No `do_graphs_list` — `graphs_list` is not an MCP tool, see B.10; `server_graphs_list` stays inline for `GET /graphs`.) Land that as a behavior-neutral refactor commit first (it keeps the REST handlers as thin wrappers; all server tests stay green). Then reuse as-is: `run_query` / `run_mutate` (already decoupled from request bodies), `authorize` → `Authz { Allowed, Denied(msg) }` (with `Err` reserved for operational 401/500), `api::query_catalog_entry` / `ParamKind` / `read_output`, `ApiError` (add `pub(crate) status_code()` + `message_str()` accessors for the error classifier). Mount in `build_app`: `.merge(mcp::mcp_router(state))` inside the `per_graph_protected` group, where the server's thin `mcp::mcp_router(state) = omnigraph_mcp::mcp_router(OmnigraphMcpBackend::new(state), INGEST_REQUEST_BODY_LIMIT_BYTES)`.
+**Stateless mode gives these conformance MUSTs for free** (rmcp 1.7): `GET`/`DELETE
+/mcp → 405` with `Allow`; disallowed `Host`/`Origin → 403`; `MCP-Protocol-Version
+→ 400` on unsupported, default `2025-03-26` when absent. No conformance
+middleware needed.
 
-Represent the built-ins as a `Builtin` enum (one variant per tool; `descriptor` / `gate` / `call` as match arms) — lower liability than ~14 unit structs + `dyn` + `async-trait` per tool. Stored-query tools are a sibling populator over `handle.queries`.
+**Host/Origin policy is derived from the deployment, never a fixed default.**
+rmcp's loopback `allowed_hosts` default would `403` every non-loopback client
+before bearer auth. The server computes `McpHostPolicy` from its `--bind`: loopback
+bind → loopback allow-list (rebind protection matters locally); non-loopback bind
+→ allow the configured public host(s) or disable Host-allowlisting and log it
+(bearer + `Origin` are the real controls). `allowed_origins` is the browser-attack
+control: default off; configurable for browser-based clients.
 
-### B.5 Tool catalog (13 built-ins) + Cedar mapping
+---
 
-Each built-in reuses the **exact `PolicyAction` its REST route enforces**. (No `graphs_list` — server-scoped graph discovery is REST-only, see B.10.)
+## 4. The extension passthrough — `EXISTING (reuse)` types, `NEW` backend
 
-| MCP tool | Scope | Cedar action |
-|---|---|---|
-| `health` | server | none (liveness/version) |
-| `snapshot`, `schema_get`, `branches_list`, `commits_list`, `commits_get` | graph | `Read` |
-| `query` (ad-hoc read) | graph | `Read` (`run_query` self-authorizes) |
-| `mutate` (ad-hoc write) | graph | `Change` |
-| `ingest` (NDJSON) | graph | `Change` (+ `BranchCreate` when `from` forks) |
-| `branches_create` / `branches_delete` / `branches_merge` | graph | `BranchCreate` / `BranchDelete` / `BranchMerge` |
-| `schema_apply` (`allow_data_loss`) | graph | `SchemaApply` |
-| *stored query* | graph | `InvokeQuery` (coarse) then inner `Read`/`Change` |
+The two values the backend needs are injected into the request extensions by
+middleware that runs **before** the MCP service. Both are real types:
 
-Baked-in decisions (resolve the Open Questions):
-- **Tool ids are `query`/`mutate` only — no `read`/`change` aliases.** The server HTTP surface already deprecated `/read`,`/change`; a fresh in-server MCP has no legacy clients to keep, so it exposes only the canonical ids. [Open Q7 → resolved: no aliases.]
-- **Ad-hoc `query`/`mutate` are always exposed, Cedar-only** — no `mcp.allow_adhoc` switch. [Open Q3 → resolved: always-on + Cedar.]
+```rust
+// EXISTING (reuse) — crates/omnigraph-server/src/identity.rs:187
+pub struct ResolvedActor {
+    pub actor_id: Arc<str>,
+    pub tenant_id: Option<TenantId>,
+    pub scopes: Vec<Scope>,
+    pub source: AuthSource,        // Static today; Oidc when MR-956 lands
+}
+```
 
-Annotations (rmcp defaults `destructiveHint` **and** `openWorldHint` to **true**, so set them explicitly via `ToolAnnotations::new().read_only(b).destructive(b).open_world(b)`): read tools → `read_only(true).open_world(false)`; `mutate`/`ingest`/`branches_delete`/`branches_merge`/`schema_apply` → `read_only(false).destructive(true).open_world(false)`; `branches_create` (additive) → `read_only(false).destructive(false).open_world(false)`.
+```rust
+// EXISTING (reuse) — crates/omnigraph-server/src/registry.rs:37
+pub struct GraphHandle {
+    pub key: GraphKey,
+    pub uri: String,
+    pub engine: Arc<Omnigraph>,
+    pub policy: Option<Arc<PolicyEngine>>,        // None ⇒ no per-graph Cedar gate
+    pub queries: Option<Arc<QueryRegistry>>,      // None ⇒ no stored queries for this graph
+}
+```
 
-### B.6 Stored-query tools
+The middleware order is fixed in `build_app` (`lib.rs:909-956`, see §9): the outer
+layer `require_bearer_auth` injects `Extension<ResolvedActor>` (or 401); the inner
+layer `resolve_graph_handle` injects `Extension<Arc<GraphHandle>>`. Both land in
+`request.extensions()`, which rmcp forwards into `RequestContext.extensions`.
 
-One MCP tool per `mcp.expose` registry entry (named by its `tool_name`), projected from the **same `api::query_catalog_entry`** the `GET /queries` catalog uses; parameters → JSON Schema per B.7. The outer gate is the **coarse `InvokeQuery`** action (all exposed queries on the graph, or none — per-query scope is deferred, see B.13); the call then runs the registry source through `run_query` / `run_mutate`, whose inner `Read` / `Change` gate the body — the **double-gate of `POST /queries/{name}`**. Skip a stored tool whose name collides with a built-in (built-ins win, so the catalog never has a duplicate tool name). Dispatch reads the query params from `args.params` and the knobs from `args.branch` / `args.snapshot` (B.7's nested envelope), so a query parameter named `branch`/`snapshot` cannot be shadowed.
+```rust
+// NEW (build this) — crates/omnigraph-server/src/mcp/mod.rs
+#[derive(Clone)]
+pub struct OmnigraphMcpBackend { state: AppState }   // AppState is #[derive(Clone)], Arc-backed
+
+impl OmnigraphMcpBackend {
+    fn ctx<'a>(&self, parts: &'a http::request::Parts)
+        -> Result<(&'a ResolvedActor, &'a Arc<GraphHandle>), McpError>
+    {
+        let actor = parts.extensions.get::<ResolvedActor>()
+            .ok_or_else(|| McpError::internal_error("actor missing from request extensions", None))?;
+        let handle = parts.extensions.get::<Arc<GraphHandle>>()
+            .ok_or_else(|| McpError::internal_error("graph handle missing from request extensions", None))?;
+        Ok((actor, handle))
+    }
+}
 
-### B.7 `ParamKind` → JSON Schema (stored-query params)
+// The server's thin wrapper — the only place the two crates meet:
+pub fn mcp_router(state: AppState) -> axum::Router {
+    omnigraph_mcp::mcp_router(
+        OmnigraphMcpBackend { state },
+        INGEST_REQUEST_BODY_LIMIT_BYTES,        // lib.rs:137 — 32 MiB, reused
+        host_policy_from_bind(/* … */),
+    )
+}
+```
 
-| `ParamKind` | JSON Schema |
-|---|---|
-| String / Bool / Int / Float | `{"type":"string"}` / `{"type":"boolean"}` / `{"type":"integer"}` / `{"type":"number"}` |
-| BigInt (i64/u64) | `{"type":"string","pattern":"^-?\\d+$"}` (JSON numbers lose precision >2⁵³) |
-| Date / DateTime | `{"type":"string","format":"date"}` / `{"type":"string","format":"date-time"}` |
-| Blob | `{"type":"string","contentEncoding":"base64"}` |
-| Vector | `{"type":"array","items":{"type":"number"},"minItems":dim,"maxItems":dim}` — **`dim` is intrinsic to the kind** (`ParamKind::Vector { dim: u32 }`, never an `Option`), so a dimensionless vector is unrepresentable and the `unwrap_or(0)` footgun (a 0-length-array schema that rejects all input) cannot exist; if a `dim` is ever genuinely unknown, **omit** `minItems`/`maxItems` rather than emit `0`. |
-| List | `{"type":"array","items":<item_kind schema>}` (scalar items only) |
+---
 
-**Envelope (correct by construction).** The tool's `input_schema` **nests the query params under a `params` object**, mirroring the `POST /queries/{name}` request: `{ "params": { <the query's typed params> }, "branch"?: string, "snapshot"?: string }`. The invocation knobs and the query's own parameters then live in **separate namespaces and cannot collide** — a query param named `branch`/`snapshot` is harmless. `nullable == false` params go in the inner `params.required`. For a **mutation** tool, omit `snapshot` from the schema and set `additionalProperties: false`, so "mutation against a snapshot" is *unrepresentable* (the REST path `400`s it; the MCP schema makes it impossible). Fold `instruction` into the description.
+## 5. Reuse points for dispatch — `EXISTING (reuse)`, all in `handlers.rs`
 
-### B.8 `list` / `call` semantics
+The backend adds **no business logic**. `call_tool` routes to the same functions
+the REST handlers call.
 
-- **`list_tools` / `list_resources`** are Cedar-filtered. For each tool/resource, run the **same authorization the call path runs, with default args** (the default branch `main`) — **not** a separate `branch: None` probe (a `branch: None` request matches no `branch_scope` rule, so it would *hide* tools the actor can actually call on a scoped branch). Sharing the authorization input makes list and call agree by construction for the common case. Emit only `Allowed`; an `Err` (operational) propagates as a JSON-RPC error; a `Denied` hides. Stored-query tools list as a group iff the coarse `InvokeQuery` is allowed.
-- **`call_tool`:** an unknown tool **or** a denied tool returns the **identical** `unknown tool: <name>` (`-32602`) so the catalog can't be probed without the grant. Route **every** `do_*`/`run_*` result through **one canonical `classify(Result<_, ApiError>) -> Result<CallToolResult, McpError>`** — the single source of truth, used at every dispatch site so the mapping can't drift: `Ok` → success JSON content; `Err` 4xx/409 → `CallToolResult { isError: true }` (the model self-corrects, per the 2025-11-25 SEP-1303 split); `Err` operational 5xx → JSON-RPC error. (Do not keep two mappers — a single `classify` replaces the `api_error_to_tool` / `api_operational_error` split that drifted in the reference impl.) A missing/bad bearer is an HTTP `401` at the boundary *before* rmcp.
-- **Branch-scope residual (R7):** because visibility uses the default branch, a grant scoped to a *non-default* branch is an **over-show** — the tool lists and `call_tool` is the authoritative gate. Over-showing is the safe direction (the agent sees the tool, tries it, gets a clear deny); under-showing a callable tool is the failure mode the default-branch probe avoids.
+### 5.1 The authorization gate
 
-### B.9 Resources
+```rust
+// crates/omnigraph-server/src/handlers.rs:336
+pub(crate) enum Authz { Allowed, Denied(String) }
+
+// crates/omnigraph-server/src/handlers.rs:357 — Err reserved for operational 401/500
+pub(crate) fn authorize(
+    actor: Option<&ResolvedActor>,
+    policy: Option<&PolicyEngine>,
+    request: PolicyRequest,
+) -> Result<Authz, ApiError>;
+
+// crates/omnigraph-server/src/handlers.rs:438 — denial ⇒ 403
+pub(crate) fn authorize_request(
+    actor: Option<&ResolvedActor>,
+    policy: Option<&PolicyEngine>,
+    request: PolicyRequest,
+) -> Result<(), ApiError>;
+```
 
-Two resources: `omnigraph://schema` (`Read` → `do_schema_get`) and `omnigraph://branches` (`Read` → `do_branches_list`, JSON text). (No `omnigraph://graphs` — server-scoped, dropped with `graphs_list`; see B.10.) `list_resources`/`read_resource` Cedar-filtered + masked exactly like tools. Advertise the `resources` capability only because both handlers are backed (don't advertise a capability whose `read` would 404).
+`PolicyRequest` carries **no actor identity** (dropped from the type — the MR-731
+invariant) and **no query-name dimension** (the coarse-`invoke_query` caveat):
 
-### B.10 Routing (RESOLVED)
+```rust
+// crates/omnigraph-policy/src/lib.rs:252
+pub struct PolicyRequest {
+    pub action: PolicyAction,
+    pub branch: Option<String>,
+    pub target_branch: Option<String>,
+}
+// crates/omnigraph-policy/src/lib.rs:259
+pub struct PolicyDecision { pub allowed: bool, pub matched_rule_id: Option<String>, pub message: String }
+// crates/omnigraph-policy/src/lib.rs:509
+impl PolicyEngine { pub fn authorize(&self, actor_id: &str, request: &PolicyRequest) -> Result<PolicyDecision>; }
+```
 
-`/mcp` lives in the `per_graph_protected` route group: single mode → `POST /mcp`; multi mode → `POST /graphs/{graph_id}/mcp` (per-graph isolation; consistent with the `/graphs/{id}/...` REST cluster routing). [Open Q5 → resolved: per-graph, final.]
+### 5.2 The read / mutate runners (already decoupled from request bodies)
 
-**Decided: the MCP surface has no server-scoped tools or resources.** `graphs_list` and `omnigraph://graphs` are **dropped from MCP** — graph discovery is a REST/admin concern, served by `GET /graphs`. Every MCP tool/resource is graph-scoped, the per-graph `/mcp` is fully clean, and there is no flat server-level `/mcp`. (If a concrete need to enumerate graphs *over MCP* ever arises, add a flat server-level `POST /mcp` in the `management` group — bearer-only, no graph handle, server-scoped tools only — but do not build it speculatively.)
+```rust
+// crates/omnigraph-server/src/handlers.rs:731 — self-authorizes Read
+pub(crate) async fn run_query(
+    handle: Arc<GraphHandle>,
+    actor: Option<&ResolvedActor>,
+    query: &str,
+    name: Option<&str>,
+    params_json: Option<&Value>,
+    branch: Option<String>,
+    snapshot: Option<String>,
+    reject_mutations: bool,
+) -> Result<(String, ReadTarget, omnigraph_compiler::result::QueryResult), ApiError>;
+
+// crates/omnigraph-server/src/handlers.rs:665 — self-authorizes Change + admission
+pub(crate) async fn run_mutate(
+    state: AppState,
+    handle: Arc<GraphHandle>,
+    actor: Option<&ResolvedActor>,
+    query: &str,
+    name: Option<&str>,
+    params_json: Option<&Value>,
+    branch: String,
+) -> Result<ChangeOutput, ApiError>;
+```
 
-**Do not** consolidate to a single flat `/mcp` that takes `graph_id` per call: MCP's `tools/list` cannot carry a graph, so it can't list per-graph stored-query tools; it also breaks isolation, pollutes every tool's `input_schema`, and diverges from the URL-scoped REST routing.
+`run_query` returns the raw `QueryResult`; project it for the wire with the
+existing `read_output(query_name, &target, result) -> ReadOutput` (`api.rs:634`).
 
-### B.11 Auth (decoupled; OAuth is a committed fast-follow)
+### 5.3 The canonical double-gate + deny-as-404 pattern to mirror
 
-The handler consumes an already-resolved `ResolvedActor` and **branches on nothing** about how the token was verified (§5.8). Static bearer works **today** with the developer clients; the consumer connectors need OAuth, a planned additive layer that changes zero MCP code (it only swaps the bearer middleware behind a `TokenVerifier`, and serves RFC 9728 metadata).
+`server_invoke_query` (`handlers.rs:933`) is exactly the shape `call_tool` must
+reproduce for stored-query tools — reproduced here as the contract:
 
-| Integration | Static bearer (this surface) | Note |
-|---|---|---|
-| Claude Code, Cursor, VS Code | ✅ | `claude mcp add --transport http <url>/mcp --header "Authorization: Bearer <tok>"` |
-| Claude Messages API MCP connector | ✅ | caller passes `authorization_token` → `Authorization: Bearer` |
-| claude.ai web / Claude Desktop connectors | ❌ needs OAuth fast-follow | requires OAuth 2.1 + PKCE (S256) + RFC 9728 + DCR/CIMD/custom client id+secret |
-| ChatGPT developer-mode connectors | ❌ needs OAuth fast-follow | OAuth 2.1 (CIMD/DCR/PKCE) or "no auth"; no static-bearer mode |
+```rust
+// EXISTING (reuse pattern) — handlers.rs:953  (outer InvokeQuery gate, deny == missing)
+match authorize(actor_ref, handle.policy.as_deref(), PolicyRequest {
+    action: PolicyAction::InvokeQuery,
+    branch: None,            // graph-scoped: NO branch dimension on the outer gate
+    target_branch: None,
+})? {
+    Authz::Allowed => {}
+    Authz::Denied(_) => return Err(ApiError::not_found(NOT_FOUND)),  // "stored query not found"
+}
+let stored = handle.queries.as_ref()
+    .and_then(|r| r.lookup(&name))
+    .ok_or_else(|| ApiError::not_found(NOT_FOUND))?;                 // same 404 on a miss
+// … then the inner gate runs inside run_mutate (Change) / run_query (Read).
+if is_mutation {
+    if req.snapshot.is_some() {                                      // mutation-against-snapshot unrepresentable
+        return Err(ApiError::bad_request("stored mutation cannot target a snapshot"));
+    }
+    // run_mutate(…)  ← inner Change gate
+} else {
+    // run_query(…)   ← inner Read gate
+}
+```
 
-OAuth fast-follow (MR-956): serve `/.well-known/oauth-protected-resource` + `WWW-Authenticate` on 401, front a managed AS (WorkOS AuthKit by default) that supports DCR + PKCE, validate audience-bound JWTs offline → `ResolvedActor`. Keep it **config-gated/dual-mode** so a server that does not advertise OAuth lets the dev clients keep using the static `Authorization` header (avoids the Claude Code header-vs-OAuth conflict).
+For MCP, the same masking principle applies but the wire encoding differs: an
+**unknown tool OR a denied tool** returns the identical `unknown tool: <name>`
+JSON-RPC error (`-32602`) so the catalog can't be probed without the grant (§6).
 
-### B.12 Tests & verification
+### 5.4 Error classification — `NEW (build this)`, one mapper
 
-- **Protocol:** `initialize` handshake + advertised `{tools, resources}` caps; `tools/list` shape; `tools/call` happy path; JSON-RPC errors (`-32601`/`-32602`); `resources/list` + `resources/read`; `GET /mcp` → 405; `MCP-Protocol-Version` 400/default; `Origin` → 403.
-- **Cedar (coarse):** a read-only actor sees the read tools but not `mutate`/`ingest`/`branches_*`/`schema_apply`; a denied `tools/call` masks byte-identically to an unknown one; stored queries listed only with `invoke_query`; the double-gate (an `invoke_query`-only actor sees a stored tool but the call surfaces `isError` when the inner `read` denies).
-- **Dispatch:** a `mutate` call writes end-to-end (proves the actor/handle extension passthrough); a malformed query → `isError:true`, not a JSON-RPC error.
-- **Resources:** list + read of `schema`/`branches`; a denied read masks as unknown.
-- **Auth decoupling / no-bearer:** `/mcp` 401s without a bearer (before rmcp) and 200s with one; the suite is green under the static-hash verifier (and a mock `ResolvedActor` source proves verifier-agnosticism).
-- **Crate-level:** a tiny `omnigraph-mcp/tests/` with a trivial `McpBackend` impl serving `initialize` + `GET→405` proves the crate stands alone; add an rmcp surface-guard there pinning `StreamableHttpServerConfig` field names + the `ServerHandler` method shapes.
-- **Verification commands:** `cargo build --workspace --locked`; `cargo tree -p omnigraph-server -e normal | grep rmcp` shows rmcp **only** transitively under `omnigraph-mcp`; `cargo test -p omnigraph-server --test server` (incl. the `mcp_*` cases, black-box over `build_app`) + `--test openapi` (no `/mcp` leak — it carries no `#[utoipa::path]`); live smoke: run the server with a bearer + policy, `curl` `initialize`/`tools/list`/`tools/call`/`GET→405`.
+`ApiError`'s fields are **private** (`lib.rs:296`), so add two accessors:
 
-### B.13 Decisions locked
+```rust
+// EXISTING shape — crates/omnigraph-server/src/lib.rs:296
+pub struct ApiError {
+    status: StatusCode, code: ErrorCode, message: String,
+    merge_conflicts: Vec<api::MergeConflictOutput>,
+    manifest_conflict: Option<api::ManifestConflictOutput>,
+}
 
-- **rmcp 1.7** (not hand-rolled) — verified to integrate on edition 2024. [Open Q2 → resolved.]
-- **Coarse `invoke_query` only**; per-query scope deferred (PR 0b — adds a query-name dimension to `PolicyRequest` + the Cedar schema). [The headline caveat.]
-- **Ad-hoc `query`/`mutate` always exposed, Cedar-only**; no `mcp.allow_adhoc`. [Open Q3 → resolved.]
-- **`query`/`mutate` ids only**, no `read`/`change` aliases. [Open Q7 → resolved.]
-- **Per-graph `/mcp` routing**; `graphs_list`/`omnigraph://graphs` **dropped from MCP** (graph discovery via REST `GET /graphs`); no server-scoped MCP tools. [Open Q5 → resolved.]
-- **text-JSON `content` for v1**; `structuredContent`/`outputSchema` deferred. [Open Q4 → resolved.]
-- **BigInt as JSON string.** [Open Q1 → resolved.]
-- **Static bearer now, OAuth/RFC-9728 fast-follow.**
+// NEW: add accessors so the MCP classifier can read status/message.
+impl ApiError {
+    pub(crate) fn status_code(&self) -> StatusCode { self.status }
+    pub(crate) fn message_str(&self) -> &str { &self.message }
+}
+```
 
-## Relationship to RFC-001
+```rust
+// NEW (build this) — the single source of truth, used at EVERY dispatch site
+fn classify(r: Result<CallToolResult, ApiError>) -> Result<CallToolResult, McpError> {
+    match r {
+        Ok(out) => Ok(out),
+        Err(e) if e.status_code().is_client_error() =>          // 4xx / 409 → model self-corrects
+            Ok(CallToolResult::error(vec![Content::text(e.message_str())])),  // isError: true
+        Err(e) => Err(McpError::internal_error(e.message_str().to_owned(), None)),  // 5xx → JSON-RPC error
+    }
+}
+```
 
-[rfc-001-queries-envelope-mcp.md](rfc-001-queries-envelope-mcp.md) (MR-656 / MR-976 / MR-969) is the parent design for stored queries + the response envelope + MCP. This RFC is the **detailed MCP-transport design** that #128 left for a follow-up, and it **revises rfc-001 in three places where the shipped code or the MCP wire protocol diverged from rfc-001's sketch**:
+One `classify`, not the `api_error_to_tool` / `api_operational_error` split that
+drifted in the reference impl. A missing/bad bearer is an HTTP 401 at the boundary
+*before* rmcp.
 
-1. **Transport shape.** rfc-001 sketched `GET /mcp/tools` + `POST /mcp/invoke` (a bespoke REST pair). **That is not the MCP wire protocol — real MCP clients cannot connect to it.** This RFC implements actual MCP JSON-RPC over Streamable HTTP and reuses `query_catalog_entry` as a *projection source*, not a parallel surface. (rfc-001's own Open Question already leaned toward Streamable HTTP.)
-2. **Exposure config.** rfc-001 specified inline `.gq` pragmas (`@mcp(expose=…)`, default `expose=false`). **#128 shipped a different mechanism:** YAML `queries.<name>.mcp.expose` in `omnigraph.yaml`, **default `true`** (declaring a query in the manifest *is* the opt-in). This RFC builds on the shipped YAML form; the `.gq`-pragma design in rfc-001 is superseded for exposure.
-3. **Schema introspection.** rfc-001 lists "Schema introspection through MCP" as a **non-goal** ("agents see types through declared return shapes"). This RFC **revises that**: the operational-parity tools include `schema_get` and `omnigraph://schema` — *because the shipped stdio package already exposes both*. The non-goal is achieved by *policy*, not omission: `schema_get`/`omnigraph://schema` are Cedar-gated by `Read`, and the recommended locked-down agent policy denies `Read`, so a curated agent still never sees the schema. (rfc-001's intent is preserved; the mechanism moves from "don't build it" to "build it, gate it.")
+---
 
-Everything else in rfc-001 (two-paths-one-engine, per-query `invoke_query` *as the intended scope*, the response envelope, multi-graph per-graph endpoints) this RFC consumes unchanged.
+## 6. Tool catalog (13 built-ins) + Cedar mapping
 
-> **Numbering note:** the `TokenVerifier`/WorkOS auth design is referred to in code (`crates/omnigraph-server/src/identity.rs`) as "RFC 0001," which is a *different* document from this repo's `docs/dev/rfc-001-queries-envelope-mcp.md`. To avoid the collision this RFC cites the auth substrate as **MR-956** throughout, never "RFC 0001."
+Each built-in reuses the **exact `PolicyAction` its REST route enforces**. The full
+action vocabulary on `main`:
 
-## Reconciliation with shipped code (historical — pre-MCP, against #128 HEAD)
+```rust
+// EXISTING — crates/omnigraph-policy/src/lib.rs:18
+pub enum PolicyAction {
+    Read, Export, Change, SchemaApply,
+    BranchCreate, BranchDelete, BranchMerge,
+    Admin,        // reserved, no call site yet
+    GraphList,    // server-scoped (resource_kind == Server)
+    InvokeQuery,  // graph-scoped, coarse (no per-query dimension yet)
+}
+```
 
-> *Historical: this was the gap analysis against `#128` (the stored-query REST foundation) before the MCP surface was built. The three ❌ items below — the MCP protocol surface, and the `TokenVerifier` — were subsequently built/addressed in [#157](https://github.com/ModernRelay/omnigraph/pull/157) (transport, tools, resources) except per-query scope and OAuth, which remain deferred. For the current build instructions see the [Implementation Blueprint](#implementation-blueprint-canonical--build-the-fresh-implementation-from-this).*
+| MCP tool | Scope | Cedar action |
+|---|---|---|
+| `health` | server | none (liveness/version) |
+| `snapshot`, `schema_get`, `branches_list`, `commits_list`, `commits_get` | graph | `Read` |
+| `query` (ad-hoc read) | graph | `Read` (`run_query` self-authorizes) |
+| `mutate` (ad-hoc write) | graph | `Change` |
+| `load` (NDJSON; **renamed from `ingest`** — see §8) | graph | `Change` (+ `BranchCreate` **iff `from` present**) |
+| `branches_create` / `branches_delete` / `branches_merge` | graph | `BranchCreate` / `BranchDelete` / `BranchMerge` |
+| `schema_apply` (`allow_data_loss`) | graph | `SchemaApply` |
+| *stored query* | graph | `InvokeQuery` (coarse) then inner `Read`/`Change` |
 
-Verified against `crates/omnigraph-server/src/{lib.rs,api.rs}` and `crates/omnigraph-policy/src/lib.rs` at the `#128` branch head:
+Locked decisions (validated): **tool ids are `query`/`mutate` only** — no
+`read`/`change` aliases (the REST surface already deprecated `/read`, `/change`; a
+fresh MCP has no legacy clients). **Ad-hoc `query`/`mutate` are always exposed,
+Cedar-only** — no `mcp.allow_adhoc`. **No `graphs_list` / `omnigraph://graphs`** —
+graph discovery stays REST-only via `GET /graphs` (server-scoped `GraphList`, §9).
 
-- ✅ `GET /queries` returns the `mcp.expose == true` subset as `QueriesCatalogOutput { queries: [QueryCatalogEntry] }`, each with typed `ParamDescriptor`s, `tool_name`, `description`, `instruction`, and a `mutation` flag. **MCP-ready projection, but exposed as bespoke REST/JSON — not the MCP wire protocol.**
-- ✅ `POST /queries/{name}` route exists (`server_invoke_query`, `lib.rs`).
-- ✅ `query_catalog_entry()` / `param_descriptor()` with an exhaustive `ScalarType → ParamKind` map (a new scalar is a compile error).
-- ✅ `InvokeQuery` Cedar action defined in `omnigraph-policy`.
-- ✅ **`InvokeQuery` IS enforced** at `POST /queries/{name}`: `server_invoke_query` calls `authorize(PolicyAction::InvokeQuery)` and **masks a denial to a 404 identical to "unknown query"** so the catalog isn't probeable (the denial-masking the previous draft of this RFC reported as missing is shipped — it lives in `lib.rs`, not `api.rs`). The stored-mutation path is already double-gated: `InvokeQuery` outer, then `Change` inside `run_mutate`.
-- ✅ **Reuse path exists:** `run_query` / `run_mutate` are already decoupled from their HTTP request bodies and take registry-supplied `(source, name, params, branch/snapshot)`. MCP `tools/call` for both stored and ad-hoc tools delegates to these — no new business logic.
-- ❌ **Per-query (`invoke_query[name]`) scope is NOT implemented.** `PolicyRequest` carries only `{action, branch, target_branch}` — **no query-name dimension** — and the action is documented coarse ("permits *any* stored query on the graph"). rfc-001 *designed* per-name scope; it is unbuilt. This RFC's per-query Cedar filtering (§5.4) and recommended agent policy (§5.9) depend on it → tracked as **PR 0b**.
-- ❌ No MCP protocol surface (`initialize`/`tools/list`/`tools/call`, JSON-RPC, transport).
-- ❌ No `TokenVerifier` trait yet — `require_bearer_auth` resolves a `ResolvedActor` inline (static-hash). The trait/`OidcJwtVerifier` are MR-956 (draft). The MCP layer's only requirement — *consume `ResolvedActor`* — is satisfiable today.
+Annotations (rmcp defaults `destructiveHint` **and** `openWorldHint` to true — set
+explicitly): read tools → `read_only(true).open_world(false)`;
+`mutate`/`load`/`branches_delete`/`branches_merge`/`schema_apply` →
+`read_only(false).destructive(true).open_world(false)`; `branches_create`
+(additive) → `destructive(false)`.
 
-Stack (verified `Cargo.toml`): Axum + utoipa (OpenAPI) + `omnigraph-policy` (Cedar) + `futures` + `tokio`. **No MCP crate present.** `edition = "2024"`.
+Represent built-ins as a `Builtin` enum (one variant per tool; `descriptor`/`gate`/`call`
+as match arms) — lower liability than ~13 unit structs + `dyn` + `async-trait`.
+Stored-query tools are a sibling populator over `handle.queries`.
 
-## Motivation
+`list_tools`/`list_resources` are Cedar-filtered by running the **same**
+authorization the call path runs, with **default args (branch `main`)** — not a
+`branch: None` probe (which matches no `branch_scope` rule and would hide callable
+tools). Over-showing a branch-scoped grant is the safe direction; `call_tool` is
+the authoritative gate. (The contract: list never shows a tool the actor can't ever
+call; for branch-scoped tools it may show one callable only on some branches.)
 
-- **One curated, safe, remotely-reachable tool surface.** MR-969's thesis: hand an LLM a token Cedar-scoped to a set of tools and it sees exactly those typed tools — cannot construct ad-hoc queries it isn't permitted, cannot read the schema it isn't permitted, cannot reach other graphs. Today the only MCP is the stdio package: local-only, full surface, ungated.
-- **Parity, so the in-server MCP can be the single implementation.** Operators/agents already depend on the operational tools. Supporting them server-side behind one Cedar gate lets the stdio package degrade to a proxy and removes two diverging tool sets.
-- **On-prem and cloud from one endpoint.** A managed cloud (WorkOS OAuth) and an on-prem/air-gapped deploy (static or customer-OIDC tokens) must serve the same MCP without forks or MCP-specific auth.
-- **Foundation for the agent on-ramp (MR-974).** `omnigraph mcp install --agent <tool>` needs a decided transport + a stable endpoint.
+---
 
-## Goals
+## 7. Stored-query projection — `EXISTING (reuse)` + the **corrected** schema
 
-- Project built-in tools + stored queries as MCP tools through **one** registry abstraction.
-- `tools/list` and the callable set are **identical for argument-independent authorization**, both driven by Cedar (see §5.4 for the branch-scoped caveat).
-- The MCP layer is **auth-method-agnostic**: it consumes `ResolvedActor`, never a raw token, never branches on how auth happened.
-- The same endpoint works on-prem (static/OIDC) and cloud (WorkOS OAuth), switched by config; cloud OAuth is additive (RFC 9728).
-- No new business logic: MCP tools delegate to the same `run_query`/`run_mutate`/branch/schema functions the HTTP routes call.
-- Behaviour-neutral when unused: no MCP traffic = no change.
+The projection source is the same `query_catalog_entry` the `GET /queries` catalog
+uses. The real types (note `Vector` is a **unit** variant; the dim is a separate
+`Option<u32>` field — this is delta **D1**):
 
-## Non-Goals
+```rust
+// EXISTING — crates/omnigraph-server/src/api.rs:346
+pub enum ParamKind { String, Bool, Int, BigInt, Float, Date, DateTime, Blob, Vector, List }
+
+// crates/omnigraph-server/src/api.rs:362
+pub struct ParamDescriptor {
+    pub name: String,
+    pub kind: ParamKind,
+    pub item_kind: Option<ParamKind>,   // Some(scalar) when kind == List
+    pub vector_dim: Option<u32>,        // Some(dim) when kind == Vector  ← dim lives HERE, not in the kind
+    pub nullable: bool,
+}
 
-- **Building/hosting an OAuth authorization server.** The server is a Resource Server; WorkOS AuthKit+Connect is the AS (MR-956). The MCP endpoint validates tokens, never issues them, never holds client secrets.
-- **OAuth/WorkOS implementation itself** — MR-956's work. This RFC leaves a clean RFC-9728 hook and consumes `ResolvedActor`.
-- **MCP prompts, elicitation, `tools/list_changed`, resource subscriptions, server-initiated messages.** None needed → enables a stateless POST-only transport (§5.6).
-- **stdio transport inside the server.** stdio stays in the TS package (now a proxy).
-- **Cross-graph tool listing.** Per-graph catalogs only (MR-969 + RFC-002 non-goal).
-- **Hot reload of the query registry.** Restart-only (MR-969).
+// crates/omnigraph-server/src/api.rs:453
+pub fn query_catalog_entry(query: &StoredQuery) -> QueryCatalogEntry;  // → { name, tool_name, description, instruction, mutation, params }
+```
 
-## Background
+```rust
+// EXISTING — crates/omnigraph-server/src/queries.rs:32
+pub struct StoredQuery {
+    pub name: String,
+    pub source: Arc<str>,
+    pub decl: QueryDecl,
+    pub expose: bool,             // default true
+    pub tool_name: Option<String>,
+}
+impl StoredQuery {
+    pub fn is_mutation(&self) -> bool;           // queries.rs:51
+    pub fn effective_tool_name(&self) -> &str;   // queries.rs:60 — tool_name override else name
+}
+pub struct QueryRegistry { /* by_name: BTreeMap<String, StoredQuery> */ }   // queries.rs:67; .lookup(&name)
+```
 
-`omnigraph-server` (Axum) already implements every operation this RFC exposes as an authenticated HTTP route; each authorizes via a `PolicyAction` against the Cedar policy for a server-resolved actor and calls into the engine. The existing stdio MCP package is a *client* of these routes (it owns no business logic). MR-956 will introduce a `TokenVerifier` trait (`StaticHashTokenVerifier` today inline, `OidcJwtVerifier` for OIDC/WorkOS) producing the `ResolvedActor { actor_id, tenant_id: Option, scopes: Vec<Scope>, source }` that already exists in `identity.rs` and is consumed by Cedar — token *validation* is offline (cached JWKS), so on-prem/air-gapped has no request-path dependency on the cloud.
+### 7.1 `ParamDescriptor → JSON Schema` — `NEW (build this)`, corrected for the real types
 
-## Design
+```rust
+// NEW (build this) — corrects the blueprint's nonexistent `ParamKind::Vector { dim }`.
+use serde_json::{json, Value};
+
+fn scalar_schema(kind: ParamKind) -> Value {
+    match kind {
+        ParamKind::String   => json!({ "type": "string" }),
+        ParamKind::Bool     => json!({ "type": "boolean" }),
+        ParamKind::Int      => json!({ "type": "integer" }),
+        ParamKind::BigInt   => json!({ "type": "string", "pattern": r"^-?\d+$" }), // i64/u64 lose precision >2^53 as JSON numbers
+        ParamKind::Float    => json!({ "type": "number" }),
+        ParamKind::Date     => json!({ "type": "string", "format": "date" }),
+        ParamKind::DateTime => json!({ "type": "string", "format": "date-time" }),
+        ParamKind::Blob     => json!({ "type": "string", "contentEncoding": "base64" }),
+        // Vector/List handled by param_schema (they carry sub-structure).
+        ParamKind::Vector | ParamKind::List => unreachable!("composite kinds handled in param_schema"),
+    }
+}
 
-> *§5 is the original design sketch (design rationale). Where it differs from the [Implementation Blueprint](#implementation-blueprint-canonical--build-the-fresh-implementation-from-this) above, the Blueprint is authoritative. Notable divergences proven out by [#157](https://github.com/ModernRelay/omnigraph/pull/157): the §5.1 per-tool `McpTool` trait became a `Builtin` enum + an `McpBackend` crate trait (B.1–B.4); §5.6's rmcp-vs-hand-roll is resolved to rmcp 1.7 (B.3); §5.7's "server tools on a per-graph endpoint" is resolved in B.10; the §5.2 `read`/`change` aliases are dropped (B.5).*
+fn param_schema(p: &ParamDescriptor) -> Value {
+    match p.kind {
+        ParamKind::Vector => {
+            let mut s = json!({ "type": "array", "items": { "type": "number" } });
+            // dim is Option<u32> on the descriptor. Present ⇒ constrain length.
+            // Absent ⇒ OMIT minItems/maxItems — never emit 0 (a 0-length-array
+            // schema rejects all input; this is the footgun the first draft
+            // misattributed to a type that doesn't exist).
+            if let Some(dim) = p.vector_dim {
+                s["minItems"] = json!(dim);
+                s["maxItems"] = json!(dim);
+            }
+            s
+        }
+        ParamKind::List => {
+            let item = p.item_kind.map(scalar_schema).unwrap_or_else(|| json!({ "type": "string" }));
+            json!({ "type": "array", "items": item })  // grammar guarantees scalar items
+        }
+        scalar => scalar_schema(scalar),
+    }
+}
+```
 
-### 5.1 One tool model: a `McpTool` trait, two populators
+### 7.2 Envelope (correct by construction)
 
-Both built-in and stored-query tools implement one trait so `tools/list` / `tools/call` never special-case:
+The tool's `input_schema` **nests query params under `params`**, mirroring
+`POST /queries/{name}`, so the invocation knobs and the query's own params live in
+separate namespaces and cannot collide:
 
-```rust
-trait McpTool: Send + Sync {
-    fn name(&self) -> &str;                       // MCP tool id (stable)
-    fn title(&self) -> Option<&str>;
-    fn description(&self) -> &str;
-    fn input_schema(&self) -> serde_json::Value;  // JSON Schema (draft 2020-12)
-    fn annotations(&self) -> ToolAnnotations;     // readOnlyHint / destructiveHint / idempotentHint
-    /// The Cedar request(s) this call requires, given parsed args. Used BOTH at
-    /// list-time (dry-run filter, default args) and call-time (enforce, real args).
-    fn authorization(&self, args: &ToolArgs) -> Vec<PolicyRequest>;
-    async fn call(&self, ctx: &GraphCtx, args: ToolArgs) -> Result<ToolOutput, ToolError>;
-}
+```jsonc
+{ "type": "object",
+  "properties": {
+    "params": { "type": "object", "properties": { /* per-param param_schema(...) */ }, "required": [ /* nullable==false */ ] },
+    "branch":   { "type": "string" },
+    "snapshot": { "type": "string" }   // OMIT for mutation tools (mutation-against-snapshot is unrepresentable)
+  },
+  "additionalProperties": false }
 ```
 
-- **Built-ins**: ~14 static impls, each delegating to the *same* function its HTTP route calls (`run_query`, `run_mutate`, branch ops, `apply_schema_as`, …). `input_schema` authored once (or derived from each route's existing `utoipa`/`ToSchema` DTO).
-- **Stored queries**: generated `McpTool` instances, one per `mcp.expose` entry; `input_schema` from `param_descriptor` (§5.3); `authorization` → `InvokeQuery` (coarse today; `InvokeQuery{name}` after PR 0b) then the inner `Read`/`Change`.
+Dispatch reads query params from `args.params` and knobs from `args.branch` /
+`args.snapshot`, so a query parameter literally named `branch`/`snapshot` is
+harmless. Skip a stored tool whose `effective_tool_name()` collides with a
+built-in (built-ins win). The outer gate is coarse `InvokeQuery`; the inner gate
+is the query body's `Read`/`Change` — the same double-gate as §5.3.
 
-`ToolRegistry` for a graph = the static built-ins + the dynamic stored-query tools resolved from that graph's `GraphHandle` registry.
+### 7.3 Where stored queries are declared (delta D6)
 
-### 5.2 Tool catalog (parity) and Cedar mapping
+`expose` + `tool_name` survive with default `expose: true` (`config.rs:132`,
+`queries.rs:43`). The **declaration source** moved: multi-graph servers declare
+queries in **`cluster.yaml graphs.<id>.queries`** with Terraform-shaped file
+discovery (`queries: queries/` discovers top-level `*.gq`, loud on parse/dup
+errors — `677320c`); the single-graph `omnigraph.yaml queries:` map still parses
+but is the **legacy** surface RFC-008 deprecates. The MCP projection is agnostic
+to the source — it reads the loaded `QueryRegistry` on `handle.queries` — but the
+blueprint's "`in omnigraph.yaml`" phrasing is corrected to "the graph's loaded
+stored-query registry (`cluster.yaml` discovery, or legacy `omnigraph.yaml`)."
 
-Each built-in **reuses the exact `PolicyAction` its HTTP route already enforces** — verified against the handlers in `lib.rs`, not invented:
+---
 
-| MCP tool | Scope | Read/Mutate | Cedar action (verified from route) |
-|---|---|---|---|
-| `health` | server | read | none (liveness/version) |
-| `graphs_list` *(new)* | server | read | `GraphList` |
-| `snapshot` | graph | read | `Read` |
-| `schema_get` | graph | read | `Read` |
-| `branches_list` | graph | read | `Read` |
-| `commits_list`, `commits_get` | graph | read | `Read` |
-| `read` (ad-hoc `.gq`) / `query` *(alias)* | graph | read | `Read` |
-| `change` (ad-hoc `.gq`) / `mutate` *(alias)* | graph | mutate | `Change` |
-| `ingest` (NDJSON) | graph | mutate | `Change` (+ `BranchCreate` when forking a new branch) |
-| `branches_create` | graph | mutate | `BranchCreate` |
-| `branches_delete` | graph | mutate | `BranchDelete` |
-| `branches_merge` | graph | mutate | `BranchMerge` |
-| `schema_apply` (`allow_data_loss`) | graph | mutate | `SchemaApply` |
-| **stored query** (`find_user`, …) | graph | inferred | `InvokeQuery` (coarse; `InvokeQuery{name}` after PR 0b) + inner `Read`/`Change` |
-
-There is **no `Ingest` and no separate `snapshot`/`Export` action** — `ingest` enforces `Change`, `snapshot` enforces `Read`. (`Export` exists but maps to the `/export` route, which this RFC does not expose as a tool.)
-
-**Tool id parity vs. canonicalization.** The shipped stdio package uses tool ids **`read`/`change`** (and calls the deprecated `/read`,`/change` routes). The server HTTP surface canonicalized to `/query`,`/mutate` with `/read`,`/change` deprecated (MR-656). To keep existing package clients working *and* align with the server, the MCP exposes **`query`/`mutate` as canonical with `read`/`change` retained as deprecated-but-live aliases** (both dispatch to the same handler). Open Q7 asks whether to drop the aliases later.
-
-Resources (§5.5): `omnigraph://schema`, `omnigraph://branches` (parity), plus `omnigraph://graphs` *(new)* — each gated by the same action as its list/get route (`Read`, `Read`, `GraphList`).
-
-### 5.3 `ParamDescriptor → JSON Schema` (stored-query tools)
-
-| `ParamKind` | JSON Schema | Notes |
-|---|---|---|
-| String | `{"type":"string"}` | |
-| Bool | `{"type":"boolean"}` | |
-| Int (i32/u32) | `{"type":"integer"}` | |
-| BigInt (i64/u64) | `{"type":"string","pattern":"^-?\\d+$"}` | JSON numbers lose precision >2⁵³ → string (matches the shipped `api.rs` rationale). (Open Q1) |
-| Float (f32/f64) | `{"type":"number"}` | |
-| Date | `{"type":"string","format":"date"}` | |
-| DateTime | `{"type":"string","format":"date-time"}` | |
-| Blob | `{"type":"string","contentEncoding":"base64"}` | |
-| Vector | `{"type":"array","items":{"type":"number"},"minItems":dim,"maxItems":dim}` | uses `vector_dim` |
-| List | `{"type":"array","items":<item_kind schema>}` | scalar items only (grammar guarantees) |
+## 8. The `load` tool — `EXISTING (reuse)`, ingest/load unified (delta D3)
 
-`nullable == false` → param is in `required`. Annotations: `mutation` → `{readOnlyHint:false, destructiveHint:true}`; else `{readOnlyHint:true}`. `description` → tool description; `instruction` → appended to description (or `_meta`). (The shipped `check()` already warns when an `mcp.expose` query declares a `Vector` param an LLM can't supply.)
+The first draft described an `ingest` tool wrapping `ingest_as` with implicit
+fork-from-`main`. **That is gone.** `server_ingest` now calls `load_as` directly and
+fork is opt-in by `from`:
 
-For built-in tools the schema is hand-authored from the route DTO; e.g. `query` → `{source: string, branch?: string, params?: object}`; `schema_apply` → `{schema: string, allow_data_loss?: boolean}`; `ingest` → `{ndjson: string, mode?: "merge"|"append"|"overwrite", branch?: string}`.
-
-### 5.4 `tools/list` (Cedar-filtered) and `tools/call` (dispatch + masking)
+```rust
+// EXISTING — crates/omnigraph-server/src/handlers.rs:1210  (abridged to the load-bearing logic)
+let branch = request.branch.unwrap_or_else(|| "main".to_string());
+let from = request.from;                                  // Option<String>
+let mode = request.mode.unwrap_or(LoadMode::Merge);
+if !branch_exists {
+    match from.as_deref() {
+        None => return Err(ApiError::not_found(           // ← no implicit fork; a typo'd branch is a 404
+            format!("branch '{branch}' not found; pass `from` to create it"))),
+        Some(from) => authorize_request(actor, handle.policy.as_deref(), PolicyRequest {
+            action: PolicyAction::BranchCreate,           // ← BranchCreate consulted ONLY when a fork happens
+            branch: Some(from.to_string()), target_branch: Some(branch.clone()),
+        })?,
+    }
+}
+authorize_request(actor, handle.policy.as_deref(), PolicyRequest {
+    action: PolicyAction::Change, branch: Some(branch.clone()), target_branch: None,
+})?;
+let result = handle.engine.load_as(&branch, from.as_deref(), &request.data, mode, actor_id).await?;  // unified load_as
+```
 
-- **`tools/list`**: build the `ToolRegistry`; for each tool evaluate `authorization(default_args)` against the actor's Cedar policy; **emit only tools that authorize**. Authz decisions memoized per request. Stored-query tools additionally require `mcp.expose: true`.
-  - **Exactness caveat (R7 is conditional):** the listed set equals the callable set **only for tools whose authorization is argument-independent** (`health`, `graphs_list`, `snapshot`, `schema_get`, `branches_list`, `commits_*`, ad-hoc `query`/`mutate`, and stored queries under the *coarse* action). For **branch-scoped tools** (`branches_create`/`merge` with `target_branch_scope`, and any branch-scoped `Read`/`Change` rule), list-time uses `default_args` (e.g. branch `main`) and cannot know the real target, so the listed set is a *best-effort approximation* of callability — a call may still be denied (or, rarely, a hidden tool would have been allowed). `tools/call` is always the authoritative gate. The contract is: **list never shows a tool the actor can't ever call; for branch-scoped tools it may show one the actor can call only on some branches.**
-- **`tools/call`**: resolve `name` → `McpTool` (masked-404 if unknown *or* `mcp.expose:false`); parse+validate args against `input_schema`; enforce `authorization(args)` (mutations stay double-gated: `InvokeQuery` then `Change`); on success `call`. **Denial masking** lives in one place (the dispatcher): an authz denial is returned identically to "unknown tool" (§5.10), reusing the same deny≡missing principle already shipped at `POST /queries/{name}`.
+DTOs (`api.rs:497` / `:208`):
 
-### 5.5 Resources
+```rust
+pub struct IngestRequest {                 // request body shape unchanged on the wire
+    pub branch: Option<String>,            // default "main"; without `from`, must already exist (else 404)
+    pub from:   Option<String>,            // parent branch; presence = opt into fork-if-missing
+    pub mode:   Option<LoadMode>,          // default Merge
+    pub data:   String,                    // NDJSON: {"type":"<T>","data":{…}} per line
+}
+pub struct IngestOutput {
+    pub uri: String, pub branch: String,
+    pub base_branch: Option<String>,       // echoes `from`; NULL when absent (was non-null in the blueprint era)
+    pub branch_created: bool, pub mode: LoadMode,
+    pub tables: Vec<IngestTableOutput>, pub actor_id: Option<String>,
+}
+```
 
-Advertise `resources` capability (`subscribe:false, listChanged:false`). `resources/list` → the URIs the actor may read; `resources/read` → schema `.pg` text / branches JSON / (multi-graph) graphs JSON, each gated by the corresponding action (`Read`, `Read`, `GraphList`). A locked-down agent denied `Read` simply never sees `omnigraph://schema` or `omnigraph://branches` — this is how rfc-001's "agents don't introspect schema" intent is met *by policy* (§Relationship-to-RFC-001).
+**Spec decisions for the MCP tool:**
 
-### 5.6 Transport: Streamable HTTP, stateless, POST-only
+1. **Name the tool `load`, not `ingest`.** The CLI now ships `ingest` only as a
+   *"Deprecated alias of `load --from <base>`"* (`cli.rs:111`); `load` is the
+   unified command. A fresh MCP with no legacy clients should expose the canonical
+   id — consistent with §6's "no `read`/`change` aliases" decision. (The HTTP route
+   stays `POST /ingest` for back-compat; only the MCP tool id is canonicalized.)
+2. **`do_load` wraps `load_as`** via the same body as `server_ingest`.
+3. **Input schema carries `from?`** and documents the 404-on-missing-branch:
+   `{ data: string, branch?: string, from?: string, mode?: "merge"|"append"|"overwrite" }`,
+   `additionalProperties: false`. There is no silent fork-from-main to represent.
 
-- **Streamable HTTP** (MCP's current standard; we're already an HTTP server). One endpoint per scope (§5.7).
-- Because the server emits **no** server-initiated messages, implement the **minimal conformant** shape: client `POST`s JSON-RPC, server replies `application/json`. **No SSE channel, no `Mcp-Session-Id`, stateless** — each request authenticated independently via the bearer middleware. Honour the `MCP-Protocol-Version` header. SSE/sessions can be added later if subscriptions land.
-- **JSON-RPC methods:** `initialize` (advertise `{tools:{listChanged:false}, resources:{listChanged:false, subscribe:false}}` + serverInfo/version), `notifications/initialized` (no-op ack), `ping`, `tools/list`, `tools/call`, `resources/list`, `resources/read`. `prompts/list` returns empty if probed.
-- **Library decision (Open Q2):** spike `rmcp` (official Rust MCP SDK) for conformance + Streamable-HTTP/Axum on edition 2024; **fall back to a hand-rolled ~150 LOC JSON-RPC-over-POST** (only the methods above) on friction. Given the tiny surface, hand-roll is an acceptable default.
+---
 
-### 5.7 Endpoint routing (server- vs graph-scoped)
+## 9. Routing — `EXISTING (reuse)`, validated
 
-- **Single-graph mode:** `POST /mcp` — graph tools + server tools (`health`, `graphs_list`).
-- **Multi-graph mode (MR-668):** `POST /graphs/{graph_id}/mcp` — graph-scoped tools for that graph; plus a server-level `POST /mcp` exposing only server-scoped tools (`health`, `graphs_list`). A per-graph endpoint never lists another graph's tools (isolation, tested). Mirrors the shipped `/graphs/{graph_id}/…` cluster routing. (Open Q5: confirm naming + whether server tools also appear on the per-graph endpoint.)
+`/mcp` is added **into `per_graph_protected`**, so it is flat in single mode and
+nested under `/graphs/{graph_id}` in multi mode automatically — no separate
+wiring. The real `build_app` (`lib.rs:907`):
 
-### 5.8 Modular / decoupled auth (the cross-cutting requirement)
+```rust
+// EXISTING — crates/omnigraph-server/src/lib.rs:915 (abridged)
+let per_graph_protected = Router::new()
+    .route("/snapshot", get(server_snapshot))
+    // … /query /mutate /queries /queries/{name} /schema /schema/apply /ingest /branches /commits …
+    .route("/mcp", post(server_mcp))                          // ← ADD HERE (or .merge(mcp_router(state)))
+    .route_layer(middleware::from_fn_with_state(state.clone(), resolve_graph_handle))  // inner: injects Arc<GraphHandle>
+    .route_layer(middleware::from_fn_with_state(state.clone(), require_bearer_auth));  // outer: injects ResolvedActor / 401
+
+let management = Router::new()
+    .route("/graphs", get(server_graphs_list))               // GraphList — server-scoped, NOT in MCP
+    .route_layer(middleware::from_fn_with_state(state.clone(), require_bearer_auth));
+
+let protected = match state.routing() {
+    GraphRouting::Single { .. } => per_graph_protected.merge(management),                  // → POST /mcp
+    GraphRouting::Multi  { .. } => Router::new()
+        .nest("/graphs/{graph_id}", per_graph_protected).merge(management),                // → POST /graphs/{id}/mcp
+};
+```
 
-**Invariant (load-bearing, satisfiable today):** the MCP handler receives an **already-resolved `ResolvedActor`** and **branches on nothing** about how the token was verified. No token parsing, no method check, no OAuth inside the MCP module. Today that actor comes from `require_bearer_auth`; when MR-956 lands it comes from a `TokenVerifier` — the MCP code is identical either way.
+Because `mcp_router` returns its own `Router` with a tower-http body-limit layer,
+prefer `.merge(mcp::mcp_router(state.clone()))` into `per_graph_protected` over a
+bare `.route("/mcp", …)` so the `/mcp`-specific limit doesn't leak onto the other
+routes. **No server-scoped MCP** — `graphs_list` / `omnigraph://graphs` are dropped;
+graph discovery is the `management` group's `GET /graphs`. A future server-level
+flat `/mcp` (bearer-only, no handle) would live in `management`, but is not built
+speculatively. **Do not** consolidate to a single flat `/mcp` that takes `graph_id`
+per call: MCP's `tools/list` can't carry a graph, so it couldn't list per-graph
+stored-query tools, and it would break isolation.
+
+---
+
+## 10. Auth & the client on-ramp — `EXISTING (reuse)` RFC-007 (delta D4)
+
+**Server side** (unchanged from the blueprint and correct): the backend consumes an
+already-resolved `ResolvedActor` and **branches on nothing** about how the token was
+verified. Static bearer works today; OAuth/RFC-9728 (MR-956) is an additive layer
+that swaps the bearer middleware behind a `TokenVerifier` and serves
+`/.well-known/oauth-protected-resource` — **zero MCP changes**. Token validation is
+offline (cached JWKS), so on-prem/air-gapped keeps working with no request-path
+cloud dependency; the endpoint is a Resource Server only (never issues tokens,
+never holds a client secret). Multi-user identity passes through: the *caller's*
+token (a per-tenant audience-bound JWT) reaches the server so Cedar enforces
+per-user policy — never a shared service token.
 
 ```
-request → [auth middleware: ResolvedActor] → [MCP route] → Cedar → McpTool
+request → [require_bearer_auth: ResolvedActor] → [resolve_graph_handle: GraphHandle] → [/mcp] → Cedar → tool
 ```
 
-**Server side — auth is config, not code:**
-
-| Deployment | Verifier | MCP change |
+| Integration | Static bearer | Note |
 |---|---|---|
-| On-prem, static bearer | `require_bearer_auth` / `StaticHashTokenVerifier` | none |
-| On-prem, customer IdP | `OidcJwtVerifier` → customer issuer (MR-956) | none |
-| Our cloud | `OidcJwtVerifier` → WorkOS, `tenant_id = Some(org_id)` (MR-956) | none |
-
-Token validation is offline (cached JWKS) — on-prem/air-gapped keeps working with no request-path cloud dependency. The MCP endpoint never terminates OAuth and never holds a client secret (Resource Server only).
-
-**Cloud client negotiation — additive, no MCP changes:** when MR-956 lands, the server publishes RFC 9728 `/.well-known/oauth-protected-resource` and returns `WWW-Authenticate: Bearer ..., resource_metadata="..."` on 401. A compliant MCP client (Claude) then auto-negotiates: static bearer to an on-prem endpoint; on a cloud 401 it discovers the WorkOS AS and runs OAuth/PKCE itself — **same endpoint URL, zero client-side branching.** This RFC only requires that MCP routes flow through the standard 401 path so that hook can be added later without touching MCP.
+| Claude Code, Cursor, VS Code | ✅ | `claude mcp add --transport http <url>/mcp --header "Authorization: Bearer <tok>"` |
+| Claude Messages API MCP connector | ✅ | caller passes `authorization_token` |
+| claude.ai web / Desktop, ChatGPT dev-mode | ❌ needs OAuth fast-follow | OAuth 2.1 + PKCE + RFC 9728 |
 
-**Multi-user identity pass-through (cloud):** the *caller's* token (a WorkOS JWT, audience-bound per-tenant) must reach the server so Cedar enforces per-user/per-tenant policy — never a shared service token. The MCP endpoint validates it offline and maps `org_id → tenant_id`. This is why the **remote path is the in-server HTTP MCP that Claude connects to directly** (its token flows through), not a stdio bridge impersonating a user.
+**Client side — the *shipped* RFC-007 model (delta D4).** The landed operator config
+is:
 
-**Client-side credential acquisition (CLI/SDK/proxy) — pluggable `CredentialSource`** (RFC-002 §5, MR-971), keyed by server name, so OAuth is a future *sibling key*, not a re-key:
+```rust
+// EXISTING — crates/omnigraph-cli/src/operator.rs:74
+pub(crate) struct OperatorServer { pub(crate) url: String }   // field is `url`, NOT `endpoint`; NO `auth:` block
+```
 
 ```yaml
+# ~/.omnigraph/config.yaml  (operator config — "No tokens in this file, ever")
 servers:
-  onprem: { endpoint: https://og.internal:8080, auth: { token: { env: OG_TOKEN } } }
-  edge:   { endpoint: https://og-edge,          auth: { token: { command: [vault, read, -field=token, secret/og] } } }
-  cloud:  { endpoint: https://api.omnigraph.cloud, auth: { oauth: { issuer: workos } } }   # future sibling
+  prod:    { url: https://og.internal:8080 }
+  edge:    { url: https://og-edge:8080 }
+  cloud:   { url: https://api.omnigraph.cloud }     # OAuth is resolved by the client, not declared here
 ```
 
-Implicit chain when `auth:` omitted: `OMNIGRAPH_TOKEN_<NAME>` → keychain `omnigraph:<name>` → `[<name>]` in `~/.omnigraph/credentials`; legacy `bearer_token_env` honoured. Secrets never inlined.
+Token resolution is a **two-step keyed chain** (`operator.rs:224`), then the legacy
+`bearer_token_env` fallback — **there is no keychain step** (the blueprint's middle
+`omnigraph:<name>` keychain link does not exist):
 
-### 5.9 Safety model — Cedar is the gate, default-deny is the floor
+```rust
+// EXISTING — crates/omnigraph-cli/src/operator.rs:229
+// 1. OMNIGRAPH_TOKEN_<NAME>  (env; `intel-dev` → OMNIGRAPH_TOKEN_INTEL_DEV)
+// 2. [<name>] token = …      in ~/.omnigraph/credentials  (0600; over-permissive ⇒ loud error)
+pub(crate) fn resolve_keyed_token(server: &str) -> Result<Option<String>>;
+```
 
-With ad-hoc `query`/`mutate`/`schema_apply` present as tools, the **only** thing protecting an untrusted agent is the Cedar policy. Therefore:
+The `omnigraph mcp install --agent <tool>` on-ramp (MR-974) writes the agent's MCP
+client config pointing at `<server url>/mcp` with the resolved bearer; it reuses
+this chain, so OAuth becomes a future *sibling* of the token chain, not a re-key.
+
+---
+
+## 11. Tests & verification (corrected suite names — delta D5)
+
+The `server.rs` monolith is gone. MCP tests land in a **new
+`crates/omnigraph-server/tests/mcp.rs`** suite (black-box over `build_app`);
+stored-query *projection* tests extend the existing **`stored_queries.rs`**. Cover:
+
+- **Protocol:** `initialize` + advertised `{tools, resources}`; `tools/list` shape;
+  `tools/call` happy path; JSON-RPC errors (`-32601`/`-32602`); `resources/list` +
+  `resources/read`; `GET /mcp → 405`; `MCP-Protocol-Version` 400/default; `Origin → 403`.
+- **Cedar (coarse):** read-only actor sees read tools but not `mutate`/`load`/`branches_*`/`schema_apply`;
+  a denied `tools/call` masks byte-identically to an unknown tool; stored queries
+  list only with `invoke_query`; the double-gate (an `invoke_query`-only actor sees
+  a stored tool but the call surfaces `isError` when the inner `Read` denies).
+- **Dispatch:** a `mutate` call writes end-to-end (proves the actor/handle extension
+  passthrough); a malformed query → `isError:true`, not a JSON-RPC error. A `load`
+  with a missing branch and **no `from` → `isError`** (404 classified); with `from`
+  → forks.
+- **Resources:** list + read of `schema`/`branches`; a denied read masks as unknown.
+- **Schema generation:** table-driven over every `ParamKind` incl. nullable / list /
+  `vector` **with and without `vector_dim`** (the absent-dim path must omit
+  `minItems`/`maxItems`, never emit 0 — the D1 regression guard).
+- **Auth decoupling / no-bearer:** `/mcp` 401s without a bearer (before rmcp) and
+  200s with one; green under the static-hash verifier and a mock `ResolvedActor`.
+- **Crate-level:** `omnigraph-mcp/tests/` with a trivial `McpBackend` proving the
+  crate stands alone (`initialize` + `GET → 405`), plus an rmcp **surface guard**
+  pinning `StreamableHttpServerConfig`'s `with_*` setters and the `ServerHandler`
+  method shapes (the only place the unverifiable rmcp-`Parts`-passthrough assumption
+  is anchored).
+
+Verification commands (corrected):
+
+```bash
+cargo build --workspace --locked
+cargo tree -p omnigraph-server -e normal | grep rmcp     # rmcp ONLY under omnigraph-mcp, never direct
+cargo test -p omnigraph-server --test mcp                # NEW suite (NOT --test server)
+cargo test -p omnigraph-server --test stored_queries     # projection tests
+cargo test -p omnigraph-server --test openapi            # no /mcp leak (it carries no #[utoipa::path])
+OMNIGRAPH_UPDATE_OPENAPI=1 cargo test -p omnigraph-server --test openapi   # if the REST surface intentionally changed
+```
 
-- **Default-deny when tokens are configured** (MR-723, shipped) is the floor — an actor with no grants sees an empty tool list.
-- **What works today (coarse action):** a policy can hide all ad-hoc tools and admin tools per-actor (`deny Read, Change, SchemaApply, Branch*`) while allowing stored queries (`allow InvokeQuery`). That already reproduces "can't run ad-hoc, can't read schema, can only call stored queries" — the agent sees *every* exposed stored query plus nothing else.
-- **What needs PR 0b (per-query scope):** selecting *which* stored queries an actor may call (`allow InvokeQuery [find_user, list_orders]`, deny the rest). The shipped `invoke_query` is coarse (all stored queries or none). Until PR 0b adds a query-name dimension to `PolicyRequest` + the Cedar schema (rfc-001's intended design), per-actor sub-selection of stored queries is **not expressible**; curation is graph-level (which `.gq` files are registered + `mcp.expose`).
-- `schema_apply`, `branches_delete`, ad-hoc `mutate` require an explicit admin-tier grant; never in a default agent policy.
-- (Open Q3) Optional `mcp.allow_adhoc` server switch defaulting **off** for the ad-hoc `query`/`mutate` tools — defence-in-depth independent of Cedar, and independent of PR 0b.
+---
 
-### 5.10 Result shaping and error mapping
+## 12. Decisions — locked & open
 
-- **Success:** `tools/call` returns `content: [{type:"text", text:<json>}]` where `<json>` is the route's existing output envelope (read rows / mutation summary, i.e. `ReadOutput` / `ChangeOutput`). (Open Q4: also emit `structuredContent` + `outputSchema` — defer; text-JSON for v1.)
-- **Tool execution error** (bad params after schema validation, engine error): result with `isError:true` + a text content block.
-- **Authorization denial / unknown tool / `mcp.expose:false`:** a single JSON-RPC error (`-32602`, message `"unknown tool"`) — identical for all three so policy isn't probeable (same principle as the shipped `POST /queries/{name}` 404 masking).
-- **Auth failure** (bad/absent bearer): HTTP 401 from the middleware *before* MCP — carries `WWW-Authenticate` (the RFC 9728 hook), never masked as a tool error. (This is exactly the path the shipped `authorize`/`authorize_request` split preserves: operational failures keep their status; only *denials* are masked.)
+**Locked** (validated against code): rmcp 1.7; stateless POST-only Streamable
+HTTP; `McpBackend` crate-trait seam with `&Parts` passthrough; `Builtin` enum +
+stored-query populator; coarse `InvokeQuery` only; ad-hoc `query`/`mutate`
+always-on Cedar-only; `query`/`mutate` ids (no `read`/`change`); **`load` id (not
+`ingest`)**; per-graph `/mcp` routing; no server-scoped MCP / no `graphs_list`;
+text-JSON content for v1; BigInt as JSON string; `vector_dim: Option<u32>` handled
+with omit-on-absent.
 
-## Relationship to the `@modernrelay/omnigraph-mcp` stdio package
+**Open (deferred follow-ups):**
+- **PR 0b — per-query `invoke_query` scope.** Add a query-name dimension to
+  `PolicyRequest` + the Cedar schema (rfc-001's intended design). `InvokeQuery` is
+  documented coarse today (`policy/src/lib.rs:59-73`) and `branch_scope` on it is
+  rejected by `validate()`. Until PR 0b, stored-query curation is graph-level
+  (registry membership + `expose`), not per-actor sub-selection — the headline caveat.
+- **MR-956 — OAuth/RFC-9728** layer (additive, zero MCP changes).
+- **stdio → proxy collapse** (see below).
+- **`structuredContent` / `outputSchema`** beyond text-JSON.
 
-Surface of the package (`omnigraph-ts`, `@modelcontextprotocol/sdk@^1.29.0`, **stdio only**). *Figures refreshed 2026-06: the package re-synced to the engine in [omnigraph-ts#11](https://github.com/ModernRelay/omnigraph-ts/pull/11) and is now at `0.6.1` — not the `0.3.0` this RFC was first drafted against.* It exposes **16 tools** (`health`, `snapshot`, `query`, `read`, `schema_get`, `branches_list`, `graphs_list`, `commits_list`, `commits_get`, `mutate`, `change`, `ingest`, `branches_create`, `branches_delete`, `branches_merge`, `schema_apply` — note it already canonicalized `query`/`mutate` with `read`/`change` as deprecated aliases, and added `graphs_list`) and **~9 resources** (`omnigraph://schema`, `omnigraph://branches`, `omnigraph://graphs`, plus a vendored `omnigraph://best-practices/*` cookbook). It is a thin client over the SDK → HTTP routes and **forwards the caller's bearer verbatim** (no inspection).
+---
 
-Once parity lands, **collapse to one implementation**: the in-server MCP is canonical (Cedar-gated, remote-capable, the path that becomes a Claude-web connector via MR-956). The stdio package degrades to a **thin stdio↔HTTP proxy** forwarding JSON-RPC (and the incoming `Authorization`) to `/mcp` — staying the local on-ramp for Claude Code/Desktop while sharing one tool set, one Cedar gate. Transition: keep the current independent stdio package on its `0.6.x` line; ship proxy mode in a later TS minor once the server endpoint is GA. (The package already re-synced to `0.6.1` in [omnigraph-ts#11](https://github.com/ModernRelay/omnigraph-ts/pull/11); its client-side stored-query-tools attempt, [omnigraph-ts#7](https://github.com/ModernRelay/omnigraph-ts/pull/7), was **closed** in favor of this server-side surface.)
+## Relationship to RFC-001
 
-## Testing
+[rfc-001-queries-envelope-mcp.md](rfc-001-queries-envelope-mcp.md) (MR-656 / MR-976
+/ MR-969) is the parent design for stored queries + the response envelope + MCP.
+This RFC is the **detailed MCP-transport design** that #128 left for a follow-up,
+and it **revises rfc-001 in three places where the shipped code or the MCP wire
+protocol diverged from rfc-001's sketch**:
+
+1. **Transport shape.** rfc-001 sketched `GET /mcp/tools` + `POST /mcp/invoke` (a
+   bespoke REST pair). **That is not the MCP wire protocol — real MCP clients
+   cannot connect to it.** This RFC implements actual MCP JSON-RPC over Streamable
+   HTTP and reuses `query_catalog_entry` as a *projection source*, not a parallel
+   surface.
+2. **Exposure config.** rfc-001 specified inline `.gq` pragmas (`@mcp(expose=…)`,
+   default `expose=false`). **#128 shipped a different mechanism:** the
+   `expose`/`tool_name` metadata in YAML, **default `true`** (declaring a query *is*
+   the opt-in). This RFC builds on the shipped form; the `.gq`-pragma design is
+   superseded. (The declaration *file* has since moved to `cluster.yaml` discovery
+   for multi-graph servers — see [§7.3](#73-where-stored-queries-are-declared-delta-d6).)
+3. **Schema introspection.** rfc-001 lists "Schema introspection through MCP" as a
+   **non-goal**. This RFC **revises that**: the operational-parity tools include
+   `schema_get` and `omnigraph://schema` — *because the shipped stdio package
+   already exposes both*. The non-goal is achieved by *policy*, not omission: both
+   are Cedar-gated by `Read`, and the recommended locked-down agent policy denies
+   `Read`, so a curated agent still never sees the schema.
+
+Everything else in rfc-001 (two-paths-one-engine, per-query `invoke_query` *as the
+intended scope*, the response envelope, multi-graph per-graph endpoints) this RFC
+consumes unchanged.
+
+> **Numbering note:** the `TokenVerifier`/WorkOS auth design is referred to in code
+> (`crates/omnigraph-server/src/identity.rs`) as "RFC 0001," a *different* document
+> from this repo's `docs/dev/rfc-001-queries-envelope-mcp.md`. To avoid the
+> collision this RFC cites the auth substrate as **MR-956** throughout.
+
+---
 
-> *Superseded by [B.12](#b12-tests--verification), which is the authoritative test plan. The list below is the original §5-era sketch; ignore its references to `graphs_list`, `read`/`change` aliases, and the per-query `invoke_query` test (all resolved/dropped — see B.13).*
+## Motivation
 
-- **Protocol conformance:** `initialize` handshake + advertised capabilities; `tools/list` shape; `tools/call` happy path; JSON-RPC error envelopes (`-32601` unknown method, `-32602` invalid params / unknown tool); `resources/list` + `resources/read`.
-- **Cedar filtering (coarse, today):** an actor with `allow InvokeQuery` + `deny Read/Change` sees *all* exposed stored queries but **not** `query`/`mutate`/`schema_get`; `tools/call query` returns masked "unknown tool"; an admin sees the full catalog.
-- **Cedar filtering (per-query, gated on PR 0b):** actor scoped to `InvokeQuery [find_user]` sees *only* `find_user`; `tools/call list_orders` masks. **This test ships with PR 0b**, not PR 1 — it cannot pass against the coarse action.
-- **Parity per built-in:** each tool round-trips against the same expectations as its HTTP route (reuse route tests); `read`/`change` aliases dispatch identically to `query`/`mutate`.
-- **Double-gating:** a stored mutation requires both `InvokeQuery` and `Change`; `schema_apply` requires `SchemaApply`.
-- **`mcp.expose:false`:** absent from `GET /queries` and MCP `tools/list`; still service-callable by name through `POST /queries/{name}` when the actor has `invoke_query`, but not MCP-callable.
-- **Schema generation:** table-driven over every `ParamKind` incl. nullable / list / vector(dim).
-- **Branch-scoped list approximation:** assert the documented R7 caveat — a branch-scoped policy lists `branches_create`, and `tools/call` is the authoritative gate (a denied target still 403s/masks).
-- **Multi-graph isolation:** `/graphs/a/mcp` never lists graph `b`'s tools; server `/mcp` exposes only server tools.
-- **Auth decoupling:** the MCP suite is green under the current `require_bearer_auth` and under a mock OIDC `ResolvedActor` source — proving verifier-agnosticism. A 401 carries `WWW-Authenticate`.
-- **OpenAPI:** the JSON-RPC endpoint is not REST — document only the envelope in utoipa (or exclude); keep `openapi.json` drift test green (`OMNIGRAPH_UPDATE_OPENAPI=1` to regenerate on intentional change).
-- **Cross-repo smoke (optional):** point `@modelcontextprotocol/sdk` (TS) at the HTTP endpoint in an `omnigraph-ts` integration test.
+- **One curated, safe, remotely-reachable tool surface.** MR-969's thesis: hand an
+  LLM a token Cedar-scoped to a set of tools and it sees exactly those typed tools —
+  cannot construct ad-hoc queries it isn't permitted, cannot read the schema it
+  isn't permitted, cannot reach other graphs. Today the only MCP is the stdio
+  package: local-only, full surface, ungated.
+- **Parity, so the in-server MCP can be the single implementation.** Operators/agents
+  already depend on the operational tools. Serving them server-side behind one Cedar
+  gate lets the stdio package degrade to a proxy and removes two diverging tool sets.
+- **On-prem and cloud from one endpoint.** A managed cloud (WorkOS OAuth) and an
+  on-prem/air-gapped deploy (static or customer-OIDC tokens) must serve the same MCP
+  without forks or MCP-specific auth.
+- **Foundation for the agent on-ramp (MR-974).** `omnigraph mcp install --agent
+  <tool>` needs a decided transport + a stable endpoint.
 
-## Rollout — phased by risk
+## Goals
 
-> *Superseded by the Blueprint's build order ([B.4](#b4-server-side-backend-lives-in-omnigraph-server) → crate ([B.1](#b1-crate-architecture--dependency-direction)–[B.3](#b3-transport-lives-in-omnigraph-mcp)) → backend → tools/stored/resources). The PR phases below are the original §5-era plan; they still mention `graphs_list`, `read`/`change` aliases, and the `mcp.allow_adhoc` switch, all of which the locked decisions ([B.13](#b13-decisions-locked)) drop. PR 0b (per-query `invoke_query` scope) remains the one deferred follow-up.*
+- Project built-in tools + stored queries as MCP tools through **one** registry
+  abstraction.
+- `tools/list` and the callable set are **identical for argument-independent
+  authorization**, both driven by Cedar (see §6 for the branch-scoped caveat).
+- The MCP layer is **auth-method-agnostic**: it consumes `ResolvedActor`, never a
+  raw token, never branches on how auth happened.
+- The same endpoint works on-prem (static/OIDC) and cloud (WorkOS OAuth), switched
+  by config; cloud OAuth is additive (RFC 9728).
+- No new business logic: MCP tools delegate to the same `run_query`/`run_mutate`/
+  branch/schema functions the HTTP routes call.
+- Behaviour-neutral when unused: no MCP traffic = no change.
 
-- **PR 0a — extract the reusable invoke path (small).** The coarse `invoke_query` gate + 404 denial-masking are **already shipped** in `server_invoke_query`. Extract the read/mutate dispatch into `invoke_stored_query(handle, name, params, branch/snapshot, actor)` so MCP `tools/call` and the HTTP route share one path. No behaviour change. *(Replaces the previous draft's "PR 0 — wire the gate", which was already done.)*
-- **PR 0b — per-query `invoke_query` scope (the safety prerequisite).** Add a query-name dimension to `PolicyRequest` + the Cedar schema (rfc-001's intended design), wire it at `POST /queries/{name}` and in the stored-query `McpTool::authorization`. Independently useful (the `allow InvokeQuery [find_user]` policy). **Gates the per-query Cedar-filtering test and §5.9's recommended agent policy.**
-- **PR 1 — MCP transport + read-only parity + stored-query reads.** Endpoint(s), `initialize`/`tools/list`/`tools/call`/`resources/*`, the `McpTool` registry, Cedar-filtered listing, the read-only built-ins (`health`, `graphs_list`, `snapshot`, `read`/`query`, `schema_get`, `branches_list`, `commits_*`) + resources + stored-query *reads*. All auth-agnostic.
-- **PR 2 — mutating parity + stored-query mutations.** `change`/`mutate`, `ingest`, `branches_create/delete/merge`, `schema_apply`, stored-query mutations + the `mcp.allow_adhoc` switch.
-- **PR 3 — docs + agent on-ramp hook.** `docs/user/server.md` MCP section (incl. the recommended agent policy + the coarse-vs-per-query caveat), `openapi.json` sync, the `omnigraph mcp install` config target (MR-974), and the downstream `omnigraph-ts` re-sync/proxy follow-up.
-- **Later (separate, MR-956):** RFC 9728 protected-resource metadata + WorkOS — slots in with zero MCP changes.
-- **Later (TS minor):** stdio package → proxy mode.
+## Non-Goals
 
-## Migration / backwards compatibility
+- **Building/hosting an OAuth authorization server.** The server is a Resource
+  Server; WorkOS AuthKit+Connect is the AS (MR-956). The MCP endpoint validates
+  tokens, never issues them, never holds client secrets.
+- **OAuth/WorkOS implementation itself** — MR-956's work. This RFC leaves a clean
+  RFC-9728 hook and consumes `ResolvedActor`.
+- **MCP prompts, elicitation, `tools/list_changed`, resource subscriptions,
+  server-initiated messages.** None needed → enables the stateless POST-only
+  transport (§3).
+- **stdio transport inside the server.** stdio stays in the TS package (now a proxy).
+- **Cross-graph tool listing.** Per-graph catalogs only.
+- **Hot reload of the query registry.** Restart-only (MR-969).
 
-- **Additive.** No `queries:` and no MCP traffic → today's behaviour unchanged. New endpoints are new routes.
-- **Cedar default-deny** (when tokens configured) means MCP exposes nothing until an actor is granted — safe by default.
-- The stdio package keeps working unchanged; proxy mode is opt-in later.
-- `openapi.json` only gains the documented MCP envelope; existing REST routes untouched.
+## Background
 
-## Open Questions
+`omnigraph-server` (Axum) already implements every operation this RFC exposes as an
+authenticated HTTP route; each authorizes via a `PolicyAction` against the Cedar
+policy for a server-resolved actor and calls into the engine. The existing stdio MCP
+package is a *client* of these routes (it owns no business logic). MR-956 will
+introduce a `TokenVerifier` trait (`StaticHashTokenVerifier` today inline,
+`OidcJwtVerifier` for OIDC/WorkOS) producing the `ResolvedActor` that already exists
+in `identity.rs` (§4) and is consumed by Cedar — token *validation* is offline
+(cached JWKS), so on-prem/air-gapped has no request-path dependency on the cloud.
 
-> *Resolved during [#157](https://github.com/ModernRelay/omnigraph/pull/157) — see [B.13](#b13-decisions-locked) for the locked decisions. Q1→string, Q2→rmcp 1.7, Q3→always-on Cedar-only, Q4→text-JSON v1, Q5→per-graph routing (graphs_list per B.10), Q6→stateless POST confirmed, Q7→no `read`/`change` aliases. Q8 (PR 0b shape: Cedar resource vs context attribute) remains open, gated on the deferred per-query scope work. The items below are kept as the original decision context.*
+## Relationship to the `@modernrelay/omnigraph-mcp` stdio package
 
-1. **BigInt/u64 as JSON string** (recommended, precision-safe) vs number.
-2. **`rmcp` vs hand-rolled** JSON-RPC (spike `rmcp` on edition 2024; default to hand-roll on friction).
-3. **Default-off `mcp.allow_adhoc`** for ad-hoc `query`/`mutate` (recommended) vs always-on + Cedar-only.
-4. **`structuredContent` + `outputSchema`** now vs text-JSON v1 (recommend v1 text-JSON).
-5. **Endpoint paths:** `/mcp` + `/graphs/{id}/mcp` — confirm naming and whether server-scoped tools also appear on the per-graph endpoint.
-6. **Stateless POST-only** confirmed (no near-term server-initiated messages) — revisit only if subscriptions land.
-7. **Legacy alias tools** (`read`/`change`): keep for client compat (the shipped package uses them), or drop and rely on `query`/`mutate`?
-8. **PR 0b shape:** per-query scope as a Cedar *resource* (`StoredQuery::"find_user"`) vs a `query_name` *context attribute* + policy condition — affects how `allow InvokeQuery [list]` is authored.
+The package (`omnigraph-ts`, `@modelcontextprotocol/sdk`, **stdio only**) is a thin
+client over the SDK → HTTP routes and **forwards the caller's bearer verbatim** (no
+inspection). It exposes the operational tools and the `omnigraph://schema` /
+`omnigraph://branches` resources (plus a vendored best-practices cookbook). *Its
+exact tool/resource counts and version live in the separate `omnigraph-ts` repo and
+are not re-verified here.*
+
+Once parity lands, **collapse to one implementation**: the in-server MCP is canonical
+(Cedar-gated, remote-capable, the path that becomes a Claude-web connector via
+MR-956). The stdio package degrades to a **thin stdio↔HTTP proxy** forwarding
+JSON-RPC (and the incoming `Authorization`) to `/mcp` — staying the local on-ramp
+for Claude Code/Desktop while sharing one tool set and one Cedar gate. This is the
+same "one contract, two implementations only where semantics genuinely differ"
+posture as [rfc-009-unify-access-paths.md](rfc-009-unify-access-paths.md).

From 20233a34ae81d40df2bc1abea580b6b4a12a5896 Mon Sep 17 00:00:00 2001
From: Ragnor Comerford <hello@ragnor.co>
Date: Sat, 13 Jun 2026 16:08:59 +0200
Subject: [PATCH 4/5] docs(rfc-003): standalone MCP surface spec, validated
 against upstream
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Rewrites the MCP RFC as a single self-contained specification, validated
against the live upstream surfaces it builds on:

- MCP protocol revision 2025-11-25 (Streamable HTTP transport requirements,
  authorization model, tool/resource shapes, SEP-1303 input-validation-as-
  tool-error, JSON Schema 2020-12 default).
- rmcp 1.7.0 (StreamableHttpServerConfig real defaults — stateful_mode
  defaults true and must be flipped; NeverSessionManager for stateless mode;
  the `local` feature must stay off; RequestContext.extensions carries the
  http::request::Parts passthrough; RPITIT ServerHandler).
- Tool/server best practice: domain-qualified tool names, explicit
  annotations, structured output + text mirror, and progressive disclosure
  (per_query vs meta projection modes) for large stored-query catalogs.

Adds a provider compatibility matrix (Claude Code/Desktop/web, Messages API
connector, OpenAI Responses API/ChatGPT, Cursor, VS Code, OpenCode) with a
phased static-bearer -> OAuth 2.1 + RFC 9728 auth plan and the Claude Code
PRM header-override caveat, and a code-mode compatibility section describing
the server-side properties that make the surface code-execution-friendly
without building client-side machinery.
---
 docs/dev/rfc-003-mcp-server-surface.md | 1305 ++++++++++--------------
 1 file changed, 547 insertions(+), 758 deletions(-)

diff --git a/docs/dev/rfc-003-mcp-server-surface.md b/docs/dev/rfc-003-mcp-server-surface.md
index cff392b3..77eefdd5 100644
--- a/docs/dev/rfc-003-mcp-server-surface.md
+++ b/docs/dev/rfc-003-mcp-server-surface.md
@@ -1,551 +1,350 @@
-# RFC-003: MCP Server Surface for `omnigraph-server` — Full Tool Parity, Stored Queries, Modular Auth
-
-**Status:** Proposed — buildable implementation spec, **validated against `main`
-at the 0.7.0 development head** (post-v0.6.2; workspace bumped to 0.7.0 in
-`dedd647`). A reference implementation shipped in
-[omnigraph#157](https://github.com/ModernRelay/omnigraph/pull/157) (proved rmcp
-1.7 on edition 2024, the auth-extension passthrough, and the full
-tool/resource/Cedar surface); this RFC is the canonical spec for a clean
-reimplementation from `main` with a dedicated `omnigraph-mcp` crate.
+# RFC-003: MCP Server Surface for `omnigraph-server`
+
+**Status:** Proposed — buildable implementation spec.
 **Date:** 2026-06-13
-**Tickets:** MR-969 (stored queries + MCP exposure), MR-956 (OAuth/RFC-9728
-fast-follow — the auth substrate), MR-971 (per-server credential resolver — landed
-as RFC-007), MR-974 (`omnigraph mcp install`), MR-668/RFC-005 (multi-graph server — shipped).
+**Audience:** server/engine maintainers.
+**Tickets:** MR-969 (stored queries + MCP exposure), MR-956 (OAuth/RFC-9728 layer),
+MR-971 (per-server credential resolver — landed as RFC-007), MR-974 (`omnigraph mcp install`).
 **Builds on:** [rfc-001-queries-envelope-mcp.md](rfc-001-queries-envelope-mcp.md)
-(stored queries + response envelope + the parent MCP design),
-[rfc-005-server-cluster-boot.md](rfc-005-server-cluster-boot.md) (multi-graph boot),
-[rfc-007-operator-config.md](rfc-007-operator-config.md) (the landed client
-credential model), [rfc-009-unify-access-paths.md](rfc-009-unify-access-paths.md)
-(the embedded/remote `GraphClient` unification this surface is a sibling of), and
-[omnigraph#128](https://github.com/ModernRelay/omnigraph/pull/128) (the shipped
-stored-query registry, `GET /queries`, `POST /queries/{name}`, the coarse
-`invoke_query` gate).
-**Supersedes:** the MCP-transport portion of `rfc-001` (`GET /mcp/tools` + `POST
-/mcp/invoke`). See [Relationship to RFC-001](#relationship-to-rfc-001).
-**Target release:** v0.8.x (per RFC-009, MCP-adjacent work is post-v0.7.0).
-
-> **Provenance.** Every snippet labelled **`EXISTING (reuse)`** is copied verbatim
-> from `origin/main` with a `file:line` citation, re-read from source on
-> 2026-06-13 — verify with `git show origin/main:<file>`. Snippets labelled
-> **`NEW (build this)`** are the interfaces to add. The one behavior this repo
-> cannot verify — that rmcp 1.7 forwards the request's `http::request::Parts`
-> into `RequestContext.extensions` — was proven by the reference impl
-> [omnigraph#157](https://github.com/ModernRelay/omnigraph/pull/157); it is
-> carried forward as a cited assumption, pinned by a crate-level surface guard
-> (§11). This document absorbed the standalone implementation-spec draft
-> (formerly `rfc-010`); there is one MCP RFC.
-
-## Summary
-
-Add a first-class **MCP (Model Context Protocol) server surface to
-`omnigraph-server`**, exposed over **Streamable HTTP**, projecting the server's
-operations as MCP tools and resources for LLM clients (Claude Code/Desktop/web,
-Cursor, ChatGPT dev-mode, etc.). Two tool populations share one projection path:
-
-1. **Built-in operational tools** — `health`, `snapshot`, `schema_get`,
-   `branches_list`, `commits_list`, `commits_get`, ad-hoc `query`/`mutate`,
-   `load` (NDJSON), `branches_create`/`branches_delete`/`branches_merge`,
-   `schema_apply` — plus resources `omnigraph://schema` and `omnigraph://branches`.
-   The authoritative catalog is [§6](#6-tool-catalog-13-built-ins--cedar-mapping)
-   (13 tools). Server-scoped graph discovery (`graphs_list` / `omnigraph://graphs`)
-   is **dropped from MCP** — it stays REST-only via `GET /graphs` ([§9](#9-routing--existing-reuse-validated)).
-2. **Dynamic stored-query tools** — one MCP tool per `expose: true` entry in the
-   graph's loaded stored-query registry, typed from the `.gq` declaration via the
-   shipped `query_catalog_entry` / `ParamDescriptor` projection ([§7](#7-stored-query-projection--existing-reuse--the-corrected-schema)).
-
-Every tool is **authorized by the server's existing Cedar policy engine**. The MCP
-layer never implements its own authentication: it consumes an **already-resolved
-`ResolvedActor`** from the server's bearer middleware, so the **same MCP endpoint
-serves on-prem (static or customer-OIDC tokens) and cloud (WorkOS OAuth) by
-configuration only** — cloud OAuth is an additive RFC-9728 layer that slots in with
-zero MCP changes. The end-state collapses two diverging tool implementations into
-one: the in-server MCP is canonical; the stdio package degrades to a thin
-stdio↔HTTP proxy over it.
-
-> **Key caveat, up front:** the headline "a token Cedar-scoped to a *specific set*
-> of stored queries" requires **per-query `invoke_query` scope**, which is *designed*
-> (rfc-001) but **not yet implemented** — the shipped action is coarse (any stored
-> query on the graph, or none; `policy/src/lib.rs:59-73`). Per-actor Cedar curation
-> works today for *built-in vs ad-hoc vs admin* tools and for *stored-vs-ad-hoc*;
-> sub-selecting individual stored queries per actor is gated on PR 0b ([§12](#12-decisions--locked--open)).
+(stored queries + the response envelope), [rfc-005-server-cluster-boot.md](rfc-005-server-cluster-boot.md)
+(multi-graph boot), [rfc-007-operator-config.md](rfc-007-operator-config.md)
+(client credential model), [rfc-009-unify-access-paths.md](rfc-009-unify-access-paths.md)
+(one-contract/two-implementations posture).
+**Target release:** v0.8.x.
+
+**Validated against** (re-checked 2026-06-13): MCP protocol revision **`2025-11-25`**
+(modelcontextprotocol.io), the official Rust SDK **`rmcp 1.7.0`** (crates.io /
+github.com/modelcontextprotocol/rust-sdk), and current tool/security best practice
+(Anthropic engineering, MCP spec security pages). Provider compatibility was
+checked against the live docs of Claude Code/Desktop/web, the Claude Messages API
+MCP connector, OpenAI's Responses API + ChatGPT connectors, Cursor, VS Code Copilot,
+and OpenCode. Code snippets marked **`Reuses`** are present in `omnigraph-server`
+today, cited to `file:line`; snippets marked **`New`** are the code to add.
 
 ---
 
-## 0. What changed since the first blueprint draft (validation deltas)
-
-The first blueprint of this RFC was authored ~79 commits behind the current head.
-Building from it verbatim would miss these; each is corrected in the noted section.
+## 1. Summary
 
-| # | The first draft said | `main` reality | §  |
-|---|---|---|---|
-| D1 | `ParamKind::Vector { dim: u32 }`, "dim intrinsic, never an `Option`" | `ParamKind::Vector` is a **unit** variant; dim is `ParamDescriptor.vector_dim: Option<u32>` | §7 |
-| D2 | handlers / `server_invoke_query` / auth in `lib.rs`; "verified against `{lib.rs,api.rs}`" | moved to **`handlers.rs`**; stored-query config in **`config.rs`**; DTOs in **`api.rs`** | §4–§8 |
-| D3 | `ingest` tool wraps `ingest_as`; "(`+BranchCreate` when `from` forks)" | `server_ingest` calls **`load_as`**; missing branch **without `from` → 404**, no implicit fork; `ingest_as` shim retired server-side; `ingest` is now a **deprecated CLI alias of `load`** | §8 |
-| D4 | client creds: `servers: { onprem: { endpoint, auth: { token\|oauth } } }`; chain env→**keychain**→file | `OperatorServer { url }` (field is `url`, **no `auth:` block**, "no tokens in this file, ever"); chain is **env → credentials file** (no keychain step) | §10 |
-| D5 | tests: `cargo test -p omnigraph-server --test server` (incl. `mcp_*`), `--test server server_opens_s3…` | `server.rs` monolith split; suites are `auth_policy.rs / data_routes.rs / schema_routes.rs / stored_queries.rs / multi_graph.rs / boot_settings.rs / s3.rs / openapi.rs`. S3 → `--test s3` | §11 |
-| D6 | "`queries.<name>.mcp.expose` in `omnigraph.yaml`" | field survives (default `true`); multi-graph declaration is **`cluster.yaml graphs.<id>.queries`** with file-discovery (`677320c`); RFC-008 deprecates `omnigraph.yaml` | §7.3 |
-
-Everything else from the blueprint (the crate split, the `McpBackend` seam,
-stateless Streamable-HTTP transport, the per-graph routing decision, the
-coarse-`invoke_query` caveat, "no `read`/`change` aliases", "no `graphs_list` in
-MCP") **validated against code and carries forward unchanged.**
-
----
+Add a first-class **MCP (Model Context Protocol) server surface to
+`omnigraph-server`**, served over **Streamable HTTP**, that projects the server's
+operations as MCP **tools** and **resources** for LLM clients. Two tool populations
+share one projection path:
 
-## 1. Crate architecture & dependency direction
+1. **Built-in operational tools** — graph read/mutate, schema get/apply, branch
+   create/delete/merge/list, commit list/get, NDJSON load, and a `health` liveness
+   tool, plus resources `omnigraph://schema` and `omnigraph://branches`.
+2. **Dynamic stored-query tools** — projected from the graph's loaded stored-query
+   registry: either one typed tool per query (small catalogs) or a
+   discovery + execute meta-tool pair (large catalogs) — see [§9](#9-stored-query-projection).
 
-Two crates. `rmcp` lives **only** in `omnigraph-mcp`; `omnigraph-server` carries
-no direct rmcp dep.
+Every tool is **authorized by the server's existing Cedar policy engine**. The MCP
+layer performs **no authentication of its own**: it consumes an already-resolved
+actor identity from the server's bearer/OAuth middleware, so the **same endpoint
+serves on-prem (static or customer-OIDC tokens) and cloud (OAuth 2.1) by
+configuration only**. The transport is **stateless JSON over a single POST
+endpoint** — the minimal conformant Streamable-HTTP shape, since the server emits no
+server-initiated messages.
+
+The surface is built so the existing local stdio MCP package can later collapse into
+a thin stdio↔HTTP proxy over it, leaving one Cedar-gated, remotely-reachable tool
+set ([§13](#13-provider-compatibility)).
+
+## 2. Goals
+
+- Project built-in tools **and** stored queries through **one** registry abstraction,
+  so `tools/list` / `tools/call` never special-case a population.
+- Make `tools/list` and the callable set agree for argument-independent authorization,
+  both driven by Cedar; `tools/call` is always the authoritative gate.
+- Keep the MCP layer **auth-method-agnostic**: it consumes a resolved actor, never a
+  raw token, and never branches on how authentication happened.
+- Add **no business logic**: tools delegate to the same engine functions the HTTP
+  routes call.
+- Be **code-mode-friendly** (typed schemas, structured output, stable names,
+  progressive disclosure) and **maximally client-compatible** (Streamable HTTP +
+  bearer today, OAuth 2.1 + RFC 9728 as an additive layer).
+- Behaviour-neutral when unused: no MCP traffic ⇒ no change.
+
+## 3. Non-Goals
+
+- **Hosting an OAuth authorization server.** The server is a **Resource Server** only
+  (validates tokens, never issues them, never holds client secrets). The AS is a
+  separate concern (MR-956).
+- **MCP prompts, elicitation, sampling, tasks, `tools/list_changed` subscriptions,
+  resource subscriptions, server-initiated messages** — none required, which is what
+  permits the stateless POST-only transport. (`tools/list_changed` is reconsidered
+  only if the registry gains runtime reload.)
+- **stdio transport inside the server** — stdio stays in the TS package (later a proxy).
+- **Client-side "code mode" machinery** (TS wrapper generation, sandboxes, tool
+  search/deferral) — those are client/runtime concerns; see [§12](#12-code-mode-compatibility)
+  for what the server does to support them and what it deliberately does not build.
+- **Cross-graph tool listing** — per-graph catalogs only.
+
+## 4. Protocol target
+
+Target MCP revision **`2025-11-25`** (current). `rmcp 1.7.0` advertises this as its
+latest and negotiates down to any of `2024-11-05 / 2025-03-26 / 2025-06-18 /
+2025-11-25`; an absent `MCP-Protocol-Version` header defaults to `2025-03-26` and an
+unsupported one is a `400`. Revision `2025-06-18` is the floor we rely on for two
+features: **structured tool output** (`outputSchema` + `structuredContent`) and the
+**OAuth Resource-Server** model. From `2025-11-25` we adopt: **input-validation
+errors as tool-execution errors** (SEP-1303), **JSON Schema 2020-12** as the default
+dialect, **`403` on invalid `Origin`**, and **`WWW-Authenticate` made optional** with
+a `.well-known` fallback.
+
+**Transport shape (stateless Streamable HTTP).** The server exposes one endpoint that
+accepts `POST` (and answers `GET`/`DELETE` with `405 + Allow: POST`). For a JSON-RPC
+*request* it returns one `application/json` object; it opens no SSE stream, assigns no
+`Mcp-Session-Id`, and treats every request independently — a fully conformant
+stateless server. It **MUST** validate `Origin` (`403` on mismatch) and honor
+`MCP-Protocol-Version`. `rmcp` delivers all of these in stateless mode (§7).
+
+## 5. Crate architecture
+
+Two crates; `rmcp` is contained to one of them.
 
 ```
-omnigraph-server (impl McpBackend, all omnigraph logic)
-        │   depends on
+omnigraph-server  (implements McpBackend; all omnigraph tool/Cedar/dispatch logic)
+        │ depends on
         ▼
-omnigraph-mcp    (rmcp transport, McpBackend trait, rmcp model re-exports)
-        │   depends on
+omnigraph-mcp     (rmcp Streamable-HTTP transport, the McpBackend trait, rmcp model re-exports)
+        │ depends on
         ▼
 rmcp 1.7 + tower-http(limit) + axum + http
 ```
 
-The dependency **must** go `server → mcp`, never the reverse: the server binary
-mounts `/mcp`, so `mcp → server` would cycle at the package level
-(`server-bin → omnigraph-mcp → server-lib`). The trait inverts it — the crate
-defines the seam, the server fills it — which is also why the crate can never
-name `AppState`, `GraphHandle`, `do_*`, or `api::*`.
+The dependency **must** go `server → mcp`. The server binary mounts `/mcp`, so a
+`mcp → server` edge cycles at the package level (`server-bin → omnigraph-mcp →
+server-lib`), which Cargo rejects. The trait inverts the direction — the crate
+defines the seam, the server fills it — which is also why the crate can never name an
+omnigraph type (`AppState`, `GraphHandle`, the handlers); it abstracts over them.
 
-`omnigraph-mcp/Cargo.toml`:
+`crates/omnigraph-mcp/Cargo.toml`:
 
 ```toml
 [package]
 name = "omnigraph-mcp"
-edition = "2024"           # matches the workspace (omnigraph-server is edition 2024)
+edition = "2024"                 # rmcp 1.7 is itself edition 2024 — no friction
 version.workspace = true
 
 [dependencies]
+# `server` is on by rmcp's default features; `transport-streamable-http-server`
+# pulls in the tower service + http stack. Do NOT enable rmcp's `local` feature —
+# it cfg's the StreamableHttpService tower wiring out.
 rmcp = { version = "1.7", default-features = false, features = ["server", "transport-streamable-http-server"] }
-axum = { workspace = true }
-http = "1"
+axum       = { workspace = true }
+http       = "1"
 tower-http = { workspace = true, features = ["limit"] }
-tokio = { workspace = true }
+tokio      = { workspace = true }
 async-trait = { workspace = true }
 serde_json = { workspace = true }
 ```
 
-Workspace edit — add the member (current `members`, `Cargo.toml`):
+Add `"crates/omnigraph-mcp"` to the workspace `members`; in
+`omnigraph-server/Cargo.toml` add `omnigraph-mcp` and **no direct `rmcp` dep**
+(verified absent today). The verification gate is `cargo tree -p omnigraph-server -e
+normal | grep rmcp` showing rmcp only transitively under `omnigraph-mcp`.
 
-```toml
-members = [
-    "crates/omnigraph-compiler",
-    "crates/omnigraph",
-    "crates/omnigraph-cli",
-    "crates/omnigraph-cluster",
-    "crates/omnigraph-policy",
-    "crates/omnigraph-server",
-    "crates/omnigraph-mcp",      # NEW
-]
-```
-
-In `omnigraph-server/Cargo.toml`: add `omnigraph-mcp = { workspace = true }`;
-do **not** add `rmcp`. (Confirmed absent today: `git grep rmcp origin/main --
-'**/Cargo.toml'` is empty.)
-
----
-
-## 2. The `McpBackend` seam — `NEW (build this)` in `omnigraph-mcp`
+## 6. The `McpBackend` seam — `New` in `omnigraph-mcp`
 
 ```rust
 // crates/omnigraph-mcp/src/lib.rs
 use async_trait::async_trait;
 
-// rmcp model types re-exported so the server uses them via `omnigraph_mcp::…`
+// rmcp model types re-exported so the server speaks rmcp via `omnigraph_mcp::…`
 // and carries no direct rmcp dependency.
 pub use rmcp::model::{
-    Annotated, CallToolResult, Content, RawResource, ReadResourceResult, Resource,
+    CallToolResult, Content, RawResource, ReadResourceResult, Resource,
     ResourceContents, ServerCapabilities, ServerInfo, Tool, ToolAnnotations,
 };
-pub use rmcp::ErrorData as McpError;
+pub use rmcp::ErrorData as McpError;        // JSON-RPC error type (method_not_found=-32601, invalid_params=-32602, internal_error=-32603)
 pub type JsonObject = serde_json::Map<String, serde_json::Value>;
 
 #[async_trait]
 pub trait McpBackend: Clone + Send + Sync + 'static {
     fn server_info(&self) -> ServerInfo;
-
-    async fn list_tools(&self, parts: &http::request::Parts)
-        -> Result<Vec<Tool>, McpError>;
-
-    async fn call_tool(&self, parts: &http::request::Parts, name: &str, args: JsonObject)
-        -> Result<CallToolResult, McpError>;
-
-    async fn list_resources(&self, parts: &http::request::Parts)
-        -> Result<Vec<Resource>, McpError>;
-
-    async fn read_resource(&self, parts: &http::request::Parts, uri: &str)
-        -> Result<ReadResourceResult, McpError>;
+    async fn list_tools(&self, parts: &http::request::Parts) -> Result<Vec<Tool>, McpError>;
+    async fn call_tool(&self, parts: &http::request::Parts, name: &str, args: JsonObject) -> Result<CallToolResult, McpError>;
+    async fn list_resources(&self, parts: &http::request::Parts) -> Result<Vec<Resource>, McpError>;
+    async fn read_resource(&self, parts: &http::request::Parts, uri: &str) -> Result<ReadResourceResult, McpError>;
 }
 ```
 
-`&http::request::Parts` is the whole decoupling mechanism. The crate hands the
-backend the request parts; the backend reads **its own** types out of
-`parts.extensions` (§4). The crate never names an omnigraph type. `#[async_trait]`
-boxes the futures — negligible at MCP QPS and it sidesteps the
-async-fn-in-trait `Send`-bound friction; the server already depends on
-`async-trait`.
+`&http::request::Parts` is the decoupling mechanism. The crate hands the backend the
+request parts; the backend reads **its own** types out of `parts.extensions`. The
+crate never names an omnigraph type, so it is reusable and auth stays decoupled (§8).
 
----
+> `rmcp`'s own `ServerHandler` trait uses RPITIT (`-> impl Future + …`), not
+> `async-trait`. Our `McpBackend` deliberately uses `#[async_trait]`: it is
+> implemented once by the server, the boxed future is negligible at MCP QPS, and the
+> server already depends on `async-trait`. Either style compiles on edition 2024.
 
-## 3. Transport — `NEW (build this)` in `omnigraph-mcp`
+## 7. Transport — `New` in `omnigraph-mcp`
 
 ```rust
 // crates/omnigraph-mcp/src/transport.rs
 use std::sync::Arc;
 use rmcp::transport::streamable_http_server::{
-    StreamableHttpServerConfig, StreamableHttpService, session::local::LocalSessionManager,
+    StreamableHttpServerConfig, StreamableHttpService,
+    session::never::NeverSessionManager,     // stateless ⇒ reject all session ops
 };
 
 pub struct McpHostPolicy {
-    pub allowed_hosts: Option<Vec<String>>,    // None ⇒ accept any Host
-    pub allowed_origins: Option<Vec<String>>,  // None/empty ⇒ off (non-browser clients send no Origin)
+    pub allowed_hosts:   Option<Vec<String>>,   // None ⇒ accept any Host (rely on bearer + Origin)
+    pub allowed_origins: Option<Vec<String>>,   // None/empty ⇒ Origin check off (non-browser clients send none)
 }
 
 pub fn mcp_router<B: McpBackend>(backend: B, body_limit: usize, hosts: McpHostPolicy) -> axum::Router {
-    // StreamableHttpServerConfig is #[non_exhaustive] — build from Default, mutate via with_*.
-    let config = StreamableHttpServerConfig::default()
+    // StreamableHttpServerConfig is #[non_exhaustive]; its Default is stateful_mode=true,
+    // json_response=false, allowed_hosts=loopback. ALL THREE must be overridden for a
+    // remote stateless JSON server — build from Default and flip via the with_* setters.
+    let mut config = StreamableHttpServerConfig::default()
         .with_stateful_mode(false)
-        .with_json_response(true)
-        .with_allowed_hosts(hosts.allowed_hosts)
-        .with_allowed_origins(hosts.allowed_origins);
-
+        .with_json_response(true);
+    config = match hosts.allowed_hosts {
+        Some(list) => config.with_allowed_hosts(list),
+        None        => config.disable_allowed_hosts(),     // accept any Host
+    };
+    if let Some(origins) = hosts.allowed_origins { config = config.with_allowed_origins(origins); }
+
+    // service_factory returns Result<S, io::Error>; NeverSessionManager pairs with stateless mode.
     let svc = StreamableHttpService::new(
         move || Ok(McpService::new(backend.clone())),
-        Arc::new(LocalSessionManager::default()),
+        Arc::new(NeverSessionManager::default()),
         config,
     );
 
     axum::Router::new()
         .route_service("/mcp", svc)
-        // rmcp reads the body directly (NOT via an axum extractor), so axum's
+        // rmcp reads the body directly (not via an axum extractor), so axum's
         // DefaultBodyLimit does NOT bound /mcp — the tower-http layer does.
         .layer(tower_http::limit::RequestBodyLimitLayer::new(body_limit))
 }
 ```
 
-`McpService<B>` implements rmcp's `ServerHandler`, extracts `&Parts` from
-`ctx.extensions` once (missing → `McpError::internal_error`), and delegates each
-method to `B`. `get_info → backend.server_info()`; `initialize`/`ping` use rmcp
-defaults.
-
-**Stateless mode gives these conformance MUSTs for free** (rmcp 1.7): `GET`/`DELETE
-/mcp → 405` with `Allow`; disallowed `Host`/`Origin → 403`; `MCP-Protocol-Version
-→ 400` on unsupported, default `2025-03-26` when absent. No conformance
-middleware needed.
-
-**Host/Origin policy is derived from the deployment, never a fixed default.**
-rmcp's loopback `allowed_hosts` default would `403` every non-loopback client
-before bearer auth. The server computes `McpHostPolicy` from its `--bind`: loopback
-bind → loopback allow-list (rebind protection matters locally); non-loopback bind
-→ allow the configured public host(s) or disable Host-allowlisting and log it
-(bearer + `Origin` are the real controls). `allowed_origins` is the browser-attack
-control: default off; configurable for browser-based clients.
-
----
-
-## 4. The extension passthrough — `EXISTING (reuse)` types, `NEW` backend
-
-The two values the backend needs are injected into the request extensions by
-middleware that runs **before** the MCP service. Both are real types:
+`McpService<B>` implements rmcp's `ServerHandler`, pulls `&Parts` out of the request
+context once, and delegates each method to `B`. rmcp's `StreamableHttpService`
+**consumes the body and injects the remaining `http::request::Parts` into
+`RequestContext.extensions`** (this is documented and load-bearing — see §8); inside
+the handler, `ctx.extensions.get::<http::request::Parts>()` returns those parts.
+
+**Conformance the stateless transport gives for free** (verified in rmcp 1.7
+`tower.rs`): `GET`/`DELETE /mcp → 405` with `Allow: POST`; a disallowed `Host` →
+`403`; a present-but-invalid `Origin` → `403` (Origin checked only when
+`allowed_origins` is non-empty); `MCP-Protocol-Version` → `400` on unsupported,
+default `2025-03-26` when absent. **No conformance middleware is needed.**
+
+**Host/Origin policy is derived from the deployment, never a fixed default.** rmcp's
+default `allowed_hosts` is loopback-only — correct for local dev (DNS-rebinding
+defense) but it would `403` every remote client. `omnigraph-server` computes
+`McpHostPolicy` once at startup from `--bind`: loopback bind → keep the loopback
+allow-list; non-loopback bind → allow the configured public host(s), else disable
+Host-allowlisting and log it (bearer + `Origin` are the real controls). `Origin`
+validation defaults off (non-browser MCP clients send no `Origin`) and is enabled for
+browser-based clients.
+
+## 8. Auth & identity — `Reuses` the server's middleware
+
+The backend consumes an already-resolved actor and **branches on nothing** about how
+the token was verified. Two values are injected into the request extensions by
+middleware that runs **before** the MCP service:
 
 ```rust
-// EXISTING (reuse) — crates/omnigraph-server/src/identity.rs:187
-pub struct ResolvedActor {
-    pub actor_id: Arc<str>,
-    pub tenant_id: Option<TenantId>,
-    pub scopes: Vec<Scope>,
-    pub source: AuthSource,        // Static today; Oidc when MR-956 lands
-}
-```
+// Reuses — crates/omnigraph-server/src/identity.rs:187
+pub struct ResolvedActor { pub actor_id: Arc<str>, pub tenant_id: Option<TenantId>, pub scopes: Vec<Scope>, pub source: AuthSource }
 
-```rust
-// EXISTING (reuse) — crates/omnigraph-server/src/registry.rs:37
+// Reuses — crates/omnigraph-server/src/registry.rs:37
 pub struct GraphHandle {
-    pub key: GraphKey,
-    pub uri: String,
+    pub key: GraphKey, pub uri: String,
     pub engine: Arc<Omnigraph>,
-    pub policy: Option<Arc<PolicyEngine>>,        // None ⇒ no per-graph Cedar gate
-    pub queries: Option<Arc<QueryRegistry>>,      // None ⇒ no stored queries for this graph
+    pub policy:  Option<Arc<PolicyEngine>>,   // None ⇒ no per-graph Cedar gate
+    pub queries: Option<Arc<QueryRegistry>>,  // None ⇒ no stored queries for this graph
 }
 ```
 
-The middleware order is fixed in `build_app` (`lib.rs:909-956`, see §9): the outer
-layer `require_bearer_auth` injects `Extension<ResolvedActor>` (or 401); the inner
+The middleware order is fixed in `build_app` (`lib.rs:909`): the **outer** layer
+`require_bearer_auth` injects `Extension<ResolvedActor>` (or `401`); the **inner**
 layer `resolve_graph_handle` injects `Extension<Arc<GraphHandle>>`. Both land in
-`request.extensions()`, which rmcp forwards into `RequestContext.extensions`.
+`request.extensions()`, which rmcp copies into `RequestContext.extensions`.
 
 ```rust
-// NEW (build this) — crates/omnigraph-server/src/mcp/mod.rs
+// New — crates/omnigraph-server/src/mcp/mod.rs
 #[derive(Clone)]
-pub struct OmnigraphMcpBackend { state: AppState }   // AppState is #[derive(Clone)], Arc-backed
+pub struct OmnigraphMcpBackend { state: AppState }   // AppState is Arc-backed #[derive(Clone)]
 
 impl OmnigraphMcpBackend {
-    fn ctx<'a>(&self, parts: &'a http::request::Parts)
-        -> Result<(&'a ResolvedActor, &'a Arc<GraphHandle>), McpError>
-    {
-        let actor = parts.extensions.get::<ResolvedActor>()
+    fn ctx<'a>(&self, parts: &'a http::request::Parts) -> Result<(&'a ResolvedActor, &'a Arc<GraphHandle>), McpError> {
+        let actor  = parts.extensions.get::<ResolvedActor>()
             .ok_or_else(|| McpError::internal_error("actor missing from request extensions", None))?;
         let handle = parts.extensions.get::<Arc<GraphHandle>>()
             .ok_or_else(|| McpError::internal_error("graph handle missing from request extensions", None))?;
         Ok((actor, handle))
     }
 }
-
-// The server's thin wrapper — the only place the two crates meet:
-pub fn mcp_router(state: AppState) -> axum::Router {
-    omnigraph_mcp::mcp_router(
-        OmnigraphMcpBackend { state },
-        INGEST_REQUEST_BODY_LIMIT_BYTES,        // lib.rs:137 — 32 MiB, reused
-        host_policy_from_bind(/* … */),
-    )
-}
-```
-
----
-
-## 5. Reuse points for dispatch — `EXISTING (reuse)`, all in `handlers.rs`
-
-The backend adds **no business logic**. `call_tool` routes to the same functions
-the REST handlers call.
-
-### 5.1 The authorization gate
-
-```rust
-// crates/omnigraph-server/src/handlers.rs:336
-pub(crate) enum Authz { Allowed, Denied(String) }
-
-// crates/omnigraph-server/src/handlers.rs:357 — Err reserved for operational 401/500
-pub(crate) fn authorize(
-    actor: Option<&ResolvedActor>,
-    policy: Option<&PolicyEngine>,
-    request: PolicyRequest,
-) -> Result<Authz, ApiError>;
-
-// crates/omnigraph-server/src/handlers.rs:438 — denial ⇒ 403
-pub(crate) fn authorize_request(
-    actor: Option<&ResolvedActor>,
-    policy: Option<&PolicyEngine>,
-    request: PolicyRequest,
-) -> Result<(), ApiError>;
-```
-
-`PolicyRequest` carries **no actor identity** (dropped from the type — the MR-731
-invariant) and **no query-name dimension** (the coarse-`invoke_query` caveat):
-
-```rust
-// crates/omnigraph-policy/src/lib.rs:252
-pub struct PolicyRequest {
-    pub action: PolicyAction,
-    pub branch: Option<String>,
-    pub target_branch: Option<String>,
-}
-// crates/omnigraph-policy/src/lib.rs:259
-pub struct PolicyDecision { pub allowed: bool, pub matched_rule_id: Option<String>, pub message: String }
-// crates/omnigraph-policy/src/lib.rs:509
-impl PolicyEngine { pub fn authorize(&self, actor_id: &str, request: &PolicyRequest) -> Result<PolicyDecision>; }
-```
-
-### 5.2 The read / mutate runners (already decoupled from request bodies)
-
-```rust
-// crates/omnigraph-server/src/handlers.rs:731 — self-authorizes Read
-pub(crate) async fn run_query(
-    handle: Arc<GraphHandle>,
-    actor: Option<&ResolvedActor>,
-    query: &str,
-    name: Option<&str>,
-    params_json: Option<&Value>,
-    branch: Option<String>,
-    snapshot: Option<String>,
-    reject_mutations: bool,
-) -> Result<(String, ReadTarget, omnigraph_compiler::result::QueryResult), ApiError>;
-
-// crates/omnigraph-server/src/handlers.rs:665 — self-authorizes Change + admission
-pub(crate) async fn run_mutate(
-    state: AppState,
-    handle: Arc<GraphHandle>,
-    actor: Option<&ResolvedActor>,
-    query: &str,
-    name: Option<&str>,
-    params_json: Option<&Value>,
-    branch: String,
-) -> Result<ChangeOutput, ApiError>;
-```
-
-`run_query` returns the raw `QueryResult`; project it for the wire with the
-existing `read_output(query_name, &target, result) -> ReadOutput` (`api.rs:634`).
-
-### 5.3 The canonical double-gate + deny-as-404 pattern to mirror
-
-`server_invoke_query` (`handlers.rs:933`) is exactly the shape `call_tool` must
-reproduce for stored-query tools — reproduced here as the contract:
-
-```rust
-// EXISTING (reuse pattern) — handlers.rs:953  (outer InvokeQuery gate, deny == missing)
-match authorize(actor_ref, handle.policy.as_deref(), PolicyRequest {
-    action: PolicyAction::InvokeQuery,
-    branch: None,            // graph-scoped: NO branch dimension on the outer gate
-    target_branch: None,
-})? {
-    Authz::Allowed => {}
-    Authz::Denied(_) => return Err(ApiError::not_found(NOT_FOUND)),  // "stored query not found"
-}
-let stored = handle.queries.as_ref()
-    .and_then(|r| r.lookup(&name))
-    .ok_or_else(|| ApiError::not_found(NOT_FOUND))?;                 // same 404 on a miss
-// … then the inner gate runs inside run_mutate (Change) / run_query (Read).
-if is_mutation {
-    if req.snapshot.is_some() {                                      // mutation-against-snapshot unrepresentable
-        return Err(ApiError::bad_request("stored mutation cannot target a snapshot"));
-    }
-    // run_mutate(…)  ← inner Change gate
-} else {
-    // run_query(…)   ← inner Read gate
-}
-```
-
-For MCP, the same masking principle applies but the wire encoding differs: an
-**unknown tool OR a denied tool** returns the identical `unknown tool: <name>`
-JSON-RPC error (`-32602`) so the catalog can't be probed without the grant (§6).
-
-### 5.4 Error classification — `NEW (build this)`, one mapper
-
-`ApiError`'s fields are **private** (`lib.rs:296`), so add two accessors:
-
-```rust
-// EXISTING shape — crates/omnigraph-server/src/lib.rs:296
-pub struct ApiError {
-    status: StatusCode, code: ErrorCode, message: String,
-    merge_conflicts: Vec<api::MergeConflictOutput>,
-    manifest_conflict: Option<api::ManifestConflictOutput>,
-}
-
-// NEW: add accessors so the MCP classifier can read status/message.
-impl ApiError {
-    pub(crate) fn status_code(&self) -> StatusCode { self.status }
-    pub(crate) fn message_str(&self) -> &str { &self.message }
-}
-```
-
-```rust
-// NEW (build this) — the single source of truth, used at EVERY dispatch site
-fn classify(r: Result<CallToolResult, ApiError>) -> Result<CallToolResult, McpError> {
-    match r {
-        Ok(out) => Ok(out),
-        Err(e) if e.status_code().is_client_error() =>          // 4xx / 409 → model self-corrects
-            Ok(CallToolResult::error(vec![Content::text(e.message_str())])),  // isError: true
-        Err(e) => Err(McpError::internal_error(e.message_str().to_owned(), None)),  // 5xx → JSON-RPC error
-    }
-}
-```
-
-One `classify`, not the `api_error_to_tool` / `api_operational_error` split that
-drifted in the reference impl. A missing/bad bearer is an HTTP 401 at the boundary
-*before* rmcp.
-
----
-
-## 6. Tool catalog (13 built-ins) + Cedar mapping
-
-Each built-in reuses the **exact `PolicyAction` its REST route enforces**. The full
-action vocabulary on `main`:
-
-```rust
-// EXISTING — crates/omnigraph-policy/src/lib.rs:18
-pub enum PolicyAction {
-    Read, Export, Change, SchemaApply,
-    BranchCreate, BranchDelete, BranchMerge,
-    Admin,        // reserved, no call site yet
-    GraphList,    // server-scoped (resource_kind == Server)
-    InvokeQuery,  // graph-scoped, coarse (no per-query dimension yet)
-}
 ```
 
-| MCP tool | Scope | Cedar action |
-|---|---|---|
-| `health` | server | none (liveness/version) |
-| `snapshot`, `schema_get`, `branches_list`, `commits_list`, `commits_get` | graph | `Read` |
-| `query` (ad-hoc read) | graph | `Read` (`run_query` self-authorizes) |
-| `mutate` (ad-hoc write) | graph | `Change` |
-| `load` (NDJSON; **renamed from `ingest`** — see §8) | graph | `Change` (+ `BranchCreate` **iff `from` present**) |
-| `branches_create` / `branches_delete` / `branches_merge` | graph | `BranchCreate` / `BranchDelete` / `BranchMerge` |
-| `schema_apply` (`allow_data_loss`) | graph | `SchemaApply` |
-| *stored query* | graph | `InvokeQuery` (coarse) then inner `Read`/`Change` |
-
-Locked decisions (validated): **tool ids are `query`/`mutate` only** — no
-`read`/`change` aliases (the REST surface already deprecated `/read`, `/change`; a
-fresh MCP has no legacy clients). **Ad-hoc `query`/`mutate` are always exposed,
-Cedar-only** — no `mcp.allow_adhoc`. **No `graphs_list` / `omnigraph://graphs`** —
-graph discovery stays REST-only via `GET /graphs` (server-scoped `GraphList`, §9).
-
-Annotations (rmcp defaults `destructiveHint` **and** `openWorldHint` to true — set
-explicitly): read tools → `read_only(true).open_world(false)`;
-`mutate`/`load`/`branches_delete`/`branches_merge`/`schema_apply` →
-`read_only(false).destructive(true).open_world(false)`; `branches_create`
-(additive) → `destructive(false)`.
-
-Represent built-ins as a `Builtin` enum (one variant per tool; `descriptor`/`gate`/`call`
-as match arms) — lower liability than ~13 unit structs + `dyn` + `async-trait`.
-Stored-query tools are a sibling populator over `handle.queries`.
-
-`list_tools`/`list_resources` are Cedar-filtered by running the **same**
-authorization the call path runs, with **default args (branch `main`)** — not a
-`branch: None` probe (which matches no `branch_scope` rule and would hide callable
-tools). Over-showing a branch-scoped grant is the safe direction; `call_tool` is
-the authoritative gate. (The contract: list never shows a tool the actor can't ever
-call; for branch-scoped tools it may show one callable only on some branches.)
-
----
-
-## 7. Stored-query projection — `EXISTING (reuse)` + the **corrected** schema
+**Auth posture (spec-aligned, MCP 2025-11-25 authorization).** The server is a
+Resource Server. Per-request validation only — **sessions are never used for
+authentication** (the transport is stateless, which makes this structural). Token
+**audience must be validated** and **token passthrough is prohibited**: if a tool
+later needs to reach an upstream API, the server acts as a separate OAuth client and
+must not forward the client's token.
+
+- **Static bearer (today).** `require_bearer_auth` resolves a `ResolvedActor` from a
+  SHA-256 hash match. Works for the developer/agent clients (§13).
+- **OAuth 2.1 + RFC 9728 (additive, MR-956).** Serve
+  `/.well-known/oauth-protected-resource`; on `401`, optionally add
+  `WWW-Authenticate: Bearer resource_metadata="…"` (header is optional in 2025-11-25
+  given the well-known fallback). Clients run OAuth 2.1 + PKCE + RFC 8707 resource
+  indicators themselves; the server validates audience-bound JWTs offline (cached
+  JWKS), so on-prem/air-gapped keeps working. This swaps the bearer middleware behind
+  a `TokenVerifier` and changes **zero** MCP code.
+
+> **Compatibility caveat to honor (Claude Code issue #59467).** Advertising RFC-9728
+> Protected-Resource-Metadata can cause some clients (Claude Code today) to **ignore
+> a static `Authorization` header and force the OAuth flow**. So PRM advertisement
+> must be **config-gated**: a deployment serving developer clients over static bearer
+> does not advertise OAuth; a deployment targeting consumer connectors does. The MCP
+> routes only need to flow through the standard `401` path so the hook can be added
+> without touching MCP code.
+
+## 9. Stored-query projection
 
 The projection source is the same `query_catalog_entry` the `GET /queries` catalog
-uses. The real types (note `Vector` is a **unit** variant; the dim is a separate
-`Option<u32>` field — this is delta **D1**):
+uses. Real types:
 
 ```rust
-// EXISTING — crates/omnigraph-server/src/api.rs:346
+// Reuses — crates/omnigraph-server/src/api.rs:346
 pub enum ParamKind { String, Bool, Int, BigInt, Float, Date, DateTime, Blob, Vector, List }
 
-// crates/omnigraph-server/src/api.rs:362
+// Reuses — crates/omnigraph-server/src/api.rs:362
 pub struct ParamDescriptor {
     pub name: String,
     pub kind: ParamKind,
     pub item_kind: Option<ParamKind>,   // Some(scalar) when kind == List
-    pub vector_dim: Option<u32>,        // Some(dim) when kind == Vector  ← dim lives HERE, not in the kind
+    pub vector_dim: Option<u32>,        // Some(dim) when kind == Vector — the dimension lives here, not in the kind
     pub nullable: bool,
 }
 
-// crates/omnigraph-server/src/api.rs:453
-pub fn query_catalog_entry(query: &StoredQuery) -> QueryCatalogEntry;  // → { name, tool_name, description, instruction, mutation, params }
+// Reuses — crates/omnigraph-server/src/queries.rs:32
+pub struct StoredQuery { pub name: String, pub source: Arc<str>, pub decl: QueryDecl, pub expose: bool, pub tool_name: Option<String> }
+impl StoredQuery { pub fn is_mutation(&self) -> bool; pub fn effective_tool_name(&self) -> &str; }   // queries.rs:51,60
+pub struct QueryRegistry { /* by_name: BTreeMap<String, StoredQuery>; .lookup(&name) */ }            // queries.rs:67
 ```
 
-```rust
-// EXISTING — crates/omnigraph-server/src/queries.rs:32
-pub struct StoredQuery {
-    pub name: String,
-    pub source: Arc<str>,
-    pub decl: QueryDecl,
-    pub expose: bool,             // default true
-    pub tool_name: Option<String>,
-}
-impl StoredQuery {
-    pub fn is_mutation(&self) -> bool;           // queries.rs:51
-    pub fn effective_tool_name(&self) -> &str;   // queries.rs:60 — tool_name override else name
-}
-pub struct QueryRegistry { /* by_name: BTreeMap<String, StoredQuery> */ }   // queries.rs:67; .lookup(&name)
-```
+A query is declared in the graph's stored-query registry — `cluster.yaml
+graphs.<id>.queries` with file discovery for cluster/multi-graph servers, or the
+legacy single-graph `omnigraph.yaml queries:` map. Each entry carries MCP metadata
+(`expose: bool`, default `true`; optional `tool_name`); the projection reads the
+loaded `QueryRegistry` on `handle.queries` and is agnostic to the declaration source.
+
+### 9.1 `ParamDescriptor → JSON Schema` (`New`)
 
-### 7.1 `ParamDescriptor → JSON Schema` — `NEW (build this)`, corrected for the real types
+JSON Schema 2020-12. The `Vector` dimension is an `Option<u32>` on the descriptor:
+when present, constrain the array length; when absent, **omit** `minItems`/`maxItems`
+(never emit `0`, which would reject all input).
 
 ```rust
-// NEW (build this) — corrects the blueprint's nonexistent `ParamKind::Vector { dim }`.
+// New — crates/omnigraph-server/src/mcp/schema.rs
 use serde_json::{json, Value};
 
 fn scalar_schema(kind: ParamKind) -> Value {
@@ -558,7 +357,6 @@ fn scalar_schema(kind: ParamKind) -> Value {
         ParamKind::Date     => json!({ "type": "string", "format": "date" }),
         ParamKind::DateTime => json!({ "type": "string", "format": "date-time" }),
         ParamKind::Blob     => json!({ "type": "string", "contentEncoding": "base64" }),
-        // Vector/List handled by param_schema (they carry sub-structure).
         ParamKind::Vector | ParamKind::List => unreachable!("composite kinds handled in param_schema"),
     }
 }
@@ -567,382 +365,373 @@ fn param_schema(p: &ParamDescriptor) -> Value {
     match p.kind {
         ParamKind::Vector => {
             let mut s = json!({ "type": "array", "items": { "type": "number" } });
-            // dim is Option<u32> on the descriptor. Present ⇒ constrain length.
-            // Absent ⇒ OMIT minItems/maxItems — never emit 0 (a 0-length-array
-            // schema rejects all input; this is the footgun the first draft
-            // misattributed to a type that doesn't exist).
-            if let Some(dim) = p.vector_dim {
-                s["minItems"] = json!(dim);
-                s["maxItems"] = json!(dim);
-            }
+            if let Some(dim) = p.vector_dim { s["minItems"] = json!(dim); s["maxItems"] = json!(dim); }
             s
         }
-        ParamKind::List => {
-            let item = p.item_kind.map(scalar_schema).unwrap_or_else(|| json!({ "type": "string" }));
-            json!({ "type": "array", "items": item })  // grammar guarantees scalar items
-        }
+        ParamKind::List => json!({ "type": "array", "items": p.item_kind.map(scalar_schema).unwrap_or_else(|| json!({"type":"string"})) }),
         scalar => scalar_schema(scalar),
     }
 }
 ```
 
-### 7.2 Envelope (correct by construction)
+### 9.2 Two projection modes (small vs large catalogs)
+
+Tool-overload is real: model accuracy degrades sharply as a single client's tool
+count climbs past a few dozen, and clients that don't defer tool loading (e.g.
+OpenCode) pay the full `tools/list` token cost. So the projection has two modes,
+selected per graph by a `mcp.stored_query_mode` setting (default `auto`):
 
-The tool's `input_schema` **nests query params under `params`**, mirroring
-`POST /queries/{name}`, so the invocation knobs and the query's own params live in
-separate namespaces and cannot collide:
+- **`per_query` (small/stable catalogs).** One tool per `expose: true` query, named by
+  `effective_tool_name()`, with a fully typed `input_schema`. This is the richest
+  surface — each query is a first-class typed tool, ideal for code-mode runtimes that
+  compile tools into a typed API.
+- **`meta` (large/dynamic catalogs).** Two tools instead of N: `stored_query_list(filter?,
+  detail_level?)` (returns names + descriptions; full param schema only at higher
+  detail) and `stored_query_run(name, params, branch?, snapshot?)`. This keeps
+  `tools/list` small and mirrors the progressive-disclosure shape (`search` + `execute`)
+  that scales to hundreds of queries.
+- **`auto`** picks `per_query` below a threshold (default 24 exposed queries) and `meta`
+  at or above it; the threshold is configurable. The boundary and count are logged so
+  a deployment never silently flips modes.
+
+### 9.3 Envelope (collision-free by construction)
+
+In `per_query` mode the tool's `input_schema` **nests query params under `params`**,
+mirroring `POST /queries/{name}`:
 
 ```jsonc
 { "type": "object",
   "properties": {
-    "params": { "type": "object", "properties": { /* per-param param_schema(...) */ }, "required": [ /* nullable==false */ ] },
+    "params":   { "type": "object", "properties": { /* per-param param_schema(...) */ }, "required": [ /* nullable == false */ ] },
     "branch":   { "type": "string" },
-    "snapshot": { "type": "string" }   // OMIT for mutation tools (mutation-against-snapshot is unrepresentable)
+    "snapshot": { "type": "string" }      // omit for mutation tools — mutation-against-snapshot is unrepresentable
   },
   "additionalProperties": false }
 ```
 
-Dispatch reads query params from `args.params` and knobs from `args.branch` /
-`args.snapshot`, so a query parameter literally named `branch`/`snapshot` is
-harmless. Skip a stored tool whose `effective_tool_name()` collides with a
-built-in (built-ins win). The outer gate is coarse `InvokeQuery`; the inner gate
-is the query body's `Read`/`Change` — the same double-gate as §5.3.
+Knobs (`branch`/`snapshot`) and the query's own params live in separate namespaces, so
+a query parameter literally named `branch`/`snapshot` cannot collide. A stored tool
+whose `effective_tool_name()` collides with a built-in is skipped (built-ins win).
 
-### 7.3 Where stored queries are declared (delta D6)
+## 10. Tool catalog + Cedar mapping — `Reuses` `PolicyAction`
 
-`expose` + `tool_name` survive with default `expose: true` (`config.rs:132`,
-`queries.rs:43`). The **declaration source** moved: multi-graph servers declare
-queries in **`cluster.yaml graphs.<id>.queries`** with Terraform-shaped file
-discovery (`queries: queries/` discovers top-level `*.gq`, loud on parse/dup
-errors — `677320c`); the single-graph `omnigraph.yaml queries:` map still parses
-but is the **legacy** surface RFC-008 deprecates. The MCP projection is agnostic
-to the source — it reads the loaded `QueryRegistry` on `handle.queries` — but the
-blueprint's "`in omnigraph.yaml`" phrasing is corrected to "the graph's loaded
-stored-query registry (`cluster.yaml` discovery, or legacy `omnigraph.yaml`)."
+Each built-in reuses the **exact `PolicyAction` its REST route enforces**:
 
----
+```rust
+// Reuses — crates/omnigraph-policy/src/lib.rs:18
+pub enum PolicyAction {
+    Read, Export, Change, SchemaApply,
+    BranchCreate, BranchDelete, BranchMerge,
+    Admin,        // reserved, no call site yet
+    GraphList,    // server-scoped (resource_kind == Server)
+    InvokeQuery,  // graph-scoped, coarse (no per-query dimension yet)
+}
+```
 
-## 8. The `load` tool — `EXISTING (reuse)`, ingest/load unified (delta D3)
+| MCP tool | Scope | Cedar action |
+|---|---|---|
+| `health` | server | none (liveness/version) |
+| `graph_snapshot`, `schema_get`, `branch_list`, `commit_list`, `commit_get` | graph | `Read` |
+| `graph_query` (ad-hoc read) | graph | `Read` (`run_query` self-authorizes) |
+| `graph_mutate` (ad-hoc write) | graph | `Change` |
+| `graph_load` (NDJSON) | graph | `Change` (+ `BranchCreate` **iff** `from` is present — see §11) |
+| `branch_create` / `branch_delete` / `branch_merge` | graph | `BranchCreate` / `BranchDelete` / `BranchMerge` |
+| `schema_apply` (`allow_data_loss`) | graph | `SchemaApply` |
+| stored query (`per_query`) / `stored_query_run` (`meta`) | graph | `InvokeQuery` (coarse) then inner `Read`/`Change` |
+
+**Naming.** Tool ids are **domain-qualified `snake_case`** (`graph_query`,
+`branch_merge`, `schema_apply`, …) within the spec's `[A-Za-z0-9_.-]`, 1–128-char
+constraint. Domain qualification (rather than bare `query`/`mutate`) reduces
+cross-server collisions when a client loads omnigraph alongside other MCP servers;
+clients that auto-prefix by connection name (e.g. OpenCode → `omnigraph_graph_query`)
+compose cleanly. Names are a stability contract (Hyrum's Law) — don't churn them.
+
+**Annotations (set explicitly).** MCP annotation defaults are pessimistic
+(`readOnlyHint=false`, `destructiveHint=true`, `idempotentHint=false`,
+`openWorldHint=true`), so an unannotated read tool is mistaken for a destructive
+open-world writer. Set them via rmcp's `ToolAnnotations` (`read_only_hint`,
+`destructive_hint`, `idempotent_hint`, `open_world_hint`):
+
+- read tools (`graph_query`, `graph_snapshot`, `schema_get`, `branch_list`,
+  `commit_*`, stored *reads*) → `read_only_hint = true`, `open_world_hint = false`.
+- writers (`graph_mutate`, `graph_load`, `branch_delete`, `branch_merge`,
+  `schema_apply`) → `read_only_hint = false`, `destructive_hint = true`,
+  `open_world_hint = false`. Clients use `destructiveHint` to drive human-confirmation
+  prompts.
+- `branch_create` (additive) → `destructive_hint = false`.
+
+Annotations are **advisory hints, not a security boundary** (clients may ignore them);
+**Cedar is the enforcement boundary.**
+
+Represent built-ins as a `Builtin` enum (one variant per tool; `descriptor` / `gate` /
+`call` as match arms) — lower liability than ~13 unit structs + `dyn`. Stored-query
+tools are a sibling populator over `handle.queries`.
+
+**`list_tools` / `list_resources` are Cedar-filtered** by running the *same*
+authorization the call path runs, with **default args (branch `main`)** — not a
+`branch: None` probe (which matches no `branch_scope` rule and would hide tools the
+actor can call on a scoped branch). Over-showing a branch-scoped grant is the safe
+direction; `call_tool` is the authoritative gate.
+
+## 11. Dispatch reuse + error classification
 
-The first draft described an `ingest` tool wrapping `ingest_as` with implicit
-fork-from-`main`. **That is gone.** `server_ingest` now calls `load_as` directly and
-fork is opt-in by `from`:
+`call_tool` adds no business logic. Reuse points (all in `handlers.rs`):
 
 ```rust
-// EXISTING — crates/omnigraph-server/src/handlers.rs:1210  (abridged to the load-bearing logic)
-let branch = request.branch.unwrap_or_else(|| "main".to_string());
-let from = request.from;                                  // Option<String>
-let mode = request.mode.unwrap_or(LoadMode::Merge);
-if !branch_exists {
-    match from.as_deref() {
-        None => return Err(ApiError::not_found(           // ← no implicit fork; a typo'd branch is a 404
-            format!("branch '{branch}' not found; pass `from` to create it"))),
-        Some(from) => authorize_request(actor, handle.policy.as_deref(), PolicyRequest {
-            action: PolicyAction::BranchCreate,           // ← BranchCreate consulted ONLY when a fork happens
-            branch: Some(from.to_string()), target_branch: Some(branch.clone()),
-        })?,
-    }
-}
-authorize_request(actor, handle.policy.as_deref(), PolicyRequest {
-    action: PolicyAction::Change, branch: Some(branch.clone()), target_branch: None,
-})?;
-let result = handle.engine.load_as(&branch, from.as_deref(), &request.data, mode, actor_id).await?;  // unified load_as
+pub(crate) enum Authz { Allowed, Denied(String) }                                  // handlers.rs:336
+pub(crate) fn authorize(actor: Option<&ResolvedActor>, policy: Option<&PolicyEngine>, request: PolicyRequest) -> Result<Authz, ApiError>; // :357 — Err = operational 401/500
+pub(crate) async fn run_query(handle: Arc<GraphHandle>, actor: Option<&ResolvedActor>, query: &str, name: Option<&str>, params_json: Option<&Value>, branch: Option<String>, snapshot: Option<String>, reject_mutations: bool) -> Result<(String, ReadTarget, QueryResult), ApiError>; // :731
+pub(crate) async fn run_mutate(state: AppState, handle: Arc<GraphHandle>, actor: Option<&ResolvedActor>, query: &str, name: Option<&str>, params_json: Option<&Value>, branch: String) -> Result<ChangeOutput, ApiError>; // :665
 ```
 
-DTOs (`api.rs:497` / `:208`):
+`PolicyRequest` carries `{ action, branch, target_branch }` only — **no actor
+identity** (server-resolved, supplied separately) and **no query-name dimension**
+(the coarse-`invoke_query` caveat):
 
 ```rust
-pub struct IngestRequest {                 // request body shape unchanged on the wire
-    pub branch: Option<String>,            // default "main"; without `from`, must already exist (else 404)
-    pub from:   Option<String>,            // parent branch; presence = opt into fork-if-missing
-    pub mode:   Option<LoadMode>,          // default Merge
-    pub data:   String,                    // NDJSON: {"type":"<T>","data":{…}} per line
-}
-pub struct IngestOutput {
-    pub uri: String, pub branch: String,
-    pub base_branch: Option<String>,       // echoes `from`; NULL when absent (was non-null in the blueprint era)
-    pub branch_created: bool, pub mode: LoadMode,
-    pub tables: Vec<IngestTableOutput>, pub actor_id: Option<String>,
-}
+// Reuses — crates/omnigraph-policy/src/lib.rs:252
+pub struct PolicyRequest { pub action: PolicyAction, pub branch: Option<String>, pub target_branch: Option<String> }
 ```
 
-**Spec decisions for the MCP tool:**
+**The stored-query double-gate + deny-masking pattern** (`handlers.rs:933`,
+`server_invoke_query`) is the contract `call_tool` mirrors for stored queries:
 
-1. **Name the tool `load`, not `ingest`.** The CLI now ships `ingest` only as a
-   *"Deprecated alias of `load --from <base>`"* (`cli.rs:111`); `load` is the
-   unified command. A fresh MCP with no legacy clients should expose the canonical
-   id — consistent with §6's "no `read`/`change` aliases" decision. (The HTTP route
-   stays `POST /ingest` for back-compat; only the MCP tool id is canonicalized.)
-2. **`do_load` wraps `load_as`** via the same body as `server_ingest`.
-3. **Input schema carries `from?`** and documents the 404-on-missing-branch:
-   `{ data: string, branch?: string, from?: string, mode?: "merge"|"append"|"overwrite" }`,
-   `additionalProperties: false`. There is no silent fork-from-main to represent.
+```rust
+// Reuses (pattern) — outer InvokeQuery gate; deny == missing so the catalog can't be probed
+match authorize(actor, handle.policy.as_deref(), PolicyRequest {
+    action: PolicyAction::InvokeQuery, branch: None, target_branch: None,   // graph-scoped: NO branch dimension
+})? {
+    Authz::Allowed => {}
+    Authz::Denied(_) => return Err(ApiError::not_found("stored query not found")),
+}
+let stored = handle.queries.as_ref().and_then(|r| r.lookup(&name)).ok_or_else(|| ApiError::not_found("stored query not found"))?;
+// inner gate runs in run_mutate (Change) / run_query (Read); a stored mutation is double-gated.
+```
 
----
+**`graph_load` (NDJSON)** wraps the unified `load_as` (`server_ingest`,
+`handlers.rs:1210`): a missing branch with **no `from` is a `404`, never an implicit
+fork**; `BranchCreate` is consulted only when `from` is present, then `Change` for the
+load. The tool's `input_schema` is `{ data: string, branch?: string, from?: string,
+mode?: "merge"|"append"|"overwrite" }`, `additionalProperties: false`.
 
-## 9. Routing — `EXISTING (reuse)`, validated
+**Error classification (`New`, one mapper, SEP-1303-aligned).** `ApiError`'s fields are
+private (`lib.rs:296`), so add `pub(crate) fn status_code(&self)`/`message_str(&self)`
+accessors. Then one `classify` is used at every dispatch site:
 
-`/mcp` is added **into `per_graph_protected`**, so it is flat in single mode and
-nested under `/graphs/{graph_id}` in multi mode automatically — no separate
-wiring. The real `build_app` (`lib.rs:907`):
+```rust
+// New — the single source of truth
+fn classify(r: Result<CallToolResult, ApiError>) -> Result<CallToolResult, McpError> {
+    match r {
+        Ok(out) => Ok(out),
+        // Semantic failures (bad params, validation, business 4xx/409) → isError result,
+        // fed back to the model so it self-corrects (MCP 2025-11-25 SEP-1303).
+        Err(e) if e.status_code().is_client_error() => Ok(CallToolResult::error(vec![Content::text(e.message_str())])),
+        // Operational failures (5xx) → JSON-RPC protocol error.
+        Err(e) => Err(McpError::internal_error(e.message_str().to_owned(), None)),
+    }
+}
+```
+
+Two cases are protocol errors, not `isError`, so the catalog isn't probeable and
+malformed calls are unambiguous: an **unknown OR denied tool** returns an identical
+`McpError::invalid_params("unknown tool: <name>")` (`-32602`), and a structurally
+malformed call (failing the `tools/call` shape) is a protocol error. A missing/bad
+bearer is an HTTP `401` at the boundary, before rmcp.
+
+## 12. Code-mode compatibility
+
+"Code mode" (Anthropic's *Code execution with MCP*; Cloudflare's *Code Mode*) is a
+**client/runtime** technique: the client compiles a server's tools into a typed code
+API (TS modules / a sandbox), the model writes code against it, and intermediate
+results are filtered in the sandbox instead of round-tripping through the model
+context (reported ~98% context savings on large workflows). It runs over **standard
+`tools/list` + `tools/call`** and **requires no new server endpoints**; credentials
+stay in the transport and the runtime holds them (the sandbox never sees the bearer).
+
+The server's job is to be a *good source* for that compilation. Concrete server-side
+choices this RFC adopts:
+
+1. **Strict, fully-typed `input_schema`** (§9.1) with `additionalProperties: false`,
+   enums for `mode`/`format`, explicit `required` — these compile into precise TS
+   input types.
+2. **Structured output** — see §13.1: declare `outputSchema` and return
+   `structuredContent` so generated code gets typed *returns*, not `any`.
+3. **Stable, descriptive tool names + rich descriptions** (§10) — names become
+   function names; descriptions become doc comments.
+4. **Progressive disclosure for large catalogs** — the `meta` projection mode (§9.2)
+   keeps `tools/list` small (`stored_query_list` + `stored_query_run`), the same
+   `search` + `execute` shape code-mode runtimes prefer.
+5. **Pagination** on every list-style result (cursor-based) so a code-mode run fetches
+   and filters incrementally without context blow-up.
+6. **Schemas as resources** (§14) — expose the graph schema (and per-query param
+   schemas) as MCP resources, the on-demand channel code-mode clients pull from.
+7. **Auth in the transport only** — never require secrets as tool *arguments* (that
+   would put them in model context / generated code and break the sandbox's credential
+   isolation).
+
+The server deliberately does **not** build TS-wrapper generation, sandboxes, tool
+search/deferral, or PII tokenization — those are client/runtime concerns, and there is
+no ratified "tools-as-code" MCP spec to target.
+
+## 13. Provider compatibility
+
+**Transport: Streamable HTTP is the universal target** — every current client below
+supports it for remote servers, and it is the recommended transport over deprecated
+HTTP+SSE.
+
+**Auth splits the ecosystem into two tiers:**
+
+| Client | Remote transport | Auth that works | Notes |
+|---|---|---|---|
+| **Claude Code** (CLI) | Streamable HTTP | static bearer header **and** OAuth 2.1 | `claude mcp add --transport http <url>/mcp --header "Authorization: Bearer …"`. Advertising RFC-9728 can override the static header (issue #59467) — gate PRM. |
+| **Cursor** | Streamable HTTP | static header **and** OAuth 2.1 | `"headers": {"Authorization": "Bearer ${env:…}"}` in `mcp.json`. |
+| **VS Code** (Copilot agent) | Streamable HTTP | static header **and** OAuth | needs VS Code ≥ 1.101 for remote + OAuth; auto-detects `401` → sign-in. |
+| **OpenCode** | remote HTTP | static header **and** OAuth (auto, DCR) | `mcp` block in `opencode.json`; auto-prefixes tools `omnigraph_…`; **no progressive disclosure** → keep the static surface tight (favors `meta` mode at scale). |
+| **Claude Messages API** (`mcp_servers`) | Streamable HTTP (+SSE) | pre-acquired token via `authorization_token` | forwards a token; never runs OAuth. Static bearer fits directly. Pin the beta header you target. |
+| **OpenAI Responses API** (`mcp` tool) | Streamable HTTP (+SSE) | arbitrary `headers` (static bearer) or pre-acquired `authorization` | forwards a token; never runs OAuth. `require_approval` gates tool calls. |
+| **ChatGPT** (developer mode/connectors) | Streamable HTTP (+SSE) | OAuth, **No-Auth**, or Mixed | beta; OAuth is the clean path. |
+| **Claude Desktop** (custom connectors) | Streamable HTTP (+SSE) | **OAuth 2.1 or authless** | no static-header field — bearer-only deployments are unreachable without a gateway. |
+| **Claude.ai web** (custom connectors) | Streamable HTTP (+SSE) | **OAuth 2.1 + RFC 9728** (or authless) | server **must** serve RFC-9728 PRM; no static-header field. |
+
+**Phased auth recommendation:**
+
+- **Phase 1 — static bearer (this RFC).** Reaches Claude Code, Cursor, VS Code
+  Copilot, OpenCode, the Claude Messages API connector, and the OpenAI Responses API —
+  the entire developer/agent/API tier. This is the correct launch posture.
+- **Phase 2 — OAuth 2.1 + RFC 9728 (MR-956, additive).** Required to reach **claude.ai
+  web** and **Claude Desktop** custom connectors and the clean ChatGPT path. The same
+  endpoint accepts validated OAuth access tokens and (still) static bearers; PRM
+  advertisement stays config-gated because of the #59467 header-override behavior.
+
+Because the resource server validates whatever token arrives on `Authorization`,
+both tiers hit one endpoint with no MCP-layer branching.
+
+### 13.1 Result shaping & structured output
+
+For typed, machine-consumable results (`graph_query`, stored-query reads,
+`branch_list`, `commit_*`, `schema_get`) the tool declares an **`outputSchema`** and
+returns **`structuredContent`** (the route's existing `ReadOutput` / listing DTOs,
+which already derive `ToSchema`), **and also** mirrors the JSON in a text `Content`
+block for clients that don't parse structured content. Plain text-JSON is used where a
+fixed schema is awkward. (Some clients still mishandle `structuredContent: null` —
+emit an empty object, never `null`, when there is no structured payload.)
+
+## 14. Resources
+
+Two resources: `omnigraph://schema` (`Read` → schema `.pg` text) and
+`omnigraph://branches` (`Read` → branches JSON). Both are Cedar-filtered and
+deny-masked exactly like tools — a locked-down agent denied `Read` never sees them,
+which is how the "agents don't introspect schema" intent is met by *policy*, not
+omission. Advertise the `resources` capability with `subscribe:false,
+listChanged:false` (both handlers are backed — don't advertise a capability whose
+`read` would 404). Exposing the schema as a resource is also the on-demand channel
+code-mode clients pull from (§12).
+
+No `omnigraph://graphs` resource and no `graphs_list` tool — server-scoped graph
+discovery stays REST-only via `GET /graphs` (§15).
+
+## 15. Routing — `Reuses` `build_app`
+
+`/mcp` is mounted **inside `per_graph_protected`**, so it is flat in single mode and
+nested under `/graphs/{graph_id}` in multi mode automatically:
 
 ```rust
-// EXISTING — crates/omnigraph-server/src/lib.rs:915 (abridged)
+// Reuses — crates/omnigraph-server/src/lib.rs:915 (abridged)
 let per_graph_protected = Router::new()
     .route("/snapshot", get(server_snapshot))
     // … /query /mutate /queries /queries/{name} /schema /schema/apply /ingest /branches /commits …
-    .route("/mcp", post(server_mcp))                          // ← ADD HERE (or .merge(mcp_router(state)))
+    .merge(mcp::mcp_router(state.clone()))                       // ← ADD: brings its own tower-http body-limit layer
     .route_layer(middleware::from_fn_with_state(state.clone(), resolve_graph_handle))  // inner: injects Arc<GraphHandle>
     .route_layer(middleware::from_fn_with_state(state.clone(), require_bearer_auth));  // outer: injects ResolvedActor / 401
 
 let management = Router::new()
-    .route("/graphs", get(server_graphs_list))               // GraphList — server-scoped, NOT in MCP
+    .route("/graphs", get(server_graphs_list))                  // GraphList — server-scoped, REST-only
     .route_layer(middleware::from_fn_with_state(state.clone(), require_bearer_auth));
 
 let protected = match state.routing() {
-    GraphRouting::Single { .. } => per_graph_protected.merge(management),                  // → POST /mcp
+    GraphRouting::Single { .. } => per_graph_protected.merge(management),                 // → POST /mcp
     GraphRouting::Multi  { .. } => Router::new()
-        .nest("/graphs/{graph_id}", per_graph_protected).merge(management),                // → POST /graphs/{id}/mcp
+        .nest("/graphs/{graph_id}", per_graph_protected).merge(management),               // → POST /graphs/{id}/mcp
 };
 ```
 
-Because `mcp_router` returns its own `Router` with a tower-http body-limit layer,
-prefer `.merge(mcp::mcp_router(state.clone()))` into `per_graph_protected` over a
-bare `.route("/mcp", …)` so the `/mcp`-specific limit doesn't leak onto the other
-routes. **No server-scoped MCP** — `graphs_list` / `omnigraph://graphs` are dropped;
-graph discovery is the `management` group's `GET /graphs`. A future server-level
-flat `/mcp` (bearer-only, no handle) would live in `management`, but is not built
-speculatively. **Do not** consolidate to a single flat `/mcp` that takes `graph_id`
-per call: MCP's `tools/list` can't carry a graph, so it couldn't list per-graph
-stored-query tools, and it would break isolation.
-
----
-
-## 10. Auth & the client on-ramp — `EXISTING (reuse)` RFC-007 (delta D4)
-
-**Server side** (unchanged from the blueprint and correct): the backend consumes an
-already-resolved `ResolvedActor` and **branches on nothing** about how the token was
-verified. Static bearer works today; OAuth/RFC-9728 (MR-956) is an additive layer
-that swaps the bearer middleware behind a `TokenVerifier` and serves
-`/.well-known/oauth-protected-resource` — **zero MCP changes**. Token validation is
-offline (cached JWKS), so on-prem/air-gapped keeps working with no request-path
-cloud dependency; the endpoint is a Resource Server only (never issues tokens,
-never holds a client secret). Multi-user identity passes through: the *caller's*
-token (a per-tenant audience-bound JWT) reaches the server so Cedar enforces
-per-user policy — never a shared service token.
-
-```
-request → [require_bearer_auth: ResolvedActor] → [resolve_graph_handle: GraphHandle] → [/mcp] → Cedar → tool
-```
-
-| Integration | Static bearer | Note |
-|---|---|---|
-| Claude Code, Cursor, VS Code | ✅ | `claude mcp add --transport http <url>/mcp --header "Authorization: Bearer <tok>"` |
-| Claude Messages API MCP connector | ✅ | caller passes `authorization_token` |
-| claude.ai web / Desktop, ChatGPT dev-mode | ❌ needs OAuth fast-follow | OAuth 2.1 + PKCE + RFC 9728 |
-
-**Client side — the *shipped* RFC-007 model (delta D4).** The landed operator config
-is:
-
-```rust
-// EXISTING — crates/omnigraph-cli/src/operator.rs:74
-pub(crate) struct OperatorServer { pub(crate) url: String }   // field is `url`, NOT `endpoint`; NO `auth:` block
-```
-
-```yaml
-# ~/.omnigraph/config.yaml  (operator config — "No tokens in this file, ever")
-servers:
-  prod:    { url: https://og.internal:8080 }
-  edge:    { url: https://og-edge:8080 }
-  cloud:   { url: https://api.omnigraph.cloud }     # OAuth is resolved by the client, not declared here
-```
-
-Token resolution is a **two-step keyed chain** (`operator.rs:224`), then the legacy
-`bearer_token_env` fallback — **there is no keychain step** (the blueprint's middle
-`omnigraph:<name>` keychain link does not exist):
-
-```rust
-// EXISTING — crates/omnigraph-cli/src/operator.rs:229
-// 1. OMNIGRAPH_TOKEN_<NAME>  (env; `intel-dev` → OMNIGRAPH_TOKEN_INTEL_DEV)
-// 2. [<name>] token = …      in ~/.omnigraph/credentials  (0600; over-permissive ⇒ loud error)
-pub(crate) fn resolve_keyed_token(server: &str) -> Result<Option<String>>;
-```
-
-The `omnigraph mcp install --agent <tool>` on-ramp (MR-974) writes the agent's MCP
-client config pointing at `<server url>/mcp` with the resolved bearer; it reuses
-this chain, so OAuth becomes a future *sibling* of the token chain, not a re-key.
-
----
-
-## 11. Tests & verification (corrected suite names — delta D5)
-
-The `server.rs` monolith is gone. MCP tests land in a **new
-`crates/omnigraph-server/tests/mcp.rs`** suite (black-box over `build_app`);
-stored-query *projection* tests extend the existing **`stored_queries.rs`**. Cover:
-
-- **Protocol:** `initialize` + advertised `{tools, resources}`; `tools/list` shape;
-  `tools/call` happy path; JSON-RPC errors (`-32601`/`-32602`); `resources/list` +
-  `resources/read`; `GET /mcp → 405`; `MCP-Protocol-Version` 400/default; `Origin → 403`.
-- **Cedar (coarse):** read-only actor sees read tools but not `mutate`/`load`/`branches_*`/`schema_apply`;
-  a denied `tools/call` masks byte-identically to an unknown tool; stored queries
-  list only with `invoke_query`; the double-gate (an `invoke_query`-only actor sees
-  a stored tool but the call surfaces `isError` when the inner `Read` denies).
-- **Dispatch:** a `mutate` call writes end-to-end (proves the actor/handle extension
-  passthrough); a malformed query → `isError:true`, not a JSON-RPC error. A `load`
-  with a missing branch and **no `from` → `isError`** (404 classified); with `from`
-  → forks.
-- **Resources:** list + read of `schema`/`branches`; a denied read masks as unknown.
-- **Schema generation:** table-driven over every `ParamKind` incl. nullable / list /
-  `vector` **with and without `vector_dim`** (the absent-dim path must omit
-  `minItems`/`maxItems`, never emit 0 — the D1 regression guard).
-- **Auth decoupling / no-bearer:** `/mcp` 401s without a bearer (before rmcp) and
-  200s with one; green under the static-hash verifier and a mock `ResolvedActor`.
+`mcp::mcp_router(state)` is the server's thin wrapper:
+`omnigraph_mcp::mcp_router(OmnigraphMcpBackend::new(state), INGEST_REQUEST_BODY_LIMIT_BYTES /* lib.rs:137, 32 MiB */, host_policy_from_bind(…))`.
+Merging the router (rather than `.route("/mcp", …)`) keeps the `/mcp`-specific body
+limit from leaking onto the other routes.
+
+**No server-scoped MCP.** Every MCP tool/resource is graph-scoped. `tools/list` can't
+carry a graph id, so a single flat `/mcp` taking `graph_id` per call couldn't list
+per-graph stored-query tools and would break isolation — hence per-graph routing. A
+future server-level flat `/mcp` (bearer-only, no handle, server-scoped tools only)
+would live in the `management` group, but is not built speculatively.
+
+## 16. Tests & verification
+
+MCP tests land in a new `crates/omnigraph-server/tests/mcp.rs` suite (black-box over
+`build_app`); stored-query *projection* tests extend `stored_queries.rs`.
+
+- **Protocol:** `initialize` + advertised `{tools, resources}` caps; `tools/list`
+  shape + pagination; `tools/call` happy path; `GET /mcp → 405`;
+  `MCP-Protocol-Version` 400/default; `Origin → 403`; unknown/denied tool →
+  identical `-32602`.
+- **Cedar:** a read-only actor sees read tools but not writers; a denied call masks
+  byte-identically to an unknown one; stored queries appear only with `invoke_query`;
+  the double-gate (an `invoke_query`-only actor sees a stored tool but the call
+  surfaces `isError` when the inner `Read` denies).
+- **Dispatch:** a `graph_mutate` writes end-to-end (proves the actor/handle extension
+  passthrough); a malformed query → `isError:true`, not a protocol error; `graph_load`
+  with a missing branch and no `from` → `isError` (404), with `from` → forks.
+- **Schema generation:** table-driven over every `ParamKind` incl. nullable, list, and
+  `vector` **with and without `vector_dim`** (the absent-dim path omits
+  `minItems`/`maxItems`).
+- **Structured output:** `outputSchema` present and `structuredContent` validates
+  against it; the text mirror is present; never emits `structuredContent: null`.
+- **Projection modes:** `per_query` below the threshold, `meta` at/above it, with the
+  switch logged.
+- **Auth decoupling:** `/mcp` `401`s without a bearer (before rmcp) and `200`s with
+  one; green under the static-hash verifier and a mock OIDC `ResolvedActor`.
 - **Crate-level:** `omnigraph-mcp/tests/` with a trivial `McpBackend` proving the
-  crate stands alone (`initialize` + `GET → 405`), plus an rmcp **surface guard**
-  pinning `StreamableHttpServerConfig`'s `with_*` setters and the `ServerHandler`
-  method shapes (the only place the unverifiable rmcp-`Parts`-passthrough assumption
-  is anchored).
+  crate stands alone (`initialize` + `GET → 405`), plus an **rmcp surface guard**
+  pinning `StreamableHttpServerConfig`'s `with_*` setters, `NeverSessionManager`, the
+  `ServerHandler` method shapes, and the `RequestContext.extensions →
+  http::request::Parts` passthrough — the smoke check on any rmcp bump.
 
-Verification commands (corrected):
+Verification commands:
 
 ```bash
 cargo build --workspace --locked
-cargo tree -p omnigraph-server -e normal | grep rmcp     # rmcp ONLY under omnigraph-mcp, never direct
-cargo test -p omnigraph-server --test mcp                # NEW suite (NOT --test server)
-cargo test -p omnigraph-server --test stored_queries     # projection tests
-cargo test -p omnigraph-server --test openapi            # no /mcp leak (it carries no #[utoipa::path])
-OMNIGRAPH_UPDATE_OPENAPI=1 cargo test -p omnigraph-server --test openapi   # if the REST surface intentionally changed
+cargo tree -p omnigraph-server -e normal | grep rmcp     # rmcp only transitively under omnigraph-mcp
+cargo test -p omnigraph-server --test mcp
+cargo test -p omnigraph-server --test stored_queries
+cargo test -p omnigraph-server --test openapi            # /mcp carries no #[utoipa::path]; no REST drift
 ```
 
----
-
-## 12. Decisions — locked & open
-
-**Locked** (validated against code): rmcp 1.7; stateless POST-only Streamable
-HTTP; `McpBackend` crate-trait seam with `&Parts` passthrough; `Builtin` enum +
-stored-query populator; coarse `InvokeQuery` only; ad-hoc `query`/`mutate`
-always-on Cedar-only; `query`/`mutate` ids (no `read`/`change`); **`load` id (not
-`ingest`)**; per-graph `/mcp` routing; no server-scoped MCP / no `graphs_list`;
-text-JSON content for v1; BigInt as JSON string; `vector_dim: Option<u32>` handled
-with omit-on-absent.
-
-**Open (deferred follow-ups):**
-- **PR 0b — per-query `invoke_query` scope.** Add a query-name dimension to
-  `PolicyRequest` + the Cedar schema (rfc-001's intended design). `InvokeQuery` is
-  documented coarse today (`policy/src/lib.rs:59-73`) and `branch_scope` on it is
-  rejected by `validate()`. Until PR 0b, stored-query curation is graph-level
-  (registry membership + `expose`), not per-actor sub-selection — the headline caveat.
-- **MR-956 — OAuth/RFC-9728** layer (additive, zero MCP changes).
-- **stdio → proxy collapse** (see below).
-- **`structuredContent` / `outputSchema`** beyond text-JSON.
-
----
-
-## Relationship to RFC-001
-
-[rfc-001-queries-envelope-mcp.md](rfc-001-queries-envelope-mcp.md) (MR-656 / MR-976
-/ MR-969) is the parent design for stored queries + the response envelope + MCP.
-This RFC is the **detailed MCP-transport design** that #128 left for a follow-up,
-and it **revises rfc-001 in three places where the shipped code or the MCP wire
-protocol diverged from rfc-001's sketch**:
-
-1. **Transport shape.** rfc-001 sketched `GET /mcp/tools` + `POST /mcp/invoke` (a
-   bespoke REST pair). **That is not the MCP wire protocol — real MCP clients
-   cannot connect to it.** This RFC implements actual MCP JSON-RPC over Streamable
-   HTTP and reuses `query_catalog_entry` as a *projection source*, not a parallel
-   surface.
-2. **Exposure config.** rfc-001 specified inline `.gq` pragmas (`@mcp(expose=…)`,
-   default `expose=false`). **#128 shipped a different mechanism:** the
-   `expose`/`tool_name` metadata in YAML, **default `true`** (declaring a query *is*
-   the opt-in). This RFC builds on the shipped form; the `.gq`-pragma design is
-   superseded. (The declaration *file* has since moved to `cluster.yaml` discovery
-   for multi-graph servers — see [§7.3](#73-where-stored-queries-are-declared-delta-d6).)
-3. **Schema introspection.** rfc-001 lists "Schema introspection through MCP" as a
-   **non-goal**. This RFC **revises that**: the operational-parity tools include
-   `schema_get` and `omnigraph://schema` — *because the shipped stdio package
-   already exposes both*. The non-goal is achieved by *policy*, not omission: both
-   are Cedar-gated by `Read`, and the recommended locked-down agent policy denies
-   `Read`, so a curated agent still never sees the schema.
-
-Everything else in rfc-001 (two-paths-one-engine, per-query `invoke_query` *as the
-intended scope*, the response envelope, multi-graph per-graph endpoints) this RFC
-consumes unchanged.
-
-> **Numbering note:** the `TokenVerifier`/WorkOS auth design is referred to in code
-> (`crates/omnigraph-server/src/identity.rs`) as "RFC 0001," a *different* document
-> from this repo's `docs/dev/rfc-001-queries-envelope-mcp.md`. To avoid the
-> collision this RFC cites the auth substrate as **MR-956** throughout.
-
----
-
-## Motivation
-
-- **One curated, safe, remotely-reachable tool surface.** MR-969's thesis: hand an
-  LLM a token Cedar-scoped to a set of tools and it sees exactly those typed tools —
-  cannot construct ad-hoc queries it isn't permitted, cannot read the schema it
-  isn't permitted, cannot reach other graphs. Today the only MCP is the stdio
-  package: local-only, full surface, ungated.
-- **Parity, so the in-server MCP can be the single implementation.** Operators/agents
-  already depend on the operational tools. Serving them server-side behind one Cedar
-  gate lets the stdio package degrade to a proxy and removes two diverging tool sets.
-- **On-prem and cloud from one endpoint.** A managed cloud (WorkOS OAuth) and an
-  on-prem/air-gapped deploy (static or customer-OIDC tokens) must serve the same MCP
-  without forks or MCP-specific auth.
-- **Foundation for the agent on-ramp (MR-974).** `omnigraph mcp install --agent
-  <tool>` needs a decided transport + a stable endpoint.
-
-## Goals
-
-- Project built-in tools + stored queries as MCP tools through **one** registry
-  abstraction.
-- `tools/list` and the callable set are **identical for argument-independent
-  authorization**, both driven by Cedar (see §6 for the branch-scoped caveat).
-- The MCP layer is **auth-method-agnostic**: it consumes `ResolvedActor`, never a
-  raw token, never branches on how auth happened.
-- The same endpoint works on-prem (static/OIDC) and cloud (WorkOS OAuth), switched
-  by config; cloud OAuth is additive (RFC 9728).
-- No new business logic: MCP tools delegate to the same `run_query`/`run_mutate`/
-  branch/schema functions the HTTP routes call.
-- Behaviour-neutral when unused: no MCP traffic = no change.
-
-## Non-Goals
-
-- **Building/hosting an OAuth authorization server.** The server is a Resource
-  Server; WorkOS AuthKit+Connect is the AS (MR-956). The MCP endpoint validates
-  tokens, never issues them, never holds client secrets.
-- **OAuth/WorkOS implementation itself** — MR-956's work. This RFC leaves a clean
-  RFC-9728 hook and consumes `ResolvedActor`.
-- **MCP prompts, elicitation, `tools/list_changed`, resource subscriptions,
-  server-initiated messages.** None needed → enables the stateless POST-only
-  transport (§3).
-- **stdio transport inside the server.** stdio stays in the TS package (now a proxy).
-- **Cross-graph tool listing.** Per-graph catalogs only.
-- **Hot reload of the query registry.** Restart-only (MR-969).
-
-## Background
-
-`omnigraph-server` (Axum) already implements every operation this RFC exposes as an
-authenticated HTTP route; each authorizes via a `PolicyAction` against the Cedar
-policy for a server-resolved actor and calls into the engine. The existing stdio MCP
-package is a *client* of these routes (it owns no business logic). MR-956 will
-introduce a `TokenVerifier` trait (`StaticHashTokenVerifier` today inline,
-`OidcJwtVerifier` for OIDC/WorkOS) producing the `ResolvedActor` that already exists
-in `identity.rs` (§4) and is consumed by Cedar — token *validation* is offline
-(cached JWKS), so on-prem/air-gapped has no request-path dependency on the cloud.
-
-## Relationship to the `@modernrelay/omnigraph-mcp` stdio package
-
-The package (`omnigraph-ts`, `@modelcontextprotocol/sdk`, **stdio only**) is a thin
-client over the SDK → HTTP routes and **forwards the caller's bearer verbatim** (no
-inspection). It exposes the operational tools and the `omnigraph://schema` /
-`omnigraph://branches` resources (plus a vendored best-practices cookbook). *Its
-exact tool/resource counts and version live in the separate `omnigraph-ts` repo and
-are not re-verified here.*
-
-Once parity lands, **collapse to one implementation**: the in-server MCP is canonical
-(Cedar-gated, remote-capable, the path that becomes a Claude-web connector via
-MR-956). The stdio package degrades to a **thin stdio↔HTTP proxy** forwarding
-JSON-RPC (and the incoming `Authorization`) to `/mcp` — staying the local on-ramp
-for Claude Code/Desktop while sharing one tool set and one Cedar gate. This is the
-same "one contract, two implementations only where semantics genuinely differ"
-posture as [rfc-009-unify-access-paths.md](rfc-009-unify-access-paths.md).
+## 17. Decisions & rollout
+
+**Locked:** rmcp 1.7 (official SDK); MCP target `2025-11-25`; stateless JSON over a
+single `/mcp` POST (`NeverSessionManager`, `stateful_mode=false`, `json_response=true`);
+`McpBackend` crate-trait seam with `&Parts` passthrough; `Builtin` enum + stored-query
+populator; domain-qualified `snake_case` tool ids; annotations set explicitly; coarse
+`InvokeQuery` with the double-gate; per-graph `/mcp` routing, no server-scoped MCP;
+`structuredContent` + `outputSchema` (with a text mirror) for typed results;
+`vector_dim: Option<u32>` handled with omit-on-absent; auth consumed as a resolved
+actor, validated per-request, never passed through.
+
+**Open / deferred:**
+- **OAuth 2.1 + RFC 9728 (MR-956)** — additive Phase 2; PRM advertisement config-gated
+  (issue #59467).
+- **Per-query `invoke_query` scope (PR 0b)** — add a query-name dimension to
+  `PolicyRequest` + the Cedar schema so an actor can be scoped to *specific* stored
+  queries. Until then curation is graph-level (registry membership + `expose`).
+- **`tools/list_changed`** — only if the registry gains runtime reload.
+- **stdio → proxy collapse** — the local stdio package degrades to a stdio↔HTTP proxy
+  over `/mcp` once this surface is GA, leaving one tool set and one Cedar gate, the
+  same one-contract posture as [rfc-009-unify-access-paths.md](rfc-009-unify-access-paths.md).
+
+**Rollout:** (1) the `omnigraph-mcp` crate + transport + a trivial backend (crate
+stands alone); (2) the server backend — extension passthrough, `Builtin` enum,
+read-only tools + resources, Cedar-filtered listing, the `classify` mapper; (3)
+mutating tools + stored-query projection (both modes) + structured output; (4) docs +
+the `omnigraph mcp install` on-ramp (MR-974); (5) OAuth/RFC-9728 (MR-956) and the
+stdio proxy as separate follow-ups.

From de9e28ecf6ba5d3bf0be36d5a331675af90e922b Mon Sep 17 00:00:00 2001
From: Ragnor Comerford <hello@ragnor.co>
Date: Sat, 13 Jun 2026 17:29:32 +0200
Subject: [PATCH 5/5] docs(rfc-003): document the per-graph multi-graph MCP
 model
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Add §15.1 making the multi-graph model explicit (it was implied across
§9/§14/§15): one isolated MCP server per graph at /graphs/{id}/mcp with the
graph id in the URL path; cross-graph discovery is REST-only via GET /graphs
(server-scoped GraphList, default-denied without a server policy); stored-query
registries, tool-name uniqueness, projection mode, and InvokeQuery are all
per-graph; clients configure one connection per graph and the connection name
namespaces the otherwise-identical tool ids.
---
 docs/dev/rfc-003-mcp-server-surface.md | 45 ++++++++++++++++++++++++++
 1 file changed, 45 insertions(+)

diff --git a/docs/dev/rfc-003-mcp-server-surface.md b/docs/dev/rfc-003-mcp-server-surface.md
index 77eefdd5..877b65b7 100644
--- a/docs/dev/rfc-003-mcp-server-surface.md
+++ b/docs/dev/rfc-003-mcp-server-surface.md
@@ -666,6 +666,51 @@ per-graph stored-query tools and would break isolation — hence per-graph routi
 future server-level flat `/mcp` (bearer-only, no handle, server-scoped tools only)
 would live in the `management` group, but is not built speculatively.
 
+### 15.1 Multi-graph model
+
+omnigraph's MCP is **per-graph**: one isolated MCP server per graph, with the graph
+identity in the **URL path**, never in tool arguments or output. In multi-graph mode
+the router nests the whole protected group under `/graphs/{graph_id}` (`lib.rs:978`),
+so each `/graphs/{id}/mcp` endpoint's `initialize` / `tools/list` / `tools/call` /
+`resources/*` operate only on that graph and can never list or touch another graph's
+tools.
+
+- **Discovery is REST-only, not an MCP tool.** `graphs_list` / `omnigraph://graphs`
+  are deliberately absent from MCP. Which graphs exist is answered by `GET /graphs`
+  (multi-mode only) → `GraphListResponse { graphs: [{ graph_id, uri }] }`
+  (`api.rs:703`), gated by the server-scoped `GraphList` Cedar action and
+  **default-denied without a server policy** (the registry — graph ids + storage URIs
+  — is never leaked until an operator authorizes it). An operator discovers graphs via
+  REST, then points each MCP client connection at the relevant `/graphs/{id}/mcp`; no
+  single MCP connection ever sees the full graph list.
+
+- **Clients configure one connection per graph.** Tool ids are identical across graphs
+  (each is its own server), so the **connection name is the namespace**: a client that
+  auto-prefixes yields `og-sales_graph_query` vs `og-hr_graph_query`.
+
+  ```bash
+  claude mcp add og-sales --transport http https://host/graphs/sales/mcp --header "Authorization: Bearer …"
+  claude mcp add og-hr    --transport http https://host/graphs/hr/mcp    --header "Authorization: Bearer …"
+  ```
+
+- **Stored queries are per-graph state.** Each graph owns its registry
+  (`GraphHandle.queries`, `registry.rs:55`), loaded from that graph's declaration
+  (`cluster.yaml graphs.<id>.queries`). So a query is exposed only on its own graph's
+  endpoint; the same query *name* may exist on multiple graphs with different
+  definitions (no cross-graph collision — different servers). `effective_tool_name()`
+  uniqueness is enforced **per graph** at registry load (`duplicate_tool_name`), not
+  across graphs. The projection mode (`per_query` vs `meta`, §9.2) is chosen from
+  *that graph's* exposed-query count, so a small graph can show one typed tool per
+  query while a large graph on the same server uses the `stored_query_list` +
+  `stored_query_run` meta pair. `InvokeQuery` is evaluated against *that graph's*
+  `handle.policy`, so an actor can be allowed stored queries on one graph and denied
+  on another, independently. The per-graph catalog is also discoverable over REST at
+  `GET /graphs/{id}/queries`.
+
+So `tools/list` on `/graphs/sales/mcp` returns sales' built-ins + sales' stored
+queries; the same call on `/graphs/hr/mcp` returns hr's — two disjoint catalogs, each
+Cedar-filtered to the actor.
+
 ## 16. Tests & verification
 
 MCP tests land in a new `crates/omnigraph-server/tests/mcp.rs` suite (black-box over