Skip to content

DRAFT: feat: add archi MCP server#519

Draft
hassan11196 wants to merge 29 commits intoarchi-physics:mainfrom
hassan11196:claude/archi-mcp-server-Rjy9q
Draft

DRAFT: feat: add archi MCP server#519
hassan11196 wants to merge 29 commits intoarchi-physics:mainfrom
hassan11196:claude/archi-mcp-server-Rjy9q

Conversation

@hassan11196
Copy link
Copy Markdown
Collaborator

@hassan11196 hassan11196 commented Mar 15, 2026

Built-in MCP Server for Archi

Adds a Model Context Protocol (MCP) server directly into the archi chat service. Any MCP-compatible AI assistant — Claude Desktop, VS Code, Cursor, or Claude Code — can query your archi knowledge base without installing anything extra.


Archi config setup

Add the following to your archi config YAML (it's on by default):

services:
  mcp_server:
    enabled: true
    # Defaults to the chat app's hostname + external port
    url: "http://localhost:7869"
    timeout: 120  # seconds

The MCP endpoint will be available at <url>/mcp/sse once the chat service starts.


Client setup

Visit <archi-url>/mcp/auth after logging in — it shows ready-to-paste config snippets for each client and handles login automatically on first use.

VS Code.vscode/mcp.json in your project root:

{
  "servers": {
    "archi": {
      "type": "sse",
      "url": "http://localhost:7869/mcp/sse"
    }
  }
}

Cursor~/.cursor/mcp.json:

{
  "mcpServers": {
    "archi": {
      "url": "http://localhost:7869/mcp/sse"
    }
  }
}

Claude Desktopclaude_desktop_config.json:

{
  "mcpServers": {
    "archi": {
      "command": "npx",
      "args": ["mcp-remote", "http://localhost:7869/mcp/sse"]
    }
  }
}

Claude Code — one command:

claude mcp add --transport sse archi http://localhost:7869/mcp/sse

VS Code, Cursor, and Claude Desktop will open a browser to log in on first use. No token copy-paste needed.


Available tools

Tool Description
archi_query Ask a question via archi's active RAG pipeline
archi_list_documents Browse the indexed knowledge base
archi_get_document_content Read the full text of an indexed document
archi_get_deployment_info Show active pipeline, model, and retrieval config
archi_list_agents List available agent specs
archi_health Verify the deployment is reachable

@hassan11196 hassan11196 changed the title feat: add archi MCP server DRAFT: feat: add archi MCP server Mar 15, 2026
@hassan11196 hassan11196 marked this pull request as draft March 15, 2026 22:19
@hassan11196 hassan11196 force-pushed the claude/archi-mcp-server-Rjy9q branch 7 times, most recently from d0c02ca to 82dd004 Compare March 16, 2026 02:21
Introduces archi_mcp/, a standalone Model Context Protocol server that
exposes archi's RAG capabilities as MCP tools for VS Code, Cursor, and
other MCP-compatible AI assistants.

Tools exposed:
  - archi_query             ask a question via the active RAG pipeline
  - archi_list_documents    browse the indexed knowledge base
  - archi_get_document_content  read a specific indexed document
  - archi_get_deployment_info   show active pipeline/model/retrieval config
  - archi_list_agents       list available agent specs
  - archi_health            verify the deployment is reachable

The server connects to a running archi chat service over HTTP (stdio
transport); no archi internals are imported.  Configuration is via
ARCHI_URL, ARCHI_API_KEY, and ARCHI_TIMEOUT environment variables.

pyproject.toml:
  - adds [project.optional-dependencies] mcp = ["mcp>=1.0.0", ...]
  - registers archi-mcp CLI entry point
  - includes archi_mcp package in setuptools find

archi_mcp/README.md covers VS Code (.vscode/mcp.json), Cursor
(~/.cursor/mcp.json), and generic stdio client setup.
- Add services.mcp_server block to base-config.yaml template
  (url, api_key, timeout; url defaults to chat_app hostname+port)
- Add --config flag to archi-mcp CLI to read settings from a
  rendered archi config file; env vars still take precedence
- Rewrite archi_mcp/README.md with server setup, archi config
  snippet, and VS Code / Cursor client setup instructions
Adds /mcp/sse and /mcp/messages routes directly to the Flask chat app
so MCP clients can connect with just a URL — no local archi-mcp command
or pip install required.

  VS Code (.vscode/mcp.json):
    { "servers": { "archi": { "type": "sse", "url": "http://localhost:7861/mcp/sse" } } }

  Cursor (~/.cursor/mcp.json):
    { "mcpServers": { "archi": { "url": "http://localhost:7861/mcp/sse" } } }

The SSE transport (JSON-RPC 2.0 over Server-Sent Events) is implemented
natively in Flask using thread-safe queues — no Starlette or extra
dependencies needed. Tool handlers call archi internals directly inside
the same process (chat wrapper, data viewer, agent spec loader).

Files changed:
- src/interfaces/chat_app/mcp_sse.py  (new) — SSE transport + 6 tools
- src/interfaces/chat_app/app.py      — register_mcp_sse() call in FlaskAppWrapper
- archi_mcp/README.md                 — document HTTP+SSE as the recommended option
The built-in /mcp/sse endpoint on the chat service makes the separate
archi-mcp CLI redundant. Clients now connect with just a URL.

- Delete archi_mcp/ package (server, client, README, entry point)
- Remove [project.optional-dependencies].mcp from pyproject.toml
- Remove archi-mcp entry point script from pyproject.toml
- Remove archi_mcp* from package discovery
Users now visit /mcp/auth (browser) after SSO login to get a long-lived
bearer token.  The token is stored in the new mcp_tokens PostgreSQL table.

VS Code / Cursor MCP configs must include the token as an Authorization
header.  The /mcp/sse and /mcp/messages endpoints enforce token validation
when auth is enabled; unauthenticated clients receive a 401 JSON response
with a login_url pointing to /mcp/auth.

Changes:
- init.sql: add mcp_tokens table (token, user_id, last_used_at, expires_at)
- mcp_sse.py: bearer-token validation (_validate_mcp_token), auth guard on
  both SSE and messages endpoints, updated session registry to carry user_id
- app.py: register /mcp/auth (GET) and /mcp/auth/regenerate (POST) routes,
  token DB helpers (_get_mcp_token, _create_mcp_token, _rotate_mcp_token),
  sso_callback now honours session['sso_next'] for post-login redirects
- templates/mcp_auth.html: token display page with VS Code / Cursor snippets
  and token rotation UI
Implements the standard OAuth2 authorization code flow with PKCE so that
MCP clients (Claude Desktop, VS Code, etc.) can authenticate automatically
without manual token copy-paste.

New endpoints:
  GET  /.well-known/oauth-authorization-server  – RFC 8414 discovery
  GET  /authorize                               – PKCE authorization (redirects to SSO if needed)
  POST /token                                   – code → bearer token exchange

New DB table: mcp_auth_codes (short-lived, single-use PKCE codes).
- Security: use atomic UPDATE...RETURNING to prevent auth-code replay attacks
- Security: validate redirect_uri in /token matches the one from /authorize
- Efficiency: inline token fetch/create inside existing DB connection (1 conn instead of 2-3)
- Efficiency: opportunistically delete expired mcp_auth_codes rows on each token exchange
- Bug fix: use urlparse/urlunparse for redirect_uri assembly (handles trailing ? edge case)
- Cleanup: move secrets/hashlib/base64/urlencode to module-level imports
- Cleanup: remove dead variable challenge_method
…in page

When an MCP client hits /authorize and the user isn't logged in, directly
invoke self.oauth.sso.authorize_redirect() — the same call the login handler
uses — instead of redirecting to /login?method=sso as an intermediate step.
The existing sso_next session key still brings the user back to /authorize
after the SSO callback completes, so the rest of the PKCE flow is unchanged.
ChatWrapper.__call__ expects message as [["User", content]] (matching the
JS client's history.slice(-1) format). _tool_query was passing a bare
string, causing `sender, content = tuple(message[0])` to fail with
"not enough values to unpack" since message[0] was a single character.
- mcp_auth.html: add two new tabs alongside VS Code and Cursor
  - Claude Desktop: shows claude_desktop_config.json snippet for macOS/Windows
  - Claude Code: shows `claude mcp add` CLI command + .mcp.json project config
- mcp_sse.py: update module docstring with Claude Desktop and Claude Code examples
Auth is now handled via SSO-issued bearer tokens (mcp_tokens table) and
the OAuth2 PKCE flow. The static api_key field was never read by any code.
…on_metadata

user_id was extracted from the bearer token and stored in the session, but
was never passed through _dispatch → _call_tool → _tool_query → wrapper.chat,
causing conversation_metadata.user_id to always be NULL for MCP requests.

Thread user_id from session_entry through the full call chain so it reaches
create_conversation() and is written to the DB.
…gistration

- Relocate /authorize → /mcp/oauth/authorize and /token → /mcp/oauth/token
- Add /mcp/oauth/register (RFC 7591 dynamic client registration)
- Update /.well-known/oauth-authorization-server metadata to point to new paths
  and advertise registration_endpoint
- Add mcp_oauth_clients table to init.sql to persist registered clients
- Update mcp_auth.html: IDE config snippets no longer include hardcoded tokens;
  clients discover OAuth via well-known and handle auth automatically on first use.
  Manual bearer token moved to an Advanced collapsible section for legacy clients.
@hassan11196 hassan11196 force-pushed the claude/archi-mcp-server-Rjy9q branch 2 times, most recently from 88c63ce to 1f349f8 Compare March 16, 2026 03:01
- Default services.mcp_server.enabled to false in base-config.yaml
- Read the flag in ChatApp and only register /mcp/* routes when enabled
- Move /mcp/auth and OAuth endpoints inside the mcp_enabled guard

https://claude.ai/code/session_01XTALCGRDaVpNPmqFbRD8My
@hassan11196 hassan11196 force-pushed the claude/archi-mcp-server-Rjy9q branch from 1f349f8 to e290b4d Compare March 16, 2026 03:07
When an MCP client includes _meta.progressToken in a tools/call request
for archi_query, the server now streams intermediate status events over
the existing SSE connection using the MCP notifications/progress protocol:

  - thinking_start / thinking_end → "Thinking…" / "Thought: <preview>"
  - tool_start → "Calling <tool>(<args>)"
  - tool_output → "Got result from <tool>"
  - chunk → "Generating answer…"

This lets MCP hosts (VS Code, Cursor, Claude Desktop, Claude Code) show
live status while archi is working instead of blocking silently.

Clients that omit progressToken continue to use the existing single
blocking wrapper.chat() call, so backwards compatibility is preserved.
@hassan11196 hassan11196 force-pushed the claude/archi-mcp-server-Rjy9q branch from 258a182 to 08b6e0d Compare March 16, 2026 03:25
Replace the two-path approach (stream vs invoke depending on progressToken)
with a single path that always calls wrapper.chat.stream(). notify() calls
are gated on whether a progressToken was provided, so progress events are
still only sent when the client requests them.

This ensures MCP responses use the same pipeline.stream() code path as the
web app, giving identical tool-call behaviour and model/provider selection.
@hassan11196 hassan11196 force-pushed the claude/archi-mcp-server-Rjy9q branch from 622fb94 to 8b5253f Compare March 16, 2026 04:04
chunk events carry the full text so far, not a new token. Appending each
one caused the response to repeat itself for every chunk emitted. Fix by
overwriting a single string on each chunk and preferring
final.response.answer (clean PipelineOutput) as the canonical answer.
VS Code MCP extension (2025-03-26 spec) fetches this endpoint first to
discover which authorization server protects the resource. Without it the
client logs 'Failed to fetch resource metadata from all attempted URLs'
and cannot complete the OAuth PKCE flow, resulting in a 401 on /mcp/sse.

The new endpoint returns the resource URI and points to the same-origin
authorization server, matching what .well-known/oauth-authorization-server
already advertises.
@hassan11196 hassan11196 force-pushed the claude/archi-mcp-server-Rjy9q branch from cf5edce to 4b5aabb Compare March 16, 2026 12:15
The MCP spec requires an absolute URL in the 'endpoint' SSE event.
Sending a relative path (/mcp/messages?...) caused VS Code's MCP client
to fail resolving it, so the POST for 'initialize' never reached the
server — resulting in the 'Waiting for initialize' loop.

Also captures request.host_url before the generator (generators run
outside request context) and adds INFO/WARNING logs so session mismatches
and incoming method calls are visible in server logs.
…event

Two bugs caused 'Waiting for initialize' on authenticated deployments but not localhost:

1. /mcp/messages re-checked the Bearer token (auth_enabled=True).
   VS Code sends the token for the initial SSE connection but not for
   subsequent POST messages to the dynamically-discovered endpoint URL.
   The session_id is already sufficient proof of identity — it was only
   issued to the client that passed auth on /mcp/sse.

2. Behind a reverse proxy, request.host_url returns the internal address
   (e.g. http://127.0.0.1:PORT/) so the absolute endpoint URL pointed
   somewhere unreachable. Now uses X-Forwarded-Proto / X-Forwarded-Host
   headers when present, falling back to request.scheme / request.host.
…oint event

Behind a reverse proxy, X-Forwarded-Proto is often not set, so the
endpoint SSE event was advertising http:// instead of https://, causing
VS Code to POST to the wrong URL.

services.mcp_server.url is already in the config template for exactly
this purpose ('Public URL of the chat service that MCP clients will
connect to'). Pass it through register_mcp_sse(public_url=...) and use
it as the base for the /mcp/messages?session_id=... endpoint URL.

Priority order for URL resolution:
  1. public_url from config  (explicit, most reliable)
  2. X-Forwarded-Proto / X-Forwarded-Host headers  (proxy sets these)
  3. request.scheme / request.host  (direct / localhost fallback)
…e comments

- /mcp/auth, OAuth metadata, and OAuth protected-resource endpoints now
  use _mcp_public_base_url() which reads services.mcp_server.url from
  config, falling back to X-Forwarded-* headers then request.host.
  This fixes https → http downgrade behind a reverse proxy.

- Removed redundant/verbose comments throughout mcp_sse.py; trimmed
  docstrings to essentials. No behavioral changes beyond the URL fix.

https://claude.ai/code/session_01XTALCGRDaVpNPmqFbRD8My
@hassan11196 hassan11196 force-pushed the claude/archi-mcp-server-Rjy9q branch from c9a6607 to 5b55ab5 Compare March 16, 2026 14:21
…render

get_active_banner_alerts() opened a new psycopg2 connection on every
Flask template render (every page load), adding measurable latency.
Cache the result for 30 seconds; invalidate immediately on create/delete
so alert managers still see changes right away.
@hassan11196 hassan11196 force-pushed the claude/archi-mcp-server-Rjy9q branch from ba7e6d6 to 4c4b586 Compare March 17, 2026 04:45
- Add `config_name` and `client_timeout` as optional input parameters to
  the `archi_query` tool schema, matching what the UI sends
- Default `client_timeout` to 18000000ms (5 hours) instead of 120s so
  long-running queries don't time out prematurely
- Convert `client_timeout` from milliseconds (UI convention) to seconds
  before passing to wrapper.chat(), consistent with how app.py handles it
- Pass `config_name` (e.g. 'comp_ops') through to the chat call instead
  of always using None (active config)

https://claude.ai/code/session_01XTALCGRDaVpNPmqFbRD8My
@hassan11196 hassan11196 force-pushed the claude/archi-mcp-server-Rjy9q branch from 4c4b586 to 3a79e0d Compare March 17, 2026 04:51
…tests for MCP SSE tools

- Updated `_parse_metadata_query` to handle malformed queries gracefully by using fallback tokenization.
- Removed unnecessary list conversion in `iter_files` calls for performance optimization.
- Introduced comprehensive unit tests for MCP SSE tools, covering various functionalities including document listing, metadata searching, and agent specifications.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants