Skip to content

feat(mcp_registry): #201 PR 1 of 5. Protocol scaffolding + FilesystemMCPServerRegistryBackend + spec/36 DRAFT + CLI#330

Merged
dep0we merged 5 commits into
mainfrom
feat/mcp-server-registry-backend-pr1
Jun 3, 2026
Merged

feat(mcp_registry): #201 PR 1 of 5. Protocol scaffolding + FilesystemMCPServerRegistryBackend + spec/36 DRAFT + CLI#330
dep0we merged 5 commits into
mainfrom
feat/mcp-server-registry-backend-pr1

Conversation

@dep0we
Copy link
Copy Markdown
Owner

@dep0we dep0we commented Jun 3, 2026

Summary

PR 1 of 5 of #201 (MCPServerRegistryBackend Protocol — the v1.0 closer arc). Lands the Protocol scaffolding + filesystem reference impl + DRAFT spec/36 + read-only CLI subcommands. After PR 4 of 5 ships, atomic-agents-stack hits v1.0 with twelve of twelve backend protocols.

Commits (bisectable):

  • ead8ea3 refactor(mcp): additive resolve_env parameter on parse_mcp_md_text + _build_spec; _resolve_env_vars helper extracted; inline comments on the 2 hidden callers in profile/{types,filesystem}.py documenting the resolve_env=True invariant
  • a2695c0 feat(mcp_registry): Protocol scaffolding (__init__.py, types.py, backend.py, filesystem.py) + 74 tests across 2 conformance + filesystem files
  • 2eec37d feat(cli): atomic-agents mcp-registry subcommand group with 4 read-only subcommands (list, show, validate, refresh-capabilities) + 16 CLI tests including the secret-leak regression test
  • 6888235 docs(mcp_registry): DRAFT spec/36 at 696 lines, 10 normative MUSTs + CHANGELOG [Unreleased] entry
  • 0c57a41 docs: post-ship sync (README test counts, spec list +spec/36, CLAUDE.md test count, spec/19 cross-link)

Test Coverage

COVERAGE: ~85%+ post-fix-all-Sonnet | Tests: 3090 → 3101 passing (+11), 52 skipped, 0 regressions

Protocol surface coverage:
  ├── list_mcp_servers           ★★★ TESTED (empty/missing → []; lex order; multi-section)
  ├── load_mcp_server            ★★★ TESTED (name charset; absent → MCPServerNotInRegistry;
  │                                          malformed → MCPRegistryDescriptorInvalid;
  │                                          PermissionError → MCPRegistryUnavailable;
  │                                          $VAR resolution at load time)
  ├── load_all_mcp_servers       ★★★ TESTED (default_load_all delegation; ordering parity;
  │                                          consistency with list+load per MUST 10)
  ├── validate                   ★★★ TESTED (all 7 branches via gap-fix Sonnet)
  ├── install / uninstall        ⏭  DEFERRED (PR 3; capability flag supports_install=False
  │                                          per capability honesty MUST 3)
  ├── capabilities (@property)   ★★★ TESTED (5 flags; static-vs-dynamic distinction)
  ├── refresh_capabilities       ★★★ TESTED (no-op on filesystem; HTTP re-probes in PR 4)
  └── close                      ★★★ TESTED (idempotent)

CLI surface coverage:
  ├── mcp-registry list          ★★★ TESTED
  ├── mcp-registry show          ★★★ TESTED (P0 regression: no secret leak verified)
  ├── mcp-registry validate      ★★★ TESTED
  ├── mcp-registry refresh-...   ★★★ TESTED
  ├── _resolve_agent_root (3 br) ★★★ TESTED (--agent-root flag > env var > cwd)
  └── exception handlers         ★★★ TESTED (MCPServerConnectFailed, all registered backend ids,
                                              MCPRegistryDescriptorInvalid, malformed mcp.md)

100 new tests in mcp_registry suite (42 conformance + 32 filesystem-specific + 2 skipped for PR 3 + 16 CLI + 8 regression tests from the cross-model review pass).

Pre-Landing Review

5 specialists + Claude adversarial subagent + Codex adversarial dispatched in parallel during /ship. 30 findings surfaced; all P0 + 5 P1 + 8 critical P2s addressed inline before push.

Cross-model-confirmed P0 (Codex + Security specialist + Claude adversarial all independently flagged): mcp-registry show printed resolved $VAR env values to stdout, leaking real secrets to terminal scrollback / CI logs / shell capture. Fix: re-parse mcp.md with resolve_env=False to print the unresolved $VAR refs from the file, NOT the resolved values returned by load_mcp_server. Regression test: test_cli_show_does_not_leak_resolved_env_values asserts the resolved secret does NOT appear in stdout while the $VAR ref DOES.

5 P1s addressed:

  1. MUST 7 violation: load_mcp_server wrapped all OSError as MCPServerNotInRegistry. Now distinguishes FileNotFoundError/ENOENT (permanent → MCPServerNotInRegistry) from other OSError including PermissionError (transient → MCPRegistryUnavailable). 2 new regression tests.
  2. Malformed-section silent drop: when mcp.md has ## foo but no command:, _build_spec silently returns None and load_mcp_server raised confusing MCPServerNotInRegistry. Now scans H2 headers via regex post-parse; if section header exists but spec is missing, raises MCPRegistryDescriptorInvalid per MUST 7 + 8.
  3. CLI read_paths=[] skip documented: _cmd_mcp_registry passes read_paths=[]. Inline comment now documents the read-only inspection use case + reserves non-empty read_paths from tools.md for PR 3 install/uninstall.
  4. MCPServerConnectFailed escape: _cmd_mcp_registry exception chain now catches MCPServerConnectFailed (raised when $VAR is unset); prints clean Error: env var not resolved: ... + exit 1 instead of a Python traceback.
  5. backend_id missing from Protocol: class implemented it but Protocol surface didn't declare it. Added @property def backend_id(self) -> str: ... to the Protocol body; future backends now fail conformance at static analysis if absent.

8 critical P2 AUTO-FIXes: section name validation in list_mcp_servers before returning refs; _redact_for_error_message DSN heuristic catches non-scheme user:password@host connection strings; vacuous isinstance(result, object) assertion removed; zero-assertion test_resolve_agent_root_falls_back_to_cwd removed; hardcoded /tmp/dummy-agent replaced with tmp_path fixture; 6 late imports consolidated to top of file (filesystem.py + 2 test files); unused MCPServerRegistryBackend import removed; CLI refresh-capabilities help text rewritten to be honest about no-op-on-filesystem behavior.

15 remaining P3 findings filed as a single follow-up tracking issue (DRY violations across _redact_for_error_message copies, behavior coverage gap for load_all_mcp_servers partial failure, trailing-dot server name acceptance design question, refresh-capabilities CLI help text concerns, custom backend env-var-driven selection routing, several test redundancies).

Plan Completion

Status Items
DONE 22 / 22 deliverables from prep notes (the 1 NOT DONE — test_malformed_mcp_md_raises_descriptor_invalid — was closed by the fix-all pass when the malformed-section P1 fix made the code path reachable; skip marker removed; test now asserts the real raise)
DEFERRED 0
UNVERIFIABLE 0

Per the feedback_atomic_agents_best_not_cheapest rule + the project's CLAUDE.md §11 (adversarial review in rounds), all P0/P1/P2 findings addressed inline before push rather than filed as follow-ups.

Linked Spec

Closes #201 (partial delivery: PR 1 of 5; the issue stays open until PR 5 of 5 lands the LOCK + v1.0.0 release).

This PR delivers the design at ~/.gstack/projects/dep0we-atomic-agents-stack/dep0we-main-design-20260601-165020.md (the office-hours design doc; APPROVED through 3 adversarial rounds + plan-eng-review + 5-stream plan-subagent prep pass).

Documentation

README.md — 4 factual updates: test count 2937 → 3153 (3 locations); spec list gains spec/35 (init wizard RFC) and spec/36 (MCPServerRegistryBackend DRAFT); RFC/DRAFT count "31 locked docs + 2 RFCs" → "31 locked docs + 4 RFCs/DRAFTs" (3 locations); "remaining two protocols" corrected to "remaining protocol" (CorpusBackend already shipped).

CLAUDE.md — Test count refreshed from 2,937 (2026-06-01) to 3,153 (2026-06-03) in 2 locations.

docs/spec/19-mcp.md — Forward-reference to spec/36 added in the Cross-links header line (matches how spec/25 D3 references #201 + how locked specs cross-link successor arcs).

Not touched (per PR 1 of 5 gate):

Test plan

  • Full test suite passes: 3101 passed, 52 skipped, 0 failures on Python 3.11
  • No regressions in existing tests (mcp.py refactor preserves byte-identical behavior for default callers)
  • Cross-model triple-confirmed P0 secret leak fixed with regression test
  • All 5 P1 findings closed with regression tests
  • 8 critical P2 mechanical AUTO-FIXes applied
  • 0 em-dashes in any PR 1 code, test, or doc file (verified per project Voice rule)
  • /document-release sync committed + pushed before PR creation

🤖 Generated with Claude Code

Dan Powers and others added 5 commits June 3, 2026 08:53
…per extraction

parse_mcp_md_text and _build_spec gain a keyword-only resolve_env: bool = True
parameter. The default preserves byte-identical behavior for all existing
callers (parse_mcp_md, doctor.py:678, profile/types.py:336,
profile/filesystem.py:374). The _flush() closure inside parse_mcp_md_text
threads resolve_env through to _build_spec.

A new module-level _resolve_env_vars(env, server_name) helper is extracted
from _build_spec's inline resolution block. The helper raises the canonical
MCPServerConnectFailed message shape documented in spec/19 so the future
FilesystemMCPServerRegistryBackend can call it on already-built specs at
load_mcp_server time without divergent error semantics.

profile/types.py:336 and profile/filesystem.py:374 gain inline comments
documenting the resolve_env=True invariant their AgentProfile.mcp_servers
consumers depend on (the spec/24 D1 mcp_md_raw preservation discipline
requires resolved env vars at snapshot time).

The public parse_mcp_md function does NOT receive the parameter. Its
docstring documents the API asymmetry so callers needing lazy resolution
use parse_mcp_md_text(..., resolve_env=False) directly.
…yBackend reference impl

New atomic_agents/mcp_registry/ package with 4 files:

- __init__.py: register_mcp_server_registry_backend /
  get_mcp_server_registry_backend / list_mcp_server_registry_backends
  registry functions + get_default_mcp_server_registry_backend factory
  + _redact_for_error_message helper (catches both ://-scheme URLs and
  non-scheme DSN-style user:password@host connection strings).
- types.py: MCPServerRef, MCPServerRegistryCapabilities, ValidationResult
  frozen dataclasses. MCPServerRef.to_dict / from_dict roundtrip is
  byte-shape preserving.
- backend.py: @runtime_checkable MCPServerRegistryBackend Protocol with
  8 methods + @Property backend_id + @Property capabilities (matching
  the CorpusBackend spec/34 precedent at corpus/backend.py:70-71).
  6 exception classes: MCPServerNotInRegistry, MCPServerAlreadyInstalled,
  MCPRegistryUnavailable, MCPRegistryAuthRequired,
  MCPRegistryDescriptorInvalid, BackendNotRegistered.
  _default_load_all(backend) module-level helper that backends MAY
  override for performance.
- filesystem.py: FilesystemMCPServerRegistryBackend(agent_root,
  read_paths, *, lock_backend=None) reading <agent_root>/mcp.md.
  Read paths only at PR 1; install/uninstall raise NotImplementedError
  per the capability-honesty contract. supports_install=False static
  class default; flips True at PR 3 when methods land. list_mcp_servers
  calls parse_mcp_md_text(content, resolve_env=False, read_paths=None)
  per Decision 8 (validation at load_mcp_server boundary); section
  names are validated via _validate_server_name and skipped with a
  logged warning if invalid (prevents load_all_mcp_servers from raising
  uncaught ValueError on tampered mcp.md). load_mcp_server distinguishes
  FileNotFoundError (raises MCPServerNotInRegistry per spec) from other
  OSError including PermissionError (raises MCPRegistryUnavailable per
  MUST 7 transient-vs-permanent honesty), detects malformed sections
  (section header exists in mcp.md but parse returned no spec for that
  name) and raises MCPRegistryDescriptorInvalid. load_all_mcp_servers
  delegates to _default_load_all(self) per MUST 10 consistency.

Tests across two new files:

- tests/test_mcp_server_registry_conformance.py: 42 tests parametrized
  across registered backends (filesystem only at this PR; HTTP joins at
  PR 4). Covers MUSTs 1-3, 5-8, 10 from the Implementer Contract; MUSTs
  4 and 9 deferred to PR 3+ (install/uninstall + URL credential
  redaction integration tests).
- tests/test_mcp_server_registry_filesystem_backend.py: 32 tests + 2
  skipped (install/uninstall stubs reserved for PR 3) covering
  path-traversal at API boundary, env-var resolution at load time with
  resolve_env=False, mcp.md parse semantics (multi-section, malformed,
  source field), load_all_mcp_servers delegation, validate() across
  all 7 branches, PermissionError/FileNotFoundError distinction,
  malformed-section MCPRegistryDescriptorInvalid raising, section-name
  charset filtering at list time.

Test suite: 3090 to 3101 passing, 52 skipped, zero regressions.
Adds four read-only subcommands at PR 1 (install/uninstall ship at PR 3
alongside LockBackend integration):

- mcp-registry list: prints a table of mounted MCP servers (name,
  transport, description).
- mcp-registry show <name>: prints the MCPServerSpec for a single
  server. CRITICAL: this command re-parses mcp.md with resolve_env=False
  to print the UNRESOLVED $VAR refs from the file, NOT the resolved
  values returned by load_mcp_server. Printing resolved values would
  leak secrets ($GITHUB_PAT becomes a real token in terminal scrollback,
  CI logs, shell capture). Operator-debugging utility is preserved
  (operator sees env: SECRET=$GITHUB_PAT) without the leak.
- mcp-registry validate <name>: runs the static descriptor check;
  prints ValidationResult.
- mcp-registry refresh-capabilities: prints the current capability
  view. No-op on filesystem (no remote dependency); HTTP backend at
  PR 4 re-probes the catalog server.

_cmd_mcp_registry catches MCPServerNotInRegistry, MCPRegistryAuthRequired,
MCPRegistryUnavailable, MCPRegistryDescriptorInvalid, MCPServerConnectFailed
(unset $VAR resolution), ValueError (invalid name), and the (OSError,
PermissionError) tuple. Every exception path prints a clean
"Error: ..." to stderr and returns exit code 1; no Python tracebacks
escape to the operator.

_resolve_mcp_registry_agent_root has three branches: --agent-root flag
takes priority; ATOMIC_AGENTS_AGENT_ROOT env var second; Path.cwd() as
fallback (matches the corpus subcommand precedent).

CLI passes read_paths=[] to FilesystemMCPServerRegistryBackend with an
inline comment documenting the intent: CLI is a read-only inspection
surface and intentionally skips path-traversal validation. PR 3 install
will require non-empty read_paths from agent runtime context (resolved
from tools.md).

tests/test_mcp_registry_cli.py: 16 tests covering all four subcommands
+ 3 agent_root resolver branches + 3 exception handler paths. The
regression test test_cli_show_does_not_leak_resolved_env_values asserts
the actual resolved env value (the secret) does NOT appear in stdout
while the $VAR reference DOES (operator-debugging visibility preserved).
docs/spec/36-mcp-server-registry-backend.md ships as DRAFT at 696 lines
with 10 normative MUSTs in the Implementer Contract (placed before the
Shipping plan section per spec/34 ordering):

1. Name charset validation at API boundary ([a-zA-Z0-9_.+@-]+ minus
   leading-dot per the cross-Protocol path-traversal pattern).
2. Side-effect-free construction (no network, subprocess, or file open
   at __init__).
3. Capability honesty including the static-vs-dynamic distinction novel
   to this Protocol (filesystem static; HTTP dynamic per tier 1/2/3
   capability negotiation at PR 4).
4. URL credential redaction across operator-facing error paths;
   _redact_for_error_message catches both ://-scheme URLs and
   DSN-style user:password@host connection strings without scheme.
5. Cross-agent isolation at storage layer (filesystem per agent_root;
   HTTP per agent_scope query param; lock file path distinct from the
   agent's main .lock).
6. backend_id stability + close() idempotency.
7. Transient-vs-permanent failure honesty (MCPRegistryUnavailable for
   PermissionError + I/O errors; MCPServerNotInRegistry for
   FileNotFoundError + ENOENT).
8. Env-var resolution semantics on MCPServerSpec at load_mcp_server
   time (not at parse time); the helper is module-level in
   atomic_agents/mcp.py and shared with the filesystem backend.
9. Install/uninstall atomicity (filesystem uses LockBackend lease per
   spec/21; HTTP relies on catalog server transactional storage).
10. load_all_mcp_servers consistency (output semantically equivalent
    to [load_mcp_server(ref.name) for ref in list_mcp_servers()]).

The spec re-languages "catalog of available servers" to "mounted server
source" so the per-agent_scope filtering model is unambiguous. HTTP
backend tier-1/2/3 capability negotiation via GET /capabilities and
OPTIONS /mcp-servers + the AgentProfile.mcp_servers_resolved sibling
field for substrate-agnostic audit (spec/24 D1 addendum) ship at
PR 2-4.

5-PR shipping plan in the spec body: PR 1 Protocol scaffolding + DRAFT
spec + filesystem read paths + CLI read-only subcommands. PR 2 agent.py
wiring + AgentProfile sibling field + IRON RULE regression suite +
doctor coherence check. PR 3 filesystem install/uninstall + LockBackend
integration + CLI install/uninstall. PR 4 HTTPMCPServerRegistryBackend
read paths + tier capability negotiation + httpx via new [http]
optional extra in pyproject.toml. PR 5 HTTP install/uninstall + spec/36
LOCK at 10 MUSTs + v1.0.0 release tag (twelve of twelve backend
protocols shipped).

CHANGELOG [Unreleased] entry covers the PR 1 deliverables + review army
findings + fixes.

Closes the office-hours / plan-eng-review / plan-subagent / /ship review
pipeline for PR 1. Cross-model triple-confirmed P0 secret leak in
mcp-registry show fixed before push.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…1 of 5

README.md: test count 2937→3153 (3101 passing + 52 skipped); spec list gains
spec/35 (RFC) + spec/36 (DRAFT, PR 1 of 5); RFC/DRAFT count 2→4; "remaining
two protocols" corrected to "remaining protocol" (CorpusBackend shipped).
CLAUDE.md: test count refresh to 3,153 (2026-06-03).
docs/spec/19-mcp.md: cross-link to spec/36 added in the Cross-links header.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@dep0we dep0we merged commit 482bf21 into main Jun 3, 2026
5 checks passed
@dep0we dep0we deleted the feat/mcp-server-registry-backend-pr1 branch June 3, 2026 15:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[backend] MCPServerRegistryBackend — unify MCP server discovery across agents

1 participant