dep0we · dep0we · Jun 3, 2026 · Jun 3, 2026 · Jun 3, 2026 · Jun 3, 2026
@@ -207,7 +207,7 @@ uv run pytest                            # full suite
 uv run pytest tests/test_<module>.py -v  # one module
 ```
 
-Run `uv run pytest --collect-only -q | tail -1` for the live test count (last refresh: 3,153 tests collected, 2026-06-03). New backend protocols add ~25 conformance + ~10 impl-specific tests. New features ship with tests. Migration-shaped PRs need parameterized fixture tests across the backend protocol — the conformance suite is what keeps the protocol honest.
+Run `uv run pytest --collect-only -q | tail -1` for the live test count (last refresh: 3,199 tests collected, 2026-06-03). New backend protocols add ~25 conformance + ~10 impl-specific tests. New features ship with tests. Migration-shaped PRs need parameterized fixture tests across the backend protocol — the conformance suite is what keeps the protocol honest.
 
 ### Releases + SemVer
 
@@ -341,7 +341,7 @@ These are not forbidden forever — they're explicitly deferred with rationale.
 
 ## Status
 
-**v0.13.0, alpha, PUBLIC.** Core runtime stable. Test suite: run `uv run pytest --collect-only -q | tail -1` for the live count (last refresh: 3,153 tests collected, 2026-06-03). Capability-gated skips fall into four buckets — ToolRegistry conformance (filesystem-shape + `supports_uninstall=False` variants), AgentProfile (skill-content + filesystem-shape on SQLite), cross-process Redis (require real Redis instead of fakeredis), and judge-conformance dispatch (LLM-only + PolicyJudge concurrent-evaluate). Full CI runs against `uv sync --extra dev --extra openai --extra validation --extra redis`. **Eleven backend protocols shipped**:
+**v0.13.0, alpha, PUBLIC.** Core runtime stable. Test suite: run `uv run pytest --collect-only -q | tail -1` for the live count (last refresh: 3,199 tests collected, 2026-06-03). Capability-gated skips fall into four buckets — ToolRegistry conformance (filesystem-shape + `supports_uninstall=False` variants), AgentProfile (skill-content + filesystem-shape on SQLite), cross-process Redis (require real Redis instead of fakeredis), and judge-conformance dispatch (LLM-only + PolicyJudge concurrent-evaluate). Full CI runs against `uv sync --extra dev --extra openai --extra validation --extra redis`. **Eleven backend protocols shipped**:
 
 - **MemoryBackend** (PR #57) — filesystem reference impl + conformance suite.
 - **LLMBackend** (#87) — Anthropic + OpenAI + Moonshot reference impls, registered at framework import; conformance suite parametrizes across all three.

@@ -282,7 +282,7 @@ Same pattern for OpenAI (`atomic-agents-openai`) and Moonshot (`atomic-agents-mo
 ## Repository structure
 
 - `atomic_agents/` — the Python package (runtime in `agent.py`; backend protocols in `memory/`, `_llm.py`, `_locks.py`, `_costs.py`, etc.; CLI in `cli.py`; preflight in `doctor.py`)
-- `tests/` 3153 tests collected (3101 passing + 52 skipped), Python 3.11 + 3.12 matrix
+- `tests/` 3199 tests collected (3141 passing + 58 skipped), Python 3.11 + 3.12 matrix
 - `docs/` — [spec entry point](docs/README.md), [`architecture.md`](docs/architecture.md), [`spec/`](docs/spec/) (31 locked docs + 4 RFCs/DRAFTs), [`deployment/`](docs/deployment/) (8 operator runbooks), [`samples/caldwell/`](docs/samples/caldwell/) (complete worked example), [`GOVERNANCE.md`](docs/GOVERNANCE.md), [`TENSIONS.md`](docs/TENSIONS.md), [`methodology.md`](docs/methodology.md)
 - `extras/` — operational templates (Claude Code skill wrappers, macOS LaunchAgent plists, cron examples)
 
@@ -313,4 +313,4 @@ Before opening a PR, read [`CLAUDE.md`](CLAUDE.md) (the project's design ethos a
 
 ## Status
 
-**v0.13.0, alpha.** Core runtime stable. 3153 tests collected (3101 passing + 52 skipped) on Python 3.11 / 3.12. Eleven of twelve backend protocols shipped (see the backend protocols table above); `MCPServerRegistryBackend` planned. The surface stabilizes at v1.0. Pre-1.0 — Minor releases may contain breaking changes (see [`docs/deployment/versioning.md`](docs/deployment/versioning.md)). Single-maintainer project; reference implementation anyone can use, fork, or extend.
+**v0.13.0, alpha.** Core runtime stable. 3199 tests collected (3141 passing + 58 skipped) on Python 3.11 / 3.12. Eleven of twelve backend protocols shipped (see the backend protocols table above); `MCPServerRegistryBackend` planned. The surface stabilizes at v1.0. Pre-1.0 — Minor releases may contain breaking changes (see [`docs/deployment/versioning.md`](docs/deployment/versioning.md)). Single-maintainer project; reference implementation anyone can use, fork, or extend.
@@ -73,6 +73,13 @@
     CorpusBackend,
     get_default_corpus_backend,
 )
+from .mcp_registry import (
+    MCPRegistryError,
+    MCPRegistryUnavailable,
+    MCPServerRegistryBackend,
+    _redact_for_error_message as _redact_mcp_registry_url,
+    get_default_mcp_server_registry_backend,
+)
 from .logs.types import (
     PRIMITIVE_AGENT_CALL,
     PRIMITIVE_CAPTURE,
@@ -266,6 +273,13 @@ class AtomicAgent:
     # ``CorpusBackend`` Protocol implementer -- breaking the
     # operator-pinned-SQLite/pgvector case PR 3 forward.
     corpus_backend: CorpusBackend
+    # Same class-level annotation rationale for ``mcp_server_registry_backend``
+    # (#201 PR 2). Without this, static analysis would narrow
+    # ``agent.mcp_server_registry_backend`` to the concrete
+    # ``FilesystemMCPServerRegistryBackend`` default rather than treating
+    # it as any ``MCPServerRegistryBackend`` Protocol implementer --
+    # breaking the operator-pinned-HTTP/SaaS case PR 4 forward.
+    mcp_server_registry_backend: MCPServerRegistryBackend
     """The main agent runtime.
 
     Responsible for:
@@ -293,6 +307,7 @@ def __init__(
         policy_backend: PolicyBackend | None = None,
         persona_backend: PersonaBackend | None = None,
         corpus_backend: CorpusBackend | None = None,
+        mcp_server_registry_backend: MCPServerRegistryBackend | None = None,
     ):
         self.name = name
         self.trigger = trigger
@@ -528,6 +543,65 @@ def __init__(
                 agent_mode=parse_agent_mode_text(_persona.identity),  # re-derive
             )
 
+        # ── MCPServerRegistryBackend resolution (#201 PR 2 of 5) ──────────────
+        # Mirrors PersonaBackend's _persona_backend_was_explicit pattern at
+        # agent.py:443-450 and CorpusBackend's at agent.py:458-465. MCP catalog
+        # is per-agent semantic context (per spec/36 Decision 1); delegate
+        # threading is explicit-only.
+        #
+        # Unlike other backends, the default-resolution factory needs read_paths
+        # from self._profile.tool_config['read_paths'], which is only available
+        # after profile load. The resolution therefore happens here in __init__
+        # AFTER profile load and BEFORE _load_config() is called, rather than
+        # inside _load_config() (which is a pure reader of self._profile).
+        # This is spec/36 line 599 corrected (the spec text says _load_config()
+        # but the actual right place is __init__; spec doc gets a one-sentence
+        # amendment in this same PR).
+        _mcp_server_registry_backend_was_explicit = (
+            mcp_server_registry_backend is not None
+        )
+        read_paths_for_mcp_registry = self._profile.tool_config.get("read_paths", [])
+        if mcp_server_registry_backend is None:
+            self.mcp_server_registry_backend = get_default_mcp_server_registry_backend(
+                self.agent_root,
+                read_paths_for_mcp_registry,
+            )
+        else:
+            self.mcp_server_registry_backend = mcp_server_registry_backend
+        # Saved on self so delegate() can consult it without re-checking the
+        # constructor kwarg (the kwarg is no longer in scope there).
+        self._mcp_server_registry_backend_was_explicit = (
+            _mcp_server_registry_backend_was_explicit
+        )
+
+        # Probe + augment profile per spec/36 framework-level invariant (line
+        # 520-522). NO try/except around load_all_mcp_servers -- fail-closed:
+        # MCPRegistryUnavailable propagates. The wrapper below adds the
+        # backend_id + redacted URL context for operator-facing messages per
+        # spec/36 MUST 4 + line 522.
+        try:
+            _materialized_mcp_specs = (
+                self.mcp_server_registry_backend.load_all_mcp_servers()
+            )
+        except MCPRegistryError as exc:
+            # Catch MCPRegistryError broadly (covers MCPRegistryUnavailable,
+            # MCPRegistryDescriptorInvalid, MCPRegistryAuthRequired). Re-raise
+            # preserving the original exception type so callers can distinguish
+            # transient (Unavailable) from permanent (DescriptorInvalid).
+            _safe_backend_id = getattr(
+                self.mcp_server_registry_backend, "backend_id", "unknown"
+            )
+            raise type(exc)(
+                f"[{_safe_backend_id}] catalog probe failed at agent "
+                f"construction: {_redact_mcp_registry_url(str(exc))}"
+            ) from exc
+        # Populate mcp_servers_resolved on the profile via replace().
+        # Stream 2 adds the mcp_servers_resolved field to AgentProfile; this
+        # replace() call is a no-op on the field until Stream 2 merges.
+        self._profile = self._profile.replace(
+            mcp_servers_resolved=_materialized_mcp_specs,
+        )
+
         # Per-agent target extractor registry (spec/29 §"Target extraction",
         # #124 PR 3a). MUST initialize BEFORE tool_registry loading below so
         # ToolDefinitions that declare a target_extractor_id can be validated
@@ -3291,15 +3365,40 @@ def call(
             # Only spin up when mcp.md declares servers and pool not yet live.
             # Discover tools and register them into the tool registry before
             # the first LLM call so the model sees the full tool list.
-            if self.config.mcp_servers and self.mcp_pool is None:
+            #
+            # Per spec/36 framework invariant (line 520): MCPClientPool consumes
+            # mcp_servers_resolved (the materialized list from the registry
+            # backend, populated in __init__ via replace()). This is the
+            # substrate-agnostic spec list. AgentConfig.mcp_servers stays as
+            # self._profile.mcp_servers (the filesystem-parse path) for backward
+            # compat on existing log/audit consumers.
+            #
+            # IMPORTANT: an empty resolved list is AUTHORITATIVE, not a
+            # missing-field signal. If the registry backend genuinely returns
+            # [] (e.g., operator pinned an HTTP catalog that lists zero MCP
+            # servers for this agent_scope), we MUST NOT fall back to
+            # config.mcp_servers (which may carry stale mcp.md specs). Cross-
+            # model review (Codex + Claude adversarial + plan-subagent prep
+            # pass) all flagged the `... or self.config.mcp_servers` fallback
+            # as the highest-priority issue: it lets the framework launch
+            # subprocesses the backend explicitly removed. The check below
+            # uses `hasattr` to distinguish "field missing entirely" from
+            # "field present but empty" -- the field is added in this same
+            # PR's Stream 2, so post-merge this always uses the resolved
+            # path.
+            if hasattr(self._profile, "mcp_servers_resolved"):
+                _resolved_mcp_specs = list(self._profile.mcp_servers_resolved)
+            else:
+                _resolved_mcp_specs = list(self.config.mcp_servers)
+            if _resolved_mcp_specs and self.mcp_pool is None:
                 # ── #89 PR 3b: Policy MCP-allowlist consultation ────────
                 # Consult Policy on each declared server. Emit a
                 # policy_decision event (axis=mcp_allowlist) per denied
                 # server. In log-only mode (enforce_noncap=False, PR 3b
                 # default) all configured servers still connect; in
                 # enforcement mode denied servers are filtered out before
                 # the pool spins up so we don't pay the subprocess cost.
-                effective_mcp_specs = self.config.mcp_servers
+                effective_mcp_specs = _resolved_mcp_specs
                 pol_snap = self._policy_snapshot_this_call
                 if pol_snap is not None and pol_snap.mcp_allow_fn is not None:
                     from .policy.types import (
@@ -3308,7 +3407,7 @@ def call(
                     )
 
                     allowed_specs = []
-                    for _spec in self.config.mcp_servers:
+                    for _spec in _resolved_mcp_specs:
                         if pol_snap.mcp_allow_fn(_spec.name):
                             allowed_specs.append(_spec)
                             continue
@@ -4649,6 +4748,10 @@ def delegate(
             _delegate_kwargs["persona_backend"] = self.persona_backend
         if self._corpus_backend_was_explicit:
             _delegate_kwargs["corpus_backend"] = self.corpus_backend
+        if self._mcp_server_registry_backend_was_explicit:
+            _delegate_kwargs["mcp_server_registry_backend"] = (
+                self.mcp_server_registry_backend
+            )
         target_agent = AtomicAgent(**_delegate_kwargs)
 
         start = time.time()

@@ -24,6 +24,7 @@
     NotInRoster,
     SelfDelegationError,
 )
+from .mcp_registry import MCPRegistryError
 
 
 def main(argv: list[str] | None = None) -> int:
@@ -35,10 +36,13 @@ def main(argv: list[str] | None = None) -> int:
     parser.add_argument("--target", required=True, help="target agent name")
     parser.add_argument("--work-item", required=True, help="work item text")
     parser.add_argument(
-        "--critical", action="store_true",
+        "--critical",
+        action="store_true",
         help="bypass cost guardrails (still logged)",
     )
-    parser.add_argument("--agents-root", default=None, help="override ATOMIC_AGENTS_ROOT")
+    parser.add_argument(
+        "--agents-root", default=None, help="override ATOMIC_AGENTS_ROOT"
+    )
 
     args = parser.parse_args(argv)
     agents_root = (
@@ -58,7 +62,12 @@ def main(argv: list[str] | None = None) -> int:
             work_item=args.work_item,
             critical=args.critical,
         )
-    except (NotInRoster, SelfDelegationError, CostGuardrailBlocked) as e:
+    except (
+        NotInRoster,
+        SelfDelegationError,
+        CostGuardrailBlocked,
+        MCPRegistryError,
+    ) as e:
         print(f"Error: {e}", file=sys.stderr)
         return 1
     except AtomicAgentsError as e: