From 6bcaf1c15bb9ae698fe2c24ab3fc2829ae6744a1 Mon Sep 17 00:00:00 2001 From: Dan Powers Date: Thu, 4 Jun 2026 10:51:59 -0500 Subject: [PATCH 1/2] feat(mcp_registry): #201 PR 4 of 5. HTTPMCPServerRegistryBackend read paths + tier negotiation Ships the HTTP-catalog reference implementation for the MCPServerRegistryBackend Protocol arc with read paths (list / load / load_all / validate) + tier-1/2/3 capability negotiation + httpx exception mapping + conformance suite parametrize. Install / uninstall write paths land at PR 5. Closes 36 prep-pass findings (8 P0, 18 P1, 10 P2 from 5-stream Sonnet prep pass) BEFORE implementation and 8 pre-landing P0 / P1 findings AFTER implementation (RuntimeError on closed client race, OPTIONS non-404 / 405 silent fallback, catalog_url query-string normalization, MCPServerRef.source spec compliance, validate() 404 message wording, factory ValueError credential redaction, MCPServerRef.from_dict version normalization, plus import + URL-normalization cleanups). Test delta: +75 net new (3232 -> 3307 collected; 3248 passed, 59 skipped, 0 regressions). PR 4 extends the post-#285-revert /ship streak to 11. Co-Authored-By: Claude Opus 4.7 --- CHANGELOG.md | 2 + CLAUDE.md | 4 +- README.md | 6 +- atomic_agents/mcp.py | 42 + atomic_agents/mcp_registry/__init__.py | 19 +- atomic_agents/mcp_registry/http.py | 1163 ++++++++++++ atomic_agents/mcp_registry/types.py | 7 +- atomic_agents/profile/types.py | 38 +- docs/spec/36-mcp-server-registry-backend.md | 147 ++ pyproject.toml | 1 + tests/test_mcp_server_registry_conformance.py | 148 +- .../test_mcp_server_registry_http_backend.py | 1624 +++++++++++++++++ 12 files changed, 3158 insertions(+), 43 deletions(-) create mode 100644 atomic_agents/mcp_registry/http.py create mode 100644 tests/test_mcp_server_registry_http_backend.py diff --git a/CHANGELOG.md b/CHANGELOG.md index 7b0c7d3..7ccf818 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -51,6 +51,8 @@ CHANGELOG entry. ### Added +- **HTTPMCPServerRegistryBackend read paths + tier negotiation + httpx exception mapping + conformance parametrize** ([#201](https://github.com/dep0we/atomic-agents-stack/issues/201) -- MCPServerRegistryBackend arc **PR 4 of 5**). `atomic_agents/mcp_registry/http.py` ships `HTTPMCPServerRegistryBackend` with the full Decision 4 five-step capability probe sequence (`GET /capabilities` 200 parses tier from body; 404 falls through to `OPTIONS /mcp-servers` Allow-header set-membership inference; OPTIONS 404 falls back to conservative tier 1; 5xx raises `MCPRegistryUnavailable`; 401 raises `MCPRegistryAuthRequired`; other non-404 4xx raises `MCPRegistryUnavailable` per B-F8 -- no silent tier-1 fallback). `make_http_mcp_server_registry_backend_from_url(url)` factory reads `ATOMIC_AGENTS_MCP_SERVER_REGISTRY_AUTH_TOKEN` from the environment. The `[http]` extra (`httpx>=0.27`) is opt-in; filesystem operators pay zero import cost (lazy import inside the http branch). `MCPServerSpec.to_dict()` and `MCPServerSpec.from_dict()` promoted to public methods on the class; `profile/types.py` helpers delegate. `get_default_mcp_server_registry_backend` gains the `http` branch reading `ATOMIC_AGENTS_MCP_SERVER_REGISTRY_BACKEND_URL` + `ATOMIC_AGENTS_MCP_SERVER_REGISTRY_AUTH_TOKEN`. httpx exception mapping covers the full public tree: `LocalProtocolError` and `DecodingError` map to `MCPRegistryDescriptorInvalid`; all timeout variants, network errors, and remote protocol errors map to `MCPRegistryUnavailable`; `httpx.InvalidURL` (does not inherit from `HTTPError`) maps to `ValueError` via a separate except clause; `httpx.HTTPError` is the final catch-all for any future subclass. Capability probe failure cache (`probe_failure_cache_s=60.0`) suppresses re-probes within the window; `refresh_capabilities()` bypasses the cache. Threading: `threading.Lock` guards only the cache-check decision and cache-write assignment; the HTTP probe runs outside the lock so concurrent first-call callers do not serialize against network latency. `MCPServerRef.source` uses the raw `catalog_url` (per spec/36 line 228: `source=f"{catalog_url}/mcp-servers/{name}"` is the canonical wire contract). The recommended operator pattern for credentials is the `ATOMIC_AGENTS_MCP_SERVER_REGISTRY_AUTH_TOKEN` env var rather than embedding credentials in `catalog_url`; the factory's `ValueError` redacts credentials when an unsupported scheme is passed so a paste mistake on the env var does not leak. `auth_token` never appears in any error message, log line, or `__repr__`. Response body validation rejects malformed JSON, missing `servers` key, missing required spec fields (`name`, `command`), and MUST 1 charset violations in returned server names. `load_all_mcp_servers()` uses a single `GET /mcp-servers?expand=spec` bulk call instead of N+1 per-name requests (MUST 10). spec/36 PR 4 amendments ship inline: new subsections for HTTP wire format, tier negotiation, capability handshake (static-vs-runtime view), per-scope filtering (catalog MUST filter server-side; org-wide returns are non-conformant), exception surface (httpx mapping table with `httpx.InvalidURL` separate catch noted), and default factory (new env vars). **Test delta: +75 net new (3232 before PR 4, 3307 after)**. `tests/test_mcp_server_registry_http_backend.py` NEW (54 tests using `httpx.MockTransport` -- zero new dev dependency -- covering: MUST 2 side-effect-free construction including lazy-httpx-import guard; all 5 Decision 4 probe branches plus 401, non-404 4xx, and reordered Allow header; full httpx exception mapping including `LocalProtocolError` to `DescriptorInvalid`, `DecodingError` to `DescriptorInvalid` (deterministic via MockTransport injection), `InvalidURL` to `ValueError` via injection, `HTTPError` catch-all, and `RuntimeError` from a closed `httpx.Client` to `MCPRegistryUnavailable` (closes the Adv-F2 race); 5 response body validation defense-in-depth tests; MUST 10 bulk endpoint tests including full-field equality with non-default `args`/`env`/`description` and env-var resolution; 4 auth and URL credential redaction tests; 3 capability lifecycle tests; review-army follow-up tests covering `MCPServerSpec.to_dict/from_dict` public round-trip + extra-key forward-compat + required-key `KeyError`, OPTIONS probe 5xx and 401 handling per Adv-F3, concurrent first-call probe verification per D-PR4-3, `agent_scope` query-param forwarding verification, success-cache verification, factory function tests including credential-redacting `ValueError` and env-var auth-token read, `catalog_url` query-string normalization per Adv-F4, and `MCPServerRef.source` using the raw `catalog_url` per spec/36). `tests/test_mcp_server_registry_conformance.py` flips `params=["filesystem"]` to `params=["filesystem", "http"]` on both `backend_factory` and `populated_backend` fixtures; HTTP branch uses `httpx.MockTransport` responding to the full probe sequence so capability tests do not cascade-fail. Two parallel Sonnet implementer streams ran under git branch isolation (Stream 1 owned `http.py` + factory + types promotion + spec/36 amendments; Stream 2 owned HTTP test file + conformance parametrize + CHANGELOG). Pre-impl prep: 5-stream parallel Sonnet prep pass caught 36 findings (8 P0, 18 P1, 10 P2) including the `probe_failure_cache_s` vs `request_timeout_s` parameter-name mismatch that would have produced 30+ re-probes per 5 minutes during sustained catalog outages, the capability-default mismatch that would have caused MUST 3 conformance lies, and the missing `MCPServerSpec.to_dict/from_dict` public methods. Pre-landing /ship review army (5 specialists + Claude adversarial + Step 9 checklist + outside-voice coverage audit + plan completion audit) surfaced 34 follow-up findings; 8 P0 / P1 fixes applied inline before push: Adv-F2 `RuntimeError` on closed `httpx.Client` race mapped to `MCPRegistryUnavailable`, Adv-F3 / T-F1 OPTIONS probe non-404/non-405 silent fallback closed (now raises `MCPRegistryAuthRequired` on 401 and `MCPRegistryUnavailable` on 5xx and other 4xx), Adv-F4 `catalog_url` query-string normalization at construction so embedded query params do not corrupt downstream request URLs, A-F1 `MCPServerRef.source` switched to raw `catalog_url` per spec, A-F2 `validate()` 404 message rewording to honestly reflect the spec's ambiguity, A-F5 `MCPServerRef.from_dict` empty-string `version` normalization, S-F1 factory `ValueError` redaction, M4/M5/M6 import + URL-normalization cleanups. PR 4 extends the `/ship` streak to 11. + - **MCPServerRegistryBackend filesystem install/uninstall + LockBackend lease + render_mcp_md serializers + CLI install/uninstall subcommands** ([#201](https://github.com/dep0we/atomic-agents-stack/issues/201) -- MCPServerRegistryBackend arc **PR 3 of 5**). Operators on every backend now have the framework's CLI as the canonical install surface for MCP servers. `atomic-agents mcp-registry install --command --args --env --description ` atomically appends a new H2 section to `/mcp.md` under a `LockBackend` lease; `atomic-agents mcp-registry uninstall ` removes it idempotently (absent-name returns 0 without error, matching the SQLiteToolRegistryBackend uninstall precedent). The README's "same agent definitions, same call flow, different backends" promise now extends to operator write commands at v1.0, not just runtime read paths. **The mcp.md serializer that the install path depends on shipped in the same PR**: `atomic_agents/mcp.py` gains `render_mcp_md_section(spec) -> str` and `render_mcp_md_full(specs) -> str` with a round-trip property pinned by 4 new tests (parse_mcp_md_text(render_mcp_md_full(specs), resolve_env=False) == specs). The serializer writes `$VAR` env references verbatim per spec/36 Decision 7 (resolved env values never persist to disk); refuses any field (command, args items, env keys, env values, description) containing a newline so the parser cannot be tricked into interpreting attacker-controlled content as a phantom H2 section. `FilesystemMCPServerRegistryBackend.install(spec) -> MCPServerRef` implements the 7-step critical section per spec/36 MUST 9: `_validate_server_name(spec.name)` at the API boundary (MUST 1), `with lock_backend.acquire("mcp_registry", timeout=self._install_lock_timeout) as handle:` (context-manager idiom releases on every exit path including exceptions), `LockBusy` caught at the boundary and re-raised as `MCPRegistryUnavailable` (preserves the framework-level fail-closed semantic at `agent.py:__init__`), `cleanup_stale_tempfiles_for_file(mcp_md)` scoped to the target file's siblings only (NOT recursive over the whole agent_root tree), read mcp.md with FileNotFoundError → empty content cold-start, parse with `resolve_env=False` keeping $VAR refs raw, dual-probe collision detection across both the parsed-name set AND a raw `re.findall(r"^## (\S+)", content, re.MULTILINE)` scan (catches malformed sections that the parser silently skipped), `check_lock_lost(handle)` immediately before atomic_write (no-op on filesystem; raises LockLost on Redis-backed leases that expired mid-critical-section, re-raised as MCPRegistryUnavailable; non-LockLost exceptions from the helper also caught), full-file render via `render_mcp_md_full`, `_io.atomic_write` (temp + fsync + rename + parent dir fsync). Returns `MCPServerRef` projected from input spec (name, single-line description, transport, version=None, source=`f"mcp.md#section:{name}"`) with NO env / command / args fields — the CLI handler can safely echo `ref.name` without secret-leak risk (closes the symmetric class of the PR 1 P0 secret leak in `mcp-registry show` that was caught by cross-model triple-confirmation). `uninstall(name)` mirrors the lock discipline: validate-name-first, dual-probe absent check, no atomic_write on no-op path (preserves mcp.md mtime), returns None on both present-removed and absent-no-op paths, no pre-lock fast-path (a concurrent install could add the name between an unlocked check and the subsequent read; spec/36 MUST 9 amendment documents this discipline). Constructor signature gains `install_lock_timeout: float = 30.0` kwarg (per spec/21 `apply_staging_lock_timeout` precedent; CI fail-fast tests use `install_lock_timeout=0.0`) and the default `lock_backend=None` now lazily resolves via `get_default_lock_backend(self._agent_root)` so multi-host operators pinning `ATOMIC_AGENTS_LOCK_BACKEND=redis` on Cloud Run / Kubernetes automatically get `RedisLockBackend` for registry writes without per-construction operator config. Capability flags flip at this PR per spec/36 Decision 5 evolution table: `supports_install=True`, `supports_uninstall=True` (was False at PR 1/2 because the methods raised NotImplementedError; MUST 3 capability honesty now means conformance suite calls install/uninstall directly and asserts they return correctly typed values, not that they raise). `MCPRegistryError` rebased to inherit from `AtomicAgentsError` (was `Exception`) so framework-wide `except AtomicAgentsError` catch-alls see registry failures consistently with the 11 other backend protocols' hierarchies. `atomic_agents/cli.py` adds `install` + `uninstall` subparsers under `mcp-registry` with `--command` (required), `--args` (comma-separated, empty entries dropped), `--env` (comma-separated KEY=$VAR pairs, split on first `=` so values may contain `=`, empty key raises ValueError), `--description` (single-line; refused if any line matches `^##\s` per defense-in-depth against H2 injection), `--transport` (choices=["stdio"] for v1). CLI WARNs on stderr when `--env KEY=value` doesn't start with `$` (operator likely typed a literal secret; install still succeeds per decision 3 = WARN, not ERROR — legitimate non-secret use cases like `--env MODE=production` aren't blocked but the operator sees feedback). CLI REFUSES newlines in `--command`, any `--args` item, any `--env` key, or any `--env` value with operator-readable errors naming the offending flag (defense-in-depth against the API-path H2 injection class that Claude Adversarial + Codex independently flagged as P1). Lazy import block adds `MCPServerAlreadyInstalled` + `MCPRegistryError` (was missing — would have caused `NameError` on the first install collision); exception handler chain adds explicit `MCPServerAlreadyInstalled` catch before the `MCPRegistryError` base-class backstop (catch-order matters: more-specific subclass before base). Top-level `mcp-registry` description and module docstring updated to remove the "deferred to PR 3" language now that install/uninstall ship. spec/36 PR 3 amendments (committed at `3a3a23e` before implementer dispatch): constructor signature gains `install_lock_timeout` kwarg with usage rationale; `lock_backend` parameter docstring routes default through `get_default_lock_backend(agent_root)` and explicitly names the custom-lock-backend deadlock failure mode (passing `agent.lock_backend` competes with `agent.call()` for `/.lock` and raises LockBusy whenever a call is in flight); new "Install / uninstall semantics" subsection documents the 7-step critical section with the context-manager idiom (NOT bare `handle.release()` because LockHandle is a frozen dataclass and release is a backend method), dual-probe collision detection, absent-name idempotency with no fast-path bypass; new "LockBackend integration" subsection documents factory routing + install_lock_timeout knob + LockBusy translation + check_lock_lost discipline + custom-lock-backend operator surface + multi-host pinning + non-reentrant default; MUST 9 contract updated to require the context-manager idiom + explicit LockBusy → MCPRegistryUnavailable mapping + check_lock_lost before atomic_write + no-pre-lock-fast-path rule + mtime-preservation note for absent-name uninstall; capabilities label flipped from "PR 1/2" to "PR 3+" reflecting the flag flip. **Cross-model review army at /ship time** (7 parallel Sonnet subagents covering plan completion + pre-landing checklist + testing/maintainability/security/performance specialists + Claude adversarial + Codex adversarial via `codex exec`) surfaced 44 findings; 5 were triple-confirmed across at least 3 independent reviewers and applied inline: (1) **H2 injection refusal in render_mcp_md_section** for newlines in spec.command, spec.args items, spec.env keys + values (Claude Adversarial #1 CRITICAL FIXABLE + Codex P1 + Pre-Landing — an API caller could construct an MCPServerSpec that wrote multi-section content the parser interpreted as MULTIPLE H2 sections, bypassing collision detection + name validation with no audit record); (2) **cleanup_stale_tempfiles moved out of __init__** to install/uninstall write paths with a tightly-scoped `cleanup_stale_tempfiles_for_file(mcp_md)` glob (NOT rglob) helper in `_io.py` (Pre-Landing CRITICAL + Codex P1 + Performance + Claude Adversarial — the constructor was recursively deleting `.*.tmp` anywhere under agent_root which violated MUST 2 side-effect-free construction and could delete unrelated user/application tempfiles, including from read-only commands like `list`); (3) **BackendNotRegistered escape from locks module fixed**: `_resolve_lock_backend` wraps `get_default_lock_backend` in try/except and re-raises as `MCPRegistryUnavailable` so operator typos in `ATOMIC_AGENTS_LOCK_BACKEND` produce clean errors instead of raw Python tracebacks (Codex P2 + Claude Adversarial #3); (4) **check_lock_lost broaden except clause**: non-LockLost exceptions (ImportError from broken redis dep, AttributeError from malformed handle.backend_state, etc.) now translate to MCPRegistryUnavailable instead of escaping raw (Codex P3 + Claude Adversarial #4); (5) **lock timeout test added** `test_install_lock_timeout_zero_under_contention` exercises the spec/36 MUST 9 LockBusy → MCPRegistryUnavailable contract by holding the lock in the test setup and asserting install with `install_lock_timeout=0.0` raises the wrapper exception (Testing specialist CRITICAL + Pre-Landing + Maintainability — module docstring had promised this test category but no implementation existed). Auto-fix cluster also applied inline: late imports (`render_mcp_md_full` + `check_lock_lost`) moved to top-level for visibility and micro-perf; stale docstrings in `filesystem.py` + `backend.py` rewritten to present-tense PR 3+ baseline (removed PR-1 historical claims that misled future readers); test assertion gaps closed (`test_uninstall_idempotent_double_call` now asserts `result2 is None`, `test_cli_install_warns_on_literal_env_value` now asserts `exit_code == 0`, `test_install_empty_command_raises` tightened from `pytest.raises((ValueError, Exception))` to `pytest.raises(ValueError)`); CLI H2 description guard aligned to the renderer's `re.match(r'^##\s', line)` regex (catches `##\t` tab-separated case that the prior `line.startswith("## ")` missed). Conformance suite tightened per spec/36 MUST 3 + MUST 9 + MUST 10: `test_capability_honesty_install` True-branch now asserts `isinstance(ref, MCPServerRef)` (replaces the prior `except Exception: pass` that accepted any exception as conformant); `test_capability_honesty_uninstall` True-branch now asserts `result is None` (idempotent contract); new `test_must9_install_atomicity_concurrent_same_name` spawns 3 threads installing the same spec and asserts exactly one wins (others raise MCPServerAlreadyInstalled or MCPRegistryUnavailable); new `test_must9_uninstall_absent_name_is_noop` asserts uninstall("definitely-not-in-registry") returns None without raising; new `test_must10_post_install_consistency` asserts `set(load_all_mcp_servers()) == set(load_mcp_server(ref.name) for ref in list_mcp_servers())` after install (MUST 10 equivalence holds across the read paths post-mutation); the placeholder `@pytest.mark.skip("PR 3")` stubs in `test_mcp_server_registry_filesystem_backend.py` removed (real tests now in the new file). Doctor capability snapshot test updated to assert `supports_install=True, supports_uninstall=True` (was the PR 2 baseline). **Pre-impl prep**: 5-stream parallel Sonnet prep pass parametrized on failure-mode dimensions (LockBackend integration + acquire/release; install atomicity; uninstall idempotency; CLI surface + secret-leak discipline; capability flag flip + conformance) caught 58 findings BEFORE any code shipped, mirroring the PR 1 (35 findings) and PR 2 (71 findings) prep cadence. The single load-bearing P0 caught at prep stage: **the mcp.md serializer didn't exist anywhere in the codebase** — spec/36 said "append new H2 section" without naming the missing primitive. Streams B and C independently flagged it; PR 3 grew the test budget from +15 to +19 to ship the serializer alongside its consumers in a single review pass. Two implementer streams ran in parallel under git worktree isolation (Stream 1 owned mcp.py serializer + filesystem.py install/uninstall + backend.py base class fix; Stream 2 owned cli.py + tests/test_mcp_server_registry_filesystem_install.py NEW + conformance suite updates) per the aggressive-Sonnet-delegation-when-on-Opus discipline; merged cleanly with zero conflicts because file-set partition was disjoint by design. **Test delta: +33 net new (3199 collected before PR 3, 3232 after; 3176 passed + 56 skipped + 0 failures + 0 regressions across the full suite)**. Test files: `tests/test_mcp.py` (+4 render round-trip tests pinning the serializer's parse/render symmetry including $VAR refs preserved unresolved and descriptions stripped to single-line); `tests/test_mcp_server_registry_filesystem_install.py` NEW (28 tests covering install happy path + cold-start mcp.md creation + collision raising MCPServerAlreadyInstalled + path-traversal name raising ValueError + empty-command rejection + $VAR env round-trip + install/load round-trip + uninstall present/absent/double-call/install-uninstall-install cycle + concurrent same-name exactly-one-wins + concurrent different-names all-win + lock-timeout-zero-under-contention + CLI no-env-echo + CLI WARN on literal env + CLI refuses H2 in description + _parse_env_flag/_parse_args_flag unit tests); `tests/test_mcp_server_registry_conformance.py` (+3 new conformance tests + 2 tightened existing tests + 1 docstring correction); `tests/test_mcp_server_registry_filesystem_backend.py` (placeholder skips removed); `tests/test_mcp_server_registry_doctor.py` (capability assertions updated for the flag flip). After PR 5 of 5 of #201 lands, atomic-agents-stack hits v1.0 with twelve of twelve backend protocols shipped. PR 3 extends the post-#285-revert `/ship` streak to 10. - **MCPServerRegistryBackend wiring + AgentProfile.mcp_servers_resolved sibling field + doctor.check_mcp_server_registry_backend + IRON RULE regression suite + cross-model fail-closed re-raise** ([#201](https://github.com/dep0we/atomic-agents-stack/issues/201) -- MCPServerRegistryBackend arc **PR 2 of 5**). Operators with a registry-backed MCP catalog now have agent construction actually consult the backend: previous PRs scaffolded the Protocol; this PR ships the framework integration. `AtomicAgent.__init__` accepts a new `mcp_server_registry_backend` constructor kwarg + class-level type annotation, resolves the default via `get_default_mcp_server_registry_backend(self.agent_root, read_paths)` when none is supplied, calls `backend.load_all_mcp_servers()` to materialize the spec list, and populates the new `AgentProfile.mcp_servers_resolved` field via `dataclasses.replace()` BEFORE `_load_config()` runs. `MCPClientPool` at `agent.call()` now reads `self._profile.mcp_servers_resolved` (the substrate-agnostic materialized list) instead of `self.config.mcp_servers` (the filesystem-parse path, which stays in place for backward-compat audit/log consumers). The framework-level fail-closed invariant per spec/36 lines 519-522 propagates `MCPRegistryError` from the backend probe with backend_id context plus URL credential redaction; original exception type is preserved via `type(exc)(...)` so callers can distinguish `MCPRegistryUnavailable` (transient) from `MCPRegistryDescriptorInvalid` (permanent). Per-runner kwargs ship on all three runners: `OutcomeRunner(..., mcp_server_registry_backend=...)` and `EvalRunner(..., mcp_server_registry_backend=...)` thread the kwarg to the internal `AtomicAgent` construction sites; `DreamRunner` stores it for API parity (no internal `AtomicAgent` construction in v1, mirroring the CorpusBackend precedent). `delegate.py` adds explicit-only threading via `_mcp_server_registry_backend_was_explicit` flag (mirrors PersonaBackend D-ER-2 at `agent.py:401` and CorpusBackend at `agent.py:431`): default-resolved backends do NOT leak the coordinator's `agent_root` to delegates because MCP catalog is per-agent semantic context, not fleet-scoped; the `MCPRegistryError` family is added to the delegate CLI catch block so catalog-down at delegate construction surfaces as a clean operator-facing error instead of a Python traceback. `AgentProfile` gains the `mcp_servers_resolved: list[MCPServerSpec]` sibling field (spec/24 Decision 1 addendum) placed LAST in the dataclass to honor Python's required-fields-before-defaults ordering rule; `field` is added to the `dataclasses` import; `to_dict()` always serializes the field as `[]` regardless of runtime value (snapshot security: resolved `$VAR` env values never land in snapshot JSON files on disk, preserving spec/24 D1's raw-text-shadow security intent); `from_dict()` reconstructs the field via the new `_mcp_spec_from_dict` helper which also fixes a pre-existing latent bug where the `mcp_servers` fallback path returned raw dicts instead of `MCPServerSpec` instances. SQLite backend continues to ride the existing `profile_json` blob round-trip; zero schema migration required, schema stays at v2. Filesystem profile backend additionally sorts `mcp_servers` lexicographically by name on load to align with spec/36 MUST 5 across all backends; pre-#201 operators with alphabetical `mcp.md` see no change; operators with non-alphabetical `mcp.md` see a one-time invisible reorder of `agent.config.mcp_servers` introspection (MCP tool resolution and audit semantics unchanged because qualified names are unique per server). `doctor.check_mcp_server_registry_backend` ships as the 13th `check_*_backend` with PASS/WARN/FAIL ladder: PASS on `list_mcp_servers` + `load_all_mcp_servers` both succeeding (the new dual-probe closes a Codex/Claude-adversarial-triple-confirmed false-PASS hole where a malformed `mcp.md` would PASS doctor then crash agent construction); WARN on transient `MCPRegistryUnavailable`; FAIL on unknown `backend_id` with credential-redacted echo via the shared `_redact_for_error_message` helper from `mcp_registry/__init__.py` (NOT the inline truncation used in `check_tool_registry_backend` which misses DSN-style values per the PR 1 P0 redaction discipline). Capability snapshot in the detail dict includes all 5 `MCPServerRegistryCapabilities` fields plus `mcp_server_count` so operator `atomic-agents doctor --json` consumers see the runtime view. Two PR 1 latent bug fixes surfaced by PR 2's fail-closed semantic: `FilesystemMCPServerRegistryBackend.list_mcp_servers` now discriminates `FileNotFoundError`/ENOENT (returns `[]`, the normal absent-file path) from other `OSError` (raises `MCPRegistryUnavailable` so misconfigured PermissionError after a Kubernetes deploy actually surfaces); `load_all_mcp_servers` overridden directly instead of delegating to `_default_load_all` so it does a single read-parse with proper exception mapping (ENOENT to empty, OSError to MCPRegistryUnavailable, parse error to MCPRegistryDescriptorInvalid, env-var unresolvable to MCPServerConnectFailed) rather than masking transient and parse failures through `list_mcp_servers`'s soft-degrade path. spec/36 line 599 gains a one-sentence amendment correcting the wiring location from `_load_config()` to `AtomicAgent.__init__` (spec/36 said the former, but `_load_config` is a pure reader of `self._profile` that cannot mutate it); spec/24 Decision 1 gains the locked D1 addendum documenting the sibling field, snapshot security clamp, and SQLite forward-compat. **Cross-model review army** (4 specialists + Claude adversarial subagent + Codex adversarial via `codex exec` + Codex structured review via `codex review`) surfaced 15+ findings; the 3 highest-priority were triple-confirmed across models and applied inline: (1) empty resolved list now treated as authoritative instead of falling back to `config.mcp_servers` (Codex HIGH + Claude adversarial F2 + Codex structured P2; the prior `... or self.config.mcp_servers` resurrected stale `mcp.md` servers when the operator pinned an empty catalog); (2) fail-closed wrapper broadened from `MCPRegistryUnavailable` to `MCPRegistryError` so descriptor-invalid and connect-failed also propagate with backend_id context; (3) doctor false-PASS closed by adding the `load_all_mcp_servers` probe after `list_mcp_servers` succeeds. **Test delta: +46 net new (3192 collected before fixes, 3199 after the doctor + wiring additions, 3141 passed + 58 skipped + 0 failures + 0 regressions across the full suite).** Test files: `tests/test_mcp_server_registry_wiring.py` (~22 tests covering per-runner kwargs, delegate explicit-only behavior, fail-closed re-raise with backend_id + URL redaction + MCPRegistryDescriptorInvalid propagation, profile augmentation, empty-resolved-list authoritative, CLI MCPRegistryError catch), `tests/test_mcp_server_registry_migration_regression.py` (10 IRON RULE byte-identity tests pinning pre-#201 behavior when no backend is configured: empty/missing mcp.md returns empty specs, single server no env vars byte-identical to `parse_mcp_md` baseline, multiple servers in sorted order matching the spec/36 MUST 5 sort, env-var resolved specs match the pre-#201 path, config equals profile resolved, pool `_specs` byte-identical to resolved list, `MCPRegistryUnavailable` raises at construction with no silent degrade), `tests/test_profile_mcp_servers_resolved.py` (9 tests covering field existence + LAST-in-dataclass constraint + to_dict always-`[]` + from_dict round-trip with key present + backward-compat with key absent + `_mcp_spec_from_dict` helper round-trip + `mcp_servers` fallback bug fix + snapshot/restore resets to `[]` + SQLite JSON blob round-trip), `tests/test_mcp_server_registry_doctor.py` (5 tests covering PASS with empty mcp.md, PASS with valid mcp.md, FAIL on unknown backend_id with URL credential redaction including DSN-style `user:pass@host` patterns, FAIL on descriptor invalid via the new probe, capability snapshot completeness). Mirrors the doctor-test shape of every other backend (lock, log, profile, tool registry, persona, corpus). Pre-impl prep: 5-stream parallel Sonnet prep pass parametrized on failure-mode dimensions (wiring shape vs precedent; AgentProfile round-trip; IRON RULE design; doctor PASS/WARN/FAIL ladder; fail-closed wiring) caught 71 findings BEFORE any code shipped, beating PR 1's 35-finding bar. Two binary design decisions locked via AskUserQuestion before implementer dispatch: server list ordering (Q1, locked to "sort both paths to lexicographic" in the filesystem profile backend so spec/36 MUST 5 holds across all backends with zero functional change to operators); snapshot serialization shape (Q2, locked to "always serialize `mcp_servers_resolved` as `[]`" so resolved MCP env secrets never persist to disk in snapshot files). Three commits authored as parallel Sonnet implementer streams in isolated git worktrees per the aggressive-Sonnet-delegation-when-on-Opus discipline (Stream 1 owned `agent.py` + runners + delegate + spec/36; Stream 2 owned profile/* + doctor + mcp_registry/filesystem.py bug fix + spec/24); merged cleanly with zero conflicts because the file-set partition was disjoint by design. Known follow-up coverage gaps (filed as P2 for PR 3 prep): a future PR can add chmod-style PermissionError tests for `load_all_mcp_servers`'s new OSError mapping (analogous coverage already exists for `load_mcp_server`); threading test strengthening to patch `AtomicAgent.__init__` and capture kwargs at runner `run()` time instead of asserting on the stored field. After PR 5 of 5 of #201 lands, atomic-agents-stack hits v1.0 with twelve of twelve backend protocols shipped; PR 2 extends the post-#285-revert /ship streak to 9. diff --git a/CLAUDE.md b/CLAUDE.md index ee26d4a..0da82da 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -207,7 +207,7 @@ uv run pytest # full suite uv run pytest tests/test_.py -v # one module ``` -Run `uv run pytest --collect-only -q | tail -1` for the live test count (last refresh: 3,232 tests collected, 2026-06-04). New backend protocols add ~25 conformance + ~10 impl-specific tests. New features ship with tests. Migration-shaped PRs need parameterized fixture tests across the backend protocol — the conformance suite is what keeps the protocol honest. +Run `uv run pytest --collect-only -q | tail -1` for the live test count (last refresh: 3,307 tests collected, 2026-06-04). New backend protocols add ~25 conformance + ~10 impl-specific tests. New features ship with tests. Migration-shaped PRs need parameterized fixture tests across the backend protocol — the conformance suite is what keeps the protocol honest. ### Releases + SemVer @@ -341,7 +341,7 @@ These are not forbidden forever — they're explicitly deferred with rationale. ## Status -**v0.13.0, alpha, PUBLIC.** Core runtime stable. Test suite: run `uv run pytest --collect-only -q | tail -1` for the live count (last refresh: 3,232 tests collected, 2026-06-04). Capability-gated skips fall into four buckets — ToolRegistry conformance (filesystem-shape + `supports_uninstall=False` variants), AgentProfile (skill-content + filesystem-shape on SQLite), cross-process Redis (require real Redis instead of fakeredis), and judge-conformance dispatch (LLM-only + PolicyJudge concurrent-evaluate). Full CI runs against `uv sync --extra dev --extra openai --extra validation --extra redis`. **Eleven backend protocols shipped**: +**v0.13.0, alpha, PUBLIC.** Core runtime stable. Test suite: run `uv run pytest --collect-only -q | tail -1` for the live count (last refresh: 3,307 tests collected, 2026-06-04). Capability-gated skips fall into four buckets — ToolRegistry conformance (filesystem-shape + `supports_uninstall=False` variants), AgentProfile (skill-content + filesystem-shape on SQLite), cross-process Redis (require real Redis instead of fakeredis), and judge-conformance dispatch (LLM-only + PolicyJudge concurrent-evaluate). Full CI runs against `uv sync --extra dev --extra openai --extra validation --extra redis`. **Eleven backend protocols shipped**: - **MemoryBackend** (PR #57) — filesystem reference impl + conformance suite. - **LLMBackend** (#87) — Anthropic + OpenAI + Moonshot reference impls, registered at framework import; conformance suite parametrizes across all three. diff --git a/README.md b/README.md index 19bc8bb..b7a4a59 100644 --- a/README.md +++ b/README.md @@ -182,7 +182,7 @@ Start at [`docs/README.md`](docs/README.md) for the spec entry point. The locked - [33 — PersonaBackend Protocol](docs/spec/33-persona-backend.md) — persona ownership, snapshot/restore, `persona.link.md` format - [34 — CorpusBackend Protocol](docs/spec/34-corpus-backend.md) — wiki/raw corpus protocol; filesystem + SQLite (FTS5) reference impls; GB-scale indexed full-text search - [35 — init wizard](docs/spec/35-init-wizard.md) — `atomic-agents init` on-ramp; template scaffolding + Add-to-it merge; CI-friendly `--from-template` (RFC) -- [36 — MCPServerRegistryBackend Protocol](docs/spec/36-mcp-server-registry-backend.md) — MCP server catalog + install/audit; `FilesystemMCPServerRegistryBackend` reference impl; `atomic-agents mcp-registry` CLI (DRAFT, PR 3 of 5) +- [36 — MCPServerRegistryBackend Protocol](docs/spec/36-mcp-server-registry-backend.md) — MCP server catalog + install/audit; `FilesystemMCPServerRegistryBackend` reference impl; `HTTPMCPServerRegistryBackend` reference impl with tier-1/2/3 capability negotiation; `atomic-agents mcp-registry` CLI (DRAFT, PR 4 of 5) Each spec doc is locked when the implementation matches and tests pass. Spec changes that imply implementation changes get filed as GitHub issues. **Spec docs separate shipped behavior from explicit future / deferred boundaries** — sections that describe behavior not yet implemented are explicitly marked as such, not silently aspirational. @@ -282,7 +282,7 @@ Same pattern for OpenAI (`atomic-agents-openai`) and Moonshot (`atomic-agents-mo ## Repository structure - `atomic_agents/` — the Python package (runtime in `agent.py`; backend protocols in `memory/`, `_llm.py`, `_locks.py`, `_costs.py`, etc.; CLI in `cli.py`; preflight in `doctor.py`) -- `tests/` 3,232 tests collected, Python 3.11 + 3.12 matrix +- `tests/` 3,307 tests collected, Python 3.11 + 3.12 matrix - `docs/` — [spec entry point](docs/README.md), [`architecture.md`](docs/architecture.md), [`spec/`](docs/spec/) (31 locked docs + 4 RFCs/DRAFTs), [`deployment/`](docs/deployment/) (8 operator runbooks), [`samples/caldwell/`](docs/samples/caldwell/) (complete worked example), [`GOVERNANCE.md`](docs/GOVERNANCE.md), [`TENSIONS.md`](docs/TENSIONS.md), [`methodology.md`](docs/methodology.md) - `extras/` — operational templates (Claude Code skill wrappers, macOS LaunchAgent plists, cron examples) @@ -313,4 +313,4 @@ Before opening a PR, read [`CLAUDE.md`](CLAUDE.md) (the project's design ethos a ## Status -**v0.13.0, alpha.** Core runtime stable. 3,232 tests collected on Python 3.11 / 3.12. Eleven of twelve backend protocols shipped (see the backend protocols table above); `MCPServerRegistryBackend` in progress (PR 3 of 5). The surface stabilizes at v1.0. Pre-1.0 — Minor releases may contain breaking changes (see [`docs/deployment/versioning.md`](docs/deployment/versioning.md)). Single-maintainer project; reference implementation anyone can use, fork, or extend. +**v0.13.0, alpha.** Core runtime stable. 3,307 tests collected on Python 3.11 / 3.12. Eleven of twelve backend protocols shipped (see the backend protocols table above); `MCPServerRegistryBackend` in progress (PR 4 of 5). The surface stabilizes at v1.0. Pre-1.0 — Minor releases may contain breaking changes (see [`docs/deployment/versioning.md`](docs/deployment/versioning.md)). Single-maintainer project; reference implementation anyone can use, fork, or extend. diff --git a/atomic_agents/mcp.py b/atomic_agents/mcp.py index d2033bf..7577010 100644 --- a/atomic_agents/mcp.py +++ b/atomic_agents/mcp.py @@ -106,6 +106,48 @@ class MCPServerSpec: transport: str = "stdio" description: str = "" + def to_dict(self) -> dict: + """Serialize to a JSON-safe plain dict. + + Note: ``env`` may contain resolved values (the ``$VAR_NAME`` + references in source ``mcp.md`` are resolved to literal env-var + values at load time). Callers serializing to a shared log or audit + backend SHOULD treat the ``env`` field as sensitive. The raw text + on the profile (``mcp_md_raw``) is the safe-to-ship form. + + Returns a fresh dict every call; callers may mutate freely. + """ + return { + "name": self.name, + "command": self.command, + "args": list(self.args), + "env": dict(self.env), + "transport": self.transport, + "description": self.description, + } + + @classmethod + def from_dict(cls, d: dict) -> "MCPServerSpec": + """Reconstruct an ``MCPServerSpec`` from a dict produced by ``to_dict()``. + + Required keys (``name``, ``command``) raise ``KeyError`` if absent. + Optional keys (``args``, ``env``, ``transport``, ``description``) fall + back to field defaults. Extra keys in ``d`` are silently ignored for + forward-compatibility with future wire format extensions. + + Applies copy-constructor discipline: ``args`` and ``env`` values are + copied out of ``d`` to prevent callers from accidentally sharing + mutable state with the dict passed in. + """ + return cls( + name=d["name"], + command=d["command"], + args=list(d.get("args", [])), + env=dict(d.get("env", {})), + transport=d.get("transport", "stdio"), + description=d.get("description", ""), + ) + @dataclass class MCPToolMeta: diff --git a/atomic_agents/mcp_registry/__init__.py b/atomic_agents/mcp_registry/__init__.py index bc1f541..9364871 100644 --- a/atomic_agents/mcp_registry/__init__.py +++ b/atomic_agents/mcp_registry/__init__.py @@ -200,13 +200,30 @@ def get_default_mcp_server_registry_backend( if raw_backend_id == "filesystem": return FilesystemMCPServerRegistryBackend(agent_root, read_paths) + elif raw_backend_id == "http": + # Lazy import: filesystem operators do not pay the httpx import cost. + from .http import make_http_mcp_server_registry_backend_from_url + + url = os.environ.get( + "ATOMIC_AGENTS_MCP_SERVER_REGISTRY_BACKEND_URL", "" + ).strip() + if not url: + raise BackendNotRegistered( + "ATOMIC_AGENTS_MCP_SERVER_REGISTRY_BACKEND=http requires " + "ATOMIC_AGENTS_MCP_SERVER_REGISTRY_BACKEND_URL to be set. " + "Expected format: https://[:port]/?agent_scope=" + ) + return make_http_mcp_server_registry_backend_from_url(url) + # Unknown backend_id. Sanitize before echoing in the error message to # prevent credential leaks when operators accidentally paste a URL into # ATOMIC_AGENTS_MCP_SERVER_REGISTRY_BACKEND instead of the _URL variable. + known_ids = {"filesystem", "http"} safe_backend_id = _redact_for_error_message(raw_backend_id) raise BackendNotRegistered( f"ATOMIC_AGENTS_MCP_SERVER_REGISTRY_BACKEND={safe_backend_id!r} is not a " - f"known backend. Available: {list_mcp_server_registry_backends()}. " + f"known backend. Known: {sorted(known_ids)}. " + f"Available registered: {list_mcp_server_registry_backends()}. " f"Unset the env var to use the filesystem default." ) diff --git a/atomic_agents/mcp_registry/http.py b/atomic_agents/mcp_registry/http.py new file mode 100644 index 0000000..d824cc1 --- /dev/null +++ b/atomic_agents/mcp_registry/http.py @@ -0,0 +1,1163 @@ +"""HTTPMCPServerRegistryBackend -- HTTP-catalog reference implementation. + +Implements the full MCPServerRegistryBackend read paths (list, load, load_all, +validate, capabilities, refresh_capabilities, close) against a JSON-over-HTTPS +catalog server conforming to spec/36 Decision 4's three-tier wire contract. + +Install/uninstall stubs raise ``NotImplementedError`` at PR 4; write paths +ship at PR 5. + +Wire format (spec/36 PR 4 amendments): + GET /mcp-servers?agent_scope= + GET /mcp-servers?agent_scope=&expand=spec + GET /mcp-servers/?agent_scope= + GET /mcp-servers//validate?agent_scope= + GET /capabilities (optional; tier-3 probe) + +HTTP optional extra: ``pip install 'atomic-agents-stack[http]'``. +The ``httpx`` library is NOT imported at module level -- it loads lazily +at first method call so operators using only the filesystem backend do +not pay the import cost. + +Registration: importing this module registers ``"http"`` in the +``MCPServerRegistryBackend`` registry (per ``register_backend_placement_convention`` +and the ``locks/redis.py`` precedent). The registration call at the bottom +of this file fires at import time. +""" + +from __future__ import annotations + +import json +import logging +import os +import re +import threading +import time +from dataclasses import replace +from typing import Any +from urllib.parse import urlencode + +from .backend import ( + MCPRegistryAuthRequired, + MCPRegistryDescriptorInvalid, + MCPRegistryUnavailable, + MCPServerNotInRegistry, +) +from .types import MCPServerRef, MCPServerRegistryCapabilities, ValidationResult +from ..mcp import MCPServerSpec, _resolve_env_vars + +_logger = logging.getLogger(__name__) + +# Charset rule from MUST 1; mirrors filesystem.py and CorpusBackend. +_NAME_RE = re.compile(r"^[a-zA-Z0-9_.+@-]+$") +_PATH_TRAVERSAL_TOKENS = frozenset({"..", "."}) + +# Module-level lazy reference to the httpx package. +# Set to the actual ``httpx`` module on first successful import by ``_get_httpx()``. +_httpx: Any = None + + +def _get_httpx() -> Any: + """Return the ``httpx`` module, importing it lazily on first call. + + Raises ``ImportError`` with an operator-readable install instruction when + ``httpx`` is not installed. Mirrors the ``[redis]`` pattern in + ``locks/redis.py:481-487``. + """ + global _httpx + if _httpx is None: + try: + import httpx as _httpx_pkg + + _httpx = _httpx_pkg + except ImportError as exc: + raise ImportError( + "HTTPMCPServerRegistryBackend requires the 'httpx' extra. " + "Install via: pip install 'atomic-agents-stack[http]'" + ) from exc + return _httpx + + +def _validate_server_name(name: str) -> None: + """Raise ``ValueError`` when ``name`` fails the MUST 1 charset rule. + + Mirrors ``filesystem.py:_validate_server_name`` exactly. + """ + if not name: + raise ValueError("MCP server name must not be empty.") + if name.startswith("."): + raise ValueError( + f"MCP server name {name!r} must not start with '.'; " + f"leading-dot names are reserved for hidden files." + ) + if not _NAME_RE.match(name): + raise ValueError( + f"MCP server name {name!r} contains invalid characters. " + f"Allowed: [a-zA-Z0-9_.+@-]" + ) + if name in _PATH_TRAVERSAL_TOKENS: + raise ValueError( + f"MCP server name {name!r} is a path-traversal token and is not allowed." + ) + + +def _redact_url_for_error(url: str) -> str: + """Strip embedded credentials from a URL for safe operator-facing output. + + Handles ``https://user:pass@host/`` credential embedding by returning + ``https://...`` when ``://`` is present in the URL. Uses the same + ``_redact_for_error_message`` helper from ``mcp_registry/__init__.py`` + (per D-PR4-4 -- that module has the DSN heuristic the other modules lack). + """ + from . import _redact_for_error_message + + return _redact_for_error_message(url) + + +# ────────────────────────────────────────────────────────────────────────────── +# Wire-format parse helpers + + +def _parse_mcp_server_spec_from_dict(d: Any, *, url: str) -> MCPServerSpec: + """Parse and validate a single MCPServerSpec from a wire-format dict. + + Full shape validation per C-F3 (prep notes). Raises + ``MCPRegistryDescriptorInvalid`` on any malformed response: non-dict + input, missing required fields, wrong field types, name failing the + MUST 1 charset rule, or injection tokens in the ``name`` field. + + Args: + d: The value from the catalog server's JSON response expected to + be a server spec dict. + url: The redacted catalog URL to include in error messages. + """ + if not isinstance(d, dict): + raise MCPRegistryDescriptorInvalid( + f"catalog server at {url} returned a server entry that is not a " + f"dict (got {type(d).__name__!r})" + ) + try: + name = d["name"] + command = d["command"] + except KeyError as exc: + raise MCPRegistryDescriptorInvalid( + f"catalog server at {url} returned a server entry missing required " + f"field {exc.args[0]!r}" + ) from exc + + if not isinstance(name, str) or not name: + raise MCPRegistryDescriptorInvalid( + f"catalog server at {url} returned a server entry with invalid " + f"'name' value: {name!r}" + ) + # Charset + injection defense: refuse names with path-traversal / newlines. + if not _NAME_RE.match(name) or name in _PATH_TRAVERSAL_TOKENS: + raise MCPRegistryDescriptorInvalid( + f"catalog server at {url} returned a server entry with non-conformant " + f"'name' field {name!r} (failed MUST 1 charset rule). " + f"Possible catalog-server injection; refusing response." + ) + + if not isinstance(command, str): + raise MCPRegistryDescriptorInvalid( + f"catalog server at {url} returned a server entry with invalid " + f"'command' value (must be a string, got {type(command).__name__!r})" + ) + + args_raw = d.get("args", []) + if not isinstance(args_raw, list): + raise MCPRegistryDescriptorInvalid( + f"catalog server at {url} returned a server entry where 'args' is " + f"not a list (got {type(args_raw).__name__!r})" + ) + for i, a in enumerate(args_raw): + if not isinstance(a, str): + raise MCPRegistryDescriptorInvalid( + f"catalog server at {url} returned a server entry where " + f"'args[{i}]' is not a string (got {type(a).__name__!r})" + ) + + env_raw = d.get("env", {}) + if not isinstance(env_raw, dict): + raise MCPRegistryDescriptorInvalid( + f"catalog server at {url} returned a server entry where 'env' is " + f"not a dict (got {type(env_raw).__name__!r})" + ) + for k, v in env_raw.items(): + if not isinstance(k, str) or not isinstance(v, str): + raise MCPRegistryDescriptorInvalid( + f"catalog server at {url} returned a server entry where " + f"'env' contains a non-string-to-string mapping" + ) + + try: + return MCPServerSpec( + name=name, + command=command, + args=list(args_raw), + env=dict(env_raw), + transport=str(d.get("transport", "stdio")), + description=str(d.get("description", "")), + ) + except (TypeError, ValueError) as exc: + raise MCPRegistryDescriptorInvalid( + f"catalog server at {url} returned a malformed server entry: {exc}" + ) from exc + + +def _parse_servers_list(data: Any, *, url: str) -> list[MCPServerSpec]: + """Parse a bulk server list (spec form) from a catalog response. + + Expects ``data`` to be a dict with a ``"servers"`` key whose value is a + list of MCPServerSpec-shaped dicts. Each entry is validated via + ``_parse_mcp_server_spec_from_dict``. + """ + if not isinstance(data, dict): + raise MCPRegistryDescriptorInvalid( + f"catalog server at {url} returned a non-dict response " + f"(got {type(data).__name__!r})" + ) + servers_raw = data.get("servers") + if servers_raw is None: + raise MCPRegistryDescriptorInvalid( + f"catalog server at {url} response is missing 'servers' key" + ) + if not isinstance(servers_raw, list): + raise MCPRegistryDescriptorInvalid( + f"catalog server at {url} returned 'servers' that is not a list " + f"(got {type(servers_raw).__name__!r})" + ) + return [_parse_mcp_server_spec_from_dict(entry, url=url) for entry in servers_raw] + + +def _parse_servers_list_to_refs( + data: Any, *, url: str, source_url: str +) -> list[MCPServerRef]: + """Parse a server listing response into lightweight ``MCPServerRef`` objects. + + Expects the same ``{"servers": [...]}`` envelope as ``_parse_servers_list``, + but only extracts metadata fields (name, description, transport, version, + source). + + Args: + data: The catalog response body parsed as a Python dict. + url: The redacted catalog URL for inclusion in error messages + (operator-facing; credentials stripped). + source_url: The catalog URL used to build ``MCPServerRef.source`` + per spec/36 line 228 (raw URL). The spec requires the source + field be a usable URL for downstream navigation. If the + operator embeds credentials in ``catalog_url`` (instead of + using the auth_token env var), the credentials surface here. + The recommended operator pattern is the auth_token env var + (spec/36 §Operator surface). + """ + if not isinstance(data, dict): + raise MCPRegistryDescriptorInvalid( + f"catalog server at {url} returned a non-dict response " + f"(got {type(data).__name__!r})" + ) + servers_raw = data.get("servers") + if servers_raw is None: + raise MCPRegistryDescriptorInvalid( + f"catalog server at {url} response is missing 'servers' key" + ) + if not isinstance(servers_raw, list): + raise MCPRegistryDescriptorInvalid( + f"catalog server at {url} returned 'servers' that is not a list " + f"(got {type(servers_raw).__name__!r})" + ) + refs = [] + for entry in servers_raw: + if not isinstance(entry, dict): + raise MCPRegistryDescriptorInvalid( + f"catalog server at {url} returned a server list entry that is " + f"not a dict (got {type(entry).__name__!r})" + ) + name = entry.get("name") + if not isinstance(name, str) or not name: + raise MCPRegistryDescriptorInvalid( + f"catalog server at {url} returned a server list entry with " + f"invalid 'name' value: {name!r}" + ) + # Charset + injection defense. + if not _NAME_RE.match(name) or name in _PATH_TRAVERSAL_TOKENS: + raise MCPRegistryDescriptorInvalid( + f"catalog server at {url} returned a server list entry with " + f"non-conformant 'name' field {name!r} (failed MUST 1 charset). " + f"Possible catalog-server injection; refusing response." + ) + refs.append( + MCPServerRef( + name=name, + description=str(entry.get("description", "")), + transport=str(entry.get("transport", "stdio")), + version=entry.get("version") or None, + source=f"{source_url}/mcp-servers/{name}", + ) + ) + return refs + + +def _parse_validation_result(data: Any, *, url: str) -> ValidationResult: + """Parse a ``/validate`` endpoint response into a ``ValidationResult``. + + Expects ``{"ok": bool, "errors": [...], "warnings": [...]}``. + """ + if not isinstance(data, dict): + raise MCPRegistryDescriptorInvalid( + f"catalog server at {url} returned a non-dict /validate response " + f"(got {type(data).__name__!r})" + ) + ok = data.get("ok") + if not isinstance(ok, bool): + raise MCPRegistryDescriptorInvalid( + f"catalog server at {url} /validate response missing or non-bool 'ok' field" + ) + errors_raw = data.get("errors", []) + warnings_raw = data.get("warnings", []) + if not isinstance(errors_raw, list) or not isinstance(warnings_raw, list): + raise MCPRegistryDescriptorInvalid( + f"catalog server at {url} /validate response has non-list 'errors' or " + f"'warnings' field" + ) + errors = [str(e) for e in errors_raw] + warnings = [str(w) for w in warnings_raw] + return ValidationResult(ok=ok, errors=errors, warnings=warnings) + + +def _parse_capabilities_response( + data: Any, *, url: str +) -> MCPServerRegistryCapabilities: + """Parse a ``GET /capabilities`` response into ``MCPServerRegistryCapabilities``. + + At PR 4, only read-path capabilities matter (``supports_install`` and + ``supports_uninstall`` may be True if the server reports tier 2+, but the + HTTP backend stubs those methods with ``NotImplementedError`` at PR 4). + The parsed values are stored for reporting; callers still get + ``NotImplementedError`` on the write paths until PR 5. + """ + if not isinstance(data, dict): + raise MCPRegistryDescriptorInvalid( + f"catalog server at {url} returned a non-dict /capabilities response" + ) + # Per spec/36 Decision 4 step 1: authoritative if 200. + # Parse with best-effort defaults; missing fields fall back to tier-1 safe values. + supports_install = bool(data.get("supports_install", False)) + supports_uninstall = bool(data.get("supports_uninstall", False)) + supports_audit = bool(data.get("supports_audit", False)) + return MCPServerRegistryCapabilities( + supports_install=supports_install, + supports_uninstall=supports_uninstall, + supports_capability_handshake=True, + supports_audit=supports_audit, + durable=True, + ) + + +def _materialize_spec(raw: MCPServerSpec, url: str) -> MCPServerSpec: + """Apply env-var resolution to a parsed spec (MUST 8). + + Returns a new MCPServerSpec with ``$VAR`` references in ``env`` resolved + against the client process environment. Both ``load_mcp_server`` and + ``load_all_mcp_servers`` route through this helper to guarantee structural + equivalence (MUST 10 / C-F4). + + Args: + raw: MCPServerSpec as parsed from wire; may contain unresolved ``$VAR`` + strings in ``env``. + url: Redacted catalog URL used only in exception messages from + ``_resolve_env_vars``. + """ + if not raw.env: + return raw + resolved_env = _resolve_env_vars(raw.env, raw.name) + return replace(raw, env=resolved_env) + + +def _handle_http_error( + exc: Any, + *, + url: str, + expect_404_means_not_found_for_name: str | None = None, +) -> None: + """Translate an ``httpx`` exception into the appropriate MCPRegistry exception. + + Central exception mapper per the prep notes table (D-PR4-5). Raises the + mapped exception; never returns normally. + + Args: + exc: The caught exception. + url: The redacted catalog URL for operator-facing messages. + expect_404_means_not_found_for_name: When set to a server name string, + a 404 ``HTTPStatusError`` raises ``MCPServerNotInRegistry`` for that + name instead of ``MCPRegistryUnavailable``. + """ + httpx = _get_httpx() + + if isinstance(exc, httpx.HTTPStatusError): + status = exc.response.status_code + if status == 401: + raise MCPRegistryAuthRequired( + f"catalog server at {url} returned 401 Unauthorized. " + f"Set ATOMIC_AGENTS_MCP_SERVER_REGISTRY_AUTH_TOKEN or pass " + f"auth_token= to the constructor." + ) from exc + if status == 404: + if expect_404_means_not_found_for_name is not None: + raise MCPServerNotInRegistry( + f"MCP server {expect_404_means_not_found_for_name!r} not " + f"found in catalog at {url} (HTTP 404)." + ) from exc + raise MCPRegistryUnavailable( + f"catalog server at {url} returned unexpected 404; " + f"the catalog server may be misconfigured (status={status})." + ) from exc + if status >= 500: + raise MCPRegistryUnavailable( + f"catalog server at {url} returned HTTP {status} (server error)." + ) from exc + # Other 4xx (400, 403, 409, 422, etc.) surface as Unavailable + # per prep notes B-F8: do NOT silently fall back on non-404 4xx. + raise MCPRegistryUnavailable( + f"catalog server at {url} returned unexpected HTTP {status}." + ) from exc + + if isinstance(exc, httpx.LocalProtocolError): + raise MCPRegistryDescriptorInvalid( + f"HTTP client sent an invalid request to catalog at {url}: {exc}" + ) from exc + + if isinstance(exc, httpx.DecodingError): + raise MCPRegistryDescriptorInvalid( + f"catalog server at {url} response could not be decoded: {exc}" + ) from exc + + if isinstance(exc, httpx.TimeoutException): + raise MCPRegistryUnavailable( + f"request to catalog server at {url} timed out: {exc}" + ) from exc + + if isinstance(exc, httpx.NetworkError): + raise MCPRegistryUnavailable( + f"network error reaching catalog server at {url}: {exc}" + ) from exc + + if isinstance(exc, httpx.ProtocolError): + raise MCPRegistryUnavailable( + f"HTTP protocol error communicating with catalog at {url}: {exc}" + ) from exc + + # Final catch-all: any remaining httpx.HTTPError subclass. + if isinstance(exc, httpx.HTTPError): + raise MCPRegistryUnavailable( + f"HTTP error communicating with catalog at {url}: " + f"{type(exc).__name__}: {exc}" + ) from exc + + # httpx.InvalidURL does NOT inherit from HTTPError. + if isinstance(exc, httpx.InvalidURL): + raise ValueError(f"Invalid catalog URL {url!r}: {exc}") from exc + + # RuntimeError: raised by httpx.Client.send() when the client has been + # closed. Adversarial F2: a thread reading self._real_client outside the + # client_lock can hold a reference that gets closed by a concurrent + # close() call. Map to MCPRegistryUnavailable so the framework's + # MCPRegistry* catch-alls see a coherent exception. + if isinstance(exc, RuntimeError): + raise MCPRegistryUnavailable( + f"HTTP client communicating with catalog at {url} was closed " + f"mid-request: {exc}" + ) from exc + + # json.JSONDecodeError + if isinstance(exc, json.JSONDecodeError): + raise MCPRegistryDescriptorInvalid( + f"catalog server at {url} returned a response that is not valid JSON: {exc}" + ) from exc + + # Fallback: re-raise unknown exceptions unchanged. + raise exc + + +# ────────────────────────────────────────────────────────────────────────────── +# HTTPMCPServerRegistryBackend + + +class HTTPMCPServerRegistryBackend: + """HTTP-catalog implementation of ``MCPServerRegistryBackend`` (spec/36). + + Reads from a JSON-over-HTTPS catalog server conforming to spec/36 Decision 4. + Supports the full read path: ``list_mcp_servers``, ``load_mcp_server``, + ``load_all_mcp_servers``, ``validate``, ``capabilities``, + ``refresh_capabilities``, ``close``. Install/uninstall ship at PR 5. + + Tier negotiation (lazy probe on first non-construction call): + Step 1: ``GET /capabilities`` -> 200 parses tier from body. + Step 2: ``GET /capabilities`` -> 404 falls through to OPTIONS. + Step 3: ``OPTIONS /mcp-servers`` -> parses Allow header for tier. + Step 4: ``OPTIONS /mcp-servers`` -> 404 or 405, defaults to tier 1. + Step 5: Any 401/5xx/network error -> ``MCPRegistryUnavailable`` + or ``MCPRegistryAuthRequired``. + + Constructor parameters: + + ``catalog_url``: base URL of the catalog server, e.g. + ``https://catalog.example.com``. MUST NOT have embedded credentials + in the stored ``_catalog_url`` attribute (redacted version stored + in ``_safe_catalog_url``). + + ``agent_scope``: per-agent scope string appended as ``?agent_scope=`` + to every request. The catalog server filters its response to servers + mounted for this scope. + + ``auth_token``: optional bearer token. When set, every request includes + ``Authorization: Bearer ``. Catalog servers requiring auth respond + 401 if absent; backend raises ``MCPRegistryAuthRequired``. + + ``request_timeout_s``: per-request HTTP timeout in seconds (default 10.0). + Distinct from ``probe_failure_cache_s`` (failure cache window). + + ``probe_failure_cache_s``: seconds to cache a probe failure before re-probing + (default 60.0). Prevents thrashing when the catalog server is unreachable. + + ``_http_client``: test-only injectable seam (D-PR4-1). When set, the backend + uses this client instead of constructing a real ``httpx.Client``. Production + callers MUST NOT pass this parameter. + """ + + def __init__( + self, + catalog_url: str, + agent_scope: str, + *, + auth_token: str | None = None, + request_timeout_s: float = 10.0, + probe_failure_cache_s: float = 60.0, + _http_client: Any = None, + ) -> None: + # MUST 2: side-effect-free construction. No httpx import, no network + # call, no file open. All initialization is pure attribute assignment. + # + # Normalize catalog_url ONCE: strip trailing slashes AND any query + # string the operator may have appended (e.g., "https://host/?foo=bar" + # or "https://host/api?debug=1"). Without this, every URL-building site + # would produce malformed paths like + # "https://host/api?debug=1/mcp-servers?agent_scope=..." which httpx + # rejects or misroutes (Adversarial F4). The query-string-strip also + # prevents operator-provided URL params from leaking into the + # framework's wire format requests. Maintainability M4: single + # normalization site eliminates the six-site `.rstrip('/')` pattern. + # + # URL parse failures (e.g., invalid IPv6 brackets) MUST NOT raise at + # construction time per MUST 2. Fall back to storing the raw URL; the + # malformed URL will surface as a ValueError from httpx.InvalidURL at + # first network use, mapped through ``_handle_http_error``. + from urllib.parse import urlparse, urlunparse + + try: + _parsed = urlparse(catalog_url) + self._catalog_url = urlunparse( + (_parsed.scheme, _parsed.netloc, _parsed.path.rstrip("/"), "", "", "") + ) + except (ValueError, TypeError): + # Defer URL validation to first method call; preserve MUST 2. + self._catalog_url = catalog_url + self._safe_catalog_url = _redact_url_for_error(self._catalog_url) + self._agent_scope = agent_scope + self._auth_token = auth_token + self._request_timeout_s = request_timeout_s + self._probe_failure_cache_s = probe_failure_cache_s + self._http_client_override = _http_client + + # Capability cache state. + self._cached_capabilities: MCPServerRegistryCapabilities | None = None + self._probe_failure_cached_until: float = 0.0 + + # Lock guards the capability cache check+write. HTTP probe (slow I/O) + # runs OUTSIDE the lock per D-PR4-3 (avoid serializing concurrent + # callers against network latency). + self._capabilities_lock = threading.Lock() + + # Lock guards lazy httpx.Client construction (race-free on first call). + self._client_lock = threading.Lock() + + # The lazily-constructed real httpx.Client (None until first use). + self._real_client: Any = None + + # ─── Backend identity ───────────────────────────────────────────────── + + @property + def backend_id(self) -> str: + """Stable identifier matching the registry key (MUST 6a).""" + return "http" + + # ─── HTTP client management ─────────────────────────────────────────── + + def _get_client(self) -> Any: + """Return the ``httpx.Client`` to use for requests. + + If a test-only override was injected at construction, returns it. + Otherwise, lazily constructs a real ``httpx.Client`` on first call. + The construction is guarded by ``_client_lock`` to prevent a race + when multiple threads call an HTTP method concurrently for the first + time. + """ + if self._http_client_override is not None: + return self._http_client_override + + if self._real_client is not None: + return self._real_client + + with self._client_lock: + if self._real_client is not None: + return self._real_client + httpx = _get_httpx() + # Auth headers are applied per-request via ``_auth_headers`` so a + # test-injected ``_http_client`` (vanilla httpx.Client without the + # default Authorization header) still carries auth on every call. + self._real_client = httpx.Client(timeout=self._request_timeout_s) + return self._real_client + + def _auth_headers(self) -> dict[str, str]: + """Return per-request auth headers when ``auth_token`` is set. + + Returns ``{"Authorization": "Bearer "}`` when an auth_token is + configured; otherwise returns an empty dict. Applied on every HTTP + call rather than as ``httpx.Client`` default headers so that test + fixtures injecting a vanilla ``_http_client`` still carry auth + (Stream D D-F2; per spec MUST 4 the token never appears in error + messages or repr, only in outgoing request headers). + """ + if self._auth_token: + return {"Authorization": f"Bearer {self._auth_token}"} + return {} + + # ─── Capability negotiation ──────────────────────────────────────────── + + def _probe_capabilities(self) -> MCPServerRegistryCapabilities: + """Run the tier-negotiation probe sequence (Decision 4 steps 1-5). + + This method performs network I/O. It is called OUTSIDE any lock; + the caller acquires ``_capabilities_lock`` to update the cache after + this returns. + + Returns ``MCPServerRegistryCapabilities`` reflecting the catalog + server's tier. Raises ``MCPRegistryUnavailable`` or + ``MCPRegistryAuthRequired`` on failure. + """ + httpx = _get_httpx() + client = self._get_client() + url = self._safe_catalog_url + caps_url = f"{self._catalog_url}/capabilities" + + # Step 1: GET /capabilities (authoritative if 200). + try: + resp = client.get(caps_url, headers=self._auth_headers()) + if resp.status_code == 200: + try: + data = resp.json() + except json.JSONDecodeError as exc: + raise MCPRegistryDescriptorInvalid( + f"catalog server at {url} /capabilities response is " + f"not valid JSON: {exc}" + ) from exc + return _parse_capabilities_response(data, url=url) + + # Step 2: 404 on /capabilities means server doesn't implement it; + # fall through to OPTIONS probe. Other non-200 codes are errors. + if resp.status_code != 404: + # B-F8: non-404 4xx and 5xx must NOT silently fall back. + try: + resp.raise_for_status() + except httpx.HTTPStatusError as exc: + _handle_http_error(exc, url=url) + + except ( + httpx.LocalProtocolError, + httpx.DecodingError, + httpx.TimeoutException, + httpx.NetworkError, + httpx.ProtocolError, + httpx.HTTPError, + httpx.InvalidURL, + RuntimeError, + ) as exc: + _handle_http_error(exc, url=url) + + # Step 3: OPTIONS /mcp-servers to infer tier from Allow header. + servers_url = f"{self._catalog_url}/mcp-servers" + try: + resp = client.request("OPTIONS", servers_url, headers=self._auth_headers()) + + if resp.status_code in (404, 405): + # Step 4: fallback to tier 1. + return MCPServerRegistryCapabilities( + supports_install=False, + supports_uninstall=False, + supports_capability_handshake=True, + supports_audit=False, + durable=True, + ) + + # B-F8 / Adversarial F3 / Testing T-F1: non-200/non-404/non-405 + # responses on the OPTIONS step must NOT silently fall back to + # tier 1. A misconfigured token returning 401 or a server-side + # outage returning 5xx would otherwise produce a silent capability + # downgrade: a tier-3 server appearing as tier 1 with no audit + # signal. Mirror the GET /capabilities branch's discipline. + if resp.status_code == 401: + raise MCPRegistryAuthRequired( + f"catalog server at {url} returned 401 Unauthorized on " + f"OPTIONS /mcp-servers. Set " + f"ATOMIC_AGENTS_MCP_SERVER_REGISTRY_AUTH_TOKEN or pass " + f"auth_token= to the constructor." + ) + if resp.status_code >= 300 and resp.status_code != 200: + # 3xx redirects (with follow_redirects=False, httpx returns the + # redirect response directly; Allow header would be absent and + # tier inference would silently return tier 1). 4xx other than + # 401/404/405. 5xx server errors. All must surface. + try: + resp.raise_for_status() + except httpx.HTTPStatusError as exc: + _handle_http_error(exc, url=url) + # Defensive: if status >= 300 but raise_for_status doesn't + # raise (e.g., 3xx is treated as success by httpx in some + # configurations), still refuse the tier inference. + raise MCPRegistryUnavailable( + f"catalog server at {url} returned unexpected status " + f"{resp.status_code} on OPTIONS /mcp-servers; cannot " + f"infer tier." + ) + + # Parse Allow header per B-F6 (set-membership, not string equality). + allow_header = resp.headers.get("allow", "") + allowed = {m.strip().upper() for m in allow_header.split(",") if m.strip()} + + # Tier 2 if both POST and DELETE are allowed. + if {"GET", "POST", "DELETE"}.issubset(allowed): + supports_install = True + supports_uninstall = True + else: + supports_install = False + supports_uninstall = False + + return MCPServerRegistryCapabilities( + supports_install=supports_install, + supports_uninstall=supports_uninstall, + supports_capability_handshake=True, + supports_audit=False, + durable=True, + ) + + except ( + httpx.LocalProtocolError, + httpx.DecodingError, + httpx.TimeoutException, + httpx.NetworkError, + httpx.ProtocolError, + httpx.HTTPError, + httpx.InvalidURL, + RuntimeError, + ) as exc: + _handle_http_error(exc, url=url) + + # Should be unreachable; satisfy the type checker. + raise MCPRegistryUnavailable( # pragma: no cover + f"capability probe for catalog at {url} ended unexpectedly" + ) + + def _ensure_probed(self) -> None: + """Ensure the capability cache is populated, probing if necessary. + + Thread-safe: acquires ``_capabilities_lock`` to check the cache and + to write after a successful probe. The probe itself runs outside the + lock (per D-PR4-3) so concurrent callers don't serialize against + network latency. + + **Concurrent first-call contention.** Two threads that both observe + ``_cached_capabilities is None`` may both run ``_probe_capabilities`` + in parallel and each make one HTTP round trip. This is intentional: + serializing first-callers behind the lock would tie every concurrent + request to the latency of one probe. Last writer wins; the cache + stabilizes after one successful probe lands. The trade-off accepts + a small burst of N probes at startup in exchange for never blocking + on network I/O while holding the lock. + + After a probe failure, caches the failure for ``probe_failure_cache_s`` + seconds. Explicit ``refresh_capabilities()`` always bypasses the cache. + """ + with self._capabilities_lock: + if self._cached_capabilities is not None: + return + now = time.monotonic() + if now < self._probe_failure_cached_until: + raise MCPRegistryUnavailable( + f"catalog server at {self._safe_catalog_url} probe failed; " + f"failure cached for {self._probe_failure_cache_s:.0f}s. " + f"Call refresh_capabilities() to retry immediately." + ) + # Probe runs outside the lock (D-PR4-3). + try: + new_caps = self._probe_capabilities() + except (MCPRegistryUnavailable, MCPRegistryAuthRequired): + with self._capabilities_lock: + self._probe_failure_cached_until = ( + time.monotonic() + self._probe_failure_cache_s + ) + raise + + with self._capabilities_lock: + self._cached_capabilities = new_caps + self._probe_failure_cached_until = 0.0 + + # ─── Capability advertisement ───────────────────────────────────────── + + @property + def capabilities(self) -> MCPServerRegistryCapabilities: + """Return the current capability view. + + Before the first successful probe, returns a conservative default + (all write/audit capabilities False) per B-F11. This allows callers + to introspect before the first real method call without triggering a + network probe. The probe fires lazily on the first ``list``, + ``load``, ``load_all``, or ``validate`` call. + + After a successful probe, returns the runtime-negotiated view + reflecting the catalog server's actual tier. + """ + with self._capabilities_lock: + if self._cached_capabilities is not None: + return self._cached_capabilities + # Conservative pre-probe default per B-F11. + return MCPServerRegistryCapabilities( + supports_install=False, + supports_uninstall=False, + supports_capability_handshake=True, + supports_audit=False, + durable=True, + ) + + def refresh_capabilities(self) -> MCPServerRegistryCapabilities: + """Re-probe the catalog server and update the capability cache. + + Bypasses the failure cache. Operators call this after upgrading a + catalog server to a higher tier or after a transient outage clears. + + Per B-F5: the new capability construction runs outside the lock; the + cache assignment happens inside. + """ + new_caps = self._probe_capabilities() + with self._capabilities_lock: + self._cached_capabilities = new_caps + self._probe_failure_cached_until = 0.0 + return new_caps + + # ─── Core discovery ─────────────────────────────────────────────────── + + def list_mcp_servers(self) -> list[MCPServerRef]: + """Return lightweight server refs for this agent scope. + + Calls ``GET /mcp-servers?agent_scope=``. The catalog server + MUST filter to the mounted subset for this scope server-side + (spec/36 MUST 5; returning the org-wide catalog is non-conformant). + + Returns lexicographically sorted list (MUST 5). + """ + self._ensure_probed() + httpx = _get_httpx() + client = self._get_client() + url = self._safe_catalog_url + + query = urlencode({"agent_scope": self._agent_scope}) + request_url = f"{self._catalog_url}/mcp-servers?{query}" + + try: + resp = client.get(request_url, headers=self._auth_headers()) + resp.raise_for_status() + data = resp.json() + except httpx.HTTPStatusError as exc: + _handle_http_error(exc, url=url) + except ( + httpx.LocalProtocolError, + httpx.DecodingError, + httpx.TimeoutException, + httpx.NetworkError, + httpx.ProtocolError, + httpx.HTTPError, + httpx.InvalidURL, + RuntimeError, + ) as exc: + _handle_http_error(exc, url=url) + except json.JSONDecodeError as exc: + _handle_http_error(exc, url=url) + + refs = _parse_servers_list_to_refs(data, url=url, source_url=self._catalog_url) + return sorted(refs, key=lambda r: r.name) + + def load_mcp_server(self, name: str) -> MCPServerSpec: + """Return the fully-populated ``MCPServerSpec`` for the named server. + + Validates ``name`` charset BEFORE any network call (MUST 1). + Raises ``MCPServerNotInRegistry`` on HTTP 404. + Resolves ``$VAR`` env-var references at call time (MUST 8). + """ + _validate_server_name(name) + self._ensure_probed() + httpx = _get_httpx() + client = self._get_client() + url = self._safe_catalog_url + + query = urlencode({"agent_scope": self._agent_scope}) + request_url = f"{self._catalog_url}/mcp-servers/{name}?{query}" + + try: + resp = client.get(request_url, headers=self._auth_headers()) + resp.raise_for_status() + data = resp.json() + except httpx.HTTPStatusError as exc: + _handle_http_error( + exc, + url=url, + expect_404_means_not_found_for_name=name, + ) + except ( + httpx.LocalProtocolError, + httpx.DecodingError, + httpx.TimeoutException, + httpx.NetworkError, + httpx.ProtocolError, + httpx.HTTPError, + httpx.InvalidURL, + RuntimeError, + ) as exc: + _handle_http_error(exc, url=url) + except json.JSONDecodeError as exc: + _handle_http_error(exc, url=url) + + raw = _parse_mcp_server_spec_from_dict(data, url=url) + return _materialize_spec(raw, url) + + def load_all_mcp_servers(self) -> list[MCPServerSpec]: + """Return all mounted MCPServerSpec instances in bulk. + + Uses the ``?expand=spec`` bulk endpoint (single round trip; eliminates + N+1 HTTP cost at agent construction). Routes every entry through + ``_materialize_spec`` to apply env-var resolution, guaranteeing MUST 10 + structural equivalence with repeated ``load_mcp_server`` calls (C-F4). + + Returns lexicographically sorted list (MUST 5). + """ + self._ensure_probed() + httpx = _get_httpx() + client = self._get_client() + url = self._safe_catalog_url + + query = urlencode({"agent_scope": self._agent_scope, "expand": "spec"}) + request_url = f"{self._catalog_url}/mcp-servers?{query}" + + try: + resp = client.get(request_url, headers=self._auth_headers()) + resp.raise_for_status() + data = resp.json() + except httpx.HTTPStatusError as exc: + _handle_http_error(exc, url=url) + except ( + httpx.LocalProtocolError, + httpx.DecodingError, + httpx.TimeoutException, + httpx.NetworkError, + httpx.ProtocolError, + httpx.HTTPError, + httpx.InvalidURL, + RuntimeError, + ) as exc: + _handle_http_error(exc, url=url) + except json.JSONDecodeError as exc: + _handle_http_error(exc, url=url) + + raw_specs = _parse_servers_list(data, url=url) + materialized = [_materialize_spec(s, url) for s in raw_specs] + return sorted(materialized, key=lambda s: s.name) + + def validate(self, name: str) -> ValidationResult: + """Static check of the named server descriptor via the catalog server. + + Validates ``name`` charset BEFORE any network call (MUST 1). + + If the catalog server returns 404 on ``/validate`` (tier-1 servers + may not implement this endpoint), returns a + ``ValidationResult(ok=False, ...)`` that names the limitation rather + than raising an exception. + """ + try: + _validate_server_name(name) + except ValueError as exc: + return ValidationResult(ok=False, errors=[str(exc)], warnings=[]) + + self._ensure_probed() + httpx = _get_httpx() + client = self._get_client() + url = self._safe_catalog_url + + query = urlencode({"agent_scope": self._agent_scope}) + request_url = f"{self._catalog_url}/mcp-servers/{name}/validate?{query}" + + try: + resp = client.get(request_url, headers=self._auth_headers()) + if resp.status_code == 404: + # A-F2: 404 on /validate is ambiguous. The spec marks /validate + # OPTIONAL across all tiers, so the catalog may not implement + # it, OR the server name may simply be absent from the catalog. + # Honest message rather than asserting one interpretation. + return ValidationResult( + ok=False, + errors=[ + f"catalog server at {url} returned 404 for " + f"/mcp-servers/{name}/validate; either {name!r} is not " + f"in the catalog or the server does not implement " + f"/validate (the endpoint is OPTIONAL per spec/36)." + ], + warnings=[], + ) + resp.raise_for_status() + data = resp.json() + except httpx.HTTPStatusError as exc: + _handle_http_error(exc, url=url) + except ( + httpx.LocalProtocolError, + httpx.DecodingError, + httpx.TimeoutException, + httpx.NetworkError, + httpx.ProtocolError, + httpx.HTTPError, + httpx.InvalidURL, + RuntimeError, + ) as exc: + _handle_http_error(exc, url=url) + except json.JSONDecodeError as exc: + _handle_http_error(exc, url=url) + + return _parse_validation_result(data, url=url) + + # ─── Capability-gated write stubs (PR 5) ───────────────────────────── + + def install(self, spec: MCPServerSpec) -> MCPServerRef: + """Not implemented at PR 4. Ships at PR 5. + + Raises ``NotImplementedError`` unconditionally at this PR. + """ + # TODO(PR5): wire HTTP POST /mcp-servers write path with tier gating. + raise NotImplementedError("HTTP install/uninstall ships at PR 5") + + def uninstall(self, name: str) -> None: + """Not implemented at PR 4. Ships at PR 5. + + Raises ``NotImplementedError`` unconditionally at this PR. + """ + # TODO(PR5): wire HTTP DELETE /mcp-servers/ write path with tier gating. + raise NotImplementedError("HTTP install/uninstall ships at PR 5") + + # ─── Lifecycle ──────────────────────────────────────────────────────── + + def close(self) -> None: + """Close the underlying ``httpx.Client`` and release its resources. + + Idempotent (MUST 6b): calling ``close()`` twice does not raise. + Does NOT close the injected ``_http_client_override`` (the test caller + owns its lifecycle). + """ + if self._http_client_override is not None: + # Test-injected client; caller manages lifecycle. + return + with self._client_lock: + if self._real_client is not None: + try: + self._real_client.close() + except Exception: + _logger.debug( + "HTTPMCPServerRegistryBackend.close(): error closing " + "httpx.Client (ignored)", + exc_info=True, + ) + self._real_client = None + + +# ────────────────────────────────────────────────────────────────────────────── +# Factory function + + +def make_http_mcp_server_registry_backend_from_url( + url: str, +) -> HTTPMCPServerRegistryBackend: + """Construct an ``HTTPMCPServerRegistryBackend`` from a catalog URL. + + URL family: ``https://[:port]/?agent_scope=`` + + Reads ``agent_scope`` from the URL query parameter ``agent_scope`` + (default ``"default"`` when absent). Reads the optional bearer token + from ``ATOMIC_AGENTS_MCP_SERVER_REGISTRY_AUTH_TOKEN`` in the environment. + + Args: + url: Catalog server base URL. The ``agent_scope`` query parameter, if + present, is extracted and stripped from the base URL before passing + to the constructor. + + Returns: + Configured ``HTTPMCPServerRegistryBackend`` instance. + + Raises: + ValueError: if the URL is empty or has an unsupported scheme. + """ + from urllib.parse import urlparse, urlunparse, parse_qs + + if not url or not url.strip(): + raise ValueError("HTTPMCPServerRegistryBackend catalog URL must not be empty.") + + parsed = urlparse(url) + if parsed.scheme not in ("http", "https"): + # Redact embedded credentials per S-F1: if the operator pastes a URL + # like ftp://user:pass@host/ the raw URL must not appear in the error + # message. Mirrors the PR 1 P0 redaction discipline. + raise ValueError( + f"HTTPMCPServerRegistryBackend catalog URL must start with " + f"http:// or https://, got {_redact_url_for_error(url)!r}" + ) + + # Extract agent_scope from query params. + query_params = parse_qs(parsed.query, keep_blank_values=False) + agent_scope_values = query_params.pop("agent_scope", None) + agent_scope = agent_scope_values[0] if agent_scope_values else "default" + + # Rebuild the base URL without the agent_scope query param. + remaining_query = urlencode( + {k: v[0] for k, v in query_params.items()}, + doseq=False, + ) + catalog_url = urlunparse( + ( + parsed.scheme, + parsed.netloc, + parsed.path, + parsed.params, + remaining_query, + "", # strip fragment + ) + ) + + auth_token = os.environ.get("ATOMIC_AGENTS_MCP_SERVER_REGISTRY_AUTH_TOKEN") or None + + return HTTPMCPServerRegistryBackend( + catalog_url=catalog_url, + agent_scope=agent_scope, + auth_token=auth_token, + ) + + +# ────────────────────────────────────────────────────────────────────────────── +# Self-registration at import time. +# Mirrors ``FilesystemMCPServerRegistryBackend`` registration in ``__init__.py``. + +from . import register_mcp_server_registry_backend # noqa: E402 + +register_mcp_server_registry_backend("http", HTTPMCPServerRegistryBackend) diff --git a/atomic_agents/mcp_registry/types.py b/atomic_agents/mcp_registry/types.py index b1728c0..77e633c 100644 --- a/atomic_agents/mcp_registry/types.py +++ b/atomic_agents/mcp_registry/types.py @@ -79,12 +79,17 @@ def from_dict(cls, d: dict) -> MCPServerRef: Extra keys in ``d`` are silently ignored for forward-compatibility. Missing optional keys fall back to field defaults. + + Version normalization (A-F5): the wire format treats empty string and + absent key as equivalent to ``None`` for the ``version`` field. This + keeps round-trips byte-identical when catalog servers omit the field + or send an explicit ``""``. """ return cls( name=d["name"], description=d.get("description", ""), transport=d.get("transport", "stdio"), - version=d.get("version"), + version=d.get("version") or None, source=d.get("source", ""), ) diff --git a/atomic_agents/profile/types.py b/atomic_agents/profile/types.py index 8ef1054..2f70bc4 100644 --- a/atomic_agents/profile/types.py +++ b/atomic_agents/profile/types.py @@ -596,42 +596,32 @@ def _escalation_to_dict(e: Any) -> dict[str, Any]: def _mcp_spec_to_dict(spec: MCPServerSpec) -> dict[str, Any]: - """``MCPServerSpec`` → JSON-safe dict. + """``MCPServerSpec`` -> JSON-safe dict. + + Thin wrapper around ``MCPServerSpec.to_dict()`` (promoted to a public + class method at PR 4, D-PR4-2). Kept here for backward compatibility + with any internal callers; delegates to the class method so logic lives + in one place. Note: ``env`` contains parser-resolved values (the ``$VAR_NAME`` references in the source ``mcp.md`` are resolved to their literal env-var values at parse time). Database backends serializing this - SHOULD treat the ``env`` field as sensitive — do NOT log it in - cleartext, do NOT include it in audit shipped to third-party log - backends. The raw text on the profile (``mcp_md_raw``) is the - safe-to-ship form. + SHOULD treat the ``env`` field as sensitive. The raw text on the + profile (``mcp_md_raw``) is the safe-to-ship form. """ - return { - "name": spec.name, - "command": spec.command, - "args": list(spec.args), - "env": dict(spec.env), - "transport": spec.transport, - "description": spec.description, - } + return spec.to_dict() def _mcp_spec_from_dict(d: dict) -> MCPServerSpec: """Reconstruct ``MCPServerSpec`` from a dict produced by ``_mcp_spec_to_dict``. + Thin wrapper around ``MCPServerSpec.from_dict()`` (promoted to a public + class method at PR 4, D-PR4-2). Kept here for backward compatibility + with any internal callers. + Used by ``AgentProfile.from_dict`` for both ``mcp_servers`` and ``mcp_servers_resolved``. Closes a pre-existing latent bug where the ``mcp_servers`` fallback path returned raw dicts instead of ``MCPServerSpec`` instances. - - Extra keys in ``d`` are silently ignored for forward-compatibility. - Missing optional keys fall back to ``MCPServerSpec`` field defaults. """ - return MCPServerSpec( - name=d["name"], - command=d["command"], - args=list(d.get("args", [])), - env=dict(d.get("env", {})), - transport=d.get("transport", "stdio"), - description=d.get("description", ""), - ) + return MCPServerSpec.from_dict(d) diff --git a/docs/spec/36-mcp-server-registry-backend.md b/docs/spec/36-mcp-server-registry-backend.md index 6aad17e..9b775e7 100644 --- a/docs/spec/36-mcp-server-registry-backend.md +++ b/docs/spec/36-mcp-server-registry-backend.md @@ -485,6 +485,127 @@ Constructed from the URL family via `make_http_mcp_server_registry_backend_from_ --- +### HTTP wire format (PR 4) + +All HTTP endpoints accept and return JSON with snake_case keys. Every request includes `?agent_scope=` to identify the per-agent catalog partition. + +**MCPServerSpec field contract (catalog server authors).** Each server entry in the wire format has the following fields: + +| Field | Required | Type | Default | Notes | +|---|---|---|---|---| +| `name` | YES | string | (none) | MUST match charset `[a-zA-Z0-9_.+@-]+`; framework refuses path-traversal / newlines / control chars at parse boundary (MUST 1 + defense-in-depth against catalog injection) | +| `command` | YES | string | (none) | The executable for the MCP server subprocess | +| `args` | optional | list of strings | `[]` | Command-line arguments | +| `env` | optional | object mapping string to string | `{}` | Environment variables; `$VAR` references resolved client-side at materialization (MUST 8) | +| `transport` | optional | string | `"stdio"` | Only `"stdio"` supported at v1.0 | +| `description` | optional | string | `""` | Operator-readable note | + +**MCPServerRef field contract** (lightweight listing form returned by `GET /mcp-servers`): + +| Field | Required | Type | Default | Notes | +|---|---|---|---|---| +| `name` | YES | string | (none) | Same charset rule as MCPServerSpec | +| `description` | optional | string | `""` | Same as MCPServerSpec | +| `transport` | optional | string | `"stdio"` | Same as MCPServerSpec | +| `version` | optional | string OR null | `null` | Reserved; empty string is normalized to `null` for round-trip stability | +| `source` | optional | string | `""` | Backend-specific origin marker; for HTTP responses the framework sets this from the raw catalog URL on receipt (operators should not populate it server-side) | + +Extra keys on either shape are silently ignored for forward-compatibility with future wire format extensions. + +**GET /mcp-servers?agent_scope=scope** + +Returns the set of servers mounted for the given scope. Response shape: + +```json +{ + "servers": [ + {"name": "github", "description": "GitHub repo access", "transport": "stdio", "version": null, "source": ""}, + {"name": "filesystem-tools", "description": "", "transport": "stdio", "version": null, "source": ""} + ] +} +``` + +**GET /mcp-servers?agent_scope=scope&expand=spec** + +Returns the full `MCPServerSpec` shape for every mounted server. Used by `load_all_mcp_servers()` to eliminate N+1 round trips. Response shape: + +```json +{ + "servers": [ + {"name": "github", "command": "npx", "args": ["-y", "@mcp/github"], "env": {"GITHUB_TOKEN": "$GITHUB_PAT"}, "transport": "stdio", "description": ""}, + {"name": "filesystem-tools", "command": "npx", "args": ["-y", "@mcp/fs", "/data"], "env": {}, "transport": "stdio", "description": ""} + ] +} +``` + +**GET /mcp-servers/name?agent_scope=scope** + +Returns the full `MCPServerSpec` for a single mounted server. Response shape is the same as one entry from the `expand=spec` list above. Returns 404 when the name is not mounted for the scope. + +**GET /mcp-servers/name/validate?agent_scope=scope** + +Returns a static validation result. Response shape: + +```json +{"ok": true, "errors": [], "warnings": ["command 'npx' not found on catalog server PATH"]} +``` + +Returns 404 when the catalog server does not implement this endpoint (tier-1 servers are not required to). The HTTP backend returns `ValidationResult(ok=False, errors=["...does not implement /validate..."])` on 404 rather than raising. + +**GET /capabilities** + +Optional. Returns the catalog server's tier advertisement. Response shape: + +```json +{"tier": 2, "supports_install": true, "supports_uninstall": true, "supports_audit": false, "wire_version": "1.0"} +``` + +Returns 404 when the catalog server does not implement capability advertisement. Absence implies at most tier 2 (tier 3 is only detectable via this endpoint). + +--- + +### Tier negotiation (PR 4) + +The HTTP backend runs the following deterministic probe sequence on the first non-construction call. The probe fires outside any lock; the cache write happens inside `_capabilities_lock`. + +**Step 1:** `GET /capabilities`. If 200, parse the response body; the server's reported tier is authoritative. Values from this endpoint win over any inference. + +**Step 2:** If `GET /capabilities` returns 404 (endpoint absent), fall through to step 3. If it returns any other non-200 status, handle per the exception table below (401 raises `MCPRegistryAuthRequired`; 5xx raises `MCPRegistryUnavailable`; other 4xx raises `MCPRegistryUnavailable`). Non-404 4xx responses MUST NOT silently fall back to OPTIONS. + +**Step 3:** `OPTIONS /mcp-servers`. Parse the `Allow` response header using set-membership: `allowed = {m.strip().upper() for m in allow_header.split(",")}`. Tier inference: `{"GET", "POST", "DELETE"}.issubset(allowed)` implies tier 2 (read-write). GET-only implies tier 1 (read-only). Extra methods (HEAD, OPTIONS) do not affect inference. + +**Step 4:** If `OPTIONS /mcp-servers` returns 404 or 405, default to tier 1 (read-only). This is the conservative fallback; operators can call `refresh_capabilities()` after confirming their catalog server supports more. + +**Step 5:** Any network error, timeout, or 5xx at any probe step raises `MCPRegistryUnavailable`. The failure is cached for `probe_failure_cache_s` seconds; subsequent calls within the cache window raise immediately without re-probing. Explicit `refresh_capabilities()` always bypasses the failure cache. + +**Probe failure cache:** prevents thrashing when the catalog server is unreachable. With the 60-second default, sustained outage produces roughly 5 probes per 5 minutes rather than 30+. `request_timeout_s` (default 10s) is the per-request timeout; `probe_failure_cache_s` (default 60s) is the failure-window timeout. The two are deliberately separate. + +--- + +### Capability handshake (PR 4) + +The `supports_capability_handshake` flag distinguishes the HTTP backend from the filesystem backend. It is `True` on `HTTPMCPServerRegistryBackend` and `False` on `FilesystemMCPServerRegistryBackend`. + +Two levels of capability: + +**Static (class-level):** what the backend's class is capable of. `HTTPMCPServerRegistryBackend` supports capability negotiation by definition (`supports_capability_handshake=True`); this is constant regardless of probe state. + +**Runtime (probe-dependent):** what the connected catalog server actually allows. `supports_install` and `supports_uninstall` reflect the tier negotiation result after the first successful probe. Before the first probe, the `capabilities` property returns a conservative pre-probe default (all write capabilities `False`). + +The `capabilities` property returns the runtime view. Callers should use `backend.capabilities.supports_install` (property, not method call) to check before invoking write operations. Conformance tests assert claim-vs-behavior parity (MUST 3) so the runtime value is trustworthy. + +--- + +### Per-scope filtering (PR 4) + +Catalog servers MUST filter by `agent_scope` server-side. The `agent_scope` query parameter is included on every HTTP request: `?agent_scope=`. A catalog server that returns the org-wide catalog from `GET /mcp-servers?agent_scope=` (ignoring the scope parameter) is non-conformant with MUST 5. + +The scope is hardcoded from the `HTTPMCPServerRegistryBackend` constructor; no API method accepts a scope override. This mirrors the `SQLiteToolRegistryBackend` pattern where the scope is a constructor argument, not a per-call parameter. + +Scope isolation is the catalog server's responsibility at the storage layer. The framework's HTTP backend does not perform any client-side filtering; it passes the scope and trusts the catalog server to filter correctly. + +--- + ## Exception surface All exceptions live in `atomic_agents/exceptions.py` and are re-exported from `atomic_agents.mcp_registry` for ergonomic access (per the PersonaBackend D-PI-1 precedent). @@ -499,6 +620,24 @@ All exceptions live in `atomic_agents/exceptions.py` and are re-exported from `a - `NotImplementedError`: capability-gated method on a backend that doesn't support it at runtime. - `MCPServerConnectFailed`: re-raised from `load_mcp_server` when env-var resolution fails (matches existing spec/19 exception; not a new exception class). +**HTTP backend exception mapping (PR 4).** Every `httpx` exception is caught and mapped before escaping `http.py`. `httpx.InvalidURL` requires a separate `except` clause because it does NOT inherit from `httpx.HTTPError`. + +| httpx exception | Maps to | Condition | +|---|---|---| +| `httpx.HTTPStatusError` (401) | `MCPRegistryAuthRequired` | Auth required; no token provided. | +| `httpx.HTTPStatusError` (404 on /mcp-servers/name) | `MCPServerNotInRegistry` | Named server absent from catalog. | +| `httpx.HTTPStatusError` (404 on /mcp-servers collection) | `MCPRegistryUnavailable` | Tier-1 server must implement GET /mcp-servers. | +| `httpx.HTTPStatusError` (5xx) | `MCPRegistryUnavailable` | Server-side transient failure. | +| `httpx.HTTPStatusError` (other 4xx) | `MCPRegistryUnavailable` | Conservative; non-404 4xx MUST NOT silently fall back. | +| `httpx.LocalProtocolError` | `MCPRegistryDescriptorInvalid` | Client sent invalid HTTP (framework bug). | +| `httpx.DecodingError` | `MCPRegistryDescriptorInvalid` | Response body cannot be decoded. | +| `httpx.TimeoutException` (all variants) | `MCPRegistryUnavailable` | Connection or read timeout. | +| `httpx.NetworkError` (all variants) | `MCPRegistryUnavailable` | DNS failure, connection refused, dropped mid-response. | +| `httpx.ProtocolError` (all variants) | `MCPRegistryUnavailable` | Protocol-level HTTP error. | +| `httpx.HTTPError` (catch-all) | `MCPRegistryUnavailable` | Any other httpx subclass. | +| `httpx.InvalidURL` (separate clause) | `ValueError` | Operator config error; not a transient catalog failure. | +| `json.JSONDecodeError` | `MCPRegistryDescriptorInvalid` | Response body is not valid JSON. | + --- ## Registry @@ -536,6 +675,14 @@ The operator surface exposes the choice via TWO paths (parallel to `LogBackend` - `ATOMIC_AGENTS_MCP_SERVER_REGISTRY_BACKEND_URL`: connection string for non-filesystem backends. HTTP shape: `https://catalog.example.com/?agent_scope=`. - `ATOMIC_AGENTS_MCP_SERVER_REGISTRY_AUTH_TOKEN`: bearer token for HTTP (optional; absence means unauthenticated requests). +**Default factory `http` branch (PR 4).** When `ATOMIC_AGENTS_MCP_SERVER_REGISTRY_BACKEND=http`, `get_default_mcp_server_registry_backend` lazily imports `make_http_mcp_server_registry_backend_from_url` from `atomic_agents.mcp_registry.http` and reads `ATOMIC_AGENTS_MCP_SERVER_REGISTRY_BACKEND_URL` (required). If that env var is absent or empty, raises `BackendNotRegistered` with an operator-readable message naming the required variable and expected URL format. The lazy import means filesystem operators never pay the `httpx` import cost. + +```sh +export ATOMIC_AGENTS_MCP_SERVER_REGISTRY_BACKEND=http +export ATOMIC_AGENTS_MCP_SERVER_REGISTRY_BACKEND_URL="https://catalog.example.com/?agent_scope=my-agent" +export ATOMIC_AGENTS_MCP_SERVER_REGISTRY_AUTH_TOKEN="token123" # optional +``` + **Credential safety:** `get_default_mcp_server_registry_backend` sanitizes the BACKEND env var before echoing in error messages (strips anything following `://` and truncates at 32 chars). Mirrors the redaction helper from `logs/__init__.py:316` and `profile/__init__.py:_redact_for_error_message`. The constructor kwarg ALWAYS wins. Operator-config layering: env vars are deployment-level; the kwarg is per-agent-construction. diff --git a/pyproject.toml b/pyproject.toml index c7bd6c7..46dac83 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -33,6 +33,7 @@ dependencies = [ openai = ["openai>=1.30"] validation = ["jsonschema>=4.0"] redis = ["redis>=5.0,<8.0"] +http = ["httpx>=0.27"] dev = [ "pytest>=8.0", ] diff --git a/tests/test_mcp_server_registry_conformance.py b/tests/test_mcp_server_registry_conformance.py index d01d60a..6c10276 100644 --- a/tests/test_mcp_server_registry_conformance.py +++ b/tests/test_mcp_server_registry_conformance.py @@ -18,6 +18,11 @@ MUST 10 -- load_all_mcp_servers consistency (~5 tests) Per spec/36 and prep notes B-F4, B-F5, B-F6, B-F7, B-F8. + +HTTP backend parametrize added at PR 4 (prep notes E-F1, E-F3). The +backend_factory and populated_backend fixtures each grow an "http" branch +that constructs HTTPMCPServerRegistryBackend with an httpx.MockTransport +so capability tests do not cascade-fail due to real network calls. """ from __future__ import annotations @@ -26,6 +31,7 @@ from textwrap import dedent from unittest.mock import patch +import httpx import pytest from atomic_agents.mcp_registry import ( @@ -36,6 +42,7 @@ MCPServerRegistryCapabilities, ) from atomic_agents.mcp_registry.backend import _default_load_all +from atomic_agents.mcp_registry.http import HTTPMCPServerRegistryBackend from atomic_agents.mcp_registry.types import MCPServerRef from atomic_agents.mcp import MCPServerSpec, parse_mcp_md_text from atomic_agents.exceptions import MCPServerConnectFailed @@ -89,6 +96,85 @@ def make_mcp_md(agent_root: Path, specs: list[MCPServerSpec]) -> Path: return mcp_md +# ────────────────────────────────────────────────────────────────────────────── +# HTTP MockTransport helpers (prep notes E-F9) + + +def _default_mock_transport( + extra_servers: list[dict] | None = None, +) -> httpx.MockTransport: + """Return an httpx.MockTransport that responds successfully to the full probe + sequence and returns an optionally populated server catalog. + + Used by ``backend_factory`` and ``populated_backend`` HTTP branches so that + capability probe tests do not cascade-fail with MCPRegistryUnavailable. + + ``extra_servers`` is a list of wire-format server dicts (same shape as what + a catalog server returns in ``{"servers": [...]}``) that the transport will + serve on the list and bulk endpoints. + """ + servers = extra_servers or [] + + def _handler(request: httpx.Request) -> httpx.Response: + path = request.url.path.rstrip("/") + query = str(request.url.query) + method = request.method + + if method == "OPTIONS": + # OPTIONS probe for tier negotiation (Decision 4 step 2). + return httpx.Response(200, headers={"Allow": "GET"}) + + if path.endswith("/capabilities"): + return httpx.Response( + 200, + json={ + "tier": 1, + "supports_install": False, + "supports_uninstall": False, + "supports_audit": False, + "wire_version": "1.0.0", + }, + ) + + # Exact /mcp-servers collection endpoint (list or bulk; expand=spec query toggles). + if path.endswith("/mcp-servers"): + if "expand" in query: + # Bulk endpoint: GET /mcp-servers?expand=spec returns full specs. + return httpx.Response(200, json={"servers": servers}) + # Plain list endpoint: GET /mcp-servers returns refs (metadata only). + refs = [ + { + "name": s["name"], + "description": s.get("description", ""), + "transport": s.get("transport", "stdio"), + } + for s in servers + ] + return httpx.Response(200, json={"servers": refs}) + + # Validate endpoint: GET /mcp-servers//validate + if path.endswith("/validate") and "/mcp-servers/" in path: + name = path.rsplit("/", 2)[-2] + for s in servers: + if s["name"] == name: + return httpx.Response( + 200, json={"ok": True, "errors": [], "warnings": []} + ) + return httpx.Response(404, json={"error": "not found"}) + + # Per-name endpoint: GET /mcp-servers/ returns the single spec. + if "/mcp-servers/" in path: + name = path.rsplit("/", 1)[-1] + for s in servers: + if s["name"] == name: + return httpx.Response(200, json=s) + return httpx.Response(404, json={"error": "not found"}) + + return httpx.Response(404, json={"error": "not found"}) + + return httpx.MockTransport(_handler) + + # ────────────────────────────────────────────────────────────────────────────── # Fixtures @@ -101,38 +187,76 @@ def tmp_agent_root(tmp_path: Path) -> Path: return root -@pytest.fixture(params=["filesystem"]) +@pytest.fixture(params=["filesystem", "http"]) def backend_factory(request, tmp_path: Path): """Parametrized fixture that returns a constructed backend instance. - HTTP backend joins at PR 4 via an additional params entry: - params=["filesystem", "http"] - with an ``elif request.param == "http":`` branch here. + Parametrized across "filesystem" and "http" backends. The HTTP branch + uses an httpx.MockTransport that responds successfully to the full + Decision 4 probe sequence so capability tests do not cascade-fail. + Per prep notes E-F3. """ if request.param == "filesystem": agent_root = tmp_path / "agent-for-backend" agent_root.mkdir() return FilesystemMCPServerRegistryBackend(agent_root, []) + elif request.param == "http": + transport = _default_mock_transport() + client = httpx.Client(transport=transport) + return HTTPMCPServerRegistryBackend( + catalog_url="http://catalog.example.invalid", + agent_scope="test-scope", + _http_client=client, + ) raise ValueError(f"Unknown backend param: {request.param!r}") -@pytest.fixture -def populated_backend(tmp_path: Path): - """Backend pre-populated with 3 specs in mcp.md. +@pytest.fixture(params=["filesystem", "http"]) +def populated_backend(request, tmp_path: Path): + """Backend pre-populated with 3 specs. Used for MUST 10 consistency tests. Returns (backend, specs) so tests can reference the specs used for population. + + Parametrized across "filesystem" and "http" backends (prep notes E-F1). + The HTTP branch provides a MockTransport serving the same 3 servers + consistently across GET /mcp-servers, GET /mcp-servers?expand=spec, + and GET /mcp-servers/ endpoints. """ - agent_root = tmp_path / "populated-agent" - agent_root.mkdir() specs = [ _make_mcp_spec("alpha-server", description="First server"), _make_mcp_spec("beta-server", description="Second server"), _make_mcp_spec("gamma-server", description="Third server"), ] - make_mcp_md(agent_root, specs) - backend = FilesystemMCPServerRegistryBackend(agent_root, []) - return backend, specs + + if request.param == "filesystem": + agent_root = tmp_path / "populated-agent" + agent_root.mkdir() + make_mcp_md(agent_root, specs) + backend = FilesystemMCPServerRegistryBackend(agent_root, []) + return backend, specs + elif request.param == "http": + # Wire format: each spec becomes a dict compatible with the HTTP wire shape. + wire_servers = [ + { + "name": s.name, + "command": s.command, + "args": s.args, + "env": s.env, + "transport": s.transport, + "description": s.description, + } + for s in specs + ] + transport = _default_mock_transport(extra_servers=wire_servers) + client = httpx.Client(transport=transport) + backend = HTTPMCPServerRegistryBackend( + catalog_url="http://catalog.example.invalid", + agent_scope="test-scope", + _http_client=client, + ) + return backend, specs + raise ValueError(f"Unknown backend param: {request.param!r}") # ────────────────────────────────────────────────────────────────────────────── diff --git a/tests/test_mcp_server_registry_http_backend.py b/tests/test_mcp_server_registry_http_backend.py new file mode 100644 index 0000000..4267ca1 --- /dev/null +++ b/tests/test_mcp_server_registry_http_backend.py @@ -0,0 +1,1624 @@ +"""HTTP backend tests for HTTPMCPServerRegistryBackend (spec/36 PR 4 of 5). + +Covers the read-path contract for the HTTP backend: capability tier negotiation, +MUST 2 side-effect-free construction, httpx exception mapping, response body +validation, MUST 10 bulk endpoint consistency, auth header injection, and URL +credential redaction. + +All network I/O is intercepted via ``httpx.MockTransport`` -- no real HTTP +requests are made and no extra dev dependency is needed (httpx arrives +transitively via the ``mcp>=1.0.0`` dependency). + +The ``_http_client`` constructor kwarg on ``HTTPMCPServerRegistryBackend`` is the +injection seam: tests pass ``httpx.Client(transport=MockTransport(handler))``; +production callers pass ``None`` (lazy real client constructed at first use). + +Test categories: + a) MUST 2 side-effect-free construction (3 tests) + b) Capability probe branches (8 tests, Decision 4 steps 1-5 + B-F8 additions + + reordered Allow header) + c) httpx exception mapping (7 tests, full table from prep notes D) + d) Response body validation (5 tests, defense-in-depth per E-F7 / C-F3) + e) Bulk endpoint MUST 10 consistency (3 tests) + f) Auth + URL credential redaction (4 tests) + g) Capability lifecycle + properties (3 tests) + h) Lazy httpx import guard (1 test) + i) Review-army follow-up tests (added during /ship review): + - MCPServerSpec.to_dict/from_dict public round-trip (T-F4) + - OPTIONS probe non-404/405 status handling (T-F1 / Adv-F3) + - InvalidURL strict mapping via injection (T-F2) + - MUST 10 full field equality (T-F3 / C-F4) + - httpx.DecodingError mapping (T-F5) + - Concurrent first-call probe (T-F6 / Adv-F7) + - agent_scope query param forwarding (T-F7) + - Successful probe cached (T-F8) + - Factory function tests (T-F9) + - RuntimeError on closed client (Adv-F2) + - catalog_url query string normalization (Adv-F4) + - MCPServerRef.source uses raw catalog_url per spec (A-F1) + - Factory ValueError redacts credentials (S-F1) +""" + +from __future__ import annotations + +import sys +from typing import Callable + +import httpx +import pytest + +from atomic_agents.mcp import MCPServerSpec +from atomic_agents.mcp_registry import ( + MCPRegistryAuthRequired, + MCPRegistryDescriptorInvalid, + MCPRegistryUnavailable, +) +from atomic_agents.mcp_registry.http import ( + HTTPMCPServerRegistryBackend, +) + + +# ────────────────────────────────────────────────────────────────────────────── +# Test scaffold helpers + + +def _capabilities_response( + tier: int = 1, + *, + supports_install: bool = False, + supports_uninstall: bool = False, + supports_audit: bool = False, + wire_version: str = "1.0.0", + **overrides, +) -> dict: + """Build a /capabilities response body for the given tier. + + Callers may override any field via keyword args. + """ + body = { + "tier": tier, + "supports_install": supports_install, + "supports_uninstall": supports_uninstall, + "supports_audit": supports_audit, + "wire_version": wire_version, + } + body.update(overrides) + return body + + +def _spec_to_wire_json(spec: MCPServerSpec) -> dict: + """Convert an MCPServerSpec to its expected wire JSON shape (to_dict).""" + return spec.to_dict() + + +def _make_mock_transport( + routes: dict[ + tuple[str, str], httpx.Response | Callable[[httpx.Request], httpx.Response] + ], +) -> httpx.MockTransport: + """Build an httpx.MockTransport from a dispatch table. + + ``routes`` is keyed on ``(METHOD, path_prefix)`` tuples. The handler + checks each key as a string-prefix match on ``request.url.path``. + + If no route matches, the transport returns 404 with an empty JSON body to + simulate a minimal tier-1 server that doesn't implement optional endpoints. + + Default routes included when not overridden: + - ``GET /capabilities`` -> 200 with tier-1 body + - ``GET /mcp-servers`` -> 200 with ``{"servers": []}`` + """ + defaults: dict[tuple[str, str], httpx.Response] = { + ("GET", "/capabilities"): httpx.Response( + 200, json=_capabilities_response(tier=1) + ), + ("GET", "/mcp-servers"): httpx.Response(200, json={"servers": []}), + } + # Caller-supplied routes override defaults. + merged = {**defaults, **routes} + + def _handler(request: httpx.Request) -> httpx.Response: + method = request.method + path = request.url.path + # Exact match first, then prefix match. + for (route_method, route_path), response_or_fn in merged.items(): + if route_method != method: + continue + if ( + path == route_path + or path.startswith(route_path + "?") + or path.startswith(route_path + "/") + ): + if callable(response_or_fn) and not isinstance( + response_or_fn, httpx.Response + ): + return response_or_fn(request) + return response_or_fn # type: ignore[return-value] + # No match: 404 + return httpx.Response(404, json={"error": "not found"}) + + return httpx.MockTransport(_handler) + + +def _make_server_wire_entry( + name: str, + command: str = "echo", + args: list[str] | None = None, + env: dict[str, str] | None = None, + transport: str = "stdio", + description: str = "", +) -> dict: + """Build a wire-format server entry dict as a catalog server would return.""" + return { + "name": name, + "command": command, + "args": args or [], + "env": env or {}, + "transport": transport, + "description": description, + } + + +def _make_backend( + catalog_url: str = "http://catalog.example.invalid", + agent_scope: str = "test-scope", + auth_token: str | None = None, + probe_failure_cache_s: float = 0.5, + transport: httpx.MockTransport | None = None, +) -> HTTPMCPServerRegistryBackend: + """Construct an HTTPMCPServerRegistryBackend with an injected MockTransport. + + Uses ``probe_failure_cache_s=0.5`` by default so failure-cache tests run + quickly without sleeping more than a second. + """ + if transport is None: + transport = _make_mock_transport({}) + client = httpx.Client(transport=transport) + return HTTPMCPServerRegistryBackend( + catalog_url=catalog_url, + agent_scope=agent_scope, + auth_token=auth_token, + probe_failure_cache_s=probe_failure_cache_s, + _http_client=client, + ) + + +# ────────────────────────────────────────────────────────────────────────────── +# a) MUST 2 -- side-effect-free construction + + +def test_http_construction_does_not_call_network() -> None: + """Constructor must not make any network calls (spec/36 MUST 2 HTTP edition). + + Build a transport that counts all calls. Construct the backend. Assert + zero calls were made. + """ + call_count: list[int] = [0] + + def _counting_handler(request: httpx.Request) -> httpx.Response: + call_count[0] += 1 + return httpx.Response(200, json=_capabilities_response()) + + transport = httpx.MockTransport(_counting_handler) + client = httpx.Client(transport=transport) + + HTTPMCPServerRegistryBackend( + catalog_url="http://catalog.example.invalid", + agent_scope="test-scope", + _http_client=client, + ) + + assert call_count[0] == 0, ( + f"Constructor must not make network calls; got {call_count[0]} call(s)." + ) + + +def test_http_probe_fires_on_first_method_call_not_construction() -> None: + """Probe fires on first list_mcp_servers(), not on construction. + + After construction: call count == 0. + After list_mcp_servers(): call count >= 1 (at least the probe fired). + """ + call_count: list[int] = [0] + + def _counting_handler(request: httpx.Request) -> httpx.Response: + call_count[0] += 1 + path = request.url.path + if path == "/capabilities": + return httpx.Response(200, json=_capabilities_response(tier=1)) + if "/mcp-servers" in path: + return httpx.Response(200, json={"servers": []}) + return httpx.Response(404, json={}) + + transport = httpx.MockTransport(_counting_handler) + client = httpx.Client(transport=transport) + + backend = HTTPMCPServerRegistryBackend( + catalog_url="http://catalog.example.invalid", + agent_scope="test-scope", + _http_client=client, + ) + + assert call_count[0] == 0, "No calls expected after construction." + + backend.list_mcp_servers() + + assert call_count[0] >= 1, ( + f"Probe must have fired after first method call; got {call_count[0]} call(s)." + ) + + +def test_http_construction_with_invalid_url_does_not_raise() -> None: + """Construction with an invalid URL must not raise (validation is deferred to first call). + + spec/36 MUST 2 -- side-effect-free construction. The URL is validated lazily + when the first network operation is attempted. This matches the filesystem + backend's pattern of deferring I/O to first use. + """ + # This must not raise. + backend = HTTPMCPServerRegistryBackend( + catalog_url="not-a-url", + agent_scope="test-scope", + ) + assert backend is not None + + +# ────────────────────────────────────────────────────────────────────────────── +# b) Capability probe branches + + +def test_probe_branch_1_get_capabilities_200_returns_tier_from_body() -> None: + """Probe step 1: GET /capabilities 200 -> parse tier from response body. + + spec/36 Decision 4 step 1. After list_mcp_servers(), capabilities reflect + the body's supports_audit field. + """ + transport = _make_mock_transport( + { + ("GET", "/capabilities"): httpx.Response( + 200, + json=_capabilities_response( + tier=3, + supports_install=True, + supports_uninstall=True, + supports_audit=True, + ), + ), + } + ) + backend = _make_backend(transport=transport) + backend.list_mcp_servers() + + assert backend.capabilities.supports_audit is True, ( + "Probe step 1: supports_audit must be True when catalog server reports tier 3." + ) + + +def test_probe_branch_2a_options_allow_get_only_tier_1() -> None: + """Probe step 2a: GET /capabilities 404, OPTIONS /mcp-servers Allow: GET -> tier 1. + + spec/36 Decision 4 step 2 / B-F6. Only GET in Allow header -> conservative + tier 1 (supports_install=False). + """ + transport = _make_mock_transport( + { + ("GET", "/capabilities"): httpx.Response(404, json={}), + ("OPTIONS", "/mcp-servers"): httpx.Response(200, headers={"Allow": "GET"}), + } + ) + backend = _make_backend(transport=transport) + backend.list_mcp_servers() + + assert backend.capabilities.supports_install is False, ( + "Probe step 2a: GET-only Allow header must produce tier 1 (supports_install=False)." + ) + + +def test_probe_branch_2b_options_allow_get_post_delete_tier_2() -> None: + """Probe step 2b: GET /capabilities 404, OPTIONS Allow: GET, POST, DELETE -> tier 2. + + spec/36 Decision 4 step 2 / B-F6. Presence of GET+POST+DELETE in Allow header + indicates write capability (supports_install=True). + """ + transport = _make_mock_transport( + { + ("GET", "/capabilities"): httpx.Response(404, json={}), + ("OPTIONS", "/mcp-servers"): httpx.Response( + 200, headers={"Allow": "GET, POST, DELETE"} + ), + } + ) + backend = _make_backend(transport=transport) + backend.list_mcp_servers() + + assert backend.capabilities.supports_install is True, ( + "Probe step 2b: GET+POST+DELETE Allow header must produce tier 2 (supports_install=True)." + ) + + +def test_probe_branch_2b_options_allow_reordered_extra_methods_still_tier_2() -> None: + """Probe step 2b: reordered or extra methods in Allow still produce tier 2. + + spec/36 B-F6 -- Allow header parsing uses set-membership, not string-equality. + Extra methods (HEAD, OPTIONS) do not affect tier inference. + """ + transport = _make_mock_transport( + { + ("GET", "/capabilities"): httpx.Response(404, json={}), + ("OPTIONS", "/mcp-servers"): httpx.Response( + 200, headers={"Allow": "DELETE, GET, HEAD, POST, OPTIONS"} + ), + } + ) + backend = _make_backend(transport=transport) + backend.list_mcp_servers() + + assert backend.capabilities.supports_install is True, ( + "Probe step 2b: set-membership check must handle reordered / extra Allow methods." + ) + + +def test_probe_branch_3_options_404_default_tier_1() -> None: + """Probe step 3: GET /capabilities 404, OPTIONS /mcp-servers 404 -> tier 1 default. + + spec/36 Decision 4 step 3. Conservative fallback when both probe steps fail. + """ + transport = _make_mock_transport( + { + ("GET", "/capabilities"): httpx.Response(404, json={}), + ("OPTIONS", "/mcp-servers"): httpx.Response(404, json={}), + } + ) + backend = _make_backend(transport=transport) + backend.list_mcp_servers() + + assert backend.capabilities.supports_install is False, ( + "Probe step 3: double-404 must fall back to conservative tier 1." + ) + + +def test_probe_branch_4_5xx_raises_unavailable_and_caches_failure() -> None: + """Probe step 4: GET /capabilities 500 -> MCPRegistryUnavailable; failure is cached. + + spec/36 Decision 4 step 4. A second call within probe_failure_cache_s must NOT + make a second HTTP request (call count stays at 1) and must still raise. + """ + call_count: list[int] = [0] + + def _handler(request: httpx.Request) -> httpx.Response: + call_count[0] += 1 + if "/capabilities" in request.url.path: + return httpx.Response(500, json={"error": "server error"}) + return httpx.Response(200, json={"servers": []}) + + transport = httpx.MockTransport(_handler) + backend = _make_backend(transport=transport, probe_failure_cache_s=0.5) + + # First call: probe fires, raises. + with pytest.raises(MCPRegistryUnavailable): + backend.list_mcp_servers() + + count_after_first = call_count[0] + assert count_after_first >= 1 + + # Second call within cache window: must NOT re-probe; must still raise. + with pytest.raises(MCPRegistryUnavailable): + backend.list_mcp_servers() + + assert call_count[0] == count_after_first, ( + "Second call within failure cache window must not make additional HTTP calls." + ) + + +def test_probe_branch_5_401_raises_auth_required() -> None: + """Probe step 5: GET /capabilities 401 -> MCPRegistryAuthRequired. + + spec/36 Decision 4 step 5 / MUST 7. + """ + transport = _make_mock_transport( + { + ("GET", "/capabilities"): httpx.Response( + 401, json={"error": "unauthorized"} + ), + } + ) + backend = _make_backend(transport=transport) + + with pytest.raises(MCPRegistryAuthRequired): + backend.list_mcp_servers() + + +def test_probe_branch_403_does_not_silent_fallback_tier_1() -> None: + """Probe: GET /capabilities 403 must raise, NOT silently fall back to tier 1. + + spec/36 B-F8. Non-404 4xx status codes on /capabilities MUST raise + MCPRegistryUnavailable. Silent tier-1 fallback would mask misconfiguration. + """ + transport = _make_mock_transport( + { + ("GET", "/capabilities"): httpx.Response(403, json={"error": "forbidden"}), + } + ) + backend = _make_backend(transport=transport) + + with pytest.raises(MCPRegistryUnavailable): + backend.list_mcp_servers() + + +# ────────────────────────────────────────────────────────────────────────────── +# c) httpx exception mapping + + +def test_httpx_connect_error_maps_to_unavailable() -> None: + """httpx.ConnectError -> MCPRegistryUnavailable. + + spec/36 D exception mapping table. DNS failure / connection refused. + """ + + def _handler(request: httpx.Request) -> httpx.Response: + raise httpx.ConnectError("connection refused", request=request) + + transport = httpx.MockTransport(_handler) + backend = _make_backend(transport=transport) + + with pytest.raises(MCPRegistryUnavailable): + backend.list_mcp_servers() + + +def test_httpx_read_timeout_maps_to_unavailable() -> None: + """httpx.ReadTimeout -> MCPRegistryUnavailable. + + spec/36 D exception mapping table. Response body too slow. + """ + + def _handler(request: httpx.Request) -> httpx.Response: + raise httpx.ReadTimeout("read timeout", request=request) + + transport = httpx.MockTransport(_handler) + backend = _make_backend(transport=transport) + + with pytest.raises(MCPRegistryUnavailable): + backend.list_mcp_servers() + + +def test_httpx_pool_timeout_maps_to_unavailable() -> None: + """httpx.PoolTimeout -> MCPRegistryUnavailable. + + spec/36 D exception mapping table. Connection pool exhausted. + """ + + def _handler(request: httpx.Request) -> httpx.Response: + raise httpx.PoolTimeout("pool exhausted", request=request) + + transport = httpx.MockTransport(_handler) + backend = _make_backend(transport=transport) + + with pytest.raises(MCPRegistryUnavailable): + backend.list_mcp_servers() + + +def test_httpx_local_protocol_error_maps_to_descriptor_invalid() -> None: + """httpx.LocalProtocolError -> MCPRegistryDescriptorInvalid. + + spec/36 D exception mapping table. Client sent invalid HTTP (framework bug). + Mapping to DescriptorInvalid rather than Unavailable surfaces the bug instead + of masking it as a transient failure. + """ + + def _handler(request: httpx.Request) -> httpx.Response: + raise httpx.LocalProtocolError("local protocol error") + + transport = httpx.MockTransport(_handler) + backend = _make_backend(transport=transport) + + with pytest.raises(MCPRegistryDescriptorInvalid): + backend.list_mcp_servers() + + +def test_httpx_invalid_url_maps_to_value_error() -> None: + """httpx.InvalidURL -> ValueError. + + spec/36 D exception mapping table. httpx.InvalidURL does NOT inherit from + httpx.HTTPError; it requires a separate except clause. Surfaces as ValueError + so operators can distinguish URL misconfiguration from transient failures. + """ + # Construct backend with a malformed URL to trigger InvalidURL on first call. + backend = HTTPMCPServerRegistryBackend( + catalog_url="not-a-valid-url://[::invalid", + agent_scope="test-scope", + ) + + with pytest.raises((ValueError, MCPRegistryUnavailable)): + # Accept either ValueError (InvalidURL mapped) or MCPRegistryUnavailable + # (any other network error from httpx). The key is it must NOT propagate + # a bare httpx.InvalidURL. + backend.list_mcp_servers() + + +def test_httpx_unknown_subclass_maps_to_unavailable_via_http_error_catchall() -> None: + """An unknown httpx.HTTPError subclass -> MCPRegistryUnavailable via catch-all. + + spec/36 D exception mapping table last row. The final catch-all for any future + httpx subclass must route to MCPRegistryUnavailable so the framework does not + expose raw httpx types to callers. + """ + + class _UnknownHTTPError(httpx.HTTPError): + """Synthetic future httpx exception class.""" + + def __init__(self) -> None: + super().__init__("unknown future error") + + def _handler(request: httpx.Request) -> httpx.Response: + raise _UnknownHTTPError() + + transport = httpx.MockTransport(_handler) + backend = _make_backend(transport=transport) + + with pytest.raises(MCPRegistryUnavailable): + backend.list_mcp_servers() + + +# ────────────────────────────────────────────────────────────────────────────── +# d) Response body validation + + +def test_list_malformed_json_raises_descriptor_invalid() -> None: + """GET /mcp-servers returning non-JSON body -> MCPRegistryDescriptorInvalid. + + spec/36 E-F7 / C-F3 defense-in-depth. The parser-level failure (body cannot + be decoded as JSON at all) must raise DescriptorInvalid rather than leaking + a json.JSONDecodeError or httpx.DecodingError. + """ + + def _handler(request: httpx.Request) -> httpx.Response: + path = request.url.path + if "/capabilities" in path: + return httpx.Response(200, json=_capabilities_response()) + # Return non-JSON body for the mcp-servers list. + return httpx.Response( + 200, + content=b"not json at all", + headers={"content-type": "application/json"}, + ) + + transport = httpx.MockTransport(_handler) + backend = _make_backend(transport=transport) + + with pytest.raises(MCPRegistryDescriptorInvalid): + backend.list_mcp_servers() + + +def test_list_missing_servers_key_raises_descriptor_invalid() -> None: + """GET /mcp-servers returning JSON without 'servers' key -> MCPRegistryDescriptorInvalid. + + spec/36 E-F7 / C-F3 shape-level validation. The body parsed but has the wrong + structure. + """ + + def _handler(request: httpx.Request) -> httpx.Response: + path = request.url.path + if "/capabilities" in path: + return httpx.Response(200, json=_capabilities_response()) + return httpx.Response(200, json={"data": []}) # 'servers' key is absent + + transport = httpx.MockTransport(_handler) + backend = _make_backend(transport=transport) + + with pytest.raises(MCPRegistryDescriptorInvalid): + backend.list_mcp_servers() + + +def test_load_missing_required_field_raises_descriptor_invalid() -> None: + """GET /mcp-servers/ returning spec without 'command' -> MCPRegistryDescriptorInvalid. + + spec/36 E-F7 / C-F3. The spec body has the right shape but is missing a required + field. + """ + + def _handler(request: httpx.Request) -> httpx.Response: + path = request.url.path + if "/capabilities" in path: + return httpx.Response(200, json=_capabilities_response()) + if path.endswith("/mcp-servers") or "/mcp-servers?" in path: + return httpx.Response( + 200, + json={ + "servers": [ + {"name": "incomplete-server", "description": "missing command"} + ] + }, + ) + if "/mcp-servers/" in path: + # Return a spec without the required 'command' field. + return httpx.Response(200, json={"name": "incomplete-server"}) + return httpx.Response(404, json={}) + + transport = httpx.MockTransport(_handler) + backend = _make_backend(transport=transport) + + with pytest.raises(MCPRegistryDescriptorInvalid): + backend.load_mcp_server("incomplete-server") + + +def test_load_injection_in_name_field_rejected() -> None: + """Server returned with injection in 'name' field -> MCPRegistryDescriptorInvalid. + + spec/36 E-F7. The name field must satisfy MUST 1 charset. A name containing + newlines or '## ' is an injection attempt and must be rejected. + """ + + def _handler(request: httpx.Request) -> httpx.Response: + path = request.url.path + if "/capabilities" in path: + return httpx.Response(200, json=_capabilities_response()) + if path.endswith("/mcp-servers") or "/mcp-servers?" in path: + return httpx.Response( + 200, + json={ + "servers": [ + { + "name": "evil\n## injection", + "command": "echo", + } + ] + }, + ) + return httpx.Response(404, json={}) + + transport = httpx.MockTransport(_handler) + backend = _make_backend(transport=transport) + + # The backend must refuse to return a server with an injected name. + with pytest.raises((MCPRegistryDescriptorInvalid, ValueError)): + backend.list_mcp_servers() + + +def test_validation_result_missing_ok_raises_descriptor_invalid() -> None: + """GET /mcp-servers//validate returning body without 'ok' -> MCPRegistryDescriptorInvalid. + + spec/36 E-F7. The ValidationResult wire shape requires 'ok' as a bool. + """ + + def _handler(request: httpx.Request) -> httpx.Response: + path = request.url.path + if "/capabilities" in path: + return httpx.Response(200, json=_capabilities_response()) + if path.endswith("/validate"): + # Return a validation result missing the required 'ok' field. + return httpx.Response(200, json={"errors": [], "warnings": []}) + if path.endswith("/mcp-servers") or "/mcp-servers?" in path: + return httpx.Response( + 200, + json={"servers": [{"name": "test-server", "command": "echo"}]}, + ) + if "/mcp-servers/" in path: + return httpx.Response( + 200, + json={ + "name": "test-server", + "command": "echo", + "args": [], + "env": {}, + "transport": "stdio", + }, + ) + return httpx.Response(404, json={}) + + transport = httpx.MockTransport(_handler) + backend = _make_backend(transport=transport) + + with pytest.raises(MCPRegistryDescriptorInvalid): + backend.validate("test-server") + + +# ────────────────────────────────────────────────────────────────────────────── +# e) Bulk endpoint MUST 10 consistency + + +def test_load_all_uses_bulk_endpoint_one_call() -> None: + """load_all_mcp_servers() makes exactly ONE network call for the bulk fetch. + + spec/36 MUST 10. The HTTP backend must use GET /mcp-servers?expand=spec (or + equivalent bulk endpoint) instead of N separate per-name requests. + """ + bulk_calls: list[str] = [] + per_name_calls: list[str] = [] + + servers = [ + _make_server_wire_entry(f"server-{i}", description=f"Server {i}") + for i in range(3) + ] + + def _handler(request: httpx.Request) -> httpx.Response: + path = request.url.path + query = str(request.url.query) + if "/capabilities" in path: + return httpx.Response(200, json=_capabilities_response()) + if "/mcp-servers" in path and "expand" in query: + bulk_calls.append(path) + return httpx.Response(200, json={"servers": servers}) + if "/mcp-servers" in path and not any( + c in path.split("/mcp-servers")[-1] + for c in ["/server-0", "/server-1", "/server-2"] + ): + # Plain list endpoint. + refs = [ + { + "name": s["name"], + "description": s["description"], + "transport": s["transport"], + } + for s in servers + ] + return httpx.Response(200, json={"servers": refs}) + # Per-name endpoint. + per_name_calls.append(path) + for s in servers: + if path.endswith(f"/{s['name']}"): + return httpx.Response(200, json=s) + return httpx.Response(404, json={}) + + transport = httpx.MockTransport(_handler) + backend = _make_backend(transport=transport) + + result = backend.load_all_mcp_servers() + + assert len(result) == 3, f"Expected 3 servers; got {len(result)}" + assert len(bulk_calls) == 1, ( + f"load_all_mcp_servers must use exactly 1 bulk call; got {len(bulk_calls)} bulk + {len(per_name_calls)} per-name." + ) + assert len(per_name_calls) == 0, ( + f"load_all_mcp_servers must NOT make per-name calls; got {len(per_name_calls)}." + ) + + +def test_load_all_consistent_with_per_name_load() -> None: + """set(load_all_mcp_servers()) names equals set of per-name load names. + + spec/36 MUST 10. Bulk and per-name paths must return the same server set. + """ + servers = [ + _make_server_wire_entry("alpha-server"), + _make_server_wire_entry("beta-server"), + _make_server_wire_entry("gamma-server"), + ] + + def _handler(request: httpx.Request) -> httpx.Response: + path = request.url.path.rstrip("/") + query = str(request.url.query) + if path.endswith("/capabilities"): + return httpx.Response(200, json=_capabilities_response()) + if path.endswith("/mcp-servers"): + if "expand" in query: + return httpx.Response(200, json={"servers": servers}) + refs = [ + { + "name": s["name"], + "description": s["description"], + "transport": s["transport"], + } + for s in servers + ] + return httpx.Response(200, json={"servers": refs}) + if "/mcp-servers/" in path: + name = path.rsplit("/", 1)[-1] + for s in servers: + if s["name"] == name: + return httpx.Response(200, json=s) + return httpx.Response(404, json={}) + + transport = httpx.MockTransport(_handler) + backend = _make_backend(transport=transport) + + all_specs = backend.load_all_mcp_servers() + refs = backend.list_mcp_servers() + per_name_specs = [backend.load_mcp_server(ref.name) for ref in refs] + + all_names = {s.name for s in all_specs} + per_name_names = {s.name for s in per_name_specs} + + assert all_names == per_name_names, ( + f"MUST 10: load_all names {all_names!r} must equal per-name names {per_name_names!r}." + ) + + +def test_load_all_resolves_env_vars(monkeypatch) -> None: + """load_all_mcp_servers() resolves $VAR env references. + + spec/36 MUST 8 + MUST 10. Env vars must be resolved in the bulk path, not + stored as $VAR literals. + """ + monkeypatch.setenv("MY_TEST_VAR", "resolved-value-xyz") + + server = _make_server_wire_entry( + "env-server", + env={"KEY": "$MY_TEST_VAR"}, + ) + + def _handler(request: httpx.Request) -> httpx.Response: + path = request.url.path + query = str(request.url.query) + if "/capabilities" in path: + return httpx.Response(200, json=_capabilities_response()) + if "/mcp-servers" in path and "expand" in query: + return httpx.Response(200, json={"servers": [server]}) + if "/mcp-servers" in path and "/env-server" not in path: + refs = [ + { + "name": server["name"], + "description": server["description"], + "transport": server["transport"], + } + ] + return httpx.Response(200, json={"servers": refs}) + if "/env-server" in path: + return httpx.Response(200, json=server) + return httpx.Response(404, json={}) + + transport = httpx.MockTransport(_handler) + backend = _make_backend(transport=transport) + + all_specs = backend.load_all_mcp_servers() + assert len(all_specs) == 1 + assert all_specs[0].env["KEY"] == "resolved-value-xyz", ( + "load_all must resolve $VAR env references at call time." + ) + + # Also verify via per-name load path. + loaded = backend.load_mcp_server("env-server") + assert loaded.env["KEY"] == "resolved-value-xyz", ( + "load_mcp_server must also resolve $VAR env references." + ) + + +# ────────────────────────────────────────────────────────────────────────────── +# f) Auth + URL credential redaction + + +def test_auth_token_added_as_bearer_header() -> None: + """auth_token is sent as 'Authorization: Bearer ' on every request. + + spec/36 E-F10 / D-F2. + """ + captured_headers: list[dict] = [] + + def _handler(request: httpx.Request) -> httpx.Response: + captured_headers.append(dict(request.headers)) + path = request.url.path + if "/capabilities" in path: + return httpx.Response(200, json=_capabilities_response()) + return httpx.Response(200, json={"servers": []}) + + transport = httpx.MockTransport(_handler) + client = httpx.Client(transport=transport) + backend = HTTPMCPServerRegistryBackend( + catalog_url="http://catalog.example.invalid", + agent_scope="test-scope", + auth_token="my-secret-token", + _http_client=client, + ) + + backend.list_mcp_servers() + + assert len(captured_headers) >= 1, "At least one request must have been made." + for headers in captured_headers: + auth = headers.get("authorization", "") + assert auth == "Bearer my-secret-token", ( + f"Authorization header must be 'Bearer my-secret-token'; got {auth!r}." + ) + + +def test_no_auth_token_omits_authorization_header() -> None: + """When no auth_token is provided, 'Authorization' header must be absent. + + spec/36 E-F10. + """ + captured_headers: list[dict] = [] + + def _handler(request: httpx.Request) -> httpx.Response: + captured_headers.append(dict(request.headers)) + path = request.url.path + if "/capabilities" in path: + return httpx.Response(200, json=_capabilities_response()) + return httpx.Response(200, json={"servers": []}) + + transport = httpx.MockTransport(_handler) + client = httpx.Client(transport=transport) + backend = HTTPMCPServerRegistryBackend( + catalog_url="http://catalog.example.invalid", + agent_scope="test-scope", + auth_token=None, + _http_client=client, + ) + + backend.list_mcp_servers() + + for headers in captured_headers: + assert "authorization" not in {k.lower() for k in headers}, ( + "Authorization header must NOT be present when no auth_token is supplied." + ) + + +def test_url_credentials_redacted_in_error_messages() -> None: + """MCPRegistryUnavailable raised from a URL-with-credentials backend must not expose the password. + + spec/36 D-F2 / E-F10. The 'secret' in 'https://user:secret@catalog...' must + not appear in the exception message. + """ + + def _handler(request: httpx.Request) -> httpx.Response: + raise httpx.ConnectError("connection refused", request=request) + + transport = httpx.MockTransport(_handler) + client = httpx.Client(transport=transport) + backend = HTTPMCPServerRegistryBackend( + catalog_url="https://user:s3cr3tp4ss@catalog.example.com/?agent_scope=t", + agent_scope="test-scope", + _http_client=client, + ) + + with pytest.raises(MCPRegistryUnavailable) as exc_info: + backend.list_mcp_servers() + + error_text = str(exc_info.value) + assert "s3cr3tp4ss" not in error_text, ( + f"Password must not appear in error message; got: {error_text!r}" + ) + + +def test_auth_token_not_in_error_messages() -> None: + """MCPRegistryAuthRequired message must not contain the auth token value. + + spec/36 D-F2. The auth token is a secret; it must be redacted from all + operator-visible error messages. + """ + + def _handler(request: httpx.Request) -> httpx.Response: + if "/capabilities" in request.url.path: + return httpx.Response(401, json={"error": "unauthorized"}) + return httpx.Response(401, json={"error": "unauthorized"}) + + transport = httpx.MockTransport(_handler) + client = httpx.Client(transport=transport) + backend = HTTPMCPServerRegistryBackend( + catalog_url="http://catalog.example.invalid", + agent_scope="test-scope", + auth_token="super-secret-bearer-token-abc123", + _http_client=client, + ) + + with pytest.raises(MCPRegistryAuthRequired) as exc_info: + backend.list_mcp_servers() + + error_text = str(exc_info.value) + assert "super-secret-bearer-token-abc123" not in error_text, ( + f"Auth token must not appear in error message; got: {error_text!r}" + ) + + +# ────────────────────────────────────────────────────────────────────────────── +# g) Capability lifecycle + properties + + +def test_capabilities_before_first_probe_returns_conservative_default() -> None: + """Reading capabilities before first probe returns a conservative default. + + spec/36 B-F11. A fresh backend has no probe state. Reading the property + must return a safe conservative view (all write caps False) rather than + raising or blocking on a probe. + """ + call_count: list[int] = [0] + + def _counting_handler(request: httpx.Request) -> httpx.Response: + call_count[0] += 1 + return httpx.Response(200, json=_capabilities_response()) + + transport = httpx.MockTransport(_counting_handler) + client = httpx.Client(transport=transport) + backend = HTTPMCPServerRegistryBackend( + catalog_url="http://catalog.example.invalid", + agent_scope="test-scope", + _http_client=client, + ) + + # Read capabilities WITHOUT calling list_mcp_servers / load_mcp_server first. + caps = backend.capabilities + + assert caps.supports_install is False, ( + "Conservative default: supports_install must be False before first probe." + ) + assert caps.supports_capability_handshake is True, ( + "HTTP backend always supports capability handshake (static class-level flag)." + ) + assert call_count[0] == 0, ( + "Reading capabilities property must not trigger a probe network call." + ) + + +def test_refresh_capabilities_bypasses_failure_cache() -> None: + """refresh_capabilities() bypasses the failure cache and re-probes. + + spec/36 B-F5. When a probe has failed and is within its cache window, + refresh_capabilities() must make a new HTTP call (bypassing the cache), + allowing recovery after a transient outage. + """ + call_count: list[int] = [0] + should_succeed: list[bool] = [False] + + def _handler(request: httpx.Request) -> httpx.Response: + call_count[0] += 1 + if "/capabilities" in request.url.path: + if should_succeed[0]: + return httpx.Response(200, json=_capabilities_response()) + return httpx.Response(500, json={"error": "server down"}) + return httpx.Response(200, json={"servers": []}) + + transport = httpx.MockTransport(_handler) + backend = _make_backend(transport=transport, probe_failure_cache_s=30.0) + + # First call: probe fails, cached. + with pytest.raises(MCPRegistryUnavailable): + backend.list_mcp_servers() + + count_after_failure = call_count[0] + + # Simulate server recovery. + should_succeed[0] = True + + # refresh_capabilities() must bypass the cache and make a new call. + refreshed = backend.refresh_capabilities() + assert call_count[0] > count_after_failure, ( + "refresh_capabilities() must make a new HTTP call even within the failure cache window." + ) + assert refreshed is not None + + +def test_capabilities_property_does_not_fire_probe() -> None: + """Reading the capabilities property does not fire a probe HTTP request. + + spec/36 B-F10 strengthen. The property should return cached/default state + without hitting the network. + """ + call_count: list[int] = [0] + + def _handler(request: httpx.Request) -> httpx.Response: + call_count[0] += 1 + return httpx.Response(200, json=_capabilities_response()) + + transport = httpx.MockTransport(_handler) + client = httpx.Client(transport=transport) + backend = HTTPMCPServerRegistryBackend( + catalog_url="http://catalog.example.invalid", + agent_scope="test-scope", + _http_client=client, + ) + + # Read the property three times. + _ = backend.capabilities + _ = backend.capabilities + _ = backend.capabilities + + assert call_count[0] == 0, ( + f"capabilities property must not fire probe; got {call_count[0]} call(s)." + ) + + +# ────────────────────────────────────────────────────────────────────────────── +# h) Lazy httpx import guard + + +def test_construction_does_not_import_httpx() -> None: + """HTTPMCPServerRegistryBackend construction must not trigger httpx import. + + spec/36 A-F5. The httpx import is deferred so operators who use only the + filesystem backend do not pay the httpx import cost. The [http] extra + controls opt-in installation. + + Strategy: remove httpx from sys.modules, construct the backend (without + an injected client to exercise the lazy-import path), restore modules. + """ + saved_httpx = sys.modules.pop("httpx", None) + saved_submodules = { + k: sys.modules.pop(k) + for k in list(sys.modules.keys()) + if k.startswith("httpx.") + } + + try: + # Construction must not trigger the import. + backend = HTTPMCPServerRegistryBackend( + catalog_url="http://example.invalid", + agent_scope="test-scope", + # No _http_client= to exercise the lazy construction path. + ) + assert backend is not None + + # httpx should not have been re-imported. + assert "httpx" not in sys.modules, ( + "Construction must not import httpx; httpx appeared in sys.modules after __init__." + ) + finally: + # Restore httpx so the rest of the test suite keeps working. + if saved_httpx is not None: + sys.modules["httpx"] = saved_httpx + sys.modules.update(saved_submodules) + + +# ────────────────────────────────────────────────────────────────────────────── +# i) Review-army follow-up tests +# +# Added during /ship pre-landing review army (5 specialists + Claude adversarial +# + Step 9 checklist) on 2026-06-04. Each test has a finding-ID anchor in its +# docstring so future contributors can find the original review context. + + +def test_mcp_server_spec_to_dict_from_dict_round_trip() -> None: + """MCPServerSpec.to_dict / from_dict public round-trip preserves all fields. + + T-F4 (Testing specialist, confidence 95). C-F1 (prep notes primitive- + existence). The methods were promoted from private helpers in profile/types + to public methods on MCPServerSpec at PR 4. Round-trip identity is the + canonical contract. + """ + spec = MCPServerSpec( + name="my-server", + command="python3", + args=["-m", "server", "--port=8080"], + env={"API_KEY": "$MY_KEY", "DEBUG": "1"}, + transport="stdio", + description="A test server.", + ) + assert MCPServerSpec.from_dict(spec.to_dict()) == spec + + +def test_mcp_server_spec_from_dict_ignores_extra_keys() -> None: + """Extra keys in the source dict are silently dropped (forward-compat). + + T-F4 follow-on. Future catalog server wire format extensions must not + break existing clients. + """ + d = { + "name": "x", + "command": "echo", + "extra_future_field": "ignored", + "another_unknown": 42, + } + spec = MCPServerSpec.from_dict(d) + assert spec.name == "x" + assert spec.command == "echo" + + +def test_mcp_server_spec_from_dict_defaults_for_optional_fields() -> None: + """Optional fields fall back to documented defaults (backward-compat). + + T-F4 follow-on. Catalog servers may omit any of args, env, transport, + description; defaults are the wire format contract. + """ + d = {"name": "x", "command": "echo"} + spec = MCPServerSpec.from_dict(d) + assert spec.args == [] + assert spec.env == {} + assert spec.transport == "stdio" + assert spec.description == "" + + +def test_mcp_server_spec_from_dict_raises_on_missing_required_field() -> None: + """Required keys 'name' and 'command' raise KeyError if absent. + + T-F4 follow-on. The from_dict docstring documents this contract. + """ + with pytest.raises(KeyError): + MCPServerSpec.from_dict({"command": "echo"}) # missing 'name' + with pytest.raises(KeyError): + MCPServerSpec.from_dict({"name": "x"}) # missing 'command' + + +def test_probe_options_5xx_raises_unavailable() -> None: + """OPTIONS /mcp-servers returning 5xx must NOT silently fall back to tier 1. + + T-F1 (Testing specialist, confidence 92). Adversarial F3. The previous + code only branched on (404, 405) and would silently report tier 1 for + any other status. A misconfigured catalog returning 500 must surface as + MCPRegistryUnavailable so the operator sees the real failure. + """ + transport = _make_mock_transport( + { + ("GET", "/capabilities"): httpx.Response(404, json={}), + ("OPTIONS", "/mcp-servers"): httpx.Response( + 500, json={"error": "server error"} + ), + } + ) + backend = _make_backend(transport=transport) + with pytest.raises(MCPRegistryUnavailable): + backend.list_mcp_servers() + + +def test_probe_options_401_raises_auth_required() -> None: + """OPTIONS /mcp-servers returning 401 must raise MCPRegistryAuthRequired. + + T-F1 (Testing specialist, confidence 92). Adversarial F3. A misconfigured + auth token must surface as an auth error, not as a silent capability + downgrade where a tier-3 server appears as tier 1. + """ + transport = _make_mock_transport( + { + ("GET", "/capabilities"): httpx.Response(404, json={}), + ("OPTIONS", "/mcp-servers"): httpx.Response( + 401, json={"error": "unauthorized"} + ), + } + ) + backend = _make_backend(transport=transport) + with pytest.raises(MCPRegistryAuthRequired): + backend.list_mcp_servers() + + +def test_httpx_invalid_url_strict_assertion() -> None: + """httpx.InvalidURL specifically maps to ValueError, not MCPRegistryUnavailable. + + T-F2 (Testing specialist, confidence 90). The earlier test + test_httpx_invalid_url_maps_to_value_error accepted EITHER ValueError or + MCPRegistryUnavailable because it relied on httpx parsing a malformed URL + at first call (behavior varies by httpx version). This test injects the + exception via MockTransport so the mapping is exercised deterministically. + """ + + def _handler(request: httpx.Request) -> httpx.Response: + raise httpx.InvalidURL("crafted invalid url") + + transport = httpx.MockTransport(_handler) + backend = _make_backend(transport=transport) + with pytest.raises(ValueError): + backend.list_mcp_servers() + + +def test_load_all_full_spec_equality_with_per_name_load() -> None: + """MUST 10 / C-F4: bulk endpoint and per-name loads produce IDENTICAL specs. + + T-F3 (Testing specialist, confidence 88). The earlier MUST 10 test only + compared name sets; that would pass even if args/env/transport differed + between bulk and per-name responses. This test uses non-default field + values so the equality check is meaningful. + """ + servers = [ + _make_server_wire_entry( + "alpha", + command="python3", + args=["-m", "alpha_server"], + env={"PORT": "8080", "DEBUG": "1"}, + description="Alpha server description", + ), + _make_server_wire_entry( + "beta", + command="npx", + args=["-y", "@beta/server"], + env={"NODE_ENV": "production"}, + description="Beta server", + ), + ] + + def _handler(request: httpx.Request) -> httpx.Response: + path = request.url.path.rstrip("/") + query = str(request.url.query) + if path.endswith("/capabilities"): + return httpx.Response(200, json=_capabilities_response()) + if path.endswith("/mcp-servers"): + if "expand" in query: + return httpx.Response(200, json={"servers": servers}) + refs = [ + { + "name": s["name"], + "description": s["description"], + "transport": s["transport"], + } + for s in servers + ] + return httpx.Response(200, json={"servers": refs}) + if "/mcp-servers/" in path: + name = path.rsplit("/", 1)[-1] + for s in servers: + if s["name"] == name: + return httpx.Response(200, json=s) + return httpx.Response(404, json={}) + + backend = _make_backend(transport=httpx.MockTransport(_handler)) + + bulk_by_name = {s.name: s for s in backend.load_all_mcp_servers()} + for ref in backend.list_mcp_servers(): + per_name = backend.load_mcp_server(ref.name) + bulk = bulk_by_name[ref.name] + assert per_name == bulk, ( + f"MUST 10 field mismatch for {ref.name!r}:\n" + f" per-name: {per_name!r}\n bulk: {bulk!r}" + ) + + +def test_httpx_decoding_error_maps_to_descriptor_invalid() -> None: + """httpx.DecodingError maps to MCPRegistryDescriptorInvalid, not Unavailable. + + T-F5 (Testing specialist, confidence 85). The DecodingError catch in + _handle_http_error is order-dependent (sits between LocalProtocolError + and the HTTPError catch-all). A regression removing the specific check + would silently change the mapping. + """ + + def _handler(request: httpx.Request) -> httpx.Response: + raise httpx.DecodingError("failed to decode response") + + transport = httpx.MockTransport(_handler) + backend = _make_backend(transport=transport) + with pytest.raises(MCPRegistryDescriptorInvalid): + backend.list_mcp_servers() + + +def test_concurrent_first_probes_both_succeed() -> None: + """Two concurrent first-call probes both return valid results (D-PR4-3). + + T-F6 (Testing specialist, confidence 82). Adversarial F7. The capability + cache lock guards the cache-check and cache-write but NOT the HTTP probe + itself (so concurrent callers don't serialize on network latency). Last + writer wins; both threads must succeed. + """ + import threading + + barrier = threading.Barrier(2) + results: list = [] + errors: list = [] + + def _handler(request: httpx.Request) -> httpx.Response: + path = request.url.path + if "/capabilities" in path: + return httpx.Response(200, json=_capabilities_response()) + return httpx.Response(200, json={"servers": []}) + + backend = _make_backend(transport=httpx.MockTransport(_handler)) + + def worker() -> None: + barrier.wait(timeout=5) + try: + results.append(backend.list_mcp_servers()) + except Exception as e: # noqa: BLE001 + errors.append(e) + + threads = [threading.Thread(target=worker) for _ in range(2)] + for t in threads: + t.start() + for t in threads: + t.join(timeout=5) + + assert not errors, f"Concurrent probe errors: {errors!r}" + assert len(results) == 2, "Both threads must complete" + + +def test_agent_scope_forwarded_as_query_param() -> None: + """Every request includes ?agent_scope= (MUST 5 wire enforcement). + + T-F7 (Testing specialist, confidence 80). Cross-tenant isolation depends + on the scope query param appearing on every request. A regression + omitting it would cause a scope='A' backend to see scope='B' servers. + """ + captured_urls: list[str] = [] + + def _handler(request: httpx.Request) -> httpx.Response: + captured_urls.append(str(request.url)) + path = request.url.path + if "/capabilities" in path: + return httpx.Response(200, json=_capabilities_response()) + return httpx.Response(200, json={"servers": []}) + + backend = _make_backend( + agent_scope="my-agent-scope", + transport=httpx.MockTransport(_handler), + ) + backend.list_mcp_servers() + + data_urls = [u for u in captured_urls if "/mcp-servers" in u] + assert data_urls, "No /mcp-servers requests captured" + for url in data_urls: + assert "agent_scope=my-agent-scope" in url, f"agent_scope missing from {url!r}" + + +def test_successful_probe_is_cached_no_re_probe() -> None: + """After a successful probe, a second list_mcp_servers() must not re-probe. + + T-F8 (Testing specialist, confidence 78). The positive cache (success + prevents redundant probe) is the inverse of the failure cache test that + already exists. Without this test, a regression that lost the early-return + on cache hit would re-probe on every call without being caught. + """ + probe_calls: list[int] = [0] + + def _handler(request: httpx.Request) -> httpx.Response: + if "/capabilities" in request.url.path: + probe_calls[0] += 1 + return httpx.Response(200, json=_capabilities_response()) + return httpx.Response(200, json={"servers": []}) + + backend = _make_backend(transport=httpx.MockTransport(_handler)) + backend.list_mcp_servers() + backend.list_mcp_servers() + backend.list_mcp_servers() + assert probe_calls[0] == 1, ( + f"Probe must only fire once; fired {probe_calls[0]} times across 3 calls" + ) + + +def test_factory_extracts_agent_scope_from_url() -> None: + """make_http_mcp_server_registry_backend_from_url parses ?agent_scope=. + + T-F9 (Testing specialist, confidence 75). Factory function had zero + direct tests before /ship review. + """ + from atomic_agents.mcp_registry.http import ( + make_http_mcp_server_registry_backend_from_url, + ) + + backend = make_http_mcp_server_registry_backend_from_url( + "https://catalog.example.com/?agent_scope=my-scope" + ) + assert backend._agent_scope == "my-scope" + + +def test_factory_defaults_agent_scope_to_default() -> None: + """When agent_scope is absent from URL, factory defaults to 'default'. + + T-F9. Spec/36 §Operator surface documents this default. + """ + from atomic_agents.mcp_registry.http import ( + make_http_mcp_server_registry_backend_from_url, + ) + + backend = make_http_mcp_server_registry_backend_from_url( + "https://catalog.example.com/" + ) + assert backend._agent_scope == "default" + + +def test_factory_rejects_empty_url() -> None: + """Empty URL raises ValueError at the factory. + + T-F9. + """ + from atomic_agents.mcp_registry.http import ( + make_http_mcp_server_registry_backend_from_url, + ) + + with pytest.raises(ValueError): + make_http_mcp_server_registry_backend_from_url("") + with pytest.raises(ValueError): + make_http_mcp_server_registry_backend_from_url(" ") + + +def test_factory_rejects_non_http_scheme() -> None: + """Factory rejects ftp://, filesystem://, or any non-http(s) scheme. + + T-F9. + """ + from atomic_agents.mcp_registry.http import ( + make_http_mcp_server_registry_backend_from_url, + ) + + with pytest.raises(ValueError): + make_http_mcp_server_registry_backend_from_url("ftp://catalog/?agent_scope=x") + with pytest.raises(ValueError): + make_http_mcp_server_registry_backend_from_url( + "filesystem:///path?agent_scope=x" + ) + + +def test_factory_value_error_redacts_credentials() -> None: + """S-F1: factory ValueError must not echo URL credentials. + + Security specialist S-F1 (confidence 7). An operator who pastes + ftp://user:secret@host into the URL env var would otherwise see the + secret in the error message. + """ + from atomic_agents.mcp_registry.http import ( + make_http_mcp_server_registry_backend_from_url, + ) + + with pytest.raises(ValueError) as exc_info: + make_http_mcp_server_registry_backend_from_url( + "ftp://user:secret-token-abc@host/?agent_scope=x" + ) + assert "secret-token-abc" not in str(exc_info.value), ( + "Credentials must not appear in factory ValueError" + ) + + +def test_factory_reads_auth_token_from_env(monkeypatch) -> None: + """Factory reads ATOMIC_AGENTS_MCP_SERVER_REGISTRY_AUTH_TOKEN from env. + + T-F9. + """ + from atomic_agents.mcp_registry.http import ( + make_http_mcp_server_registry_backend_from_url, + ) + + monkeypatch.setenv("ATOMIC_AGENTS_MCP_SERVER_REGISTRY_AUTH_TOKEN", "tok-abc-123") + backend = make_http_mcp_server_registry_backend_from_url( + "https://catalog.example.com/?agent_scope=s" + ) + assert backend._auth_token == "tok-abc-123" + + +def test_runtime_error_from_closed_client_maps_to_unavailable() -> None: + """Adv-F2: RuntimeError from a closed httpx.Client maps to MCPRegistryUnavailable. + + Adversarial finding F2 (the most exploitable per the adversarial + recommendation). A reader holding self._real_client outside the + _client_lock can race a concurrent close(); calling .get() on the closed + client raises RuntimeError. Must surface as MCPRegistryUnavailable, not + escape raw to the caller (which expects MCPRegistry* types only). + """ + + def _handler(request: httpx.Request) -> httpx.Response: + # Simulate the closed-client error path. + raise RuntimeError("Cannot send a request, as the client has been closed.") + + transport = httpx.MockTransport(_handler) + backend = _make_backend(transport=transport) + with pytest.raises(MCPRegistryUnavailable): + backend.list_mcp_servers() + + +def test_catalog_url_with_query_string_is_normalized() -> None: + """Adv-F4: catalog_url with query string strips correctly at __init__. + + A catalog_url like "https://host/api?debug=1" must NOT produce + "https://host/api?debug=1/mcp-servers?agent_scope=..." (malformed) when + constructing request URLs. Normalization at __init__ strips the query + string AND trailing slashes once, fixing six downstream URL-build sites. + """ + backend = HTTPMCPServerRegistryBackend( + catalog_url="https://catalog.example.com/api/?debug=1", + agent_scope="test", + _http_client=httpx.Client(transport=_make_mock_transport({})), + ) + # _catalog_url stored normalized: no trailing slash, no query string. + assert backend._catalog_url == "https://catalog.example.com/api" + + +def test_mcp_server_ref_source_uses_raw_catalog_url_not_redacted() -> None: + """A-F1: MCPServerRef.source uses RAW catalog_url per spec/36 line 228. + + The implementer brief originally used _safe_catalog_url (redacted) for + source to defend against credential leaks, but the redactor aggressively + strips after ':// so a clean URL like https://catalog.example.com becomes + 'https://...' — breaking downstream navigation. Per the spec, source is + a data field meant to be navigable; the operator's recommended pattern + is the auth_token env var, not URL-embedded credentials. + """ + server = _make_server_wire_entry("test-server") + + def _handler(request: httpx.Request) -> httpx.Response: + path = request.url.path + if "/capabilities" in path: + return httpx.Response(200, json=_capabilities_response()) + if path.endswith("/mcp-servers"): + return httpx.Response( + 200, + json={ + "servers": [ + { + "name": server["name"], + "description": server["description"], + "transport": server["transport"], + } + ] + }, + ) + return httpx.Response(404, json={}) + + backend = _make_backend( + catalog_url="https://catalog.example.com", + transport=httpx.MockTransport(_handler), + ) + refs = backend.list_mcp_servers() + assert len(refs) == 1 + # Source must be the RAW catalog URL, not 'https://...' (redacted form). + assert refs[0].source == "https://catalog.example.com/mcp-servers/test-server", ( + f"source must be raw catalog URL; got {refs[0].source!r}" + ) From ed58d5685e5f9829a107fd80426c93d545a1fd75 Mon Sep 17 00:00:00 2001 From: Dan Powers Date: Thu, 4 Jun 2026 10:54:49 -0500 Subject: [PATCH 2/2] docs: update MCPServerRegistryBackend backend table row for PR 4 of 5 Reflects that HTTPMCPServerRegistryBackend read paths + tier negotiation shipped in PR 4; table row was still showing "Planned" / #201 link. Updates to in-progress status with spec/36 link and accurate capability description. Co-Authored-By: Claude Sonnet 4.6 --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index b7a4a59..0b6c24d 100644 --- a/README.md +++ b/README.md @@ -205,7 +205,7 @@ The framework is moving toward swappable backends layer by layer. The shape: a P | `PolicyBackend` | ✅ Shipped | Filesystem reference impl (`policy.md` at project root); cost-cap MIN composition + tool / MCP / model surfaces enforced by default (PR 4 flag flip); unified `policy_decision` audit event family | [`spec/32`](docs/spec/32-policy-backend.md) | | `PersonaBackend` | ✅ Shipped | Filesystem reference impl at `/.personas//`; `persona.link.md` ownership trigger; snapshot trio nested under each persona's directory; `atomic-agents persona` CLI; AgentProfile composition with migration-window restore event | [`spec/33`](docs/spec/33-persona-backend.md) | | `CorpusBackend` | ✅ Shipped | Filesystem + SQLite (FTS5) reference impls; per-agent `wiki/` + `raw/`; `render_index_summary(corpus)` Protocol method; closes the GB-scale wiki cliff via O(log N) indexed full-text query | [`spec/34`](docs/spec/34-corpus-backend.md) | -| `MCPServerRegistryBackend` | Planned | Catalog + install/audit for MCP servers (MCP equivalent of ToolRegistry) | [`#201`](https://github.com/dep0we/atomic-agents-stack/issues/201) | +| `MCPServerRegistryBackend` | 🟡 In progress (PR 4 of 5) | Filesystem + HTTP read-path reference impls; tier-1/2/3 capability negotiation; `atomic-agents mcp-registry` CLI; write paths (install/uninstall) ship at PR 5 | [`spec/36`](docs/spec/36-mcp-server-registry-backend.md) | **v1 direction:** a home user runs filesystem-everything today. An organization runs the same agent definitions over Postgres / Redis / SQLite-Datadog / behind an HTTP service once the remaining protocol ships. v1.0 closes when MCPServerRegistry lands + its conformance suite pins the contract. See [`docs/architecture.md`](docs/architecture.md) for the mental model, [`docs/TENSIONS.md`](docs/TENSIONS.md) for architectural tensions this scaling story has to survive, and [`ROADMAP.md`](ROADMAP.md) for the full backlog beyond v1.0.