Skip to content

feat(adapters): MemPalaceDaemonAdapter — read-side adapter for production palaces#5

Merged
M0nkeyFl0wer merged 17 commits into
M0nkeyFl0wer:mainfrom
techempower-org:feat/mempalace-daemon-adapter
May 1, 2026
Merged

feat(adapters): MemPalaceDaemonAdapter — read-side adapter for production palaces#5
M0nkeyFl0wer merged 17 commits into
M0nkeyFl0wer:mainfrom
techempower-org:feat/mempalace-daemon-adapter

Conversation

@jphein
Copy link
Copy Markdown
Contributor

@jphein jphein commented Apr 30, 2026

First in a three-PR stack pulling in the adapters and corpus work that's been accumulating on my fork. Stack order:

  1. This PRMemPalaceDaemonAdapter (read-side, hits palace-daemon's HTTP API)
  2. feat/familiar-adapter-and-jp-realm-corpusFamiliarAdapter + jp-realm-v0.1 30-question corpus + first live readings + lessons doc
  3. feat/rlm-adapterRlmAdapter + Qwen-7B / Llama-70B baselines (the data behind #3)

Each PR is built on top of the previous; the diffs against main will collapse to the new commits as earlier PRs merge.

Summary

Adds MemPalaceDaemonAdapter, a read-side adapter that talks to a running palace-daemon HTTP API rather than opening the underlying ChromaDB directly. Useful when:

  • The palace is hosted on a remote machine (typical production setup) and the eval framework can't reach the DB files.
  • You want the eval to exercise the same access path the orchestrator uses in production (auth, kind filtering, daemon-side query handling).
  • You want a /graph fast-path snapshot for Cat 8 without round-tripping every drawer through the MCP layer.

Design + plan docs in the PR

  • docs/superpowers/specs/2026-04-25-mempalace-daemon-adapter-design.md
  • docs/superpowers/plans/2026-04-25-mempalace-daemon-adapter.md

What's exposed

  • query() — happy path against /search, --kind filter, --api-key auth resolution from arg → env → config.
  • get_graph_snapshot() — fast-path against /graph, MCP fallback when /graph is unavailable.
  • ingest_corpus() — explicit error message rather than silent no-op (palace-daemon is read-side).

Testing

  • Mocked HTTP via fake_urlopen_factory fixture (tests/conftest.py, tests/test_mempalace_daemon_adapter.py) — 28 tests covering query happy/error paths, warning contracts, kind filter behavioural invariants, graph snapshot fast-path + MCP fallback, ontology/ingest errors.
  • Gated live smoke (tests/test_mempalace_daemon_integration.py) — runs against an actual running daemon when PALACE_DAEMON_URL and PALACE_API_KEY are set; skipped in CI.

Test plan

  • python -m pytest tests/test_mempalace_daemon_adapter.py -v — 28 mocked tests pass
  • sme-eval retrieve --adapter mempalace-daemon --questions tests/fixtures/tiny_questions.yaml --daemon-url http://localhost:8085 --api-key $KEY — CLI smoke against a running daemon
  • PALACE_DAEMON_URL=... PALACE_API_KEY=... python -m pytest tests/test_mempalace_daemon_integration.py — live smoke if you have a daemon to point at

Happy to walk through any specific commit if useful — the 17 commits split into design/plan, scaffold + auth, query, graph snapshot, ontology + CLI wiring, tests/docs/ruff, and one Copilot-review fix commit on top (env-file resilience + tighter live-test gate). The unused-import lint fix from the same Copilot review only applies after the shared _graph_mapping module lands, so it ships with PR #6 instead.

jphein and others added 16 commits April 30, 2026 10:02
The shipped sme/adapters/mempalace.py opens ChromaDB directly via
chromadb.PersistentClient. That's correct for upstream MemPalace
single-process installs but violates the daemon-strict invariant of
the jphein/palace-daemon + jphein/mempalace fork architecture: two
PersistentClient instances against the same palace = the
multi-writer corruption scenario the daemon was built to prevent.

Documenting a planned mempalace-daemon adapter that:
- talks HTTP/MCP only (no filesystem access, no chromadb import)
- assembles the graph snapshot from /mcp tool calls
  (mempalace_list_wings, mempalace_list_rooms, mempalace_list_tunnels,
   mempalace_kg_query)
- routes query() through palace-daemon's GET /search with the new
  kind= parameter (default kind=content excludes Stop-hook auto-save
  checkpoints — validated 2026-04-25 against the canonical 151K
  palace, filter excludes ~637 of 151,448 drawers but ~80%+ of
  search results before filter)

Why it matters: SME's Cat 9 (The Handshake) measures
integration-under-production-model — exactly the slice engram-2's
"0.984 R@5 vs 17% E2E QA" critique targets. Running Cat 9 through
the daemon adapter, A/B before/after applying the kind= filter, would
let SME's verdict replace our hand-rolled benchmark and generate
publishable data on whether the fix moves the needle.

Multi-hour build, not yet started; this entry is the placeholder so
the work doesn't drift. Lives on main only — kept off the
feat/cat9-9b-scaffolding upstream PR branch per the
"fork README handling" pattern (PR branches must carry upstream's
README so fork-only content doesn't leak upstream).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Design for a fork-side SMEAdapter that talks HTTP/MCP to a running
palace-daemon, plus a planned /graph endpoint to make graph snapshots fast.
Adapter ships independently against daemon 1.5.1 via MCP-tool fallback;
/graph is coordinated daemon-side work.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
12 tasks following TDD: scaffolding, adapter (auth → query happy
path → query errors → /graph fast path → MCP fallback → ontology),
CLI plumbing, gated integration smoke, end-to-end CLI smoke, README.
Daemon-side /graph endpoint explicitly out-of-scope; coordinated
separately with JP's active palace-daemon work.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three-source auth resolution: kwargs > env file > process env. Raises
ValueError at construction if nothing resolves. query() and
get_graph_snapshot() raise NotImplementedError pending follow-on tasks.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
GET /search builds context_string in the same format as the existing
MemPalaceAdapter so tiktoken counts stay comparable across adapters.
retrieval_path captures kind + scope counts for downstream scoring.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Covers WARN-with-results soft signal, NO_RESULTS, AUTH, HTTP 5xx,
connection failures, and X-API-Key header propagation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
GET /graph in one call; projects to wings, rooms, member_of, tunnel,
and KG layers. Mirrors the existing direct adapter's projection minus
drawer-level surface (impractical to fetch at scale over HTTP).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When /graph is unavailable (older daemons or prefer_graph_endpoint=False),
walk mempalace_list_wings, mempalace_list_rooms per wing, and
mempalace_list_tunnels via POST /mcp. Per-wing failures degrade to a
partial snapshot rather than aborting.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Returns the same readme-typed ontology as the existing MemPalaceAdapter so
Cat 8 readings are comparable across the direct-ChromaDB and HTTP paths.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds the mempalace-daemon branch to _load_adapter and threads
--api-key / --kind through _load_adapter_from_args, _add_db_or_api_args,
and the retrieve subparser. Existing mempalace and flat adapters drop
the new kwargs cleanly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Skipped when PALACE_DAEMON_URL is unset so CI without a daemon stays
green. Asserts query() returns a QueryResult, snapshot returns >=1
wing, and kind=all >= kind=content for total_before_filter (validates
the README's filter claim). Runs in ~29s against the 151K-drawer
palace via the MCP fallback path; will drop to <5s once the daemon
ships GET /graph.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the fork-roadmap planning subsection with the wired-endpoint
description, auth resolution rules, and invocation example.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Moved conftest imports to the top of the module, dropped unused
os/Path/urllib.error imports.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The earlier total_before_filter comparison conflated metadata math
with filter behaviour. The reliable signal is whether 'CHECKPOINT:'
strings appear in the result text — kind='content' must produce
strictly fewer than kind='all'. Skips when the live palace has no
checkpoints to test against.

Discovered against palace-daemon 1.6.0 with 151K drawers.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new read-side adapter (MemPalaceDaemonAdapter) so SME can query and snapshot MemPalace via the palace-daemon HTTP API (rather than direct Chroma/SQLite access), with CLI wiring plus mocked and live-gated tests and supporting docs.

Changes:

  • Introduce MemPalaceDaemonAdapter with /search querying, /graph fast-path snapshot + MCP fallback, env-file/env-var auth resolution, and explicit read-only ingest behavior.
  • Wire the adapter into the CLI (mempalace-daemon name, --api-key, --kind) and update user-facing docs/README.
  • Add comprehensive mocked HTTP tests, a gated live-daemon smoke test, and a tiny questions YAML fixture for CLI smoke runs.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
sme/adapters/mempalace_daemon.py New HTTP adapter implementation (query + graph snapshot)
sme/cli.py CLI wiring for mempalace-daemon, plus --api-key / --kind plumbing
tests/conftest.py Adds fake_urlopen_factory for urllib-level HTTP mocking
tests/test_mempalace_daemon_adapter.py Mocked unit tests covering auth/query/snapshot/MCP fallback/ontology/ingest
tests/test_mempalace_daemon_integration.py Live-daemon smoke tests gated by env
tests/fixtures/tiny_questions.yaml Minimal question set for end-to-end CLI smoke
docs/superpowers/specs/2026-04-25-mempalace-daemon-adapter-design.md Design/spec documentation for the adapter
docs/superpowers/plans/2026-04-25-mempalace-daemon-adapter.md Implementation plan documentation
README.md Documents the shipped adapter and usage patterns

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread sme/adapters/mempalace_daemon.py Outdated
out: dict[str, str] = {}
if not path.exists():
return out
for line in path.read_text().splitlines():
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_parse_env_file calls path.read_text() without handling I/O errors. If the env file exists but is unreadable (permissions, broken symlink, etc.), adapter construction will raise and bypass the intended fallback to process env vars. Catch OSError/UnicodeDecodeError around the read and return {} (optionally logging at debug/warn) so resolution can proceed.

Suggested change
for line in path.read_text().splitlines():
try:
text = path.read_text()
except (OSError, UnicodeDecodeError) as exc:
log.warning("Failed to read env file %s: %s", path, exc)
return out
for line in text.splitlines():

Copilot uses AI. Check for mistakes.
Comment on lines +229 to +240
# _http_get returns QueryResult on error; treat 404 specifically
if isinstance(body, QueryResult):
if body.error and body.error.startswith("HTTP 404"):
log.info(
"/graph endpoint not present (404); falling back to MCP"
)
else:
log.warning(
"/graph fetch failed: %s; falling back to MCP",
body.error,
)
return self._snapshot_via_mcp()
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_graph_snapshot() falls back to the MCP path on any /graph error (including AUTH/403, DNS, etc.). In those cases MCP will typically fail the same way and the method ends up returning an empty snapshot, which can mask a real configuration/auth outage as “empty graph”. Consider only falling back on HTTP 404 (endpoint absent) and otherwise surfacing the failure more explicitly (e.g., raise RuntimeError/SystemExit, or at least log at error level and return empty without retrying).

Suggested change
# _http_get returns QueryResult on error; treat 404 specifically
if isinstance(body, QueryResult):
if body.error and body.error.startswith("HTTP 404"):
log.info(
"/graph endpoint not present (404); falling back to MCP"
)
else:
log.warning(
"/graph fetch failed: %s; falling back to MCP",
body.error,
)
return self._snapshot_via_mcp()
# _http_get returns QueryResult on error; only treat 404 as
# "endpoint absent" and fall back to MCP in that case.
if isinstance(body, QueryResult):
if body.error and body.error.startswith("HTTP 404"):
log.info(
"/graph endpoint not present (404); falling back to MCP"
)
return self._snapshot_via_mcp()
error_msg = body.error or "unknown /graph fetch error"
log.error("/graph fetch failed: %s", error_msg)
raise RuntimeError(
f"Failed to fetch graph snapshot from /graph: {error_msg}"
)

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deferred for now — the empty-on-error shape may be load-bearing for existing users who rely on get_graph_snapshot returning ([], []) on transient daemon issues rather than raising. The behavior change merits reviewer signal first. Happy to ship a follow-up that distinguishes by status code (404 → MCP fallback; 401/403/5xx/network → empty + log.error) once you've signaled direction. Tracking in the commit body of bb50736.

Comment thread sme/cli.py
Comment on lines +1118 to +1119
"--api-url is supplied (ladybugdb adapter in API-only mode, or "
"the mempalace-daemon adapter which never takes a path).",
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The retrieve --db help text says it’s only optional when --api-url is supplied, but MemPalaceDaemonAdapter can also resolve api_url/api_key from ~/.config/palace-daemon/env (or process env), so users can omit both --db and --api-url. Update the help text to mention env-file/env-var resolution to avoid implying --api-url is required for mempalace-daemon.

Suggested change
"--api-url is supplied (ladybugdb adapter in API-only mode, or "
"the mempalace-daemon adapter which never takes a path).",
"--api-url is supplied, or when the mempalace-daemon adapter "
"can resolve API settings from ~/.config/palace-daemon/env or "
"the process environment.",

Copilot uses AI. Check for mistakes.
Comment on lines +23 to +26
pytestmark = pytest.mark.skipif(
not os.environ.get("PALACE_DAEMON_URL"),
reason="needs a running palace-daemon; set PALACE_DAEMON_URL to enable",
)
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR description says the live integration suite is gated by MEMPALACE_DAEMON_LIVE=1, but this test file is gated on PALACE_DAEMON_URL being set. Either the PR description or the gate condition/docstring here should be updated so contributors know which env var actually enables the live run.

Copilot uses AI. Check for mistakes.
jphein referenced this pull request in techempower-org/multipass-structural-memory-eval Apr 30, 2026
README:
- Status: corrected adapter count (3 -> 6: flat-baseline, mempalace,
  mempalace-daemon, familiar, rlm, ladybugdb) and noted that Cat 9b
  scaffolding shipped upstream as PR #1
- Added "Shipped: rlm adapter" section under Fork roadmap with the
  RLM_BASE_URL/RLM_MODEL/RLM_API_KEY env-var override, the table of
  first two live readings (Qwen 7B and Llama 70B both at 46.67% vs
  familiar's 78.33%), and the link to the 9a issue the data motivated
- Added "Upstream contributions in flight" section listing PRs #5/6/7
  and issues #3/4 currently open against M0nkeyFl0wer

docs/ideas.md:
- RlmAdapter section: the live-benchmark question the section closed
  with ("recover, plateau, or regress?") was answered today by the
  Qwen-7B and Llama-70B runs. Added a "Live benchmark answers" subsection
  with the per-run table, the four findings (plateau across model sizes,
  regress vs familiar, full-vs-partial trade at 70B, 2-hop doubles at
  70B), the unchanged v0.4 implication, and the link to issue #3
- Planned refinements #4 (Cat 9 MVP): added a 2026-04-30 status note
  that 9b shipped via upstream PR #1, 9a was proposed via issue #3,
  and 9c/9e/9f are paced
jphein referenced this pull request in techempower-org/multipass-structural-memory-eval Apr 30, 2026
Addresses Copilot review on PR #5:

- Drop unused `from collections import defaultdict` import (graph
  projection moved into sme.adapters._graph_mapping in the familiar
  PR; this import was left orphaned). Was failing ruff lint silently.
- Wrap `_parse_env_file`'s `path.read_text()` in a try/except for
  OSError + UnicodeDecodeError so adapter construction falls through
  to env-var resolution instead of raising on permissions issues,
  broken symlinks, or non-UTF-8 contents.
- Tighten the live-integration skip gate to require both
  PALACE_DAEMON_URL and PALACE_API_KEY (it previously checked only
  the URL, so a developer with the URL but not the key would get a
  hard test failure instead of a skip).

Deferred (commented in PR review): only-fall-back-on-404 for
get_graph_snapshot. That's a behavior change and the empty-on-error
shape may be load-bearing for existing users; will revisit once the
PR has reviewer signal on direction.
…test gate

Addresses Copilot review on PR #5:

- Wrap `_parse_env_file`'s `path.read_text()` in a try/except for
  OSError + UnicodeDecodeError so adapter construction falls through
  to env-var resolution instead of raising on permissions issues,
  broken symlinks, or non-UTF-8 contents.
- Tighten the live-integration skip gate to require both
  PALACE_DAEMON_URL and PALACE_API_KEY (it previously checked only
  the URL, so a developer with the URL but not the key would get a
  hard test failure instead of a skip).

(The unused-defaultdict-import fix from the same Copilot review only
applies after the shared _graph_mapping module lands; it ships with
the familiar adapter PR where the extraction happens.)

Deferred (commented in PR review): only-fall-back-on-404 for
get_graph_snapshot. That's a behavior change and the empty-on-error
shape may be load-bearing for existing users; will revisit once the
PR has reviewer signal on direction.
jphein referenced this pull request in techempower-org/multipass-structural-memory-eval Apr 30, 2026
Addresses Copilot review on PR #5:

- Drop unused `from collections import defaultdict` import (graph
  projection moved into sme.adapters._graph_mapping in the familiar
  PR; this import was left orphaned). Was failing ruff lint silently.
- Wrap `_parse_env_file`'s `path.read_text()` in a try/except for
  OSError + UnicodeDecodeError so adapter construction falls through
  to env-var resolution instead of raising on permissions issues,
  broken symlinks, or non-UTF-8 contents.
- Tighten the live-integration skip gate to require both
  PALACE_DAEMON_URL and PALACE_API_KEY (it previously checked only
  the URL, so a developer with the URL but not the key would get a
  hard test failure instead of a skip).

Deferred (commented in PR review): only-fall-back-on-404 for
get_graph_snapshot. That's a behavior change and the empty-on-error
shape may be load-bearing for existing users; will revisit once the
PR has reviewer signal on direction.
jphein referenced this pull request in techempower-org/multipass-structural-memory-eval Apr 30, 2026
Addresses Copilot review on PR #5:

- Drop unused `from collections import defaultdict` import (graph
  projection moved into sme.adapters._graph_mapping in the familiar
  PR; this import was left orphaned). Was failing ruff lint silently.
- Wrap `_parse_env_file`'s `path.read_text()` in a try/except for
  OSError + UnicodeDecodeError so adapter construction falls through
  to env-var resolution instead of raising on permissions issues,
  broken symlinks, or non-UTF-8 contents.
- Tighten the live-integration skip gate to require both
  PALACE_DAEMON_URL and PALACE_API_KEY (it previously checked only
  the URL, so a developer with the URL but not the key would get a
  hard test failure instead of a skip).

Deferred (commented in PR review): only-fall-back-on-404 for
get_graph_snapshot. That's a behavior change and the empty-on-error
shape may be load-bearing for existing users; will revisit once the
PR has reviewer signal on direction.
Copy link
Copy Markdown
Owner

@M0nkeyFl0wer M0nkeyFl0wer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved. Merging this clears the way for #6 and #7.

Local verification on pr-5:

  • pytest tests/test_mempalace_daemon_adapter.py -v — 28/28 pass
  • Full suite — 64 passed, 3 skipped (live-gated)
  • ruff check — clean

What I liked:

  • Auth resolution chain is the right shape and the tests verify the priority order (kwarg → env_file → process env → ValueError). That's the kind of thing that silently swaps the wrong key in production if the order isn't pinned by tests.
  • /graph fast-path with MCP fallback distinguishes 404 (endpoint not present) from other failure modes before falling back, with the right log levels for each. The _project_graph projection mirrors MemPalaceAdapter.get_graph_snapshot structurally and the operational reasoning for skipping drawer-level surface ("impractical at 151K-drawer scale through HTTP") is honest and worth keeping in the docstring.
  • ingest_corpus raises NotImplementedError with a helpful pointer to the daemon's /memory endpoint instead of silent no-op. Right call.
  • Operational comments earn their place — "180s timeout, list_wings on 151K-drawer palace ~30s" is exactly the kind of comment that saves a future debugger an hour.
  • The 3946-line diff is mostly design docs + tests; the adapter itself is 524 LOC, healthy ratio.

No requested changes. The unused-import lint that ships with PR #6 is appropriately deferred.

Once the stack is in I'll wire MemPalaceDaemonAdapter into the check test runner alongside MemPalaceAdapter so the CLI path is symmetric.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants