Fix AtomicMemory thread scope and add meta-fact filter#2
Merged
Conversation
## Summary Fix the AtomicMemory provider dropping thread scope (session ID) on memory operations, and introduce a `MetaFactFilter` that removes extraction artifacts — facts the extractor emits about the conversation itself rather than about the user — before they can pollute embedding search results. ## Changes - **Thread scope fix** — `scope-mapper.ts` and `handle-impl.ts` now correctly carry thread/session scope through list, search, and upsert paths so memories are isolated to the right conversation thread. - **`MetaFactFilter`** (`src/memory/meta-fact-filter.ts`) — pattern-based filter that rejects statements matching extraction artifacts (`The user asked for…`, `As of <date>, X is a term mentioned…`, `A name was mentioned…`). Exported from `src/memory/index.ts`. - **Provider types** (`types.ts`) — add `threadId` / session scope fields to internal request/response shapes. - **Mapper updates** (`mappers.ts`) — propagate new scope fields through wire-format mapping. - **Fixture refresh** — `search.raw.json`, `search-fast.raw.json`, `list.mapped.json`, and related mapped fixtures updated to reflect the new scope fields in provider responses. - **Test coverage** — `atomicmemory-provider.test.ts` extended for thread scope round-trips; `meta-fact-filter.test.ts` added with pattern coverage across all rejection categories; `namespace-base-routes.test.ts` added for scoped route construction. - **AlignBench benchmark suite** (`benchmarks/alignbench/`) — controlled 60-query / 55-fact recall benchmark with a standalone runner, six-model embedding ablation, and full results. Validates the meta-fact filter lift and falsifies the pronoun-rewrite hypothesis against a pre-registered threshold. ## Why Thread scope was being silently dropped in the scope mapper, causing all memory reads and writes to fall through to the namespace root rather than the per-session partition. This meant thread-isolated memories were readable across unrelated sessions. Separately, production extraction pipelines emit low-signal meta-facts (`The user asked for the user's name.`, `As of May 14, the user is a term mentioned in the conversation.`) that occupy the same embedding neighborhood as real user facts, causing a retrieval margin collapse — the correct fact loses top-1 to an extraction artifact. The AlignBench benchmark quantified this: dropping extraction meta-facts yields the largest single recall lift (+0.03–0.05 r@1) of any algorithmic retrieval patch tested. ## Validation - `pnpm test` — all provider, meta-fact-filter, and namespace-base-routes tests pass. - `pnpm typecheck` — no new type errors; new `threadId` fields are fully typed. - Fixture contract tests confirm the updated wire shapes match the refreshed recorded responses. - AlignBench run results committed in `benchmarks/alignbench/runs/` for reproducibility; the clean-pool variant (meta-facts removed) shows r@5 0.950 vs 0.933 baseline with distractor-top1 dropping to zero.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fix the AtomicMemory provider dropping thread scope (session ID) on memory operations, and introduce a
MetaFactFilterthat removes extraction artifacts — facts the extractor emits about the conversation itself rather than about the user — before they can pollute embedding search results.Changes
scope-mapper.tsandhandle-impl.tsnow correctly carry thread/session scope through list, search, and upsert paths so memories are isolated to the right conversation thread.MetaFactFilter(src/memory/meta-fact-filter.ts) — pattern-based filter that rejects statements matching extraction artifacts (The user asked for…,As of <date>, X is a term mentioned…,A name was mentioned…). Exported fromsrc/memory/index.ts.types.ts) — addthreadId/ session scope fields to internal request/response shapes.mappers.ts) — propagate new scope fields through wire-format mapping.search.raw.json,search-fast.raw.json,list.mapped.json, and related mapped fixtures updated to reflect the new scope fields in provider responses.atomicmemory-provider.test.tsextended for thread scope round-trips;meta-fact-filter.test.tsadded with pattern coverage across all rejection categories;namespace-base-routes.test.tsadded for scoped route construction.benchmarks/alignbench/) — controlled 60-query / 55-fact recall benchmark with a standalone runner, six-model embedding ablation, and full results. Validates the meta-fact filter lift and falsifies the pronoun-rewrite hypothesis against a pre-registered threshold.Why
Thread scope was being silently dropped in the scope mapper, causing all memory reads and writes to fall through to the namespace root rather than the per-session partition. This meant thread-isolated memories were readable across unrelated sessions.
Separately, production extraction pipelines emit low-signal meta-facts (
The user asked for the user's name.,As of May 14, the user is a term mentioned in the conversation.) that occupy the same embedding neighborhood as real user facts, causing a retrieval margin collapse — the correct fact loses top-1 to an extraction artifact. The AlignBench benchmark quantified this: dropping extraction meta-facts yields the largest single recall lift (+0.03–0.05 r@1) of any algorithmic retrieval patch tested.Validation
pnpm test— all provider, meta-fact-filter, and namespace-base-routes tests pass.pnpm typecheck— no new type errors; newthreadIdfields are fully typed.benchmarks/alignbench/runs/for reproducibility; the clean-pool variant (meta-facts removed) shows r@5 0.950 vs 0.933 baseline with distractor-top1 dropping to zero.