Skip to content

Fix AtomicMemory thread scope and add meta-fact filter#2

Merged
ethanj merged 1 commit into
mainfrom
sync/sdk-d3abc7d
May 17, 2026
Merged

Fix AtomicMemory thread scope and add meta-fact filter#2
ethanj merged 1 commit into
mainfrom
sync/sdk-d3abc7d

Conversation

@ethanj
Copy link
Copy Markdown
Contributor

@ethanj ethanj commented May 17, 2026

Summary

Fix the AtomicMemory provider dropping thread scope (session ID) on memory operations, and introduce a MetaFactFilter that removes extraction artifacts — facts the extractor emits about the conversation itself rather than about the user — before they can pollute embedding search results.

Changes

  • Thread scope fixscope-mapper.ts and handle-impl.ts now correctly carry thread/session scope through list, search, and upsert paths so memories are isolated to the right conversation thread.
  • MetaFactFilter (src/memory/meta-fact-filter.ts) — pattern-based filter that rejects statements matching extraction artifacts (The user asked for…, As of <date>, X is a term mentioned…, A name was mentioned…). Exported from src/memory/index.ts.
  • Provider types (types.ts) — add threadId / session scope fields to internal request/response shapes.
  • Mapper updates (mappers.ts) — propagate new scope fields through wire-format mapping.
  • Fixture refreshsearch.raw.json, search-fast.raw.json, list.mapped.json, and related mapped fixtures updated to reflect the new scope fields in provider responses.
  • Test coverageatomicmemory-provider.test.ts extended for thread scope round-trips; meta-fact-filter.test.ts added with pattern coverage across all rejection categories; namespace-base-routes.test.ts added for scoped route construction.
  • AlignBench benchmark suite (benchmarks/alignbench/) — controlled 60-query / 55-fact recall benchmark with a standalone runner, six-model embedding ablation, and full results. Validates the meta-fact filter lift and falsifies the pronoun-rewrite hypothesis against a pre-registered threshold.

Why

Thread scope was being silently dropped in the scope mapper, causing all memory reads and writes to fall through to the namespace root rather than the per-session partition. This meant thread-isolated memories were readable across unrelated sessions.

Separately, production extraction pipelines emit low-signal meta-facts (The user asked for the user's name., As of May 14, the user is a term mentioned in the conversation.) that occupy the same embedding neighborhood as real user facts, causing a retrieval margin collapse — the correct fact loses top-1 to an extraction artifact. The AlignBench benchmark quantified this: dropping extraction meta-facts yields the largest single recall lift (+0.03–0.05 r@1) of any algorithmic retrieval patch tested.

Validation

  • pnpm test — all provider, meta-fact-filter, and namespace-base-routes tests pass.
  • pnpm typecheck — no new type errors; new threadId fields are fully typed.
  • Fixture contract tests confirm the updated wire shapes match the refreshed recorded responses.
  • AlignBench run results committed in benchmarks/alignbench/runs/ for reproducibility; the clean-pool variant (meta-facts removed) shows r@5 0.950 vs 0.933 baseline with distractor-top1 dropping to zero.

## Summary

Fix the AtomicMemory provider dropping thread scope (session ID) on memory operations, and introduce a `MetaFactFilter` that removes extraction artifacts — facts the extractor emits about the conversation itself rather than about the user — before they can pollute embedding search results.

## Changes

- **Thread scope fix** — `scope-mapper.ts` and `handle-impl.ts` now correctly carry thread/session scope through list, search, and upsert paths so memories are isolated to the right conversation thread.
- **`MetaFactFilter`** (`src/memory/meta-fact-filter.ts`) — pattern-based filter that rejects statements matching extraction artifacts (`The user asked for…`, `As of <date>, X is a term mentioned…`, `A name was mentioned…`). Exported from `src/memory/index.ts`.
- **Provider types** (`types.ts`) — add `threadId` / session scope fields to internal request/response shapes.
- **Mapper updates** (`mappers.ts`) — propagate new scope fields through wire-format mapping.
- **Fixture refresh** — `search.raw.json`, `search-fast.raw.json`, `list.mapped.json`, and related mapped fixtures updated to reflect the new scope fields in provider responses.
- **Test coverage** — `atomicmemory-provider.test.ts` extended for thread scope round-trips; `meta-fact-filter.test.ts` added with pattern coverage across all rejection categories; `namespace-base-routes.test.ts` added for scoped route construction.
- **AlignBench benchmark suite** (`benchmarks/alignbench/`) — controlled 60-query / 55-fact recall benchmark with a standalone runner, six-model embedding ablation, and full results. Validates the meta-fact filter lift and falsifies the pronoun-rewrite hypothesis against a pre-registered threshold.

## Why

Thread scope was being silently dropped in the scope mapper, causing all memory reads and writes to fall through to the namespace root rather than the per-session partition. This meant thread-isolated memories were readable across unrelated sessions.

Separately, production extraction pipelines emit low-signal meta-facts (`The user asked for the user's name.`, `As of May 14, the user is a term mentioned in the conversation.`) that occupy the same embedding neighborhood as real user facts, causing a retrieval margin collapse — the correct fact loses top-1 to an extraction artifact. The AlignBench benchmark quantified this: dropping extraction meta-facts yields the largest single recall lift (+0.03–0.05 r@1) of any algorithmic retrieval patch tested.

## Validation

- `pnpm test` — all provider, meta-fact-filter, and namespace-base-routes tests pass.
- `pnpm typecheck` — no new type errors; new `threadId` fields are fully typed.
- Fixture contract tests confirm the updated wire shapes match the refreshed recorded responses.
- AlignBench run results committed in `benchmarks/alignbench/runs/` for reproducibility; the clean-pool variant (meta-facts removed) shows r@5 0.950 vs 0.933 baseline with distractor-top1 dropping to zero.
@ethanj ethanj merged commit c345cc2 into main May 17, 2026
2 checks passed
@ethanj ethanj deleted the sync/sdk-d3abc7d branch May 17, 2026 08:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant