cafitac · cafitac · May 13, 2026 · May 13, 2026
diff --git a/.dev/roadmap/memory-consolidation/current-progress-and-next-steps.md b/.dev/roadmap/memory-consolidation/current-progress-and-next-steps.md
@@ -1,11 +1,11 @@
 # Memory Consolidation Current Progress and Next Steps
 
 Status: AI-authored draft. Not yet human-approved.
-Last updated: 2026-05-13 15:48 KST
+Last updated: 2026-05-13 16:32 KST
 
 ## v0.1.152 released runtime checkpoint and next runway
 
-This document is the restartable checkpoint after the v0.1.152 release/runtime rollout: 50-task expanded retrieval fixture gate, 75 checked-in retrieval eval tasks across the fixture directory, per-candidate collapse proof artifact persistence/replay with supersession-chain evidence, one fresh non-idempotent narrow live reviewed-candidate promotion, copy/live-safe explicit approval corridor evidence, v0.1.152 `personal-oss` Hermes hook rollout, released named ranking policy/shadow-compare diagnostics, approval-gated config-only default-ranking migrate/rollback mechanics, and 50-task live-Hermes-DB representative shadow corpus evidence while keeping `conservative_legacy` as the live default.
+This document is the restartable checkpoint after the v0.1.152 release/runtime rollout: 50-task expanded retrieval fixture gate, 75 checked-in retrieval eval tasks across the fixture directory, per-candidate collapse proof artifact persistence/replay with supersession-chain evidence, one fresh non-idempotent narrow live reviewed-candidate fact promotion, one guarded live reviewed procedure/episode promotion pair, copy/live-safe explicit approval corridor evidence, v0.1.152 `personal-oss` Hermes hook rollout, released named ranking policy/shadow-compare diagnostics, approval-gated config-only default-ranking migrate/rollback mechanics, and 50-task live-Hermes-DB representative fact plus mixed fact/procedure/episode shadow corpus evidence while keeping `conservative_legacy` as the live default.
 
 Current verified release state:
 
@@ -20,28 +20,28 @@ Current verified release state:
 Fresh diagnostics:
 
 - `g4-linkage-gap-diagnose-v0138-fresh.json`: decision `fresh_trace_linkage_gap_not_detected`.
-- `/Users/reddit/.agent-memory/reports/default-ranking-v0152-shadow/fresh-epoch-since-v0152.json`: still blocks epoch-wide automation on `epoch_empty_retrieval_outcome_metadata_gap_classified`.
+- `/Users/reddit/.agent-memory/reports/default-ranking-v0152-shadow/fresh-epoch-since-v0152-with-metadata-gap-diagnostic.json`: still blocks epoch-wide automation on `low_epoch_observation_trace_coverage` and `epoch_empty_retrieval_outcome_metadata_gap_classified`; metadata-gap drilldown reports `dominant_blocker=classified_legacy_missing_outcome`, `classified_missing_outcome_count=6`, and `unresolved_adapter_payload_gap_count=0`.
 - `/tmp/agent-memory-apply-corridor-v0150/`: copy/live-safe explicit approval corridor smoke passed without unintended durable-memory mutation; live apply was idempotent.
 - `/tmp/agent-memory-telemetry-reset-decision/copy-apply.json`: copy telemetry reset passed with protected durable memory tables unchanged; live telemetry reset remains blocked.
-- 50-task expanded retrieval source fixture gate exists, the checked-in fixture directory evaluates at 75/75 pass, and the live-Hermes-DB representative 50-task fact corpus passes with zero shadow regressions/no durable mutation. The checked-in expanded fixture is still not directly replayable against the tiny live DB because project-M1 references are absent; default ranking remains unchanged until a separate explicit default-rollout decision.
+- 50-task expanded retrieval source fixture gate exists, the checked-in fixture directory evaluates at 75/75 pass, and live-Hermes-DB representative 50-task fact and mixed fact/procedure/episode corpora pass with zero shadow regressions/no durable ranking mutation. The checked-in expanded fixture is still not directly replayable against the tiny live DB because project-M1 references are absent; default ranking remains unchanged until a separate explicit default-rollout decision.
 - Collapse proof artifacts can be persisted/replayed and can reach `satisfied` with reviewed supersession-chain/relation evidence, but collapse/delete apply remains disabled.
 
 Progress estimate:
 
-- Overall north-star: 76-78%.
-- Substrate/evidence plumbing: about 86%.
-- Safe automatic mutation/promotion: about 64-68%.
-- Remaining work: about 22-24% overall.
+- Overall north-star: 78-80%.
+- Substrate/evidence plumbing: about 87%.
+- Safe automatic mutation/promotion: about 66-70%.
+- Remaining work: about 20-22% overall.
 
 Current interpretation:
 
-Fresh v0.1.152 evidence and merged G5a-G5i plus default-ranking migration mechanics are healthy enough to continue the brain-like reviewed-candidate runway. The current runway has completed the expanded retrieval source fixture gate, stronger read-only opt-in ranking comparison, supersession-chain collapse proof evidence, one fresh guarded live reviewed-candidate promotion, the explicit default-ranking opt-in-to-default migration design, released named ranking policy diagnostics plus approval-gated config-only migrate/rollback mechanics, and representative live-Hermes-DB shadow evidence preserving `conservative_legacy`. Broad G4/background apply remains blocked. Current next work is to broaden live shadow fixture coverage beyond facts into procedure/episode surfaces, continue telemetry/fresh-epoch reconciliation, and only then consider explicit operator-approved default ranking migration.
+Fresh v0.1.152 evidence and merged G5a-G5i plus default-ranking migration mechanics are healthy enough to continue the brain-like reviewed-candidate runway. The current runway has completed the expanded retrieval source fixture gate, stronger read-only opt-in ranking comparison, supersession-chain collapse proof evidence, one fresh guarded live reviewed-candidate fact promotion, one guarded live reviewed procedure/episode promotion pair, the explicit default-ranking opt-in-to-default migration design, released named ranking policy diagnostics plus approval-gated config-only migrate/rollback mechanics, and representative live-Hermes-DB fact plus mixed shadow evidence preserving `conservative_legacy`. Broad G4/background apply remains blocked. Current next work is to improve fresh-epoch telemetry coverage and reduce classified legacy missing-outcome rows through metadata-rich dogfooding before any explicit operator-approved default ranking migration.
 
 Recommended sequence from here:
 
 1. Keep live default ranking on `conservative_legacy`; do not run live `retrieval-ranking-migrate-default` until the operator gives the exact approval phrase and fresh-epoch telemetry is green.
-2. Broaden live shadow fixture coverage beyond the current 50 approved-fact tasks by seeding/approving representative procedure and episode memories through guarded review corridors.
-3. Continue telemetry/fresh-epoch reconciliation; current post-v0.1.152 telemetry-only reconciliation is green, but fresh-epoch still blocks on `epoch_empty_retrieval_outcome_metadata_gap_classified`.
+2. Continue metadata-rich dogfooding to lift fresh-epoch observation/trace linkage coverage above threshold and replace classified legacy missing-outcome rows.
+3. Keep live mixed fact/procedure/episode corpus work in read-only shadow comparison unless additional representative memories are promoted through guarded review corridors with backup/hash/actor/reason/approval evidence.
 4. Keep collapse proof evidence-driven: `satisfied` requires supersession-chain/relation evidence, and collapse/delete apply remains disabled.
 5. Keep fresh reviewed candidate promotion limited to the explicit guarded corridor with backup/hash/actor/reason/approval evidence; do not use broad apply.
 6. Preserve broad G4/background apply as blocked until ranking, rollback replay, telemetry reconciliation/fresh epoch, and reviewed queue approvals all pass on real runtime evidence.

diff --git a/.dev/status/current-handoff.md b/.dev/status/current-handoff.md
@@ -1,7 +1,7 @@
 # agent-memory current handoff
 
 Status: AI-authored draft. Not yet human-approved.
-Last updated: 2026-05-13 15:48 KST
+Last updated: 2026-05-13 16:32 KST
 
 ## v0.1.152 released runtime checkpoint
 
@@ -16,23 +16,23 @@ Current verified state:
 - Hermes hook doctor is green for `personal-oss` after `--accept-hooks` smoke on the v0.1.152 runtime.
 - Fresh G4 report directory retained: `/Users/reddit/.agent-memory/reports/g4-v0138-20260512-132253/`.
 - Fresh linkage diagnosis retained from G4 diagnostics: `g4-linkage-gap-diagnose-v0138-fresh.json` passed with decision `fresh_trace_linkage_gap_not_detected`.
-- Current v0.1.152 source/runtime runway now includes a 50-task expanded retrieval fixture gate (`live-compatible-50-gate.json`), 75 checked-in retrieval eval tasks across the fixture directory, persisted/replayed per-candidate collapse proof artifacts with relation-equivalence/supersession-chain evidence, one fresh live G5 reviewed-candidate promotion (`candidate:29db0390b2f81bdb` -> `fact:4`) with backup/hash evidence, idempotent live G4 queue apply evidence, the explicit default-ranking opt-in-to-default migration plan at `.dev/roadmap/memory-consolidation/default-ranking-opt-in-to-default-migration.md`, and the released default-ranking migration mechanics.
-- Default-ranking migration mechanics are now released in v0.1.152: named `conservative_legacy`/`graph_reinforced_v1`/`shadow_compare` policy diagnostics, shadow compare on `retrieval-ranking-experiment`, and approval-gated config-only `retrieval-ranking-migrate-default` with protected table hash proof plus rollback metadata. Live Hermes remains on `conservative_legacy`; live shadow reports under `/Users/reddit/.agent-memory/reports/default-ranking-v0152-shadow/` include a 50-task representative live-Hermes-DB fact corpus with 50/50 pass, zero baseline regressions, protected default order, and no durable mutation. The checked-in expanded 50-task source fixture still fails against the tiny live DB because project-M1 references are absent; the gap artifact is `checked-in-expanded-50-live-gap.stderr.txt`.
+- Current v0.1.152 source/runtime runway now includes a 50-task expanded retrieval fixture gate (`live-compatible-50-gate.json`), 75 checked-in retrieval eval tasks across the fixture directory, persisted/replayed per-candidate collapse proof artifacts with relation-equivalence/supersession-chain evidence, one fresh live G5 reviewed-candidate promotion (`candidate:29db0390b2f81bdb` -> `fact:4`) with backup/hash evidence, one guarded live reviewed procedure/episode promotion pair (`candidate:3435fe1db562aaf2` -> `procedure:1`, `candidate:4a35c03e7130fdec` -> `episode:1`) with backup/hash evidence, idempotent live G4 queue apply evidence, the explicit default-ranking opt-in-to-default migration plan at `.dev/roadmap/memory-consolidation/default-ranking-opt-in-to-default-migration.md`, and the released default-ranking migration mechanics.
+- Default-ranking migration mechanics are now released in v0.1.152: named `conservative_legacy`/`graph_reinforced_v1`/`shadow_compare` policy diagnostics, shadow compare on `retrieval-ranking-experiment`, and approval-gated config-only `retrieval-ranking-migrate-default` with protected table hash proof plus rollback metadata. Live Hermes remains on `conservative_legacy`; live shadow reports under `/Users/reddit/.agent-memory/reports/default-ranking-v0152-shadow/` include a 50-task representative live-Hermes-DB fact corpus and a 50-task mixed fact/procedure/episode corpus, both with 50/50 pass, zero baseline regressions, protected default order, and no durable ranking mutation. The checked-in expanded 50-task source fixture still fails against the tiny live DB because project-M1 references are absent; the gap artifact is `checked-in-expanded-50-live-gap.stderr.txt`.
 - Broad G4/background apply remains blocked; default retrieval ranking changes, collapse/delete apply, live telemetry reset, and ordinary conversation auto-approval remain blocked. The new fact `fact:4` also records this guardrail in the live memory DB.
 
 Progress estimate:
 
-- Overall north-star: 76-78%.
-- Substrate/evidence plumbing: about 86%.
-- Safe automatic mutation/promotion: about 64-68%.
-- Remaining work: about 22-24% overall.
+- Overall north-star: 78-80%.
+- Substrate/evidence plumbing: about 87%.
+- Safe automatic mutation/promotion: about 66-70%.
+- Remaining work: about 20-22% overall.
 
 Current interpretation:
 
 - The trace/retrieval/candidate/proof substrate is healthy enough for the next safety runway.
-- Completed in the current runway: expanded retrieval gate to 50 tasks, proved the checked-in fixture directory at 75/75 pass, moved collapse proof to `satisfied` with supersession-chain evidence while keeping collapse/delete disabled, ran one fresh non-idempotent narrow live reviewed-candidate promotion with backup/hash verification, released/runtime-smoked v0.1.151, documented the explicit default-ranking opt-in-to-default migration plan, implemented and released the named-policy/shadow-compare/config-only migrate/rollback command path in v0.1.152, and smoke-tested live shadow comparison plus a 50-task representative live fact corpus without changing the live default.
+- Completed in the current runway: expanded retrieval gate to 50 tasks, proved the checked-in fixture directory at 75/75 pass, moved collapse proof to `satisfied` with supersession-chain evidence while keeping collapse/delete disabled, ran one fresh non-idempotent narrow live reviewed-candidate fact promotion plus one guarded reviewed procedure/episode promotion pair with backup/hash verification, released/runtime-smoked v0.1.151, documented the explicit default-ranking opt-in-to-default migration plan, implemented and released the named-policy/shadow-compare/config-only migrate/rollback command path in v0.1.152, and smoke-tested live shadow comparison plus both 50-task representative live fact and mixed corpora without changing the live default.
 - Broad G4/background apply remains blocked; existing docs/RED-test-only broad-G4 baseline must not be advertised as ready.
-- Retrieval ranking changes remain opt-in experiments only; the expanded 50-task source experiment and the representative 50-task live-Hermes-DB fact corpus both passed as read-only comparisons with no durable mutation. v0.1.152 adds released migration mechanics, but live default enablement still requires broader live fixture coverage, fresh-epoch telemetry green, the exact approval phrase, and explicit operator approval.
+- Retrieval ranking changes remain opt-in experiments only; the expanded 50-task source experiment, the representative 50-task live-Hermes-DB fact corpus, and the representative 50-task mixed fact/procedure/episode corpus all passed as read-only comparisons with no durable ranking mutation. v0.1.152 adds released migration mechanics, but live default enablement still requires fresh-epoch telemetry green, the exact approval phrase, and explicit operator approval.
 
 Current safe mutation boundaries: