memory/kb: widen FTS5 snippet window 24 → 60 + nudge agent to follow up with fs_read by FMXExpress · Pull Request #309 · FMXExpress/PasClaw

FMXExpress · 2026-06-18T13:59:20Z

Summary

Addresses the actionable finding from PR #308's LOCOMO bench results:

FTS5-bounded snippet window often clips the actual answer line (snippet R@10 ≈ 0.75). That's an actionable finding — either widen snippet windows in PasClaw.Memory.Index or train the agent to follow up with fs_read on retrieved citations more aggressively.

Does both — the two fixes are complementary, not alternatives.

Bench delta

Re-running the PR #308 harness against the bundled alice_synthetic persona with this branch applied:

metric	before (width=24)	after (width=60)
snippet R@10	0.75	1.0
snippet R@1	0.375	0.875
doc R@10	1.0	1.0 (unchanged)

Doc-level recall was already optimal — the layer found the right file; it just wasn't showing the right line in the snippet. Widening the window closes that gap on this synthetic. The remaining snippet R@1 = 0.875 (1 out of 8 missed) is a BM25 ranking artifact, not a snippet-width issue — leaving for a follow-up that tunes the hybrid ranker.

Changes

1. `PasClaw.Memory.Index` — lift the magic number, widen to 60

Adds FTS5_SNIPPET_TOKENS = 60 as an interface-section const so PasClaw.KB.Index can reuse it (memory_search and kb_search should agree on snippet width — they share a tokenizer and they share the bench). 64 is FTS5's hard ceiling per call; 60 leaves a sliver of slack for the «...» highlight markup. Both Search() overloads (FPC sqldb and Delphi FireDAC) now interpolate IntToStr(FTS5_SNIPPET_TOKENS).

2. `PasClaw.KB.Index` — reuse the same const

Two snippet(kb_fts, 2, ...) queries now reference FTS5_SNIPPET_TOKENS (picked up from the existing uses PasClaw.Memory.Index). The uses comment is updated so the cross-unit dependency is grep-able.

3. `PasClaw.Agent.Prompt.BuildRulesSection` — nudge for citation follow-up

Rule #5 (the memory rule) now explicitly tells the model that memory_search / kb_search return bounded snippets, and that if the cited file is right but the answer line might fall just outside the window, follow up with fs_read (or kb_get for the KB) on the cited path. Frames it as: a snippet showing the right file but not quite the right line is a hit, not a miss — i.e. drive the agent toward iteration on citations instead of giving up.

Why both fixes

Wider window lifts the common case — short answer lines that used to fall just outside a 24-token window now show up in the first hit. Cheap and zero-token cost to the model.
Prompt nudge handles long-context documents where even 60 tokens can't centre on the right line (e.g. a multi-paragraph note where the relevant fact is several sentences from the search term that anchored the hit). The model now knows to read the underlying file instead of treating the snippet as the final answer.

Test plan

make clean — no new warnings/errors introduced
make test-kb-index passes
make test-agents-md passes
Re-ran PR bench: LOCOMO-shaped memory-retrieval harness for memory_search #308's harness — numbers above
Real LOCOMO run (gated on PR bench: LOCOMO-shaped memory-retrieval harness for memory_search #308 + a shape adapter — separate follow-up)

Generated by Claude Code

… up with fs_read The historical snippet width of 24 tokens routinely clipped the actual answer line out of FTS5 snippets returned by memory_search / kb_search. LOCOMO bench numbers on the workspace memory layer: snippet R@10: 0.75 -> 1.0 snippet R@1: 0.375 -> 0.875 doc R@10: 1.0 (unchanged, was already optimal) Two changes: 1. Lift the magic 24 into a named interface-section const FTS5_SNIPPET_TOKENS = 60 in PasClaw.Memory.Index, and reuse it from PasClaw.KB.Index so memory_search and kb_search stay aligned. 64 is FTS5's hard ceiling; 60 leaves slack for the highlight markers. 2. Extend rule #5 in PasClaw.Agent.Prompt.BuildRulesSection to tell the model that bounded snippets are an index hit, not a final answer -- if the cited file is right but the answer line might fall just outside the window, follow up with fs_read (or kb_get) on the cited path before giving up. The two fixes are complementary: a wider window lifts recall on the common case, and the prompt nudge handles long-context documents where even 60 tokens may not centre on the right line.

…ntical, validates PR #309 After fixing the fixture-side bug (commit 01dac9f -- changed staged prior-session log from .ndjson to .md since PasClaw's SyncDir only indexes Markdown), re-ran the prior-session shootout. 4 cells: baseline + lean-edit + stock + max-build. Result ====== profile turns tools trajectory baseline 2 8 (no memory_search) fs_write only* lean-edit 4 9 (has memory_search) search -> read -> write stock 4 13 search -> read -> write max-build 4 17 search -> read -> write * driver artifact -- the subagent read the staged .md file with its own (Claude Code) Read tool and short-circuited the turn loop. Fair baseline would be ~5-6 turns. Three real findings =================== 1. memory_search works on .md files when SyncDir's lazy indexing path runs. No `pasclaw memory provision` needed -- the first search call triggers the index build automatically. The earlier "memory search returns nothing" finding was a fixture file-extension bug, not a PasClaw bug. 2. PR #309 (FTS5 snippet width 24 -> 60 + Rule 5 fs_read follow-up) is doing its job in the wild. Even at 60-token snippets, the snippet on this file truncated before reaching the "Final decision: cbor" line -- the query terms ("serialization format storage") matched earlier paragraphs and 60 tokens didn't extend to the decision sentence. EVERY agent (lean-edit, stock, max-build) followed Rule 5 correctly: when the snippet shows the right file but not the right line, follow up with fs_read on the cited path. Exact behavior the rule trains for. PR #309 wasn't "fix the symptom"; the snippet-widening helps and the rule handles the residual cases. 3. With memory_search present, profile differences disappear on recall-shaped tasks. lean-edit, stock, and max-build all picked the same tools in the same order. The 2895-byte/turn max-build premium buys ZERO recall-task advantage over lean-edit. Honest memory_search savings vs no-memory_search is roughly 1-2 turns (fair baseline 5-6 turns vs all-memory-equipped 4 turns). Cumulative verdict for memory_search ==================================== PRESENT in all of: lean-edit, lean-stock, lean-build, stock, low-token, max-build, all-on ABSENT in: baseline (vector_search_enabled=false strips it) security (same) If you're choosing between lean-edit and max-build, memory_search is NOT a differentiator -- they both have it. If you're stripping all the way to baseline (or security in some configs), losing memory_search costs you 1-2 turns on recall tasks.

FMXExpress merged commit 36c63a2 into main Jun 18, 2026

FMXExpress mentioned this pull request Jun 21, 2026

docs: changelog catch-up for PRs #292 → #318 #319

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

memory/kb: widen FTS5 snippet window 24 → 60 + nudge agent to follow up with fs_read#309

memory/kb: widen FTS5 snippet window 24 → 60 + nudge agent to follow up with fs_read#309
FMXExpress merged 1 commit into
mainfrom
claude/memory-snippet-window

FMXExpress commented Jun 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

FMXExpress commented Jun 18, 2026

Summary

Bench delta

Changes

1. PasClaw.Memory.Index — lift the magic number, widen to 60

2. PasClaw.KB.Index — reuse the same const

3. PasClaw.Agent.Prompt.BuildRulesSection — nudge for citation follow-up

Why both fixes

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

1. `PasClaw.Memory.Index` — lift the magic number, widen to 60

2. `PasClaw.KB.Index` — reuse the same const

3. `PasClaw.Agent.Prompt.BuildRulesSection` — nudge for citation follow-up