Skip to content

Commit 373b802

Browse files
author
StackMemory Bot (CLI)
committed
feat(search): Stage D.5 — retrieval signals, GC, hybrid scoring, multi-provider, cold storage, multi-repo
10 features implemented in parallel: 1. Retrieval quality signals — retrieval_log table, query instrumentation 2. Cache expiry & LRU fix — timestamp-based expiry, proper LRU eviction 3. FTS5 query sanitization — escape special chars, prefix search, implicit AND 4. Incremental GC — retention policies, TTL-based deletion, cascade cleanup 5. Embedding backfill resumability — cursor-based pagination, checkpoint tracking 6. Hybrid score normalization — min-max normalization, Reciprocal Rank Fusion (RRF) 7. Remote storage S3/GCS — StorageTierManager with cold-tier archival 8. Performance optimization — composite indexes, mmap_size, cache_size pragmas 9. Multi-repo support — project_registry table, scoped search, ProjectRegistryManager 10. Model routing factory — EmbeddingProviderFactory with transformers/ollama/openai + fallback chain +137 new tests (630 total passing), 13 new files, ~1000 lines added.
1 parent 854b70a commit 373b802

24 files changed

Lines changed: 4861 additions & 41 deletions

docs/specs/PROMPT_PLAN.md

Lines changed: 77 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# StackMemory — Prompt Plan
22

3-
> Generated from ONE_PAGER.md, DEV_SPEC.md
3+
> Generated from ONE_PAGER.md, DEV_SPEC.md, vision.md, SPEC.md
44
55
## Stage A: Foundation (Complete)
66
- [x] Initialize repository and tooling
@@ -31,7 +31,74 @@
3131
- [x] Agent prompt consolidation (structured templates, latest models)
3232
- [x] Workflow integration (hooks, skill-rules, CLI)
3333

34-
## Stage E: Team Collaboration (Next)
34+
## Stage D.5: Search & Intelligence (v0.6.x — Current)
35+
36+
> FTS5, sqlite-vec, and @xenova/transformers are shipped (v0.6.3). This stage hardens retrieval quality, adds missing infrastructure, and fills gaps identified in SPEC.md Phase 4 and vision.md roadmap.
37+
38+
### 1. Retrieval quality signals & acceptance criteria
39+
- [ ] Add `retrieval_log` table: query, strategy, results returned, latency_ms, timestamp
40+
- [ ] Instrument `ContextRetriever.retrieveContext()` to log every query + results
41+
- [ ] CLI command `stackmemory search:stats` — hit rate, avg latency, strategy distribution
42+
- [ ] Add precision proxy: track whether returned frames are referenced in subsequent tool calls
43+
44+
### 2. Cache expiry & LRU correctness
45+
- [ ] Fix `getCachedResult()` — currently never expires (no timestamp check)
46+
- [ ] Add `cachedAt` timestamp to cache entries; evict when > `cacheExpiryMs`
47+
- [ ] Replace Map-based LRU with proper bounded LRU (or use Map insertion-order delete)
48+
49+
### 3. FTS5 query sanitization
50+
- [ ] Sanitize MATCH input: escape special chars (`"`, `*`, `OR`, `AND`, `NOT`, `NEAR`)
51+
- [ ] Add prefix search support: `term*` for partial matches
52+
- [ ] Handle multi-word queries with implicit AND (currently raw pass-through)
53+
54+
### 4. Incremental garbage collection
55+
- [ ] Add `retention_policy` column to frames (keep_forever, ttl_days, archive)
56+
- [ ] `MaintenanceService.runGC()`: delete/archive frames past TTL
57+
- [ ] Cascade: delete orphaned events, anchors, embeddings, FTS entries
58+
- [ ] CLI `stackmemory gc --dry-run` for preview
59+
- [ ] Configurable in `daemon-config.ts`: `gcRetentionDays`, `gcBatchSize`
60+
61+
### 5. Embedding backfill progress & resumability
62+
- [ ] Track backfill progress in `maintenance_state` table (last_frame_id, total, completed)
63+
- [ ] Resume from last checkpoint on daemon restart (not re-scan full table)
64+
- [ ] Add `--force-reembed` flag to re-generate embeddings for changed frames
65+
- [ ] Report backfill % in `stackmemory daemon status`
66+
67+
### 6. Hybrid search score normalization
68+
- [ ] Normalize BM25 scores to 0-1 range using min-max within result set
69+
- [ ] Normalize vector distances to 0-1 similarity using max distance
70+
- [ ] Apply Reciprocal Rank Fusion (RRF) as alternative to weighted sum
71+
- [ ] A/B compare weighted-sum vs RRF in retrieval_log
72+
73+
### 7. Remote infinite storage (S3/GCS cold tier)
74+
- [ ] `StorageTierManager`: hot (SQLite) → warm (compressed SQLite) → cold (S3/GCS)
75+
- [ ] Background migration: frames older than N days with no recent access → cold
76+
- [ ] On-demand rehydration: transparent fetch from cold tier on access
77+
- [ ] Config: `storage.coldTier.provider`, `storage.coldTier.bucket`, `storage.coldTier.migrationDays`
78+
- [ ] CLI `stackmemory storage stats` — per-tier frame counts and sizes
79+
80+
### 8. Performance optimization (<100ms p50 retrieval)
81+
- [ ] Add composite index on `frames(project_id, created_at DESC)` if missing
82+
- [ ] Profile FTS5 + vec queries with `EXPLAIN QUERY PLAN`
83+
- [ ] Benchmark: p50/p95/p99 retrieval latency with 1k/10k/100k frames
84+
- [ ] Add `PRAGMA mmap_size` for memory-mapped I/O on large DBs
85+
- [ ] Connection pooling for concurrent reads (WAL mode allows parallel readers)
86+
87+
### 9. Multi-repository support
88+
- [ ] `project_registry` table: project_id, repo_path, display_name, created_at
89+
- [ ] `stackmemory projects list/add/remove` CLI commands
90+
- [ ] Scoped search: `--project <name>` flag on all search/context commands
91+
- [ ] Cross-project search: `stackmemory search --all-projects "query"`
92+
- [ ] MCP tool: `switch_project` to change active project context
93+
94+
### 10. Model routing & cost optimization
95+
- [ ] Move embedding model selection to config: `embedding.model`, `embedding.dimension`
96+
- [ ] Support multiple providers: `@xenova/transformers` (local), Ollama, OpenAI API
97+
- [ ] `EmbeddingProviderFactory.create(config)` — factory with fallback chain
98+
- [ ] Cost tracking: log token/compute usage per embedding call
99+
- [ ] CLI `stackmemory config set embedding.provider ollama` for runtime switching
100+
101+
## Stage E: Team Collaboration
35102
- [ ] Shared frame stacks across team members
36103
- [ ] Conflict resolution for concurrent frame edits
37104
- [ ] Team activity feed and notifications
@@ -45,6 +112,13 @@
45112

46113
## Stage G: Polish & Scale
47114
- [ ] Browser extension for context capture
48-
- [ ] Performance optimization (frame indexing, lazy loading)
49115
- [ ] Telemetry and usage analytics
50116
- [ ] Plugin marketplace for custom skills
117+
118+
## Stage H: Enterprise & Ecosystem
119+
- [ ] SSO (SAML/OIDC) and audit logs
120+
- [ ] Multi-org support with tenant isolation
121+
- [ ] PostgreSQL production adapter (pgvector, LISTEN/NOTIFY)
122+
- [ ] Specish API — OpenAPI-based tool bridge
123+
- [ ] Linear Chrome extension (ticket → subagent pipeline)
124+
- [ ] SMS/WhatsApp notification system

package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"name": "@stackmemoryai/stackmemory",
3-
"version": "0.6.3",
3+
"version": "0.7.0",
44
"description": "Project-scoped memory for AI coding tools. Durable context across sessions with MCP integration, frames, smart retrieval, Claude Code skills, and automatic hooks.",
55
"engines": {
66
"node": ">=20.0.0",

0 commit comments

Comments
 (0)