stackmemoryai
diff --git a/‎docs/specs/PROMPT_PLAN.md‎
Lines changed: 77 additions & 3 deletions b/‎docs/specs/PROMPT_PLAN.md‎
Lines changed: 77 additions & 3 deletions
diff --git a/‎package.json‎
Lines changed: 1 addition & 1 deletion b/‎package.json‎
Lines changed: 1 addition & 1 deletion
@@ -1,6 +1,6 @@
 # StackMemory — Prompt Plan
 
-> Generated from ONE_PAGER.md, DEV_SPEC.md
+> Generated from ONE_PAGER.md, DEV_SPEC.md, vision.md, SPEC.md
 
 ## Stage A: Foundation (Complete)
 - [x] Initialize repository and tooling
@@ -31,7 +31,74 @@
 - [x] Agent prompt consolidation (structured templates, latest models)
 - [x] Workflow integration (hooks, skill-rules, CLI)
 
-## Stage E: Team Collaboration (Next)
+## Stage D.5: Search & Intelligence (v0.6.x — Current)
+
+> FTS5, sqlite-vec, and @xenova/transformers are shipped (v0.6.3). This stage hardens retrieval quality, adds missing infrastructure, and fills gaps identified in SPEC.md Phase 4 and vision.md roadmap.
+
+### 1. Retrieval quality signals & acceptance criteria
+- [ ] Add `retrieval_log` table: query, strategy, results returned, latency_ms, timestamp
+- [ ] Instrument `ContextRetriever.retrieveContext()` to log every query + results
+- [ ] CLI command `stackmemory search:stats` — hit rate, avg latency, strategy distribution
+- [ ] Add precision proxy: track whether returned frames are referenced in subsequent tool calls
+
+### 2. Cache expiry & LRU correctness
+- [ ] Fix `getCachedResult()` — currently never expires (no timestamp check)
+- [ ] Add `cachedAt` timestamp to cache entries; evict when > `cacheExpiryMs`
+- [ ] Replace Map-based LRU with proper bounded LRU (or use Map insertion-order delete)
+
+### 3. FTS5 query sanitization
+- [ ] Sanitize MATCH input: escape special chars (`"`, `*`, `OR`, `AND`, `NOT`, `NEAR`)
+- [ ] Add prefix search support: `term*` for partial matches
+- [ ] Handle multi-word queries with implicit AND (currently raw pass-through)
+
+### 4. Incremental garbage collection
+- [ ] Add `retention_policy` column to frames (keep_forever, ttl_days, archive)
+- [ ] `MaintenanceService.runGC()`: delete/archive frames past TTL
+- [ ] Cascade: delete orphaned events, anchors, embeddings, FTS entries
+- [ ] CLI `stackmemory gc --dry-run` for preview
+- [ ] Configurable in `daemon-config.ts`: `gcRetentionDays`, `gcBatchSize`
+
+### 5. Embedding backfill progress & resumability
+- [ ] Track backfill progress in `maintenance_state` table (last_frame_id, total, completed)
+- [ ] Resume from last checkpoint on daemon restart (not re-scan full table)
+- [ ] Add `--force-reembed` flag to re-generate embeddings for changed frames
+- [ ] Report backfill % in `stackmemory daemon status`
+
+### 6. Hybrid search score normalization
+- [ ] Normalize BM25 scores to 0-1 range using min-max within result set
+- [ ] Normalize vector distances to 0-1 similarity using max distance
+- [ ] Apply Reciprocal Rank Fusion (RRF) as alternative to weighted sum
+- [ ] A/B compare weighted-sum vs RRF in retrieval_log
+
+### 7. Remote infinite storage (S3/GCS cold tier)
+- [ ] `StorageTierManager`: hot (SQLite) → warm (compressed SQLite) → cold (S3/GCS)
+- [ ] Background migration: frames older than N days with no recent access → cold
+- [ ] On-demand rehydration: transparent fetch from cold tier on access
+- [ ] Config: `storage.coldTier.provider`, `storage.coldTier.bucket`, `storage.coldTier.migrationDays`
+- [ ] CLI `stackmemory storage stats` — per-tier frame counts and sizes
+
+### 8. Performance optimization (<100ms p50 retrieval)
+- [ ] Add composite index on `frames(project_id, created_at DESC)` if missing
+- [ ] Profile FTS5 + vec queries with `EXPLAIN QUERY PLAN`
+- [ ] Benchmark: p50/p95/p99 retrieval latency with 1k/10k/100k frames
+- [ ] Add `PRAGMA mmap_size` for memory-mapped I/O on large DBs
+- [ ] Connection pooling for concurrent reads (WAL mode allows parallel readers)
+
+### 9. Multi-repository support
+- [ ] `project_registry` table: project_id, repo_path, display_name, created_at
+- [ ] `stackmemory projects list/add/remove` CLI commands
+- [ ] Scoped search: `--project <name>` flag on all search/context commands
+- [ ] Cross-project search: `stackmemory search --all-projects "query"`
+- [ ] MCP tool: `switch_project` to change active project context
+
+### 10. Model routing & cost optimization
+- [ ] Move embedding model selection to config: `embedding.model`, `embedding.dimension`
+- [ ] Support multiple providers: `@xenova/transformers` (local), Ollama, OpenAI API
+- [ ] `EmbeddingProviderFactory.create(config)` — factory with fallback chain
+- [ ] Cost tracking: log token/compute usage per embedding call
+- [ ] CLI `stackmemory config set embedding.provider ollama` for runtime switching
+
+## Stage E: Team Collaboration
 - [ ] Shared frame stacks across team members
 - [ ] Conflict resolution for concurrent frame edits
 - [ ] Team activity feed and notifications
@@ -45,6 +112,13 @@
 
 ## Stage G: Polish & Scale
 - [ ] Browser extension for context capture
-- [ ] Performance optimization (frame indexing, lazy loading)
 - [ ] Telemetry and usage analytics
 - [ ] Plugin marketplace for custom skills
+
+## Stage H: Enterprise & Ecosystem
+- [ ] SSO (SAML/OIDC) and audit logs
+- [ ] Multi-org support with tenant isolation
+- [ ] PostgreSQL production adapter (pgvector, LISTEN/NOTIFY)
+- [ ] Specish API — OpenAPI-based tool bridge
+- [ ] Linear Chrome extension (ticket → subagent pipeline)
+- [ ] SMS/WhatsApp notification system
@@ -1,6 +1,6 @@
 {
   "name": "@stackmemoryai/stackmemory",
-  "version": "0.6.3",
+  "version": "0.7.0",
   "description": "Project-scoped memory for AI coding tools. Durable context across sessions with MCP integration, frames, smart retrieval, Claude Code skills, and automatic hooks.",
   "engines": {
     "node": ">=20.0.0",
Original file line number	Diff line number	Diff line change
`@@ -1,6 +1,6 @@`
`1`	`1`	`{`
`2`	`2`	`"name": "@stackmemoryai/stackmemory",`
`3`		`- "version": "0.6.3",`
	`3`	`+ "version": "0.7.0",`
`4`	`4`	`"description": "Project-scoped memory for AI coding tools. Durable context across sessions with MCP integration, frames, smart retrieval, Claude Code skills, and automatic hooks.",`
`5`	`5`	`"engines": {`
`6`	`6`	`"node": ">=20.0.0",`