AppSprout-dev · CalebisGross · Mar 30, 2026 · Mar 30, 2026 · Mar 30, 2026 · Mar 30, 2026
diff --git a/.claude/rules/mnemonic-usage.md b/.claude/rules/mnemonic-usage.md
@@ -1,67 +1,59 @@
-# Mnemonic MCP Tool Usage — Mandatory
+# Mnemonic MCP Tool Usage
+
+## Available Tools (7)
+
+| Tool | Purpose |
+|------|---------|
+| `remember` | Store decisions, errors, insights, learnings |
+| `recall` | Semantic search with spread activation |
+| `recall_project` | Project context + recent activity (use at session start) |
+| `batch_recall` | Multiple recall queries in one round-trip |
+| `feedback` | Rate recall quality (drives Hebbian learning) |
+| `status` | System health check |
+| `amend` | Update a stale memory in place |
 
 ## Session Start
 
-For tasks involving code changes, decisions, or multi-step work:
 1. Call `recall_project` to load project context
-2. Call `recall` with keywords relevant to the user's first request
-3. If either call returns useful context, use it to inform your work
-4. If a call fails (FTS error, timeout), note it and move on — don't block the session
+2. Call `recall` with keywords relevant to the user's request
+3. If useful context found, use it. If not, move on.
 
-Alternative: Use `batch_recall` to combine multiple queries into one round-trip.
+Alternative: `batch_recall` to combine project context + task-specific queries.
 
-For trivial tasks (typo fix, single-line change, quick question): skip recall and just do the work.
+For trivial tasks: skip recall, just do the work.
 
-## During Work (MUST)
+## During Work
 
-### Remember
+### Remember (be selective)
 
-- **Decisions**: Architectural/design choices — `type: "decision"`
-- **Errors**: Bugs encountered and resolved — `type: "error"`
-- **Insights**: Non-obvious discoveries about the codebase — `type: "insight"`
-- **Learnings**: Library, API, or framework behavior — `type: "learning"`
-- **Experiment results**: HP sweep findings, benchmark baselines, training outcomes — `type: "insight"` or `type: "decision"` depending on whether it's an observation or a choice made from it
+Only store things a future session would need:
+- **Decisions**: "chose X because Y" — `type: "decision"`
+- **Errors**: bugs found and how they were fixed — `type: "error"`
+- **Insights**: non-obvious discoveries — `type: "insight"`
+- **Learnings**: API/framework behavior — `type: "learning"`
 
-Use judgment — remember things a future session would need. Don't remember trivial actions, file paths, or things derivable from git history.
+Do NOT remember: file paths, trivial changes, things derivable from git history or code.
 
 ### Recall mid-session
 
-Don't only recall at session start. When entering new territory (new subsystem, unfamiliar pattern, making claims about prior work), call `recall` with specific keywords first. Example: before suggesting HP ranges, recall prior training findings. Before claiming something works a certain way, check if there's a stored decision or learning about it.
+When entering unfamiliar territory, recall before assuming. Check if there's a prior decision or known issue.
 
 ### Amend stale memories
 
-If a recall returns a memory that's outdated or partially wrong, use `amend` to update it in place rather than creating a new memory. This preserves associations and history.
-
-## After Recalls (MUST)
-
-- After using `recall` and acting on the results, call `feedback`:
-  - `helpful` — memories were relevant and informed your work
-  - `partial` — some relevant, some noise
-  - `irrelevant` — memories didn't help
-- If recall returned 0 results, no feedback needed — but consider whether your query was too broad or too specific
-- This trains the retrieval system — skipping it degrades future recall quality
-
-## Between Phases / Major Tasks (MUST)
-
-When working through multi-phase plans (epics, milestones, sequential issues):
-- `remember` key decisions, strategy changes, or gotchas from the completed phase before starting the next
-- `recall` relevant context before entering a new phase — prior phase decisions may affect the current one
-- This ensures continuity across long sessions and prevents rediscovering the same issues
-
-## Reducing Noise
+If recall returns outdated info, use `amend` to fix it in place. This preserves associations.
 
-- Use `include_patterns: false` and `include_abstractions: false` on `recall` when you only need memories, not patterns/principles
-- Use `types: ["decision", "error"]` to filter recall to actionable memory types
-- Use `dismiss_pattern` and `dismiss_abstraction` to archive noise that keeps surfacing
+## After Recalls
 
-## Before Committing (SHOULD)
+Call `feedback` after acting on recall results:
+- `helpful` — memories informed your work
+- `partial` — some useful, some noise
+- `irrelevant` — didn't help
 
-- Review the session's work and `remember` any decisions or insights that haven't been stored yet
-- Call `session_summary` if the session involved significant work
+This trains retrieval. Skipping it degrades future quality.
 
-## General
+## What NOT to Do
 
-- Prefer specific `recall` queries over broad ones — "SQLite FTS5 migration" not "database stuff"
-- Set the `type` field on every `remember` call — never use the default "general" when a specific type fits
-- When a recall returns irrelevant noise, say so via `feedback` — this is how the system improves
-- Don't remember things that belong in experiment docs — training results go in `training/docs/`, not just in mnemonic memory. Memory is for cross-session context, not a substitute for proper documentation
+- Don't use `include_patterns` or `include_abstractions` — these produce noise
+- Don't store experiment results in memory — those go in `training/docs/`
+- Don't remember things that belong in code comments or commit messages
+- Don't create memories about file structure or architecture — read the code instead
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -1,6 +1,6 @@
 # Mnemonic — Development Guide
 
-Mnemonic is a local-first, air-gapped semantic memory system built in Go. It uses 8 cognitive agents + orchestrator + reactor, SQLite with FTS5 + vector search, and LLMs (LM Studio locally or cloud APIs like Gemini) for semantic understanding.
+Mnemonic is a local-first, air-gapped semantic memory daemon for AI agents. Built in Go, it provides persistent long-term memory via SQLite with FTS5 + vector search, heuristic encoding, and spread activation retrieval. No LLM required.
 
 ## Build & Test
 
@@ -15,159 +15,104 @@ golangci-lint run             # Lint (uses .golangci.yml config)
 
 **Version** is injected via ldflags from `Makefile` (managed by release-please). The binary var is in `cmd/mnemonic/main.go`.
 
+## Architecture
+
+### Embedding Pipeline (no LLM)
+
+All encoding uses heuristic Go code — no generative LLM calls anywhere:
+
+```
+MCP remember → raw memory → heuristic encoding (RAKE concepts + salience) → hugot embedding (384-dim MiniLM) → SQLite + FTS5
+MCP recall   → FTS5 + embedding search → spread activation → rank → return
+```
+
+Three embedding providers available via `config.yaml`:
+- `bow` — 128-dim bag-of-words (instant, zero dependencies)
+- `hugot` — 384-dim MiniLM-L6-v2 via pure Go (no CGo, no shared library)
+- `api` — OpenAI-compatible endpoint (for cloud embeddings)
+
+### Cognitive Agents
+
+Agents communicate via event bus, never direct calls. Their value is in **side effects** (association strengthening, salience decay, clustering), not text output:
+
+- **Encoding** — Raw events → memories with concepts + embeddings
+- **Retrieval** — FTS5 + vector search + spread activation
+- **Consolidation** — Decay salience, merge related memories, prune dead associations
+- **Dreaming** — Replay memories, strengthen associations, cross-pollinate
+- **Orchestrator** — Schedule agent cycles, health monitoring
+
+Perception watchers (filesystem, git, terminal, clipboard) are **disabled by default** — agents have direct codebase access and watcher-sourced memories create retrieval noise.
+
 ## Project Layout
 
 ```
 cmd/mnemonic/          CLI + daemon entry point
 cmd/benchmark/         End-to-end benchmark
 cmd/benchmark-quality/ Memory quality IR benchmark
-cmd/lifecycle-test/    Full lifecycle simulation (install → 3 months)
+cmd/lifecycle-test/    Full lifecycle simulation
 internal/
-  agent/               8 cognitive agents + orchestrator + reactor + forum
-    perception/        Watch filesystem/terminal/clipboard, heuristic filter
-    encoding/          LLM compression, concept extraction, association linking
-    episoding/         Temporal episode clustering
-    consolidation/     Decay, merge, prune (sleep cycle)
-    retrieval/         Spread activation + LLM synthesis with tool-use
-    metacognition/     Self-reflection, feedback processing, audit
-    dreaming/          Memory replay, cross-pollination, insight generation
-    abstraction/       Patterns → principles → axioms
-    orchestrator/      Autonomous scheduler, health monitoring
-    reactor/           Event-driven rule engine
-    forum/             Agent personality system for forum communication
+  agent/               Cognitive agents + orchestrator + reactor
   api/                 REST API server + routes
-  web/                 Embedded dashboard (forum-style, modular ES modules + CSS)
-  mcp/                 MCP server (24 tools for Claude Code)
+  web/                 Embedded dashboard
+  mcp/                 MCP server (7 core tools)
+  embedding/           Embedding providers (bow, hugot, api) + RAKE + TurboQuant
   store/               Store interface + SQLite implementation
-  llm/                 LLM provider interface + implementations (LM Studio, Gemini/cloud API)
-    llamacpp/          Optional embedded llama.cpp backend (CGo, build-tagged)
-  ingest/              Project ingestion engine
-  watcher/             Filesystem (FSEvents/fsnotify), terminal, clipboard
-  daemon/              Service management (macOS launchd, Linux systemd, Windows Services)
-  updater/             Self-update via GitHub Releases
+  llm/                 Legacy LLM provider interface (kept for MCP server compat)
+  watcher/             Filesystem, terminal, clipboard watchers (disabled by default)
+  daemon/              Service management (launchd, systemd, Windows Services)
   events/              Event bus (in-memory pub/sub)
   config/              Config loading (config.yaml)
   logger/              Structured logging (slog)
-  concepts/            Shared concept extraction (paths, commands, event types)
-  backup/              Export/import
-  testutil/            Shared test infrastructure (stub LLM provider)
-sdk/                   Python agent SDK (self-evolving assistant)
-  agent/evolution/     Agent evolution data (created at runtime, gitignored)
-  agent/evolution/examples/  Example evolution data for reference
-training/              Mnemonic-LM training infrastructure
-  scripts/             Training, sweep, bisection, data download scripts
-  configs/             Data mix config (pretrain_mix.yaml)
-  docs/                Experiment registry, analysis docs
-  data/                Tokenized pretraining shards (gitignored)
-  sweep_results.tsv    HP sweep results log
-  probe_results.tsv    Short probe results from LR bisection
-third_party/           llama.cpp submodule (for embedded LLM builds)
+sdk/                   Python agent SDK
+training/              Training infrastructure (historical, not active)
 migrations/            SQLite schema migrations
-scripts/               Utility scripts
 ```
 
 ## Conventions
 
-- **Event bus architecture:** Agents communicate via events, never direct calls. To add behavior, subscribe to events in the bus.
-- **Store interface:** All data access goes through `store.Store` interface. The SQLite implementation is in `internal/store/sqlite/`.
+- **Event bus architecture:** Agents communicate via events, never direct calls.
+- **Store interface:** All data access goes through `store.Store` interface.
 - **Error handling:** Wrap errors with context: `fmt.Errorf("encoding memory %s: %w", id, err)`
-- **Platform-specific code:** Use Go build tags (`//go:build darwin`, `//go:build !darwin`). See `internal/watcher/filesystem/` for examples.
-- **Config:** All tunables live in `config.yaml`. Add new fields to `internal/config/config.go` struct.
-
-## Adding Things
-
-- **New agent:** Implement `agent.Agent` interface, register in `cmd/mnemonic/main.go` serve pipeline.
-- **New CLI command:** Add case to the command switch in `cmd/mnemonic/main.go`.
-- **New API route:** Add handler in `internal/api/routes/`, register in `internal/api/server.go`. Existing routes include `/api/v1/activity` (watcher concept tracker for MCP sync).
-- **New MCP tool:** Add to `internal/mcp/server.go` tool registration.
+- **Platform-specific code:** Use Go build tags (`//go:build darwin`, `//go:build !darwin`).
+- **Config:** All tunables live in `config.yaml`. Add new fields to `internal/config/config.go`.
 
 ## Platform Support
 
 | Platform | Status |
 |----------|--------|
-| macOS ARM | Full support (primary dev platform) |
-| Linux x86_64 | Supported — `serve`, `install`, `start`, `stop`, `uninstall` all work via systemd |
-| Windows x86_64 | Supported — `serve`, `install`, `start`, `stop`, `uninstall` work via Windows Services |
-
-## Training (Mnemonic-LM)
-
-Training scripts live in `training/scripts/` and require the **Felix-LM venv**:
-
-```bash
-source ~/Projects/felixlm/.venv/bin/activate
-```
-
-Key scripts:
-
-- `train_mnemonic_lm.py` — Main training script (imports Felix-LM v3 from `~/Projects/felixlm`)
-- `run_sweep.sh` — Run HP sweep configs sequentially with auto-logging to TSV
-- `bisect_lr.sh` — Binary search for optimal LR using short probes + full confirmation
-- `validate.py` — Quality gate pipeline for fine-tuning data
-
-All experiments must be pre-registered in `training/docs/experiment_registry.md` before running. See `.claude/rules/scientific-method.md` and `.claude/rules/experiment-logging.md`.
-
-## Known Issues
-
-See [GitHub Issues](https://github.com/appsprout-dev/mnemonic/issues) for tracked bugs.
-
----
-
-## MCP Tools Available
-
-You have 24 tools via the `mnemonic` MCP server:
-
-| Tool | When to Use |
-|------|-------------|
-| `remember` | Store decisions, errors, insights, learnings (returns raw ID + salience) |
-| `recall` | Semantic search with spread activation (`explain`, `include_associations`, `format`, `type`, `types`, `include_patterns`, `include_abstractions`, `synthesize` params) |
-| `batch_recall` | Run multiple recall queries in parallel — ideal for session start |
-| `get_context` | Proactive suggestions based on recent daemon activity — call at natural breakpoints |
-| `forget` | Archive irrelevant memories |
-| `amend` | Update a memory's content in place (preserves associations, history, salience) |
-| `check_memory` | Inspect a memory's encoding status, concepts, and associations |
-| `status` | System health, encoding pipeline status, source distribution |
-| `recall_project` | Get project-specific context and patterns |
-| `recall_timeline` | See what happened in a time range |
-| `recall_session` | Retrieve all memories from a specific MCP session |
-| `list_sessions` | List recent sessions with time range and memory count |
-| `session_summary` | Summarize current/recent session |
-| `get_patterns` | View discovered recurring patterns (returns IDs for dismissal, supports `min_strength`) |
-| `get_insights` | View metacognition observations and abstractions (returns IDs for dismissal) |
-| `feedback` | Report recall quality (drives ranking, can auto-suppress noisy memories) |
-| `audit_encodings` | Review recent encoding quality and suggest improvements |
-| `coach_local_llm` | Write coaching guidance to improve local LLM prompts |
-| `ingest_project` | Bulk-ingest a project directory into memory |
-| `exclude_path` | Add a watcher exclusion pattern at runtime |
-| `list_exclusions` | List all runtime watcher exclusion patterns |
-| `dismiss_pattern` | Archive a stale or irrelevant pattern to stop it surfacing in recall |
-| `dismiss_abstraction` | Archive a stale or irrelevant principle/axiom to stop it surfacing in recall |
-| `create_handoff` | Store structured session handoff notes (high salience, surfaced by recall_project) |
+| macOS ARM | Full support |
+| Linux x86_64 | Full support (systemd) |
+| Windows x86_64 | Full support (Windows Services) |
+
+## MCP Tools (7)
+
+| Tool | Purpose |
+|------|---------|
+| `remember` | Store decisions, errors, insights, learnings |
+| `recall` | Semantic search with spread activation |
+| `recall_project` | Project context + recent activity |
+| `batch_recall` | Multiple recall queries in parallel |
+| `feedback` | Rate recall quality (drives Hebbian learning) |
+| `status` | System health |
+| `amend` | Update a memory in place |
 
 ### At Session Start
 
-- Use `recall_project` to load context for the current project
-- Use `recall` with relevant keywords to find prior decisions
+- `recall_project` — project context
+- `recall` or `batch_recall` — task-specific context
 
 ### During Work
 
-- `remember` decisions with `type: "decision"` — e.g., "chose SQLite over Postgres for simplicity"
-- `remember` errors with `type: "error"` — e.g., "nil pointer in auth middleware, fixed with guard clause"
-- `remember` insights with `type: "insight"` — e.g., "spread activation works best with 3 hops max"
-- `remember` learnings with `type: "learning"` — e.g., "Go's sql.NullString needed for nullable columns"
+- `remember` decisions, errors, insights, learnings
+- `recall` before entering unfamiliar territory
+- `amend` stale memories instead of creating new ones
 
 ### After Recalls
 
-- Use `feedback` to rate recall quality — this helps the system improve
-- `helpful` = memories were relevant and useful
-- `partial` = some relevant, some not
-- `irrelevant` = memories didn't help
+- `feedback` — rate quality (helpful/partial/irrelevant)
 
-### Memory Types
+## Known Issues
 
-When using `remember`, set the `type` field:
+See [GitHub Issues](https://github.com/appsprout-dev/mnemonic/issues) for tracked bugs.
 
-- `decision` — architectural choices, tradeoffs, "we chose X because Y"
-- `error` — bugs found, error patterns, debugging insights
-- `insight` — realizations about code, architecture, or process
-- `learning` — new knowledge, API behaviors, framework quirks
-- `general` — everything else (default)
+**Active branch:** `feat/heuristic-pipeline` (PR #374) — major refactor removing all LLM dependency.