Merge pull request #244 from AppSprout-dev/docs/update-readme-architecture

CalebisGross · web-flow · commit 6aca16f2be5a · 2026-03-20T02:05:32.000-04:00
docs: update README, CLAUDE.md, and ARCHITECTURE.md for v0.22+
diff --git a/ARCHITECTURE.md b/ARCHITECTURE.md
@@ -128,31 +128,47 @@ Raw Event → Size Filter → Pattern Blacklist → Frequency Dedup → Content
 
 ### Layer 2 — Encoding (Compression & Linking)
 
-Triggered by `RawMemoryCreated` event:
-1. LLM compresses raw content → summary + concepts (structured output)
-2. Generate embedding via embedding provider
-3. Find similar memories via embedding + FTS
-4. Create association links with strength weights
-5. Emit `MemoryEncoded` event
+Triggered by `RawMemoryCreated` event. MCP-sourced memories are processed first (priority queue by source).
+
+1. Atomic claim via `ClaimRawForEncoding` (prevents duplicate encoding across processes)
+2. LLM compresses raw content → summary + concepts (structured output, enforced vocabulary)
+3. Post-process concepts: strip metadata, normalize casing, deduplicate
+4. Generate embedding via embedding provider
+5. Deduplication check: if cosine similarity > threshold, boost existing memory instead
+6. Find similar memories via embedding + FTS, create association links
+7. Emit `MemoryEncoded` event
 
 ### Layer 3 — Consolidation (Sleep Cycle)
 
-Runs every 6 hours (or on-demand). **Budget-constrained**: max 100 memories per cycle.
+Runs periodically (configurable, default 1h) or on-demand. **Budget-constrained**: max 100 memories per cycle.
 
 Operations in order:
-1. **Decay**: `new_salience = salience * decay_rate^(hours_since_access)`
+1. **Decay**: `new_salience = salience * decay_rate^(hours_since_access)` with recency protection
 2. **State transitions**: active → fading (< 0.3) → archived (< 0.1) → deleted (> 90 days)
 3. **Strengthen**: Recently accessed memories get salience boost
 4. **Prune associations**: Weaken/remove low-strength, never-activated links
 5. **Merge** (max 5 per cycle): Cluster highly-related memories → LLM creates gist memory
+6. **Archive never-recalled noise**: Non-MCP memories with 0 access after 30 days → archived
+7. **Pattern extraction**: Identify recurring themes, deduplicate near-identical patterns
+8. **Abstraction dedup**: Archive zombie abstractions with near-zero confidence
 
 ### Layer 4 — Retrieval (Associative Recall)
 
 ```
-Query → [Parse + Embed] → Entry Points (FTS top 3 + Embedding top 3) → Spread Activation (3 hops) → Rank → Top 7 → [Optional: LLM Synthesis]
+Query → [Parse + Embed] → Entry Points (FTS + Embedding) → Spread Activation (3 hops) → Rank → Filter → Diversity (MMR) → [Optional: LLM Synthesis]
 ```
 
-Spread activation follows strongest association links, with activation decaying per hop. Every accessed memory gets strengthened (access_count++, last_accessed updated).
+Ranking considers multiple signals:
+- **Activation score** from spread activation traversal
+- **Recency bonus** (exponential decay from last access)
+- **Source weight** (MCP: 1.0, terminal: 0.8, filesystem: 0.5 -- configurable)
+- **Feedback adjustment** (memories rated helpful get boosted, irrelevant get penalized)
+- **Pattern evidence boost** (+0.1 for memories supporting matched patterns, +0.05 for abstraction sources)
+- **Significance multiplier** (critical/important memories from structured concept extraction)
+
+Suppressed memories (accumulated negative feedback) are filtered out by default. Every accessed memory gets strengthened (access_count++, last_accessed updated).
+
+The `recall` tool supports `explain: true` to surface score breakdowns, `include_associations: true` to show the knowledge graph, and `format: "json"` for structured output.
 
 ### Layer 5 — Episoding (Temporal Clustering)
 
@@ -195,7 +211,7 @@ Runs periodically (default every 2 hours). Builds hierarchical knowledge:
 - **Level 2 — Principles**: Generalizations across patterns
 - **Level 3 — Axioms**: Fundamental truths with high confidence
 
-Abstractions are grounded in evidence. Those that lose supporting evidence are demoted or archived.
+Abstractions are grounded in evidence via `verifyGrounding()`. Young abstractions (< 7 days) have a confidence floor of 0.5 to prevent premature demotion. Frequently accessed abstractions (> 5 recalls) resist decay. Demotion is graduated: 0.9x for moderate evidence loss, 0.7x for significant, 0.5x for near-total (softened from 0.3x).
 
 ### Layer 9 — Reactor (Event-Driven Rules)
 
@@ -236,6 +252,7 @@ GET    /abstractions            Hierarchical abstractions
 GET    /projects                Project summaries
 
 GET    /llm/usage               LLM token usage by agent
+GET    /tool/usage              MCP tool call analytics
 
 GET    /graph                   Association graph for D3.js visualization
 
@@ -265,6 +282,7 @@ Served at `http://localhost:9999/`. Features:
 - Query tester with score explanations
 - System health (LLM status, store health, watcher status)
 - LLM usage monitoring (per-agent token consumption and cost)
+- MCP tool usage analytics (call frequency, latency, error rates)
 - Memory source tags (hoverable, showing origin: filesystem, terminal, clipboard, MCP, consolidation)
 - 5 themes: Midnight, Ember, Nord, Slate, Parchment (persists in localStorage)
 - Agent SDK dashboard: evolution state, principles, strategies, session timeline, chat
@@ -442,6 +460,47 @@ CREATE TABLE llm_usage (
     latency_ms INTEGER NOT NULL DEFAULT 0,
     success INTEGER NOT NULL DEFAULT 1
 );
+
+-- MCP tool usage tracking
+CREATE TABLE tool_usage (
+    id INTEGER PRIMARY KEY AUTOINCREMENT,
+    timestamp TEXT NOT NULL,
+    tool_name TEXT NOT NULL,
+    session_id TEXT NOT NULL DEFAULT '',
+    project TEXT NOT NULL DEFAULT '',
+    latency_ms INTEGER NOT NULL DEFAULT 0,
+    success INTEGER NOT NULL DEFAULT 1,
+    error_message TEXT NOT NULL DEFAULT '',
+    query_text TEXT NOT NULL DEFAULT '',
+    result_count INTEGER NOT NULL DEFAULT 0,
+    memory_type TEXT NOT NULL DEFAULT '',
+    rating TEXT NOT NULL DEFAULT '',
+    response_size INTEGER NOT NULL DEFAULT 0
+);
+
+-- Memory amendment audit trail
+CREATE TABLE memory_amendments (
+    id INTEGER PRIMARY KEY AUTOINCREMENT,
+    memory_id TEXT NOT NULL,
+    old_content TEXT NOT NULL,
+    old_summary TEXT NOT NULL,
+    new_content TEXT NOT NULL,
+    new_summary TEXT NOT NULL,
+    amended_at TEXT NOT NULL,
+    source TEXT NOT NULL DEFAULT 'mcp'
+);
+
+-- Runtime watcher exclusions (managed via MCP tools)
+CREATE TABLE runtime_exclusions (
+    id INTEGER PRIMARY KEY AUTOINCREMENT,
+    pattern TEXT NOT NULL UNIQUE,
+    source TEXT NOT NULL DEFAULT 'mcp',
+    created_at TEXT NOT NULL
+);
+
+-- Additional columns on memories table (added via migrations):
+-- feedback_score INTEGER DEFAULT 0     — accumulated feedback (helpful=+1, irrelevant=-1)
+-- recall_suppressed INTEGER DEFAULT 0  — auto-suppressed when feedback_score <= -3
 ```
 
 ---
@@ -493,7 +552,7 @@ mnemonic/
 │   │   ├── server.go                      # Static file serving (go:embed)
 │   │   └── static/index.html              # Dashboard (D3.js graph, live feed, query tester)
 │   ├── ingest/                            # Project ingestion engine
-│   ├── mcp/server.go                      # MCP server (13 tools for Claude Code)
+│   ├── mcp/server.go                      # MCP server (19 tools for Claude Code)
 │   ├── backup/                            # Export/import logic
 │   ├── daemon/                            # Service management (macOS LaunchAgent + Linux systemd)
 │   ├── config/config.go                   # Configuration loading
@@ -513,7 +572,7 @@ mnemonic/
 
 ## Build History
 
-All original build phases are **complete**. Current focus is SDK features, dashboard polish, and recall quality.
+All original build phases are **complete**. Current focus is memory quality, retrieval intelligence, and training a bespoke local LLM (Mnemonic-LM).
 
 ### Completed
 
@@ -531,12 +590,9 @@ All original build phases are **complete**. Current focus is SDK features, dashb
 
 - **Multi-modal memory** (images, audio) — text-only for v1
 - **Cross-device sync** — single machine for v1
-- **User preference learning** — needs feedback data from v1
 - **Advanced consolidation** (hierarchical memory, schema learning)
 - **Database encryption** — air-gapped assumption covers v1
-- **Local model fine-tuning** — LM Studio handles v1
 - **Native macOS menu bar widget** — web dashboard covers v1, native UI later
-- ~~**MCP server integration**~~ — **Done.** 13 MCP tools implemented (`internal/mcp/server.go`)
 
 ---
 
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -34,7 +34,7 @@ internal/
     reactor/           Event-driven rule engine
   api/                 REST API server + routes
   web/                 Embedded dashboard (single-page app, D3.js charts)
-  mcp/                 MCP server (13 tools for Claude Code)
+  mcp/                 MCP server (19 tools for Claude Code)
   store/               Store interface + SQLite implementation
   llm/                 LLM provider interface + implementations (LM Studio, Gemini/cloud API)
   ingest/              Project ingestion engine
@@ -107,23 +107,29 @@ See [GitHub Issues](https://github.com/appsprout-dev/mnemonic/issues) for tracke
 
 ## MCP Tools Available
 
-You have 13 tools via the `mnemonic` MCP server:
+You have 19 tools via the `mnemonic` MCP server:
 
 | Tool | When to Use |
 |------|-------------|
-| `remember` | Store decisions, errors, insights, learnings |
-| `recall` | Search for relevant memories before starting work |
+| `remember` | Store decisions, errors, insights, learnings (returns raw ID + salience) |
+| `recall` | Semantic search with spread activation (`explain`, `include_associations`, `format`, `synthesize` params) |
 | `forget` | Archive irrelevant memories |
-| `status` | Check memory system health and stats |
+| `amend` | Update a memory's content in place (preserves associations, history, salience) |
+| `check_memory` | Inspect a memory's encoding status, concepts, and associations |
+| `status` | System health, encoding pipeline status, source distribution |
 | `recall_project` | Get project-specific context and patterns |
 | `recall_timeline` | See what happened in a time range |
+| `recall_session` | Retrieve all memories from a specific MCP session |
+| `list_sessions` | List recent sessions with time range and memory count |
 | `session_summary` | Summarize current/recent session |
 | `get_patterns` | View discovered recurring patterns |
 | `get_insights` | View metacognition observations and abstractions |
-| `feedback` | Report recall quality (helps system learn) |
+| `feedback` | Report recall quality (drives ranking, can auto-suppress noisy memories) |
 | `audit_encodings` | Review recent encoding quality and suggest improvements |
 | `coach_local_llm` | Write coaching guidance to improve local LLM prompts |
 | `ingest_project` | Bulk-ingest a project directory into memory |
+| `exclude_path` | Add a watcher exclusion pattern at runtime |
+| `list_exclusions` | List all runtime watcher exclusion patterns |
 
 ### At Session Start
 
diff --git a/README.md b/README.md
@@ -13,7 +13,7 @@ A local-first semantic memory daemon that watches your work, learns from it, and
 - **Autonomous** — Watches your filesystem, terminal, and clipboard. Encodes memories without you lifting a finger.
 - **Biological** — Memories consolidate, decay, form patterns, and become principles. It doesn't just store — it *processes*.
 - **Local-first** — Air-gapped, SQLite-backed, never phones home. Your data stays on your machine.
-- **13 MCP tools** — Drop-in memory layer for Claude Code and other AI agents.
+- **19 MCP tools** — Drop-in memory layer for Claude Code and other AI agents.
 - **Self-updating** — Built-in update mechanism checks GitHub Releases and applies updates in-place.
 - **Cross-platform** — macOS, Linux, and Windows. Daemon management via launchd, systemd, or Windows Services.
 
@@ -42,7 +42,7 @@ Or [build from source](#development) (requires Go 1.23+).
 **Configure and run:**
 
 ```bash
-cp config.yaml ~/.mnemonic/config.yaml
+cp config.example.yaml ~/.mnemonic/config.yaml
 # Edit ~/.mnemonic/config.yaml — set llm.endpoint, llm.chat_model, llm.embedding_model
 # For local LLM: see docs/setup-lmstudio.md
 # For Gemini: set endpoint to Gemini API URL and export LLM_API_KEY
@@ -72,6 +72,7 @@ Open `http://127.0.0.1:9999` for the embedded web UI:
 - **Explore** — Browse episodes, memories, patterns, and abstractions
 - **Timeline** — Chronological view with date range filters and type/tag filtering
 - **LLM** — Per-agent token consumption, cost tracking, and usage charts
+- **Tools** — MCP tool usage analytics: call frequency, latency, error rates
 - **SDK** — Agent evolution dashboard: principles, strategies, session timeline, chat interface
 - **Activity drawer** — Slide-out panel with live event feed and metacognition insights
 - **Themes** — 5 dashboard themes: Midnight, Ember, Nord, Slate, Parchment
@@ -87,8 +88,8 @@ Mnemonic implements a cognitive pipeline inspired by neuroscience — 8 agents p
 1. **Perception** — Watch filesystem, terminal, clipboard, MCP events. Pre-filter with heuristics.
 2. **Encoding** — LLM-powered compression into memories. Extract concepts, generate embeddings, create association links.
 3. **Episoding** — Cluster memories into temporal episodes with LLM synthesis.
-4. **Consolidation** — Sleep cycle (every 6h). Decay salience, merge related memories, extract recurring patterns.
-5. **Retrieval** — Spread activation: embed query, find entry points (FTS + embedding), traverse association graph 3 hops, LLM synthesis with tool-use.
+4. **Consolidation** — Sleep cycle. Decay salience, merge related memories, extract patterns, archive never-recalled watcher noise.
+5. **Retrieval** — Spread activation: embed query, find entry points (FTS + embedding), traverse association graph 3 hops, rank by feedback history + source weight + pattern evidence, optional LLM synthesis.
 6. **Metacognition** — Self-reflection. Audit memory quality, analyze feedback, re-embed orphaned memories.
 7. **Dreaming** — Replay memories, strengthen associations, cross-pollinate across projects, generate insights.
 8. **Abstraction** — Build hierarchical knowledge: patterns (level 1) → principles (level 2) → axioms (level 3).
@@ -97,15 +98,15 @@ Mnemonic implements a cognitive pipeline inspired by neuroscience — 8 agents p
 
 **Reactor** — Event-driven rule engine. Fires condition → action chains in response to system events.
 
-**Feedback loop** — Helpful recalls strengthen associations and boost salience. Irrelevant results weaken them. The system learns from usage.
+**Feedback loop** — Helpful recalls strengthen associations, boost salience, and inform future ranking. Irrelevant results weaken associations and can auto-suppress noisy memories. Feedback scores directly influence retrieval ranking, and patterns discovered from your usage boost evidence memories.
 
 All agents communicate via an event bus — none call each other directly.
 
 For the full deep dive, see [ARCHITECTURE.md](ARCHITECTURE.md).
 
 ## MCP Integration
 
-Mnemonic exposes 13 tools via the [Model Context Protocol](https://modelcontextprotocol.io/) for Claude Code and other AI agents:
+Mnemonic exposes 19 tools via the [Model Context Protocol](https://modelcontextprotocol.io/) for Claude Code and other AI agents:
 
 **Claude Code config** (`~/.claude/settings.local.json`):
 
@@ -124,19 +125,25 @@ Mnemonic exposes 13 tools via the [Model Context Protocol](https://modelcontextp
 
 | Tool | Purpose |
 | ---- | ------- |
-| `remember` | Store decisions, errors, insights, learnings |
-| `recall` | Semantic search with spread activation |
+| `remember` | Store decisions, errors, insights, learnings (returns salience + encoding status) |
+| `recall` | Semantic search with spread activation, feedback-informed ranking, optional synthesis |
 | `forget` | Archive a memory |
-| `status` | System health and stats |
+| `amend` | Update a memory's content in place (preserves associations and history) |
+| `check_memory` | Inspect encoding status, concepts, associations for a specific memory |
+| `status` | System health, pipeline status, source distribution |
 | `recall_project` | Project-scoped context and patterns |
 | `recall_timeline` | Chronological retrieval within a time range |
+| `recall_session` | Retrieve all memories from a specific session |
+| `list_sessions` | List recent MCP sessions with metadata |
 | `session_summary` | Summarize current/recent session |
 | `get_patterns` | View discovered recurring patterns |
 | `get_insights` | View metacognition observations and abstractions |
-| `feedback` | Report recall quality (trains retrieval) |
+| `feedback` | Report recall quality (drives ranking, can auto-suppress noisy memories) |
 | `audit_encodings` | Review encoding quality |
 | `coach_local_llm` | Write coaching guidance for local LLM prompts |
 | `ingest_project` | Bulk-ingest a project directory |
+| `exclude_path` | Add a watcher exclusion pattern at runtime |
+| `list_exclusions` | List all runtime watcher exclusions |
 
 See [CLAUDE.md](CLAUDE.md) for Claude Code usage guidelines.
 
@@ -178,7 +185,7 @@ All settings live in `config.yaml`. Key sections:
 - **perception** — Watch directories, shell, clipboard; heuristic thresholds; project identity
 - **encoding** — Concept extraction, similarity search, contextual encoding
 - **consolidation** — Decay rate, salience thresholds, pattern extraction
-- **retrieval** — Spread activation hops, decay, synthesis tokens
+- **retrieval** — Spread activation hops, decay, synthesis tokens, source weights, feedback weight
 - **metacognition** — Reflection interval, feedback processing
 - **episoding** — Episode window, minimum events
 - **dreaming** — Replay interval, association boost, noise pruning
@@ -211,7 +218,7 @@ internal/
   agent/            8 cognitive agents + orchestrator + reactor
   api/              HTTP + WebSocket server
   web/              Embedded dashboard (single-page app)
-  mcp/              MCP server (13 tools)
+  mcp/              MCP server (19 tools)
   store/            Store interface + SQLite (FTS5 + vector search)
   llm/              LLM provider interface (LM Studio, Gemini, cloud APIs)
   ingest/           Project ingestion engine