Mnemonic v1 — Refined Architecture Specification

Local-first, air-gapped, agentic memory that learns through use. The architecture is the intelligence. The LLM is the judgment.

Core Design Philosophy

The cognitive model (spread activation, salience decay, associative linking) is the durable foundation — it's based on decades of cognitive science and won't be obsoleted by next month's model release. Everything else — LLM providers, storage backends, embedding models — is swappable plumbing behind clean interfaces.

Tech Stack

Component	Choice	Rationale
Language	Go	Fast, single binary, great concurrency, clean daemon
Store	SQLite (WAL mode)	Sub-ms lookups, FTS5, ACID, single file, embedded
LLM runtime	LM Studio / Gemini API / any OpenAI-compatible provider	Local or cloud, model-agnostic
Embeddings	Provider-supplied (e.g. embeddinggemma, Gemini embedding)	Separate model slot, local or cloud semantic search
Platform	macOS ARM (primary), Linux x86_64, Windows x86_64	Cross-platform via build tags

Core Abstractions (Interfaces)

These are the load-bearing walls. Get them right and everything downstream is swappable.

1. `llm.Provider`

type Provider interface {
    Complete(ctx, req CompletionRequest) (CompletionResponse, error)
    Embed(ctx, text string) ([]float32, error)
    BatchEmbed(ctx, texts []string) ([][]float32, error)
    Health(ctx) error
    ModelInfo(ctx) (ModelMetadata, error)
}

Implementations: LM Studio (local, OpenAI-compatible), Gemini API (cloud), any OpenAI-compatible endpoint
Instrumented wrapper tracks per-call token usage, latency, and caller identity

2. `store.Store`

type Store interface {
    // Raw memory CRUD
    WriteRaw(ctx, raw RawMemory) error
    ListRawUnprocessed(ctx, limit int) ([]RawMemory, error)
    // Encoded memory CRUD
    WriteMemory(ctx, mem Memory) error
    GetMemory(ctx, id string) (Memory, error)
    UpdateSalience(ctx, id string, salience float32) error
    // Search (FTS + embedding + concepts)
    SearchByFullText(ctx, query string, limit int) ([]Memory, error)
    SearchByEmbedding(ctx, embedding []float32, limit int) ([]RetrievalResult, error)
    // Association graph
    CreateAssociation(ctx, assoc Association) error
    GetAssociations(ctx, memoryID string) ([]Association, error)
    // Spread activation (first-class operation)
    SpreadActivation(ctx, entryPoints []Memory, maxHops int, config ActivationConfig) ([]RetrievalResult, error)
    // Batch ops for consolidation
    BatchUpdateSalience(ctx, updates map[string]float32) error
    BatchMergeMemories(ctx, sourceIDs []string, gistMemory Memory) error
    // Transactions
    BeginTx(ctx) (Tx, error)
    // Housekeeping
    GetStatistics(ctx) (StoreStatistics, error)
}

v1 implementation: SQLite with FTS5 + binary embeddings
Future: PostgreSQL, vector DBs — zero agent code changes

3. `watcher.Watcher`

type Watcher interface {
    Start(ctx) error
    Stop(ctx) error
    Events() <-chan Event
    Health(ctx) error
}

v1 implementations: Filesystem, Terminal History, Clipboard
Each watcher is independent and pluggable

4. `agent.Agent`

type Agent interface {
    Name() string
    Run(ctx, bus events.EventBus) error
    Health(ctx) error
}

All 8 cognitive agents + orchestrator implement this (reactor is a separate engine)
Loosely coupled through the event bus

5. `events.EventBus`

type EventBus interface {
    Publish(ctx, event Event) error
    Subscribe(eventType string, handler Handler) error
    Close() error
}

v1 implementation: In-memory pub/sub
Agents communicate through events, never direct calls

Cognitive Layers

Layer 1 — Perception (Full Ingestion)

Four watchers active from v1:

Filesystem: Watch configured dirs, track create/modify/delete
Terminal: Poll shell history, detect new entries
Clipboard: Poll for text changes
Git: Watch repositories for commits, branches, and diffs

Heuristic pre-filter pipeline (before any LLM call):

Raw Event → Size Filter → Pattern Blacklist → Frequency Dedup → Content Heuristic → [LLM Gate] → RawMemory

Size: Skip >100KB or <10 chars
Patterns: Skip .git, node_modules, temp files, .DS_Store
Frequency: Skip if same event >5 times in 10 min
Content: Simple keyword scoring (error, todo, important → higher score)
Only events passing heuristics go to LLM for salience judgment

Layer 2 — Encoding (Compression & Linking)

Triggered by RawMemoryCreated event. MCP-sourced memories are processed first (priority queue by source).

Atomic claim via ClaimRawForEncoding (prevents duplicate encoding across processes)
LLM compresses raw content → summary + concepts (structured output, enforced vocabulary)
Post-process concepts: strip metadata, normalize casing, deduplicate
Generate embedding via embedding provider
Deduplication check: if cosine similarity > threshold, boost existing memory instead
Find similar memories via embedding + FTS, create association links
Emit MemoryEncoded event

Layer 3 — Consolidation (Sleep Cycle)

Runs periodically (configurable, default 1h) or on-demand. Budget-constrained: max 100 memories per cycle.

Operations in order:

Decay: new_salience = salience * decay_rate^(hours_since_access) with recency protection
State transitions: active → fading (< 0.3) → archived (< 0.1) → deleted (> 90 days)
Strengthen: Recently accessed memories get salience boost
Prune associations: Weaken/remove low-strength, never-activated links
Merge (max 5 per cycle): Cluster highly-related memories → LLM creates gist memory
Archive never-recalled noise: Non-MCP memories with 0 access after 30 days → archived
Pattern extraction: Identify recurring themes, deduplicate near-identical patterns
Abstraction dedup: Archive zombie abstractions with near-zero confidence

Layer 4 — Retrieval (Associative Recall)

Query → [Parse + Embed] → Entry Points (FTS + Embedding) → Spread Activation (3 hops) → Rank → Filter → Diversity (MMR) → [Optional: LLM Synthesis]

Ranking considers multiple signals:

Activation score from spread activation traversal
Recency bonus (exponential decay from last access)
Source weight (MCP: 1.0, terminal: 0.8, filesystem: 0.5 -- configurable)
Feedback adjustment (memories rated helpful get boosted, irrelevant get penalized)
Pattern evidence boost (+0.1 for memories supporting matched patterns, +0.05 for abstraction sources)
Significance multiplier (critical/important memories from structured concept extraction)

Suppressed memories (accumulated negative feedback) are filtered out by default. Every accessed memory gets strengthened (access_count++, last_accessed updated).

The recall tool supports explain: true to surface score breakdowns, include_associations: true to show the knowledge graph, and format: "json" for structured output.

Layer 5 — Episoding (Temporal Clustering)

Clusters raw memories into time-window episodes (default 10-minute windows). When a window closes:

Collect raw memories in the time range
LLM synthesizes a title, narrative, and emotional tone
Extract concepts, files modified, and event timeline
Create episode record linking to constituent memories
Emit EpisodeClosed event

Claude-aware prompt for AI-assisted development sessions — recognizes coding patterns, debugging flows, and decision-making.

Layer 6 — Meta-Cognition (Self-Reflection)

Runs periodically (default every 4 hours):

Memory growth patterns (topic concentration)
Retrieval success/failure rate (via user feedback)
Association graph health (isolated clusters, density)
Consolidation effectiveness
Re-embed orphaned memories, trigger consolidation when needed
Log observations to meta_observations table

Layer 7 — Dreaming (Memory Replay)

Runs periodically (default every 1 hour). Replays and cross-pollinates memories:

Select batch of recent memories (default 60)
Strengthen associations between related memories across projects
Link memories to discovered patterns
Generate higher-order insights from memory clusters
Prune noise associations (below threshold)

Layer 8 — Abstraction (Hierarchical Knowledge)

Runs periodically (default every 2 hours). Builds hierarchical knowledge:

Level 1 — Patterns: Recurring themes extracted from memory clusters
Level 2 — Principles: Generalizations across patterns
Level 3 — Axioms: Fundamental truths with high confidence

Abstractions are grounded in evidence via verifyGrounding(). Young abstractions (< 7 days) have a confidence floor of 0.5 to prevent premature demotion. Frequently accessed abstractions (> 5 recalls) resist decay. Demotion is graduated: 0.9x for moderate evidence loss, 0.7x for significant, 0.5x for near-total (softened from 0.3x).

Layer 9 — Reactor (Event-Driven Rules)

Event-driven rule engine that fires condition-action chains in response to system events:

Trigger consolidation when DB grows too large
Kick off dreaming when an episode closes
Escalate health alerts when agents fail repeatedly

API Surface

HTTP REST (`http://localhost:9999/api/v1/`)

GET    /health                 System health check
GET    /stats                  Memory statistics

POST   /memories               Create raw memory (explicit user input)
GET    /memories                List memories (with filters)
GET    /memories/:id            Get specific memory
GET    /memories/:id/context    Get memory with surrounding context
GET    /raw/:id                 Get raw (unprocessed) memory

GET    /episodes                List episodes
GET    /episodes/:id            Get specific episode

POST   /query                   Query memories (spread activation + LLM synthesis)
POST   /feedback                Submit retrieval feedback

POST   /consolidation/run       Force consolidation cycle
POST   /ingest                  Bulk-ingest a directory

GET    /insights                Meta-cognition observations
GET    /patterns                Discovered patterns
GET    /abstractions            Hierarchical abstractions
GET    /projects                Project summaries

GET    /llm/usage               LLM token usage by agent
GET    /tool/usage              MCP tool call analytics

GET    /graph                   Association graph for D3.js visualization

GET    /agent/evolution          Agent SDK evolution state (conditional)
GET    /agent/changelog          Agent evolution changelog (conditional)
GET    /agent/sessions           Agent session history (conditional)
GET    /agent/config             Agent SDK configuration (conditional)

GET    /forum/categories         Forum categories with summaries
GET    /forum/threads            List threads (with limit/offset)
GET    /forum/threads/:id        Get thread posts
POST   /forum/posts              Create new post or thread
GET    /forum/posts/:id          Get a single post
PATCH  /forum/posts/:id          Update post state
POST   /forum/posts/:id/internalize  Absorb post into memory system

Optional bearer token authentication via Authorization: Bearer <token> header (configure with mnemonic generate-token).

WebSocket (`ws://localhost:9999/ws`)

Real-time event stream. Clients can filter by event type:

raw_memory_created — new perception event
memory_encoded — memory compressed and stored
consolidation_completed — sleep cycle finished
query_executed — retrieval performed

Web Dashboard (embedded in Go binary)

Served at http://localhost:9999/. Forum-style interface (phpBB-inspired) where cognitive agents are first-class participants:

Search — Query memories with spread activation, retrieval scores, and synthesis
Forum — Nested navigation (index > category > thread > post), agent @mentions with autocomplete, quote/reply, internalization (absorb posts into memory)
Timeline — Chronological view with date range filters and type/tag filtering
SDK — Agent evolution dashboard: principles, strategies, session timeline, chat
LLM — Per-agent token consumption, cost tracking, and usage charts
Tools — MCP tool usage analytics: call frequency, latency, error rates
Agent identity system — each cognitive agent has a distinct personality, avatar, and posting style
Live activity feed — agents post to the forum in real-time as they work
Memory source tags (hoverable, showing origin: filesystem, terminal, clipboard, MCP, consolidation)
5 themes: Midnight, Ember, Nord, Slate, Parchment (persists in localStorage)
Modular frontend: 12 ES modules + per-page CSS (no external CDN dependencies)

SQLite Schema

-- Raw observations before encoding
CREATE TABLE raw_memories (
    id TEXT PRIMARY KEY,
    timestamp DATETIME NOT NULL,
    source TEXT NOT NULL,        -- 'terminal', 'filesystem', 'clipboard', 'user'
    type TEXT,
    content TEXT NOT NULL,
    metadata JSON,
    heuristic_score REAL,
    initial_salience REAL DEFAULT 0.5,
    processed BOOLEAN DEFAULT FALSE,
    created_at DATETIME DEFAULT CURRENT_TIMESTAMP
);

-- Encoded memories (the real memory store)
CREATE TABLE memories (
    id TEXT PRIMARY KEY,
    raw_id TEXT REFERENCES raw_memories(id),
    timestamp DATETIME NOT NULL,
    content TEXT NOT NULL,       -- compressed/encoded form
    summary TEXT NOT NULL,       -- one-liner
    concepts JSON,               -- extracted concepts
    embedding BLOB,              -- float32 vector
    salience REAL DEFAULT 0.5,
    access_count INTEGER DEFAULT 0,
    last_accessed DATETIME,
    state TEXT DEFAULT 'active', -- active | fading | archived | merged
    gist_of JSON,               -- if merged: source memory IDs
    created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
    updated_at DATETIME DEFAULT CURRENT_TIMESTAMP
);

-- FTS5 for full-text search (auto-synced via triggers)
CREATE VIRTUAL TABLE memories_fts USING fts5(
    id UNINDEXED, summary, content, concepts,
    content='memories', content_rowid='rowid'
);

-- Association graph
CREATE TABLE associations (
    source_id TEXT NOT NULL REFERENCES memories(id) ON DELETE CASCADE,
    target_id TEXT NOT NULL REFERENCES memories(id) ON DELETE CASCADE,
    strength REAL DEFAULT 0.5,
    relation_type TEXT DEFAULT 'similar',
    created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
    last_activated DATETIME,
    activation_count INTEGER DEFAULT 0,
    PRIMARY KEY (source_id, target_id)
);

-- Meta-cognition observations
CREATE TABLE meta_observations (
    id TEXT PRIMARY KEY,
    observation_type TEXT NOT NULL,
    severity TEXT DEFAULT 'info',
    details JSON,
    created_at DATETIME DEFAULT CURRENT_TIMESTAMP
);

-- Retrieval feedback for learning
CREATE TABLE retrieval_feedback (
    query_id TEXT PRIMARY KEY,
    query_text TEXT NOT NULL,
    retrieved_memory_ids JSON,
    feedback TEXT,              -- 'helpful', 'partial', 'irrelevant'
    created_at DATETIME DEFAULT CURRENT_TIMESTAMP
);

-- Consolidation history
CREATE TABLE consolidation_history (
    id TEXT PRIMARY KEY,
    start_time DATETIME NOT NULL,
    end_time DATETIME NOT NULL,
    duration_ms INTEGER,
    memories_processed INTEGER,
    memories_decayed INTEGER,
    merged_clusters INTEGER,
    associations_pruned INTEGER,
    created_at DATETIME DEFAULT CURRENT_TIMESTAMP
);

-- Episodes (temporal clusters of raw events)
CREATE TABLE episodes (
    id TEXT PRIMARY KEY,
    title TEXT,
    start_time DATETIME NOT NULL,
    end_time DATETIME NOT NULL,
    summary TEXT,
    narrative TEXT,
    salience REAL DEFAULT 0.5,
    emotional_tone TEXT,
    state TEXT DEFAULT 'open',       -- open | closed
    concepts JSON DEFAULT '[]',
    files_modified JSON DEFAULT '[]',
    event_timeline JSON DEFAULT '[]',
    created_at DATETIME DEFAULT CURRENT_TIMESTAMP
);

-- Multi-resolution memory (gist / narrative / detail)
CREATE TABLE memory_resolutions (
    memory_id TEXT PRIMARY KEY REFERENCES memories(id) ON DELETE CASCADE,
    gist TEXT,
    narrative TEXT,
    detail_raw_ids JSON,
    created_at DATETIME DEFAULT CURRENT_TIMESTAMP
);

-- Structured concept extraction
CREATE TABLE concept_sets (
    memory_id TEXT PRIMARY KEY REFERENCES memories(id) ON DELETE CASCADE,
    topics JSON,
    entities JSON,
    actions JSON,
    causality JSON,
    significance TEXT,
    created_at DATETIME DEFAULT CURRENT_TIMESTAMP
);

-- Patterns discovered through consolidation/dreaming
CREATE TABLE patterns (
    id TEXT PRIMARY KEY,
    pattern_type TEXT NOT NULL,
    title TEXT NOT NULL,
    description TEXT NOT NULL,
    evidence_ids JSON DEFAULT '[]',
    strength REAL DEFAULT 0.5,
    project TEXT,
    concepts JSON DEFAULT '[]',
    embedding BLOB,
    state TEXT DEFAULT 'active',
    created_at DATETIME DEFAULT CURRENT_TIMESTAMP
);

-- Abstractions: hierarchical knowledge (level 1=pattern, 2=principle, 3=axiom)
CREATE TABLE abstractions (
    id TEXT PRIMARY KEY,
    level INTEGER DEFAULT 1,
    title TEXT NOT NULL,
    description TEXT NOT NULL,
    parent_id TEXT,
    source_pattern_ids JSON DEFAULT '[]',
    confidence REAL DEFAULT 0.5,
    concepts JSON DEFAULT '[]',
    embedding BLOB,
    state TEXT DEFAULT 'active',
    created_at DATETIME DEFAULT CURRENT_TIMESTAMP
);

-- System metadata key-value store
CREATE TABLE system_meta (
    key TEXT PRIMARY KEY,
    value TEXT NOT NULL,
    updated_at DATETIME DEFAULT CURRENT_TIMESTAMP
);

-- LLM usage tracking (per-call instrumentation)
CREATE TABLE llm_usage (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    timestamp TEXT NOT NULL DEFAULT CURRENT_TIMESTAMP,
    operation TEXT NOT NULL,
    caller TEXT NOT NULL DEFAULT '',
    model TEXT NOT NULL DEFAULT '',
    prompt_tokens INTEGER NOT NULL DEFAULT 0,
    completion_tokens INTEGER NOT NULL DEFAULT 0,
    total_tokens INTEGER NOT NULL DEFAULT 0,
    latency_ms INTEGER NOT NULL DEFAULT 0,
    success INTEGER NOT NULL DEFAULT 1
);

-- MCP tool usage tracking
CREATE TABLE tool_usage (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    timestamp TEXT NOT NULL,
    tool_name TEXT NOT NULL,
    session_id TEXT NOT NULL DEFAULT '',
    project TEXT NOT NULL DEFAULT '',
    latency_ms INTEGER NOT NULL DEFAULT 0,
    success INTEGER NOT NULL DEFAULT 1,
    error_message TEXT NOT NULL DEFAULT '',
    query_text TEXT NOT NULL DEFAULT '',
    result_count INTEGER NOT NULL DEFAULT 0,
    memory_type TEXT NOT NULL DEFAULT '',
    rating TEXT NOT NULL DEFAULT '',
    response_size INTEGER NOT NULL DEFAULT 0
);

-- Memory amendment audit trail
CREATE TABLE memory_amendments (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    memory_id TEXT NOT NULL,
    old_content TEXT NOT NULL,
    old_summary TEXT NOT NULL,
    new_content TEXT NOT NULL,
    new_summary TEXT NOT NULL,
    amended_at TEXT NOT NULL,
    source TEXT NOT NULL DEFAULT 'mcp'
);

-- Runtime watcher exclusions (managed via MCP tools)
CREATE TABLE runtime_exclusions (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    pattern TEXT NOT NULL UNIQUE,
    source TEXT NOT NULL DEFAULT 'mcp',
    created_at TEXT NOT NULL
);

-- Forum communication (decoupled from core memory)
CREATE TABLE forum_categories (
    id TEXT PRIMARY KEY,
    name TEXT NOT NULL,
    slug TEXT NOT NULL UNIQUE,
    description TEXT,
    icon TEXT,
    color TEXT,
    type TEXT DEFAULT 'custom',  -- system | custom | agent
    sort_order INTEGER DEFAULT 0
);

CREATE TABLE forum_posts (
    id TEXT PRIMARY KEY,
    parent_id TEXT,              -- self-reference for threaded replies
    thread_id TEXT,
    author_type TEXT,            -- human | agent
    author_name TEXT,
    author_key TEXT,
    content TEXT NOT NULL,
    mentions JSON,               -- @agent names
    memory_ids JSON,
    event_ref TEXT,
    category_id TEXT,
    pinned INTEGER DEFAULT 0,
    state TEXT DEFAULT 'active', -- active | archived | internalized
    created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
    updated_at DATETIME DEFAULT CURRENT_TIMESTAMP
);

-- Additional columns on memories table (added via migrations):
-- feedback_score INTEGER DEFAULT 0     — accumulated feedback (helpful=+1, irrelevant=-1)
-- recall_suppressed INTEGER DEFAULT 0  — auto-suppressed when feedback_score <= -3

Project Structure

mnemonic/
├── cmd/
│   ├── mnemonic/
│   │   ├── main.go                        # Daemon entry point + CLI
│   │   └── ingest.go                      # Bulk ingest subcommand
│   ├── benchmark/main.go                  # End-to-end benchmark
│   └── benchmark-quality/                 # Memory quality IR benchmark
├── internal/
│   ├── llm/
│   │   ├── provider.go                    # LLM interface
│   │   ├── lmstudio.go                    # LM Studio / OpenAI-compatible implementation
│   │   ├── embedded.go                    # Embedded LLM backend interface
│   │   ├── llamacpp/                      # Optional llama.cpp CGo backend (build-tagged)
│   │   ├── instrumented.go                # Usage-tracking wrapper (tokens, latency, caller)
│   │   └── pricing.go                     # Token cost estimation
│   ├── store/
│   │   ├── store.go                       # Store interface + domain types
│   │   └── sqlite/                        # SQLite implementation (FTS5, embeddings, episodes, patterns)
│   ├── events/
│   │   ├── bus.go                         # EventBus interface
│   │   ├── inmemory.go                    # In-memory implementation
│   │   └── types.go                       # Event type definitions
│   ├── watcher/
│   │   ├── watcher.go                     # Watcher interface
│   │   ├── filesystem/                    # FSEvents (macOS) + fsnotify (Linux)
│   │   ├── terminal/watcher.go            # Shell history polling
│   │   ├── clipboard/watcher.go           # Cross-platform clipboard
│   │   └── git/                           # Git repository watcher
│   ├── agent/
│   │   ├── agent.go                       # Agent interface
│   │   ├── perception/                    # Layer 1: Watch + heuristic filter
│   │   ├── encoding/                      # Layer 2: LLM compression + linking
│   │   ├── episoding/                     # Layer 3: Temporal episode clustering
│   │   ├── forum/                         # Agent personality system for forum communication
│   │   ├── consolidation/                 # Layer 4: Decay, merge, prune
│   │   ├── retrieval/                     # Layer 5: Spread activation + synthesis
│   │   ├── metacognition/                 # Layer 6: Self-reflection + audit
│   │   ├── dreaming/                      # Layer 7: Memory replay + cross-pollination
│   │   ├── abstraction/                   # Layer 8: Patterns → principles → axioms
│   │   ├── orchestrator/                  # Autonomous scheduler + health monitoring
│   │   └── reactor/                       # Event-driven rule engine
│   ├── api/
│   │   ├── server.go                      # HTTP + WebSocket server
│   │   └── routes/                        # REST endpoints (memories, query, graph, etc.)
│   ├── web/
│   │   ├── server.go                      # Static file serving (go:embed)
│   │   └── static/                        # Forum-style dashboard (modular ES modules + CSS)
│   ├── ingest/                            # Project ingestion engine
│   ├── mcp/server.go                      # MCP server (24 tools for Claude Code)
│   ├── backup/                            # Export/import logic
│   ├── daemon/                            # Service management (launchd, systemd, Windows Services)
│   ├── config/config.go                   # Configuration loading
│   └── logger/logger.go                   # Structured logging
├── sdk/                                   # Python agent SDK (self-evolving assistant)
│   ├── agent/                             # Agent implementation
│   ├── tests/                             # SDK tests
│   └── pyproject.toml
├── third_party/
│   └── llama.cpp/                         # llama.cpp submodule (for embedded LLM builds)
├── training/                              # Mnemonic-LM training infrastructure (Qwen spoke adapters)
├── migrations/                            # SQLite schema migrations
├── scripts/                               # Utility scripts
├── config.yaml
├── Makefile
└── go.mod

Build History

All original build phases are complete. Current focus is memory quality, retrieval intelligence, and training a bespoke local LLM (Mnemonic-LM).

Completed

Phase 1: Foundations (config, logging, LLM client, SQLite store)
Phase 2: Core memory loop (event bus, encoding, retrieval)
Phase 3: Perception (filesystem, terminal, clipboard watchers + heuristics)
Phase 4: API + Dashboard (REST API, WebSocket, D3.js graph, query tester)
Phase 5: Consolidation + Meta (decay, merge, prune, metacognition)
Phase 6: Polish (CLI, daemon, signal handling)
Bonus: Episoding, dreaming, abstraction agents; orchestrator; reactor; MCP server; Python agent SDK

Explicitly Deferred to v2+

Multi-modal memory (images, audio) — text-only for v1
Cross-device sync — single machine for v1
Advanced consolidation (hierarchical memory, schema learning)
Database encryption — air-gapped assumption covers v1
Native macOS menu bar widget — web dashboard covers v1, native UI later

Key Design Decisions

Embeddings in v1: Yes. Without them, retrieval is keyword-only and feels dumb. LM Studio serves embedding models with minimal overhead on M4.
Heuristic pre-filter before LLM: Multi-stage gate saves token budget. 80% of filesystem/terminal events are obvious noise.
Spread activation over pure vector search: Graph traversal captures meaning relationships (caused_by, contradicts). Embeddings are entry points, not the whole answer.
Budget-constrained consolidation: Max 100 memories and 5 merges per cycle. Prevents runaway LLM calls. Scales predictably.
Event bus architecture: Agents never call each other directly. Adding new observers or agents doesn't touch existing code.
API-first: The daemon is a service. CLI, dashboard, Claude Code, future tools — all talk to the same API. The memory system is infrastructure, not an app.

Verification Plan

After each phase, verify:

Phase 1: LLM client can chat + embed. Store can CRUD memories. FTS5 returns results.
Phase 2: POST /memories → encoding pipeline → POST /query returns relevant results with spread activation.
Phase 3: Start daemon, edit a file in a watched dir, see raw_memory appear. Run a terminal command, see it captured.
Phase 4: Open http://localhost:9999, see live dashboard. WebSocket streams events in real-time. Query tester works.
Phase 5: Run consolidation, verify salience decay. Let system run 24h, check meta-cognition observations.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mnemonic v1 — Refined Architecture Specification

Core Design Philosophy

Tech Stack

Core Abstractions (Interfaces)

1. `llm.Provider`

2. `store.Store`

3. `watcher.Watcher`

4. `agent.Agent`

5. `events.EventBus`

Cognitive Layers

Layer 1 — Perception (Full Ingestion)

Layer 2 — Encoding (Compression & Linking)

Layer 3 — Consolidation (Sleep Cycle)

Layer 4 — Retrieval (Associative Recall)

Layer 5 — Episoding (Temporal Clustering)

Layer 6 — Meta-Cognition (Self-Reflection)

Layer 7 — Dreaming (Memory Replay)

Layer 8 — Abstraction (Hierarchical Knowledge)

Layer 9 — Reactor (Event-Driven Rules)

API Surface

HTTP REST (`http://localhost:9999/api/v1/`)

WebSocket (`ws://localhost:9999/ws`)

Web Dashboard (embedded in Go binary)

SQLite Schema

Project Structure

Build History

Completed

Explicitly Deferred to v2+

Key Design Decisions

Verification Plan

FilesExpand file tree

ARCHITECTURE.md

Latest commit

History

ARCHITECTURE.md

File metadata and controls

Mnemonic v1 — Refined Architecture Specification

Core Design Philosophy

Tech Stack

Core Abstractions (Interfaces)

1. llm.Provider

2. store.Store

3. watcher.Watcher

4. agent.Agent

5. events.EventBus

Cognitive Layers

Layer 1 — Perception (Full Ingestion)

Layer 2 — Encoding (Compression & Linking)

Layer 3 — Consolidation (Sleep Cycle)

Layer 4 — Retrieval (Associative Recall)

Layer 5 — Episoding (Temporal Clustering)

Layer 6 — Meta-Cognition (Self-Reflection)

Layer 7 — Dreaming (Memory Replay)

Layer 8 — Abstraction (Hierarchical Knowledge)

Layer 9 — Reactor (Event-Driven Rules)

API Surface

HTTP REST (http://localhost:9999/api/v1/)

WebSocket (ws://localhost:9999/ws)

Web Dashboard (embedded in Go binary)

SQLite Schema

Project Structure

Build History

Completed

Explicitly Deferred to v2+

Key Design Decisions

Verification Plan

1. `llm.Provider`

2. `store.Store`

3. `watcher.Watcher`

4. `agent.Agent`

5. `events.EventBus`

HTTP REST (`http://localhost:9999/api/v1/`)

WebSocket (`ws://localhost:9999/ws`)