Skip to content

Latest commit

 

History

History
172 lines (127 loc) · 9.81 KB

File metadata and controls

172 lines (127 loc) · 9.81 KB

slmrs - SuperLocalMemory Rust

Project Overview

slmrs is a direct port of SuperLocalMemory V3 from Python to Rust. The primary motivations are binary size (single ~3MB binary vs Python + pip + venv + torch) and startup performance (instant vs multi-second Python import chain). The port preserves the original architecture and algorithms faithfully, with Rust-specific optimizations where they naturally fit - startup time, search performance, and memory efficiency.

This is NOT a reimagining or rewrite. When porting upstream fixes or features, match the Python implementation closely. Deviate only when Rust's type system, ownership model, or performance characteristics make a different approach clearly better.

Upstream Relationship

Details
Upstream repo github.com/qualixar/superlocalmemory
Fork github.com/ccustine/superlocalmemory
Local Python clone ~/development/superlocalmemory
Base commit cd70c89 (v3.2.3, 2026-03-30). Upstream HEAD is v3.3.19 (5ae80d7) as of 2026-04-03.
Base dir ~/.slmrs (Python uses ~/.superlocalmemory)
DB file ~/.slmrs/memory.db (schema-compatible with Python)

To check for new upstream commits:

cd ~/development/superlocalmemory && git fetch upstream && git log --oneline 9d4eace..upstream/main

Sync status is tracked in the Red Hat HQ Obsidian vault at Projects/slmrs/slmrs Upstream Sync Status.

Architecture

The Rust module structure mirrors the Python package layout. Every .rs file has doc comments mapping it to its Python source file(s), including function-level mappings. Constants that must stay in sync with Python are marked with /// SYNC: comments - grep for these when porting upstream changes.

Module Map

Rust module Python source Purpose
models storage/models.py Data types, enums
config core/config.py + core/modes.py Config, mode capabilities
storage storage/database.py + storage/schema.py SQLite with WAL, FTS5, CRUD
math math/fisher.py, sheaf.py, langevin.py Fisher-Rao, sheaf cohomology, Langevin dynamics. Hopfield network and TurboQuant not yet ported.
trust trust/scorer.py, provenance.py Bayesian trust, SHA-256 provenance chain
encoding encoding/*.py 11-step encoding pipeline, context generation, auto-linking, temporal validation, graph analysis, consolidation
retrieval retrieval/*.py 5-channel search + RRF fusion (semantic, BM25, entity graph, temporal, spreading activation)
hooks hooks/*.py Auto-invoke engine (multi-signal auto-recall)
learning learning/adaptive.py, behavioral.py, outcomes.py Adaptive weights, behavioral patterns
compliance compliance/*.py EU AI Act, GDPR, lifecycle, retention
llm llm/backbone.py, core/embeddings.py Multi-provider LLM client, embedding service
mcp mcp/server.py, tools_*.py MCP server (28 tools via rmcp)
engine core/engine.py Unified store/recall orchestrator
server server/api.py, server/ws.py, infra/event_bus.py Axum web dashboard, REST API, WebSocket, event bus
cli cli/main.py, commands.py, json_output.py CLI (26 subcommands, includes consolidate)
tui - Ratatui TUI (18 views, command palette, help overlay)
daemon - Persistent daemon (socket, service, MCP proxy)

Key Architecture Differences from Python

  • Python's MemoryEngine (core/engine.py) is a monolithic orchestrator. Rust has a unified engine.rs orchestrator that both CLI and MCP share, with encoding/pipeline.rs (store path) and retrieval/engine.rs (recall path) as lower-level modules.
  • Python uses a subprocess for embeddings (keeps main process < 60MB). Rust uses in-process ONNX via the ort crate for local embeddings, plus cloud and Ollama providers.
  • Python uses FastMCP; Rust uses rmcp with #[tool] macros.
  • Schemas are compatible - both Python and Rust can read the same SQLite tables (including v3.2 extension tables).
  • Python uses sqlite-vec for vector indexing; Rust uses brute-force cosine scan (sufficient for target user base).
  • Python's engine_wiring/store_pipeline/recall_pipeline refactor maps to existing Rust engine.rs + encoding/pipeline.rs + retrieval/engine.rs split.
  • Python v3.3+ has 6 retrieval channels (added Hopfield); Rust still has 5 (semantic, BM25, entity graph, temporal, spreading activation).
  • Python v3.3+ uses TurboQuant for vector quantization; Rust has no quantization yet.

Building and Testing

cargo build              # Dev build
cargo build --release    # Release build
cargo test               # Run all 266 tests
cargo install --path .   # Install to ~/.cargo/bin/slmrs
slmrs --version          # Verify: shows version, git hash, build timestamp

The build.rs script embeds git hash + dirty flag + build timestamp into the binary for build identification.

Spikes (Proof of Concept Binaries)

Binary Purpose
spike-rusqlite Validates rusqlite + WAL + FTS5 + schema compatibility
spike-mcp Validates rmcp + stdio MCP transport
spike-onnx Validates ONNX model inference for embeddings

MCP Server

The MCP server runs via slmrs mcp (stdio transport). Plugin configuration for Claude Code is in the slmrs local plugin at ~/.claude-shared/plugins/local/slmrs/.

TUI

The TUI launches when slmrs is run with no arguments in an interactive terminal. It connects to the daemon for live events and has direct SQLite access for reads/writes.

Key bindings: ? for help, / for command palette, Tab to toggle sidebar/main focus, q to quit.

Daemon

The persistent daemon (slmrs daemon start) centralizes MCP server handling and event broadcasting.

slmrs daemon start       # Start in foreground
slmrs daemon stop        # Stop running daemon
slmrs daemon status      # Check status
slmrs daemon install     # Install as system service (launchd/systemd)
slmrs daemon uninstall   # Remove system service

MCP shims (slmrs mcp) auto-detect the daemon and forward requests. Falls back to standalone mode if daemon is unavailable.

Operating Modes

Mode Description LLM Embeddings EU AI Act
A Local Guardian None (rule-based) Local ONNX Compliant
B Smart Local Ollama (default: llama3.2) Local ONNX Compliant
C Full Power Cloud (OpenAI/Anthropic/Azure) Cloud Not compliant

All three modes are functional. Mode B/C add LLM-based fact extraction, agentic multi-hop retrieval, and cross-encoder reranking.

Porting Guidelines

When porting new features or fixes from upstream:

  1. Check the upstream commit against the sync status doc
  2. Read the Python diff and identify which Rust files are affected using the doc comment mappings
  3. Search for /// SYNC: markers in the affected Rust files
  4. Match Python's behavior exactly unless there's a clear Rust-idiomatic improvement
  5. Run cargo test to verify
  6. Update the sync status doc in Chris HQ vault at Projects/slmrs/slmrs Upstream Sync Status

What NOT to port

  • Python packaging (pip, pyproject.toml, package.json, postinstall scripts)
  • Python subprocess worker infrastructure (embedding_worker.py, worker_pool.py)
  • Windows CLI wrapper scripts (bin/slm, bin/slm.bat)
  • Python dependency management changes

Not Yet Ported

These Python modules have no Rust equivalent yet, organized by priority.

High Priority (v3.3.x - impacts retrieval quality and schema compat)

  • Math: Modern Hopfield network (math/hopfield.py - 6th retrieval channel, ~15-20% recall quality), TurboQuant vector quantization (math/turbo_quant.py - replaces PolarQuant, 2x storage savings)
  • Retrieval: Fisher Bayesian variance update on recall (24pp benchmark drop without it), Hopfield channel (retrieval/hopfield_channel.py), forgetting filter, quantization-aware search (3-tier QAS for 90x speedup)
  • Storage: v3.3 schema migration - 6 new tables: fact_retention, polar_embeddings, embedding_quantization_metadata, ccq_consolidated_blocks, ccq_audit_log, soft_prompt_templates
  • Encoding: Verbatim content storage alongside extracted facts (v3.3.11), full CCQ 6-step cognitive consolidation pipeline with audit logging

Medium Priority (quality and performance)

  • Retrieval: Entity graph in-memory adjacency cache (31s -> 500ms), ANN index, profile_channel
  • Math: Ebbinghaus-Langevin coupling (dynamics/ebbinghaus_langevin_coupling.py), Ebbinghaus forgetting curves with lifecycle zones
  • Encoding: observation_builder, scene_builder, foresight
  • Core: maintenance scheduler (background Langevin/Ebbinghaus/sheaf), summarizer, embedding migration detection
  • Trust: trust gates (gate.py), burst detection (signals.py)
  • Compliance: hash-chain audit, ABAC, scheduler
  • Hooks: auto_capture, rules engine, claude_code_hooks, ide_connector
  • Learning: Soft prompt parameterization (parameterization/*.py), LightGBM Phase 3 training (crate broken on macOS, gated behind ml feature flag). All 12 other learning modules ported.

Low Priority (infrastructure, UI, nice-to-have)

  • Server: dashboard frontend UI (backend REST routes are done, including v3_api)
  • Infra: caching, webhooks, rate limiter
  • Attribution: mathematical DNA, signing, watermarking
  • CLI: setup wizard, audit
  • Core: AgentRegistry

Code Style

  • The codebase is a faithful port, not a ground-up Rust redesign. It uses Rust idioms (enums with derive macros, iterators, Option/Result) but preserves Python's architecture and naming
  • Models are in a single models.rs file mirroring Python's storage/models.py
  • Each encoding/retrieval/compliance submodule is a standalone file with its own types and logic
  • Tests live in #[cfg(test)] mod tests blocks within each source file
  • Temporary/debug files go in tmp/ (gitignored)