Versioned, graph-based persistent memory for AI coding agents — powered by Sphinx-Needs.
AI agents lose context between sessions. This protocol gives them a structured way to remember, recall, and evolve knowledge — with full Git history, typed entries, graph links, and machine-readable output.
- Typed memories — observations, decisions, facts, preferences, risks, goals, open questions
- Graph links — relates, supports, depends, supersedes, contradicts, example_of
- Tag-based discovery —
topic:api,repo:backend,tier:core - Context-optimized output — brief / compact / context / JSON formats with body toggling
- Stale detection — auto-expire, review reminders, staleness checks
- Auto-scaling — RST files split at 50 entries, transparent to queries
- Git-native — every memory is an RST directive, fully diffable and versioned
- MCP server — expose memory as tools for Claude Desktop, VS Code Copilot, and other MCP clients
- Autonomous capture — extract memories from Git commits, CI logs, and discussion transcripts
- Planning engine — analyze memory graph and propose maintenance actions
- CLI-first — 16+ subcommands for full lifecycle management
git clone https://github.com/bburda/ai_memory_protocol.git
pipx install -e ai_memory_protocol/
# With MCP server support
pipx install -e 'ai_memory_protocol/[mcp]'This installs the memory CLI command (and optionally memory-mcp-stdio) globally on PATH.
# 1. Create a memory workspace
memory init .memories --name "My Project" --install
# 2. Add your first memory
memory add fact "API runs on port 8080" \
--tags "topic:api,repo:backend" \
--confidence high \
--body "Gateway listens on 0.0.0.0:8080 by default" \
--rebuild
# 3. Search
memory recall api port
memory recall --tag topic:api --format brief
# 4. Get full details
memory get FACT_api_runs_on_port_8080RST files (memory/*.rst) ← Human + AI editable, Git-tracked
│
▼ memory rebuild (sphinx-build)
needs.json (_build/html/needs.json) ← Machine-readable index
│
▼ memory recall / get / list
Formatted output ← Optimized for LLM context windows
Memories are stored as Sphinx-Needs directives in RST files. A memory rebuild command runs Sphinx to produce needs.json — the single query layer for all search operations. This means memories are simultaneously human-readable documentation and machine-queryable data.
memory init <dir> # Create a new workspace
memory add <type> "<title>" [options] # Record a memory
memory recall [query] [--tag ...] [--format brief|compact|context|json]
memory get <ID> # Full details of one memory
memory related <ID> [--hops N] # Graph walk from a memory
memory list [--type TYPE] [--status S] # Browse all memories
memory update <ID> [--confidence ...] [--add-tags ...]
memory deprecate <ID> [--by NEW_ID] # Mark as deprecated
memory tags [--prefix PREFIX] # Discover tags in use
memory stale # Find expired/overdue memories
memory review # Show memories needing review
memory rebuild # Rebuild needs.json
memory capture git # Extract memories from recent commits
memory capture ci --input <file|-> # Extract memories from CI/test logs
memory capture discussion --input <file|-> # Extract from conversation transcripts
memory plan [--auto-apply] # Analyze graph and propose maintenance
memory apply <plan.json> # Execute a generated plan
memory doctor # Verify installation healthKey flags for recall:
--format brief— ultra-compact, minimal tokens--body— include body text (off by default)--sort newest|oldest|confidence|updated--limit N— cap results--expand 0— disable graph expansion--stale— only expired/review-overdue
Expose memory tools to LLM clients via the Model Context Protocol.
Install with MCP extras:
pipx install -e 'ai_memory_protocol/[mcp]'claude mcp add --transport stdio --env MEMORY_DIR=/path/to/.memories memory -- memory-mcp-stdioOr add to .mcp.json in your project root (project scope):
{
"mcpServers": {
"memory": {
"type": "stdio",
"command": "memory-mcp-stdio",
"env": {
"MEMORY_DIR": "/path/to/.memories"
}
}
}
}Add to .vscode/mcp.json:
{
"servers": {
"memory": {
"command": "memory-mcp-stdio",
"env": {
"MEMORY_DIR": "${workspaceFolder}/.memories"
}
}
}
}| Tool | Description |
|---|---|
memory_recall |
Search memories by text/tags with formatting options |
memory_get |
Get full details of a specific memory |
memory_add |
Record a new memory with tags and metadata |
memory_update |
Update metadata (status, confidence, tags, etc.) |
memory_deprecate |
Mark a memory as deprecated |
memory_tags |
List all tags with counts |
memory_stale |
Find expired/overdue memories |
memory_rebuild |
Rebuild needs.json index |
memory_capture_git |
Extract memories from recent Git commits |
memory_capture_ci |
Extract memories from CI/test log output |
memory_capture_discussion |
Extract memories from conversation transcripts |
memory_plan |
Analyze memory graph and propose maintenance actions |
memory_apply |
Execute a generated maintenance plan |
| Type | Prefix | Use Case |
|---|---|---|
mem |
MEM_ |
Observation, note, or finding |
dec |
DEC_ |
Design or architectural decision |
fact |
FACT_ |
Verified, stable knowledge |
pref |
PREF_ |
Coding style or convention |
risk |
RISK_ |
Uncertainty or assumption |
goal |
GOAL_ |
Objective or target |
q |
Q_ |
Open question needing resolution |
| Link | Meaning |
|---|---|
relates |
General association |
supports |
Evidence or justification |
depends |
Hard dependency |
supersedes |
Replaces older memory |
contradicts |
Conflict or tension |
example_of |
Concrete instance of concept |
| Field | Values | Purpose |
|---|---|---|
confidence |
low / medium / high |
Trust level |
scope |
global, repo:X, product:X |
Applicability |
tags |
prefix:value format |
Categorization |
source |
URL, commit, description | Provenance |
review_after |
ISO date | Staleness trigger |
expires_at |
ISO date | Auto-expire date |
created_at |
ISO date | Capture timestamp |
Tags use prefix:value format for consistent discovery:
topic:— Subject area (topic:gateway,topic:auth)repo:— Repository (repo:backend,repo:web-ui)domain:— Knowledge domain (domain:robotics,domain:web)tier:— Importance level (tier:core,tier:detail)intent:— Purpose (intent:decision,intent:coding-style)
Always use a two-phase approach. Never go straight to body text on broad queries.
Phase A — Peek (scan titles, zero body text):
memory recall --tag topic:gateway --format brief --expand 0Returns [ID] Title (confidence) one-liners. Minimal tokens. Do this FIRST.
Phase B — Drill (read full body of specific memories):
memory get DEC_handler_context_patternOnly after peeking — pick the 2-3 most relevant IDs and get them individually.
When to recall — recall is NOT just a session-start ritual. Recall at each of these moments:
| Trigger | What to recall |
|---|---|
| Session start | recall --format brief --limit 20 --sort newest |
| New task or topic | recall --tag topic:<X> --format brief |
| Entering unfamiliar code | recall --tag repo:<X> --type fact --format brief |
| Before a design decision | recall --tag topic:<X> --type dec |
| Encountering an error or failure | recall <error message keywords> — FIRST reaction before debugging; check if this problem was already solved |
| Stuck after initial attempts | recall --tag topic:<X> --type mem,fact — broaden search to related areas and past solutions |
| Before implementing a pattern | recall --tag intent:coding-style --type pref |
Recording memories is NOT optional. Write at these concrete moments:
| Trigger | Type | Example |
|---|---|---|
| Chose approach A over B | dec |
"Use tl::expected over exceptions" |
| Fixed a non-obvious bug | mem |
"EntityCache race condition fix" |
| Discovered undocumented API | fact |
"Routes match in registration order" |
| User stated a preference | pref |
"Prefer Zustand over Redux" |
| Identified a risk | risk |
"JWT secret hardcoded in tests" |
| Question remains unanswered | q |
"Should synthetic components expose operations?" |
End-of-task writes: summarize architecture learned (fact), record conventions (pref), note anything a future agent needs (mem), capture unfinished goals (goal).
Write quality rules:
--tagsis mandatory — without tags, the memory is unfindable--bodymust be self-contained with file paths and concrete details- Use
--rebuildflag to make new memories immediately searchable
When knowledge changes, add a new entry with --supersedes OLD_ID and deprecate the old one.
Run memory stale at the start of long sessions to keep the graph accurate.
recallomits body by default — this is intentional, not a limitation- Peek with
--format brief→ drill withget <ID>— this is the core pattern - Use
--limit 10and--expand 0when exploring broad topics - Use
--tagfilters to narrow results instead of free-text - Use
memory tagsto discover available tag prefixes before filtering
ai_memory_protocol/
├── pyproject.toml # Package definition, CLI + MCP entry points
├── README.md
├── LICENSE # Apache 2.0
├── CONTRIBUTING.md
├── .pre-commit-config.yaml
├── .github/workflows/ci.yml
└── src/
└── ai_memory_protocol/
├── __init__.py
├── cli.py # CLI (argparse, 16+ subcommands)
├── mcp_server.py # MCP server (13 tools, stdio transport)
├── capture.py # Knowledge extraction (git, CI, discussion)
├── planner.py # Graph analysis and maintenance planning
├── executor.py # Plan execution engine
├── config.py # Type definitions, constants
├── engine.py # Workspace detection, search, graph walk
├── formatter.py # Output formatting (brief/compact/context/json)
├── rst.py # RST generation, editing, file splitting
└── scaffold.py # Workspace scaffolding (init command)
Memory data lives in a separate workspace (e.g., .memories/), created with memory init.
The protocol supports a fully autonomous memory lifecycle — agents can capture, plan, and maintain knowledge without human intervention:
capture (git / CI / discussion)
│
▼
plan (analyze graph → propose actions)
│
▼
apply (execute plan → add/update/deprecate)
│
▼
rebuild (sphinx-build → needs.json)
│
▼
recall (search updated graph)
Capture sources:
memory capture git— scans recent commits, extracts decisions, bug fixes, refactorsmemory capture ci --input <log>— parses test failures, compiler errors, deprecation warningsmemory capture discussion --input <transcript>— classifies conversation into decisions, facts, preferences, risks, goals, questions
Planning engine:
memory plan— analyzes the memory graph for staleness, missing links, contradictions, and proposes maintenance actionsmemory plan --auto-apply— execute the plan immediately after analysismemory apply plan.json— execute a previously saved plan
All captured candidates include provenance (--source) and are deduplicated against existing memories.
The Sphinx build acts as a quality gate for the memory graph. needs_warnings in conf.py define constraints that fire during memory rebuild:
needs_warnings = {
"missing_topic_tag": "type in ['mem','dec','fact',...] and not any(t.startswith('topic:') for t in tags)",
"empty_body": "description == '' or description == 'TODO: Add description.'",
"deprecated_without_supersede": "status == 'deprecated' and len(supersedes_back) == 0",
}With sphinx-build -W (warnings as errors), the build fails if any memory violates these constraints. This means:
- Every memory must have at least one
topic:tag - No empty placeholders survive to the index
- Deprecated memories must be superseded by a replacement
Agents learn to self-correct: if rebuild fails, they read the warning, fix the offending memory, and retry.
Humans are observers and editors, not gatekeepers:
- Dashboards —
memory/dashboards.rstcontainsneedtable,needlist, andneedflowdirectives rendering the live state of the memory graph as HTML - RST editing — memories are plain RST, editable in any text editor or IDE with full diff/blame in Git
- Override — humans can update status, confidence, or tags on any memory via CLI or direct RST edit
- Review —
memory reviewsurfaces memories whosereview_afterdate has passed, prompting human validation
The protocol is designed so that agents maintain knowledge autonomously while humans retain full visibility and override capability.
See CONTRIBUTING.md for guidelines on how to contribute.