A cognitive database for AI agents — vector memory, a typed knowledge graph, and hybrid retrieval in one substrate. Cognitive operations (encode, recall, plan, reason, forget) are the primitives; entities, statements, and relations are the secondary layer that activates when a schema is declared.
Status: Specification complete — 32 sections covering both the substrate (§00–§16) and the knowledge layer (§17–§31). Implementation is phased: substrate phases 0–12 shipped; phases 13–14 (benchmarks + substrate acceptance) close the substrate; phases 15–24 deliver the knowledge layer. The v1.0.0 tag lands at the end of Phase 24, after the combined acceptance suite passes.
// 1. Substrate-only: encode an experience, recall what's relevant.
brain.encode("Had a difficult conversation with Alex about the project").await?;
let memories = brain.recall("conflicts with Alex").top_k(5).await?;
// → ranked by semantic similarity, edge proximity, temporal recency,
// and salience — not just vector distance.
// 2. Declare a schema; the knowledge layer comes alive.
brain.schema_upload(include_str!("acme.brain")).await?;
brain.encode("Priya kicked off the billing rewrite project today").await?;
// → background extractors create: entity Priya, entity Project(billing rewrite),
// statement Event(Priya, kicked_off, billing rewrite), relation Priya→Project.
// 3. Hybrid query: semantic + lexical + graph retrieval, RRF-fused.
let results = brain.query()
.text("what's Priya working on?")
.filter_subject_entity("priya")
.top_k(10)
.await?;- What Brain is
- Why this exists
- The two layers
- The cognitive primitives
- The knowledge layer
- End-to-end: what
brain.encode("…")actually does - Architecture
- The seven invariants
- Latency targets
- Tech stack
- Implementation status
- Development environment
- Repository layout
- License
Brain is to AI agents what SQL is to applications: a substrate where the application says what it wants cognitively and the substrate handles how.
Today, building memory for an LLM agent typically looks like: glue a vector database for similarity, a graph database for relationships, a full-text store for keyword matching, an LLM extraction pipeline, an embedding service, a key-value store for state, a queue for async events, and write a thousand lines of orchestration to keep them in sync. Half of that orchestration is reinventing transaction semantics across systems that don't agree on what "committed" means.
Brain collapses that stack into one substrate:
- A memory is the atomic record — text, embedding, salience, time, edges, provenance — stored once and queried many ways.
- A schema (optional) declares the entity types, predicates, and relation types the user cares about. The substrate runs three tiers of extractors over incoming memories and surfaces typed entities, statements (Fact / Preference / Event), and relations.
- Recall isn't
top-k by cosine— it's similarity blended with recency, salience, edge proximity, and typed filters, with a hybrid retriever that fuses semantic + lexical + graph ranks via RRF.
Three observations drove the design:
-
Agents need cognitive operations, not storage operations. "Remember this," "what's relevant?", "plan a path," "why do we think X?" map awkwardly onto INSERT / SELECT / JOIN. They map cleanly onto verbs that already carry the semantics: encode, recall, plan, reason, forget.
-
The data has structure, but it's latent. An agent's stream of observations contains people, projects, decisions, preferences — but a vector database flattens that to similarity scores. A property graph captures the structure but doesn't help you find anything by meaning. You need both, with a query layer that can join them.
-
Predictable tail latency is non-negotiable. Agents call the memory system on every turn. A p99 spike at 200 ms turns into perceived agent latency of seconds. Brain is built thread-per-core (Glommio +
io_uring), with WAL group-commit, lock-free reads, and a single-writer-per-shard discipline — for tail-latency reasons, not throughput.
Brain ships in two layers that share one shard, one storage system, and one wire protocol:
┌───────────────────────────────────────────┐
│ KNOWLEDGE LAYER (activates on schema) │
│ │
Layer 3: STATEMENTS │ Facts, Preferences, Events │
│ (typed claims, with provenance + conf.) │
│ ▲ │
│ │ derived from │
Layer 2: ENTITIES + │ │ │
RELATIONS │ Canonical nouns, typed edges │
│ ▲ │
│ │ references / anchored to │
└───────────│─────────────────────────────────┘
┌───────────│─────────────────────────────────┐
Layer 1: MEMORIES │ │ │
(substrate) │ Raw episodic / semantic / consolidated │
│ memories, embedded, indexed in HNSW │
│ │
│ SUBSTRATE (always active) │
└───────────────────────────────────────────┘
-
Substrate-only mode (no schema declared): Brain runs as a pure vector memory store. The wire surface is the 8 cognitive primitives. Knowledge-layer tables exist on disk but are empty; knowledge-layer workers don't run. This is a first-class deployment posture, not a legacy mode.
-
Schema-declared mode: a
SCHEMA_UPLOADactivates the knowledge layer. Background extractors (pattern → classifier → LLM) run over incoming memories. Entities, statements, and relations populate the knowledge-layer tables and indexes.RECALLtransparently routes through the hybrid retriever; the typed knowledge SDK becomes useful.
A deployment can move in either direction. Declaring a schema after months of substrate-only use kicks off a backfill. Stopping schema operations returns the substrate to vector-only mode without losing knowledge data.
Eight verbs at the substrate level (spec §09):
| Verb | What it does |
|---|---|
| ENCODE | Store an experience. Brain embeds the text, picks a slot in the arena, writes the WAL record, updates redb metadata, inserts into HNSW, and (if a schema is declared) queues extractors. |
| RECALL | Find memories relevant to a cue. Ranks by similarity blended with salience, recency, and edge proximity. With a schema declared, routes through the hybrid retriever. |
| PLAN | Construct a path from one cognitive state to another. Pull-based executor with budgets (steps, wall time, branches). |
| REASON | Multi-hop traversal explaining why X is connected to Y. Returns the path, the evidence memories, and the confidence. |
| FORGET | Soft (mark as forgotten; grace period) or hard (zero the slot immediately) tombstoning. Cascades to derived knowledge-layer records when a schema is active. |
| LINK / UNLINK | Manually assert / retract a typed edge between two memories. |
| SUBSCRIBE | Stream events: memory created, statement created, extractor failed, schema updated, etc. |
use brain_sdk::{Client, EdgeKind, ForgetMode, PlanState, PlanBudget};
let brain = Client::connect("127.0.0.1:7860", "mind-agent-1").await?;
// 1. Encode a memory; capture an edge to the previous turn.
let memory = brain
.encode("Alex pushed the deadline to next Friday")
.edge(EdgeKind::FollowedBy, prev_turn_id)
.await?;
println!("stored as {}, salience {:.2}", memory.id, memory.salience);
// 2. Later: recall everything related to deadlines and Alex.
let mut results = brain.recall("when did Alex change the deadline?")
.top_k(5)
.include_edges(true)
.stream()
.await?;
while let Some(r) = results.next().await {
println!("{:.2} {}", r.confidence, r.text);
}
// 3. Plan from "current sprint" to "shipping the feature":
let mut plan = brain.plan(
PlanState::ByText("current sprint state".into()),
PlanState::ByText("feature shipped".into()),
)
.budget(PlanBudget { max_steps: 8, max_wall_time_ms: 1_000, max_branches_explored: 50 })
.stream()
.await?;
while let Some(step) = plan.next().await {
println!("step {}: {}", step.step_index, step.text);
}
// 4. Soft-forget when something becomes private:
brain.forget(memory.id, ForgetMode::Soft).await?;When a schema is declared, the substrate exposes typed cognition. Three layers, ten decisions, one set of guarantees.
The decisions (spec §17–§31):
- Property graph, not RDF. Operational knowledge graphs converged on property graphs; RDF reification is too expensive for fact-with-metadata workloads.
- Three statement kinds, shared storage. Fact / Preference / Event are distinct in the API (different mutation rules) but one table with a
kinddiscriminator. Cross-kind queries work; per-kind queries are fast. - Three retrievers + three filters + RRF fusion. Semantic (HNSW) + Lexical (tantivy BM25) + Graph (entity-joined) for retrieval; Type + Temporal + Confidence for filtering; Reciprocal Rank Fusion (
k=60) for combining ranks. - Three extractor tiers. Pattern (regex, free) → Classifier (small bundled model, cheap) → LLM (expensive, capable). Tiered fallback keeps cost manageable.
- Declarative schema DSL. Users declare entity types, relation types, predicates, and extractor bindings. The substrate picks the right extractor tier per declaration. Schema is versioned; data migrates.
- Valid time only (v1).
valid_from/valid_toon statements and relations. Transaction-time "as of" queries are deferred. - Confidence is first-class. Every derived record has confidence in
[0,1]and an evidence list. Contradictions surface, don't auto-resolve. - Schema is optional. No-schema runs as a pure vector substrate (real deployment posture, not legacy).
- Single node, single schema namespace. No federation, no multi-tenant schemas — one Brain, one knowledge graph.
- No silent failures. Schema-validated outputs only; ambiguous resolutions queue for review; contradictions surface; FORGET cascades audited.
The wire surface (spec §28) adds ~30 opcodes on top of the substrate's primitives:
| Range | Group | Examples |
|---|---|---|
0x20–0x2F |
Schema | SCHEMA_UPLOAD, SCHEMA_GET, EXTRACTOR_LIST, EXTRACTOR_DISABLE |
0x30–0x3F |
Entities | ENTITY_CREATE, ENTITY_RESOLVE, ENTITY_MERGE, ENTITY_RENAME |
0x40–0x4F |
Statements | STATEMENT_CREATE, STATEMENT_SUPERSEDE, STATEMENT_HISTORY |
0x50–0x5F |
Relations | RELATION_CREATE, RELATION_TRAVERSE (1–3 hop), RELATION_LIST_FROM |
0x60–0x6F |
Query | QUERY (hybrid), QUERY_EXPLAIN, QUERY_TRACE, RECALL_HYBRID |
0x70–0x7F |
Admin | ADMIN_BACKFILL, ADMIN_LIST_PENDING_RESOLUTIONS, ADMIN_LIST_STALE_STATEMENTS |
End-to-end (docs/development/usage/practical-guide.md walks through this in full):
ENCODE "Priya kicked off the billing rewrite today."
│
▼ Substrate writes memory + embedding + WAL + HNSW.
│
├─→ Pattern extractor (sync, regex):
│ "Priya" → entity_resolver: tier 1 exact hit on existing entity_42
│ "billing rewrite" → tier 2 fuzzy → create new entity_71 (Project)
│
├─→ Classifier extractor (NER, near-foreground): confirms Person + Project tags.
│
├─→ LLM extractor (async worker, cached): produces Event(Priya, kicked_off,
│ billing rewrite, event_at=today, confidence=0.91, evidence=[memory_id]).
│
▼
Statement and Relation written; tantivy + statement HNSW indexed.
Audit entries written. Subscribers notified.
LATER: QUERY "What's Priya working on right now?"
│
▼ Query router classifies: known entity "Priya" → graph lane; known
│ predicate-shape "working on" → type filter (Event/Fact).
│
├─→ GraphRetriever: statements where subject=entity_42, status=current
├─→ SemanticRetriever (HNSW): statements similar to query text
└─→ LexicalRetriever (tantivy BM25): statements with matching tokens
│
▼ RRF fusion (k=60), confidence filter ≥ 0.5, temporal filter (last 30 days)
│
▼
Result: top-N statements with provenance trail back to source memories.
sequenceDiagram
autonumber
participant App as Agent app
participant SDK as brain-sdk
participant Conn as Connection layer<br/>(Tokio)
participant Shard as Shard executor<br/>(Glommio)
participant Embed as Embedding<br/>(BGE-small via candle)
participant WAL as WAL<br/>(O_DIRECT + RWF_DSYNC)
participant Arena as Arena<br/>(mmap'd 1600-byte slots)
participant Meta as Metadata<br/>(redb)
participant Index as ANN index<br/>(HNSW in RAM)
participant Ext as Extractor pipeline<br/>(if schema declared)
App->>SDK: encode("Alex said the deadline...")
SDK->>SDK: build ENCODE_REQ frame<br/>(rkyv payload + 32B header + CRC)
SDK->>Conn: TCP write (single frame, EOS)
Conn->>Conn: validate frame, lookup shard<br/>by agent then shard_id
Conn->>Shard: enqueue(EncodeRequest)
Shard->>Embed: embed(text)
Embed-->>Shard: f32 vector dim 384, L2-normalized
Shard->>Shard: idempotency, dedup on RequestId
Shard->>WAL: append(EncodeMemory record)
WAL->>WAL: pwritev2(RWF_DSYNC)<br/>group commit
WAL-->>Shard: Lsn (durable)
Shard->>Arena: write slot(version, vector, metadata)<br/>plus slot CRC
Shard->>Meta: redb txn,<br/>memory plus edges plus idempotency
Meta-->>Shard: txn committed
Shard->>Index: insert(memory_id, vector)
Shard-->>Conn: ENCODE_RESP(memory_id, salience, was_dedup)
Conn-->>SDK: TCP write
SDK-->>App: Memory { id, salience, ... }
Note over Shard,Ext: After response: pattern extractors run sync;<br/>classifier + LLM run as background workers.
Shard-)Ext: dispatch (if schema declared)
The order matters. Step 5 (WAL fsync) is what makes the operation durable; the response is not sent until that fsync returns. Steps 6–9 happen after but before the response — if any of them fail, the WAL record is the source of truth and recovery replays it. This is the "WAL-before-acknowledge" invariant. Knowledge-layer extraction runs after the response is sent, so substrate-write latency is unaffected.
flowchart TB
SDK1[Agent A]
SDK2[Agent B]
SDK3[Agent C]
SDK1 -->|TCP+TLS| Conn
SDK2 -->|TCP+TLS| Conn
SDK3 -->|TCP+TLS| Conn
subgraph Server[Brain server · Linux]
Conn[Connection layer<br/>Tokio · accept · TLS · frame validate · dispatch]
subgraph S0[Shard 0 · Glommio · thread-per-core]
S0Arena[Arena<br/>mmap]
S0Wal[WAL]
S0Meta[redb]
S0Hnsw[HNSW · memory]
S0Ehnsw[HNSW · entity]
S0Shnsw[HNSW · statement]
S0Tantivy[tantivy · memory + statement]
end
subgraph S1[Shard 1 · Glommio]
S1All[Same per-shard set]
end
subgraph SN[Shard N · Glommio]
SNAll[Same per-shard set]
end
Conn -->|message channel| S0
Conn -->|message channel| S1
Conn -->|message channel| SN
Workers[Background workers<br/>decay · consolidation · HNSW maint · GC<br/>+ extractors · sweepers · migrations]
Workers -.-> S0
Workers -.-> S1
Workers -.-> SN
end
Two runtimes, one host:
- Connection layer runs on Tokio. Many lightweight tasks: accept TCP, terminate TLS, decode the 32-byte frame header, validate, dispatch to the right shard via a bounded message channel. Tokio is great here — many tasks, varied shapes, async I/O.
- Shard layer runs on Glommio. Thread-per-core,
io_uring, single-task-per-shard for the writer. Each shard owns its files, its indexes, its caches. No cross-shard locks. Reads useArcSwap+crossbeam-epochfor lock-free hot paths.
The two layers communicate via channels carrying messages (plain Send structs). Per-shard data never crosses the boundary; the connection task hands off an EncodeRequest, the shard hands back an EncodeResponse.
Sharding is by agent (AgentId → ShardId), so every operation for an agent goes to one shard. That makes the shard's discipline easy to reason about: one writer, no locks needed, no cross-shard coordination on the hot path.
Background workers keep the substrate healthy: salience decay over time, consolidation (multiple similar memories → one summary), HNSW link maintenance, slot reclamation past the tombstone grace window, idempotency-table TTL expiry, and so on. When a schema is declared, additional workers run: pattern + classifier + LLM extractors, the statement-embedding worker, the supersession sweeper, the backfill worker, the schema-migration runner. Workers run as their own Glommio tasks, scheduled around the per-shard writer with priorities (spec §11 + §27).
flowchart LR
Op[Op handler<br/>encode · recall · forget · query · ...]
subgraph storage[Storage layer]
WAL[WAL<br/>O_DIRECT + pwritev2<br/>+ RWF_DSYNC group commit]
Arena[Arena<br/>mmap'd 1600 B slots<br/>per-slot CRC32C + version]
end
subgraph metadata[Metadata]
Redb[redb B-tree<br/>memory · edges · context<br/>idempotency · tombstones<br/>+ entities · statements · relations<br/>+ predicates · audits]
LLMCache[LLM cache redb<br/>extractor responses · TTL]
end
subgraph index[Indexes]
Hnsw[HNSW · memory<br/>M=16 · ef=200/100<br/>cosine · ArcSwap-published]
Ehnsw[HNSW · entity<br/>M=16 · ef=100/64]
Shnsw[HNSW · statement<br/>M=32 · ef=200/128]
Tant[tantivy · memory_text<br/>tantivy · statement_text<br/>BM25]
end
Op -->|1. fsync first| WAL
WAL -->|2. then slot| Arena
Op -->|3. metadata commit| Redb
Op -->|4. index update| Hnsw
Op -.->|ANN| Hnsw
Op -.->|graph| Redb
Op -.->|lexical| Tant
Op -.->|entity res| Ehnsw
Op -.->|stmt sem| Shnsw
Op -.->|LLM extract| LLMCache
Six data structures per shard, each pinned to a spec section:
- Arena (
spec/05/02) — memory-mapped file of 1600-byte slots. Each slot is 1536 bytes off32vector (384 × 4) plus 64 bytes of metadata (kind, salience, timestamps, slot version, CRC). The file has a 4 KiB header recording the shard UUID, format version, slot count, embedding-model fingerprint, and a header CRC. Vectors are little-endian on disk. - WAL (
spec/05/04..08) — append-only log of operations, one segment per ~64 MiB. Writes useO_DIRECTfor predictable latency andpwritev2(RWF_DSYNC)for durable fsync; the WAL also batches concurrent writes via group commit so N pending records share one fsync. Recovery replays from the last checkpoint forward, tolerating torn-tail (the last record may be partial; we stop there). Knowledge-layer additions add frame types0x10..0x50for entity / statement / relation / schema / audit records (spec §26). - redb (
spec/07+spec/26) — embedded B-tree for metadata. Substrate tables: text bodies, edge lists, context names, idempotency dedupe, tombstones. Knowledge-layer tables: entities (+ aliases, trigrams, mentions), statements (+ chain, indexes by subject/predicate/object/event-time), relations (+ direction indexes), predicates, entity types, relation types, extractors, schema versions, audits, merge log. ACID transactions wrap multi-table writes from a single op. - HNSW (memory) (
spec/06) — Hierarchical Navigable Small World index for ANN search over memory vectors.M=16,ef_construction=200, cosine distance over L2-normalized 384-dim vectors. Held in RAM; persisted incrementally; published to readers viaArcSwap. - HNSW (entity + statement) (spec §26) — smaller HNSW for entity embeddings (used by the entity resolver) and a separate HNSW for statement embeddings (used by the semantic retriever to find statements similar to a query).
- tantivy (spec §26) — two per-shard BM25 indexes: one over memory text, one over statement text representations. Backs the lexical retriever in hybrid queries.
The discipline that makes this work without per-record locking:
- Single writer per shard. Only one task mutates shard state. Other Glommio tasks may read via
ArcSwap-published snapshots (HNSW) or via the mmap (arena, redb). The writer never blocks on a lock because there's nobody to lock against. - WAL-before-ack. Step 1 in the diagram: the WAL append + fsync happens before any other store is touched. If the process dies at step 2, the WAL record on restart drives the rest.
- CRC on every slot, every record. Two checksums: arena slot CRC32C catches in-flight slot corruption; WAL record CRC32C catches log corruption. Both halt the shard on mismatch — never overwrite a stored CRC.
Brain ships a custom binary protocol over TCP (with optional TLS). The 32-byte frame header is fixed; the payload is rkyv-encoded structured data plus optional raw f32 vector blobs.
Frame header (32 bytes, big-endian)
+--------+--------+--------+--------+
| magic = "BRN0" |
+--------+--------+--------+--------+
| ver(1) | op(1) | flags (2) |
+--------+--------+--------+--------+
| header_crc32c (4) |
+--------+--------+--------+--------+
| stream_id (4) |
+--------+--------+--------+--------+
| payload_len (3) | reserved(1) |
+--------+--------+--------+--------+
| payload_crc32c (4) |
+--------+--------+--------+--------+
| reserved (8) |
+--------+--------+--------+--------+
Payload layout
+----------------------------------+
| rkyv-encoded body | — request/response/event struct
+----------------------------------+
| padding (0–3 bytes) | — align next section to 4 bytes
+----------------------------------+
| raw f32 vectors (N × 1536 bytes) | — optional; bytemuck::cast_slice<u8, f32>
+----------------------------------+
The opcode space is laid out by group (spec §03/05 + §28):
0x00–0x0F reserved
0x10–0x1F substrate primitives (ENCODE, RECALL, PLAN, REASON, FORGET, LINK, UNLINK, TXN_*, SUBSCRIBE, …)
0x20–0x2F schema operations
0x30–0x3F entity operations
0x40–0x4F statement operations
0x50–0x5F relation operations
0x60–0x6F query operations (hybrid retrieval, EXPLAIN, TRACE)
0x70–0x7F admin operations (backfill, audit, pending resolutions)
Validation is layered (spec/03/11): frame-level (magic, version, CRC, length), payload-level (rkyv structural validation, vector norm checks), and operation-level (per-opcode field constraints).
Errors come back as a typed ERROR frame with a category (Protocol, Authentication, Validation, NotFound, Conflict, ResourceExhausted, Internal, Unavailable) and a stable code drawn from the §10 error table plus the knowledge-layer codes in §28. The category drives the SDK's retry policy.
Non-negotiable rules. Code that violates them is wrong, regardless of test results.
| # | Invariant | What it prevents |
|---|---|---|
| 1 | WAL-before-acknowledge. No operation returns success until its WAL record is fsynced. | Lost writes after a crash. |
| 2 | Single writer per shard. No locks needed; the discipline enforces it. | Lock contention on the hot path; two-writer races. |
| 3 | CRC everywhere. Every WAL record, every arena slot. Reads verify; mismatches halt. | Silent corruption from bad disk / memory / cosmic ray. |
| 4 | Slot version on MemoryId. Encoded in the ID; stale references → NotFound. |
Reading the wrong memory after slot reuse. |
| 5 | Idempotency by RequestId. 24h TTL. Same params → cached response. Different params → Conflict. |
Duplicate effects on retry. |
| 6 | Tombstone grace before reclamation. Default 7 days. Hard FORGET zeroes immediately. | Surprise: data still recoverable when soft-forgotten / data lingers when hard-forgotten. |
| 7 | No silent corruption. Fail-stop and alert. Never return wrong data. | Trusting outputs that may be wrong; quietly papering over bit rot. |
Tested per spec/16/06_durability_criteria.md. The random-kill recovery test exercises 1, 2, 3, 5, and 7 directly; the GC tests cover 4 and 6.
Hard targets from spec/16/02_latency_targets.md. Single-shard, 1M memories, mixed workload, 100 concurrent clients, reference hardware (16-core x86_64 / 64 GB RAM / NVMe SSD):
| Operation | p50 | p95 | p99 | p99.9 |
|---|---|---|---|---|
| ENCODE | 8 ms | 15 ms | 25 ms | 50 ms |
| RECALL (K=10, no text) | 5 ms | 12 ms | 20 ms | 40 ms |
| RECALL (K=10, with text) | 7 ms | 18 ms | 30 ms | 60 ms |
| PLAN (depth 3) | 4 ms | 10 ms | 18 ms | 35 ms |
| REASON (depth 3) | 8 ms | 20 ms | 35 ms | 70 ms |
| FORGET | 3 ms | 8 ms | 15 ms | 30 ms |
| LINK / UNLINK | 2 ms | 5 ms | 10 ms | 20 ms |
These are MUST targets for v1.0. Brain optimizes for predictable tails, not minimum averages — a p50 of 5 ms with a p99 of 20 ms is preferred over a p50 of 2 ms with a p99 of 80 ms.
| Component | Crate | Why |
|---|---|---|
| Async runtime (shards) | glommio |
Thread-per-core, io_uring, no work-stealing — predictable per-core latency. |
| Async runtime (connection layer) | tokio |
Many tasks, varied shapes, mature ecosystem. |
| Wire encoding | rkyv + bytemuck |
Zero-copy structured deserialization (rkyv) + raw vector slice access (bytemuck::cast_slice). |
| Metadata store | redb |
Pure-Rust ACID B-tree; embeddable; no external services. |
| ANN index | hnsw_rs |
HNSW; battle-tested parameters (M=16, ef=200/100 for memory; tuned variants for entity / statement). |
| Lexical index | tantivy |
Pure-Rust BM25 + tokenizer; backs the lexical retriever (phase 22). |
| Embedding | candle family + tokenizers |
Pure-Rust inference; BGE-small-en-v1.5; substrate-owned. |
| SIMD math | matrixmultiply + wide |
Cosine distance kernel; portable AVX2 / NEON fallbacks. |
| Lock-free swap | arc-swap |
Cross-shard read snapshots without locking. |
| Epoch GC | crossbeam-epoch |
Safe memory reclamation for lock-free reads. |
| CRC | crc32c |
iSCSI Castagnoli polynomial; hardware-accelerated. |
| UUIDs | uuid (v7) |
Time-ordered IDs; agents/contexts/requests/transactions, plus entities/statements/relations. |
| Errors | thiserror (libs) + anyhow (bins) |
Typed errors at boundaries, ergonomic at the top. |
| Telemetry | tracing + opentelemetry |
Spans, structured fields, OTel export. |
| HTTP transport | hyper 1.x + tokio-tungstenite |
HTTP/1.1 + HTTP/2 + WebSocket + SSE for the admin / web surface. |
Deps are pinned in the workspace Cargo.toml; new ones require commit-message justification.
The specification is complete — 32 sections, 17 substrate (§00–§16) + 15 knowledge layer (§17–§31). Implementation is phased.
| Phase | Scope | Status |
|---|---|---|
| 0 | Workspace skeleton, CI | ✓ phase-0-complete |
| 1 | Wire protocol & core types | ✓ phase-1-complete |
| 2 | Storage: arena + WAL + recovery | ✓ phase-2-complete |
| 3 | Metadata + redb integration | ✓ phase-3-complete |
| 4 | ANN index (HNSW) | ✓ phase-4-complete |
| 5 | Embedding service | ✓ phase-5-complete |
| 6 | Query planner | ✓ phase-6-complete |
| 7 | Cognitive operations | ✓ phase-7-complete |
| 8 | Background workers | ✓ phase-8-complete |
| 9 | Server binary (Tokio + Glommio wiring) | ✓ phase-9-complete |
| 10 | Rust SDK + admin CLI | ✓ phase-10-complete |
| 11 | brain-http (HTTP/WS/SSE transport) |
✓ phase-11-complete |
| 12 | Observability (metrics / logs / tracing / dashboards / alerts) | ✓ phase-12-complete |
| 13 | Benchmarks & chaos | planned |
| 14 | Substrate acceptance & v0.9.x-substrate-rc |
planned |
Activates when a schema is declared. Substrate-only deployments are unaffected.
| Phase | Scope | Status |
|---|---|---|
| 15 | Knowledge storage extensions (tables, WAL frames, indexes, flags) | planned |
| 16 | Entity layer (resolver tiers 1–3, entity HNSW) | planned |
| 17 | Statement layer (Fact / Preference / Event, supersession, contradictions) | planned |
| 18 | Relation layer (cardinality, symmetry, 1–3 hop traversal) | planned |
| 19 | Schema DSL (parser, validator, versioning, migration plan) | planned |
| 20 | Pattern + classifier extractors (regex + bundled NER) | planned |
| 21 | LLM extractor (cache, retry, cost budget, resolver tier 4) | planned |
| 22 | Tantivy / lexical retrieval | planned |
| 23 | Hybrid query engine (router, RRF fusion, filter chain, EXPLAIN/TRACE) | planned |
| 24 | Sweepers, knowledge acceptance & v1.0.0 |
planned |
See ROADMAP.md for the phase index and docs/development/phases/ for per-phase sub-task breakdowns. The dependency DAG for the knowledge-layer phases lives in docs/development/phases/README.md.
For a hands-on walkthrough of every feature in context (substrate primitives, schema declaration, extractors, hybrid query, FORGET cascade), see docs/development/usage/practical-guide.md.
Linux x86_64 / aarch64, kernel ≥ 5.15. Brain depends on io_uring, O_DIRECT, pwritev2(RWF_DSYNC), and a few Linux-only madvise / fallocate flags — see spec/01_system_architecture/05_hardware.md §1.1 for why we chose a single-platform backend over portable shims.
| Crate | Linux | macOS / Windows native |
|---|---|---|
brain-core |
✓ | ✓ (pure value types) |
brain-protocol |
✓ | ✓ (codec + DSL parser) |
brain-cli |
✓ | ✓ (no runtime dep yet) |
brain-sdk-rust |
✓ | ✓ (client-side) |
brain-http |
✓ | ✓ (hyper-based; client side works) |
brain-planner |
✓ | ✓ (pure logic) |
brain-storage |
✓ | ✗ — compile_error! |
brain-metadata |
✓ | ✗ (redb with O_DIRECT-aware paths) |
brain-index |
✓ | ✗ (HNSW + tantivy persistence) |
brain-embed |
✓ | ✗ (candle wiring) |
brain-ops |
✓ | △ partial — wires runtime crates |
brain-workers |
✓ | ✗ — runs on Glommio |
brain-extractors |
✓ | ✗ (phase 20) |
brain-llm |
✓ | ✗ (phase 21) |
brain-server |
✓ | ✗ — Glommio + Tokio runtime |
CI (.github/workflows/ci.yml) runs everything on ubuntu-latest and is the authoritative test gate.
rustup toolchain install stable
rustup component add rustfmt clippy
just verify # fmt + build + clippy + testThe repo ships a .devcontainer/ config. With Docker, OrbStack, or Colima running:
just shell # builds the image on first run, drops you into bash
# inside the container:
just verifyEditor integration: VS Code, Cursor, JetBrains, and GitHub Codespaces auto-detect .devcontainer/devcontainer.json. Use "Reopen in Container."
| Runtime | macOS recommendation |
|---|---|
| OrbStack | Recommended on Apple Silicon. Permissive seccomp; works with io_uring. |
| Docker Desktop | Works, but 4.42+ needs a custom seccomp profile or --security-opt seccomp=unconfined to allow io_uring. |
| Colima | Works similarly to Docker Desktop; uses Lima/QEMU. |
O_DIRECTagainst bind-mounted source. macOS-hosted Docker mounts via VirtioFS / 9P; these may not supportO_DIRECT. Storage tests should write under/tmp(tmpfs) or a named-volume path inside the container — not/workspaces/brain(the bind mount).- First build is slow (downloads ~100 crates). Persistent named volumes (
brain-cargo-registry,brain-cargo-git,brain-target-cache) keep the cache across container restarts.
Validates compilation without running. Pair with the dev container (or CI) for actual tests.
rustup target add x86_64-unknown-linux-gnu
brew install lld # or: cargo install --locked cargo-zigbuild
cargo check --workspace --target x86_64-unknown-linux-gnujust verify # full verify: fmt + build + clippy + test
cargo test -p brain-protocol # per-crate (cross-platform crates)
cargo run --bin brain-server -- --config config/dev.toml # Linux only
cargo run --bin brain-cli -- stats # cross-platform
cargo +nightly fuzz run protocol_frame -- -max_total_time=60 # nightly + Linux
just shell # enter the dev container (Docker required)A step-by-step setup, CLI tour, SDK tour, and troubleshooting walkthrough lives under docs/development/usage/.
liburing-syslink error on macOS native: expected. Use the dev container.compile_error!mentioning README.md: the friendly Linux gate; switch to the container.io_uring_setup: Function not implemented: kernel too old or seccomp restricted; check host kernel and container runtime.O_DIRECTreturnsEINVAL: the filesystem under the test path doesn't support direct I/O; use/tmpor a named volume.
brain/
├── README.md # This file
├── ROADMAP.md # Phase index (0–24)
├── .devcontainer/ # Linux dev container for non-Linux contributors
│ ├── Dockerfile
│ ├── devcontainer.json
│ └── post-create.sh
├── docs/ # See docs/README.md for navigation
│ ├── README.md # Navigation hub (Diátaxis-shaped)
│ ├── tutorials/ # Learning-oriented (getting started)
│ ├── guides/ # Task-oriented (install, configure, operate, upgrade, observability)
│ ├── reference/ # Info-oriented (perf targets, error codes)
│ ├── runbooks/ # Operator procedures (RB-1 … RB-11)
│ └── development/ # Contributor-oriented
│ ├── usage/ # Build + run + debug + test workflow
│ ├── spec-deviations.md # Recorded conscious deviations from the spec
│ └── phases/ # Per-phase plans (0–24); dev history
├── monitoring/ # Deployment assets (Grafana dashboards + Alertmanager rules)
├── spec/ # The 32-section specification (read-only)
│ ├── 00_master_overview/ # Substrate (§00–§16)
│ ├── 01_system_architecture/
│ ├── 02_data_model/
│ ├── 03_wire_protocol/
│ ├── 05_storage_arena_wal/
│ ├── …
│ ├── 16_benchmarks_acceptance/
│ ├── 17_knowledge_model/ # Knowledge layer (§17–§31)
│ ├── 18_entities/
│ ├── 19_statements/
│ ├── 20_relations/
│ ├── 21_schema_dsl/
│ ├── 22_extractors/
│ ├── 23_retrievers/
│ ├── 24_hybrid_query/
│ └── … (through 31_complete_acceptance/)
├── crates/
│ ├── brain-core/ # Shared value types (MemoryId, EntityId, StatementId, …)
│ ├── brain-protocol/ # Wire codec + schema DSL parser
│ ├── brain-storage/ # Arena + WAL + recovery (Linux only)
│ ├── brain-metadata/ # redb wrapper (substrate + knowledge tables)
│ ├── brain-index/ # HNSW (memory + entity + statement) + tantivy
│ ├── brain-embed/ # BGE inference
│ ├── brain-planner/ # Query planner + hybrid query router
│ ├── brain-ops/ # Cognitive operations
│ ├── brain-workers/ # Background workers
│ ├── brain-extractors/ # Pattern + classifier extractors (phase 20)
│ ├── brain-llm/ # LLM client + cache + budget (phase 21)
│ ├── brain-http/ # HTTP/WS/SSE transport
│ ├── brain-server/ # Server binary
│ ├── brain-sdk-rust/ # Rust SDK (substrate + typed knowledge SDK)
│ └── brain-cli/ # Admin CLI
├── benches/ # criterion benches per crate
├── fuzz/ # cargo-fuzz harnesses
├── tests/ # cross-crate integration + chaos + soak
├── scripts/
│ └── acceptance/ # acceptance gate runners
├── config/ # Example TOML configs
├── justfile # Convenience commands (`just verify`, `just shell`, …)
└── .github/workflows/ci.yml # CI: build + test + clippy + fmt + miri + audit
Apache-2.0. Source code, spec, and documentation are all under the same license.
By submitting a pull request, you agree that your contribution is licensed under the Apache-2.0 terms (per Apache-2.0 §5 — "Submission of Contributions"). The Apache-2.0 patent grant applies; see LICENSE for details.
Repository: https://github.com/brain-db-io/brain-db