Memory System

How agents remember facts across conversations using a 3-tier architecture with automatic consolidation.

Overview

GoClaw v3 gives agents long-term memory that persists across sessions. Memory is organized into three tiers — working memory, episodic memory, and semantic memory — each serving a distinct purpose in the recall lifecycle. A background consolidation pipeline automatically promotes memories across tiers without any agent action required.

3-Tier Memory Architecture

graph TD
    L0["L0 — Working Memory<br/>(MEMORY.md, memory/*.md)<br/>FTS + Vector, per-agent/user"]
    L1["L1 — Episodic Memory<br/>(episodic_summaries table)<br/>Session summaries, 90-day TTL"]
    L2["L2 — Semantic Memory<br/>(Knowledge Graph)<br/>Entities + relations, temporal"]

    L0 -->|"dreaming_worker promotes<br/>after ≥5 unpromoted episodes"| L0
    L1 -->|"episodic_worker creates<br/>on session.completed"| L1
    L1 -->|"semantic_worker extracts<br/>KG facts on episodic.created"| L2
    L1 -->|"dreaming_worker synthesizes<br/>into long-term MEMORY.md"| L0

Tier	Storage	Content	Lifespan	Search
L0 Working	`memory_documents` + `memory_embeddings`	Agent-curated facts, auto-flush notes, dreaming output	Permanent until deleted	FTS + vector hybrid
L1 Episodic	`episodic_summaries`	Session summaries, key topics, L0 abstracts	90 days (configurable)	FTS + HNSW vector
L2 Semantic	Knowledge Graph tables	Entities, relations, temporal validity windows	Permanent	Graph traversal

Tier Boundaries and Promotion Rules

Session → L1: When a session completes, episodic_worker summarizes it into an episodic_summaries row. Uses the compaction summary if available; otherwise calls the LLM with the session messages (30-second timeout, max 1,024 tokens).
L1 → L2: After each episodic summary is created, semantic_worker extracts KG entities and relations from the summary text and ingests them into the knowledge graph with temporal validity (valid_from = now).
L1 → L0: When ≥5 unpromoted episodic entries accumulate for an agent/user pair, dreaming_worker synthesizes them into a long-term Markdown document written to _system/dreaming/YYYYMMDD-consolidated.md and marks the episodes as promoted.

How It Works

graph LR
    W[Agent writes<br/>MEMORY.md or memory/*] --> CH[Chunk<br/>Split by paragraphs]
    CH --> EM[Embed<br/>Generate vectors]
    EM --> DB[(PostgreSQL<br/>memory_documents +<br/>memory_embeddings)]
    Q[Agent queries memory] --> HS[Hybrid Search<br/>FTS + Vector]
    HS --> DB
    DB --> R[Ranked Results]

Writing Memory (L0)

When an agent writes to MEMORY.md or files in memory/*, GoClaw:

Intercepts the file write (routed to DB, not filesystem)
Chunks the text by paragraph boundaries (max 1,000 chars per chunk)
Embeds each chunk using the configured embedding provider
Stores both the text (with tsvector for FTS) and the embedding vector

Only .md files are chunked and embedded. Non-markdown files (e.g., .json, .txt) are stored in the DB but are not indexed or searchable via memory_search.

Searching Memory

When an agent calls memory_search, GoClaw runs a hybrid search combining FTS and vector similarity:

Method	Weight	How It Works
Full-text search (FTS)	0.3	PostgreSQL `tsvector` + `plainto_tsquery('simple')` — good for exact terms
Vector similarity	0.7	`pgvector` cosine distance — good for semantic meaning

Weighted merge algorithm: FTS scores are normalized to 0..1 range (vector scores are already 0..1), then combined as (FTS × 0.3) + (vector × 0.7). When only one channel returns results, its scores are used directly (effective weight normalized to 1.0).

Results are then ranked:

Per-user boost: results scoped to the current user get a 1.2× multiplier
Deduplication: if both user-scoped and global results match, user copy wins
Final sort by weighted score

Embedding cache: The embedding_cache table is wired into the IndexDocument hot path. Repeated re-indexing of unchanged content reuses cached embeddings instead of calling the embedding provider, reducing latency and API cost.

Fallback behavior: if per-user search returns no results, GoClaw falls back to the global memory pool. This applies to both MEMORY.md and memory/*.md files.

Knowledge Graph Search

knowledge_graph_search complements memory_search for relationship and entity queries. While memory_search retrieves factual text chunks, knowledge_graph_search traverses entity relationships — useful for questions like "what projects is Alice working on?" or "which tools does this agent use?"

Consolidation Workers

The consolidation pipeline runs entirely in the background, event-driven via the internal event bus. Workers are registered once at startup via consolidation.Register() and subscribe to domain events.

sequenceDiagram
    participant S as Session
    participant EW as episodic_worker
    participant SW as semantic_worker
    participant DW as dedup_worker
    participant DR as dreaming_worker
    participant L0A as l0_abstract

    S->>EW: session.completed event
    EW->>EW: LLM summarize (or use compaction summary)
    EW->>EW: l0_abstract (extractive, no LLM)
    EW-->>SW: episodic.created event
    EW-->>DR: episodic.created event
    SW->>SW: Extract KG entities + relations
    SW-->>DW: entity.upserted event
    DW->>DW: Merge/flag duplicate entities
    DR->>DR: Count unpromoted (debounce 10min, threshold 5)
    DR->>DR: LLM synthesis → _system/dreaming/YYYYMMDD.md
    DR->>DR: Mark episodes as promoted

`episodic_worker`

Trigger: session.completed event Action: Creates an episodic_summaries row for each completed session.

Checks source_id (sessionKey:compactionCount) to prevent duplicate summaries.
Uses the compaction summary if present; otherwise reads session messages and calls the LLM with a 30-second timeout.
Generates an L0 abstract — a 1-sentence extractive summary (~200 runes) for fast context injection, with no LLM call.
Extracts key_topics as capitalized proper noun phrases for FTS boosting.
Sets expires_at to 90 days from creation (configurable via episodic_ttl_days).
Publishes episodic.created for downstream workers.

`semantic_worker`

Trigger: episodic.created event Action: Extracts knowledge graph entities and relations from the episodic summary text.

Calls the EntityExtractor (KG extraction, not a raw LLM call).
Stamps extracted entities with valid_from = now() and scopes them to agent_id + user_id.
Ingests into the KG store via IngestExtraction.
Publishes entity.upserted for the dedup worker.
Failures are non-fatal — extraction errors are logged as warnings and do not block the pipeline.

`dedup_worker`

Trigger: entity.upserted event Action: Detects and merges duplicate KG entities after each extraction batch.

Calls kgStore.DedupAfterExtraction with the newly upserted entity IDs.
Merges semantically equivalent entities and flags ambiguous ones.
Terminal worker — no downstream events.
Failures are non-fatal.

`dreaming_worker`

Trigger: episodic.created event Action: Consolidates unpromoted episodic summaries into long-term L0 memory.

Debounce: skips if already ran within the last 10 minutes for the same agent/user pair.
Threshold: requires ≥5 unpromoted episodic entries before running (configurable).
Fetches up to 10 unpromoted entries and calls the LLM to synthesize long-term facts (max 4,096 tokens).
Synthesis prompt extracts: user preferences, project facts, recurring patterns, key decisions.
Writes output to _system/dreaming/YYYYMMDD-consolidated.md in L0 memory and indexes it for search.
Marks all processed entries as promoted_at = now().

`l0_abstract`

Not a standalone worker — a utility called by episodic_worker to produce a brief L0 abstract from a full summary. Uses an extractive sentence-splitting approach (no LLM call, no latency). The abstract is stored in the l0_abstract column of episodic_summaries and used by the auto-injector.

Periodic pruning: A goroutine runs every 6 hours to delete episodic summaries past their expires_at date.

Auto-Injector

The auto-injector automatically surfaces relevant memories into the agent's system prompt at the start of each turn, before the LLM call.

Interface: AutoInjector.Inject(ctx, InjectParams) — called once per turn in the context build stage.
How it works: Checks the user's message against the memory index. Returns a formatted section for the system prompt (empty string if nothing is relevant). Budget: max ~200 tokens of L0 abstracts.
Default parameters (overridable per agent in agents.settings JSONB):

Parameter	Default	Description
`auto_inject_enabled`	`true`	Enable/disable auto-injection
`auto_inject_threshold`	`0.3`	Minimum relevance score (0–1) for a memory to be injected
`auto_inject_max_tokens`	`200`	Token budget for injected memory section
`episodic_ttl_days`	`90`	Days before episodic summaries expire
`consolidation_enabled`	`true`	Enable/disable consolidation pipeline

The injector returns an InjectResult with observability fields: MatchCount, Injected, and TopScore.

Trivial Filter

The trivial filter prevents low-value messages from triggering memory injection, reducing unnecessary database lookups.

isTrivialMessage(msg) returns true when the message contains fewer than 3 meaningful words after removing stopwords (greetings like "hi", "ok", "thanks", acknowledgments, single-word responses). Trivial messages skip the auto-injector entirely.

Memory vs Sessions

Aspect	Memory	Sessions
Lifespan	Permanent (until deleted)	Per-conversation
Content	Facts, preferences, knowledge	Message history
Search	Hybrid (FTS + vector)	Sequential access
Scope	Per-user per-agent	Per-session key

Memory is for things worth remembering forever. Sessions are for conversation flow.

Auto Memory Flush

During auto-compaction, GoClaw extracts important facts from the conversation and saves them to memory before summarizing the history.

Trigger: >50 messages OR >85% context window (either condition triggers compaction)
Process: Synchronous flush, max 5 iterations, 90-second timeout
What's saved: Key facts, user preferences, decisions, action items
Order: Memory flush runs before history compaction — facts are persisted first, then history is summarized and truncated

Memory flush only triggers as part of auto-compaction — not independently. The flush runs synchronously inside the compaction lock and appends extracted facts to memory/YYYY-MM-DD.md. This means agents gradually build up knowledge about each user without explicit "remember this" commands.

Extractive Memory Fallback

If the LLM-based flush fails (timeout, provider error, bad output), GoClaw falls back to extractive memory: a keyword-based pass over the conversation that extracts key facts without an LLM call. This ensures memories are saved even when the LLM is unavailable, at the cost of lower quality extraction.

Memory File Variants

GoClaw recognizes four memory file types:

File	Role	Notes
`MEMORY.md`	Curated memory (Markdown)	Primary file; auto-included in system prompt
`memory.md`	Fallback for `MEMORY.md`	Checked if `MEMORY.md` is absent
`MEMORY.json`	Machine-readable index	Deprecated — no longer recommended
Inline (`memory/*.md`)	Date-stamped files from auto-flush	Indexed and searchable; e.g. `memory/2026-03-23.md`

All .md variants are chunked, embedded, and searchable via memory_search. MEMORY.json is stored but not indexed.

Requirements

Memory requires:

PostgreSQL 15+ with the pgvector extension
An embedding provider configured (OpenAI, Anthropic, or compatible)
memory: true in agent config (enabled by default)

Set memory: false in an agent's config to disable memory entirely for that agent — no reads, no writes, no auto-flush.

Team Memory Sharing

When agents work as a team, team members can read the leader's memory as a fallback:

memory_search: Searches the member's own memory first. If no results, automatically falls back to the leader's memory and merges results.
memory_get: Reads from the member's own memory first. If the file isn't found, falls back to the leader's memory.
Writes are blocked: Team members cannot save or modify memory files — only the team leader can write memory. Members attempting to write receive: "memory is read-only for team members".

This allows knowledge sharing within a team without duplication. The leader accumulates shared knowledge, and all members benefit from it automatically.

Common Issues

Problem	Solution
Memory search returns nothing	Check that pgvector extension is installed; verify embedding provider is configured
Agent forgets things	Ensure `memory: true` in config; check if auto-compaction is running
Irrelevant memories surfacing	Memory accumulates over time; consider clearing old memories via the API
Episodic summaries not created	Verify consolidation workers are registered at startup; check event bus is running
Dreaming worker never promotes	Check that ≥5 sessions have completed for the agent/user pair; review debounce logs

What's Next

Multi-Tenancy — Per-user memory isolation
Sessions and History — How conversation history works
Context Pruning — How pruning integrates with the consolidation pipeline
Agents Explained — Agent types and context files

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory System

Overview

3-Tier Memory Architecture

Tier Boundaries and Promotion Rules

How It Works

Writing Memory (L0)

Searching Memory

Knowledge Graph Search

Consolidation Workers

`episodic_worker`

`semantic_worker`

`dedup_worker`

`dreaming_worker`

`l0_abstract`

Auto-Injector

Trivial Filter

Memory vs Sessions

Auto Memory Flush

Extractive Memory Fallback

Memory File Variants

Requirements

Team Memory Sharing

Common Issues

What's Next

FilesExpand file tree

memory-system.md

Latest commit

History

memory-system.md

File metadata and controls

Memory System

Overview

3-Tier Memory Architecture

Tier Boundaries and Promotion Rules

How It Works

Writing Memory (L0)

Searching Memory

Knowledge Graph Search

Consolidation Workers

episodic_worker

semantic_worker

dedup_worker

dreaming_worker

l0_abstract

Auto-Injector

Trivial Filter

Memory vs Sessions

Auto Memory Flush

Extractive Memory Fallback

Memory File Variants

Requirements

Team Memory Sharing

Common Issues

What's Next

`episodic_worker`

`semantic_worker`

`dedup_worker`

`dreaming_worker`

`l0_abstract`