Engram is configured via a YAML file (default: engram.yaml). Any field can be overridden with an ENGRAM_* environment variable.
./engram --config /path/to/engram.yaml| Key | Default | Description |
|---|---|---|
port |
8080 |
HTTP server port |
api_key |
(none) | Bearer token required for all /v1/* requests. If unset, auth is disabled. |
| Key | Default | Description |
|---|---|---|
path |
./engram.db |
SQLite database file path |
Engram uses three LLM configs for different pipeline stages, each with different quality requirements:
| Config key | Purpose | Recommended |
|---|---|---|
compression_llm |
Pyramid compression — squeeze episode text to N words (L4/L8/L16/L32/L64) | Small/local model (Ollama) |
consolidation_llm |
Engram/trace summarization — synthesize groups of episodes into coherent memories | Medium model |
inference_llm |
Relationship/edge detection — infer semantic edges between episodes (structured JSON output) | Medium–high model |
All three share the same set of fields:
| Key | Description |
|---|---|
provider |
anthropic, ollama, or claude-code |
model |
Model name |
api_key |
Anthropic API key (if provider: anthropic). Falls back to ANTHROPIC_API_KEY env var. |
base_url |
Ollama server URL (if provider: ollama). Defaults to http://localhost:11434. |
binary_path |
Path to Claude Code CLI binary (if provider: claude-code). Defaults to claude. |
| Config | Default provider | Default model |
|---|---|---|
compression_llm |
ollama |
qwen2.5:7b |
consolidation_llm |
anthropic |
claude-haiku-4-5-20251001 |
inference_llm |
anthropic |
claude-haiku-4-5-20251001 |
# Pyramid compression — local Ollama is fast enough for word-count compression
compression_llm:
provider: "ollama"
model: "qwen2.5:7b"
base_url: "http://localhost:11434"
# Engram summarization — Haiku handles coherent prose well
consolidation_llm:
provider: "anthropic"
model: "claude-haiku-4-5-20251001"
# Relationship detection — Haiku handles structured JSON output reliably
inference_llm:
provider: "anthropic"
model: "claude-haiku-4-5-20251001"compression_llm:
provider: "ollama"
model: "qwen2.5:3b"
base_url: "http://localhost:11434"
consolidation_llm:
provider: "ollama"
model: "qwen2.5:7b"
base_url: "http://localhost:11434"
inference_llm:
provider: "ollama"
model: "qwen2.5:7b"
base_url: "http://localhost:11434"consolidation_llm:
provider: "claude-code"
model: "claude-sonnet-4-6" # optional
inference_llm:
provider: "claude-code"
model: "claude-sonnet-4-6"
compression_llm:
provider: "ollama"
model: "qwen2.5:7b"
base_url: "http://localhost:11434"Deprecated. The single
llmkey is still supported as a fallback but will be removed in a future release. Ifcompression_llm,consolidation_llm, orinference_llmare not set, Engram falls back tollmfor that stage. A deprecation warning is logged at startup whenllmis in use.
# Old style — all three stages use the same model (still works, but deprecated)
llm:
provider: "anthropic"
model: "claude-haiku-4-5-20251001"Migrate by replacing llm: with the three specific keys above.
| Key | Default | Description |
|---|---|---|
base_url |
http://localhost:11434 |
Ollama-compatible embedding server URL |
model |
nomic-embed-text |
Embedding model name |
api_key |
(none) | API key if required by the embedding server |
If the embedding server is unavailable, Engram falls back to lexical-only retrieval (BM25 + entity seeding).
| Key | Default | Description |
|---|---|---|
provider |
ollama |
spacy or ollama |
model |
qwen2.5:7b |
Model name (Ollama only) |
spacy_url |
(none) | spaCy server URL (if provider: spacy) |
NER is optional — omit this block to skip entity extraction. Retrieval still works via semantic and lexical seeding; entity-based seeding is simply absent.
spaCy is faster and more accurate for English. Run the sidecar:
docker run -p 8001:8001 ghcr.io/vthunder/engram-ner:latestOr from the repo:
cd ner && uvicorn server:app --port 8001| Key | Default | Description |
|---|---|---|
enabled |
true |
Run background consolidation |
interval |
15m |
How often to check consolidation eligibility (Go duration string) |
min_episodes |
10 |
Minimum unconsolidated episodes in any channel before consolidation is eligible |
idle_time |
30m |
Time since last episode in a channel before that channel can trigger consolidation |
max_buffer |
100 |
Unconsolidated episode count that triggers consolidation immediately, regardless of idle time |
Consolidation now runs conditionally on each tick. For each channel with unconsolidated episodes, it checks:
unconsolidated_count >= min_episodes AND (idle_time elapsed OR unconsolidated_count >= max_buffer)
This prevents consolidation from firing mid-conversation (the idle_time gate) while ensuring the buffer doesn't grow unbounded (the max_buffer gate). Setting max_buffer equal to the bot's maximum unconsolidated episode fetch limit ensures older episodes are always reachable either via the unconsolidated buffer or as consolidated engrams retrievable by spreading activation.
Controls automatic background activation decay. Engram applies exponential decay to all engram activations on each tick — no client scheduling required. Operational engrams decay at 3× the base rate.
| Key | Default | Description |
|---|---|---|
interval |
1h |
How often to run decay (Go duration string). Set to 0 to disable. |
lambda |
0.005 |
Exponential decay coefficient λ. Higher values = faster forgetting. |
floor |
0.01 |
Minimum activation level — engrams never decay below this value. |
The decay formula applied to each engram:
new_activation = current_activation × exp(−λ × hours_since_last_access)
lambda of 0.005 causes an engram that hasn't been accessed for 7 days to retain roughly 70% activation; after 30 days, ~67%; after 90 days, ~50%. Reinforce engrams with POST /v1/engrams/{id}/reinforce to reset their last_accessed timestamp and slow decay.
POST /v1/activation/decay remains available for manual or one-off decay runs.
When set, the consolidation pipeline uses role-aware memory formation. The bot's own episodes are written in first person; owner episodes are attributed by name; third-party episodes are attributed correctly. One-time approvals ("ok you can restart") are recorded with temporal anchoring rather than as standing permissions.
| Key | Default | Description |
|---|---|---|
name |
(none) | Bot's display name, e.g. "Bud". Matched against episode.author |
author_id |
(none) | Bot's author ID. Matched against episode.author_id |
owner_ids |
(none) | List of owner author_id values for owner-specific framing |
identity:
name: "Bud"
author_id: "bud"
owner_ids:
- "thunder"Effects:
- Episodes where
author_idmatchesidentity.author_idare written in first person ("I should...") - Episodes where
author_idis inidentity.owner_idsare attributed as "the owner" or by name POST /v1/thoughtsautomatically setsauthorandauthor_idfrom identity config
All config fields can be set or overridden with ENGRAM_* environment variables:
| Variable | Config field |
|---|---|
ENGRAM_SERVER_API_KEY |
server.api_key |
ENGRAM_STORAGE_PATH |
storage.path |
ENGRAM_COMPRESSION_LLM_PROVIDER |
compression_llm.provider |
ENGRAM_COMPRESSION_LLM_MODEL |
compression_llm.model |
ENGRAM_COMPRESSION_LLM_API_KEY |
compression_llm.api_key |
ENGRAM_COMPRESSION_LLM_BASE_URL |
compression_llm.base_url |
ENGRAM_COMPRESSION_LLM_BINARY_PATH |
compression_llm.binary_path |
ENGRAM_CONSOLIDATION_LLM_PROVIDER |
consolidation_llm.provider |
ENGRAM_CONSOLIDATION_LLM_MODEL |
consolidation_llm.model |
ENGRAM_CONSOLIDATION_LLM_API_KEY |
consolidation_llm.api_key |
ENGRAM_CONSOLIDATION_LLM_BASE_URL |
consolidation_llm.base_url |
ENGRAM_CONSOLIDATION_LLM_BINARY_PATH |
consolidation_llm.binary_path |
ENGRAM_INFERENCE_LLM_PROVIDER |
inference_llm.provider |
ENGRAM_INFERENCE_LLM_MODEL |
inference_llm.model |
ENGRAM_INFERENCE_LLM_API_KEY |
inference_llm.api_key |
ENGRAM_INFERENCE_LLM_BASE_URL |
inference_llm.base_url |
ENGRAM_INFERENCE_LLM_BINARY_PATH |
inference_llm.binary_path |
ANTHROPIC_API_KEY |
fallback api_key for all anthropic-using configs |
ENGRAM_LLM_PROVIDER |
llm.provider (deprecated) |
ENGRAM_LLM_MODEL |
llm.model (deprecated) |
ENGRAM_LLM_API_KEY |
llm.api_key (deprecated) |
ENGRAM_LLM_BASE_URL |
llm.base_url (deprecated) |
ENGRAM_LLM_BINARY_PATH |
llm.binary_path (deprecated) |
ENGRAM_EMBEDDING_BASE_URL |
embedding.base_url |
ENGRAM_EMBEDDING_MODEL |
embedding.model |
ENGRAM_EMBEDDING_API_KEY |
embedding.api_key |
ENGRAM_NER_PROVIDER |
ner.provider |
ENGRAM_NER_MODEL |
ner.model |
ENGRAM_NER_SPACY_URL |
ner.spacy_url |
ENGRAM_IDENTITY_NAME |
identity.name |
ENGRAM_IDENTITY_AUTHOR_ID |
identity.author_id |
ENGRAM_CONSOLIDATION_MIN_EPISODES |
consolidation.min_episodes |
ENGRAM_CONSOLIDATION_IDLE_TIME |
consolidation.idle_time |
ENGRAM_CONSOLIDATION_MAX_BUFFER |
consolidation.max_buffer |
Decay does not currently have env var overrides; configure it via the YAML file.