| title | Configuration |
|---|---|
| description | Complete configuration reference for Signet. |
| order | 2 |
| section | Getting Started |
Complete reference for all Signet configuration options. For initial setup, see [[quickstart]]. For the [[daemon]] runtime, see [[architecture]].
All files live in your active Signet workspace.
- Default workspace:
~/.agents/ - Persisted workspace setting:
~/.config/signet/workspace.json - Override for a single process:
SIGNET_PATH=/some/path
| File | Purpose |
|---|---|
agent.yaml |
Main configuration and manifest |
AGENTS.md |
Agent-managed operating rules and instructions (synced to harnesses) |
SOUL.md |
Agent-managed personality, tone, values, and temperament |
MEMORY.md |
System-managed working memory summary (auto-generated, do not edit manually) |
IDENTITY.md |
Agent-managed identity metadata |
USER.md |
Agent-managed user profile and relationship context |
The loader checks agent.yaml, AGENT.yaml, and config.yaml in that
order, using the first file it finds. All sections are optional; omitting
a section falls back to the documented defaults.
Use the CLI to inspect or change the default workspace path:
signet workspace status
signet workspace set ~/.openclaw/workspacesignet workspace set is idempotent. It safely migrates files, stores the
new default workspace in ~/.config/signet/workspace.json, and updates
detected OpenClaw-family configs to keep agents.defaults.workspace aligned.
Resolution order for the effective workspace is:
--pathCLI optionSIGNET_PATHenvironment variable- Stored CLI workspace setting (
~/.config/signet/workspace.json) - Default
~/.agents/
The primary configuration file. Created by signet setup and editable
via signet configure or the dashboard's config editor.
version: 1
schema: signet/v1
agent:
name: "My Agent"
description: "Personal AI assistant"
created: "2025-02-17T00:00:00Z"
updated: "2025-02-17T00:00:00Z"
owner:
address: "0x..."
localId: "user123"
ens: "user.eth"
name: "User Name"
harnesses:
- forge
- claude-code
- openclaw
- opencode
embedding:
provider: ollama
model: nomic-embed-text
dimensions: 768
base_url: http://localhost:11434
search:
alpha: 0.7
top_k: 20
min_score: 0.3
memory:
database: memory/memories.db
session_budget: 2000
decay_rate: 0.95
synthesis:
harness: openclaw
model: sonnet
schedule: daily
max_tokens: 4000
pipelineV2:
enabled: true
shadowMode: false
extraction:
provider: ollama
model: qwen3:4b
synthesis:
enabled: true
provider: ollama
model: qwen3:4b
graph:
enabled: true
autonomous:
enabled: true
maintenanceMode: execute
hooks:
sessionStart:
recallLimit: 10
includeIdentity: true
includeRecentContext: true
recencyBias: 0.7
userPromptSubmit:
enabled: true
recallLimit: 10
maxInjectChars: 500
minScore: 0.8
preCompaction:
includeRecentMemories: true
memoryLimit: 5
auth:
mode: local
defaultTokenTtlSeconds: 604800
sessionTokenTtlSeconds: 86400
trust:
verification: noneCore agent identity metadata.
| Field | Type | Required | Description |
|---|---|---|---|
name |
string | yes | Agent display name |
description |
string | no | Short description |
created |
string | yes | ISO 8601 creation timestamp |
updated |
string | yes | ISO 8601 last update timestamp |
Optional owner identification. Reserved for future cryptographic identity verification.
| Field | Type | Description |
|---|---|---|
address |
string | Cryptographic identity address or external identity ID, reserved for future use |
localId |
string | Local user identifier |
ens |
string | Optional ENS or human-friendly identity alias |
name |
string | Human-readable name |
List of AI platforms to integrate with. Valid values: forge,
claude-code, opencode, openclaw, and codex. Support for
cursor, windsurf, chatgpt, and gemini is planned.
Vector embedding configuration for semantic memory search.
| Field | Type | Default | Description |
|---|---|---|---|
provider |
string | "ollama" |
"ollama" or "openai" |
model |
string | "nomic-embed-text" |
Embedding model name |
dimensions |
number | 768 |
Output vector dimensions |
base_url |
string | "http://localhost:11434" |
Ollama API base URL |
api_key |
string | — | API key or $secret:NAME reference |
Recommended Ollama models:
| Model | Dimensions | Notes |
|---|---|---|
nomic-embed-text |
768 | Default; good quality/speed balance |
all-minilm |
384 | Faster, smaller vectors |
mxbai-embed-large |
1024 | Better quality, more resource usage |
Recommended OpenAI models:
| Model | Dimensions | Notes |
|---|---|---|
text-embedding-3-small |
1536 | Cost-effective |
text-embedding-3-large |
3072 | Highest quality |
Rather than putting an API key in plain text, store it with
signet secret put OPENAI_API_KEY and reference it as:
api_key: $secret:OPENAI_API_KEYHybrid search tuning. Controls the blend between semantic (vector) and keyword (BM25) retrieval.
| Field | Type | Default | Description |
|---|---|---|---|
alpha |
number | 0.7 |
Vector weight 0-1. Higher = more semantic. |
top_k |
number | 20 |
Candidate count fetched from each source |
min_score |
number | 0.3 |
Minimum combined score to return a result |
At alpha: 0.9 results are heavily semantic, suitable for conceptual
queries. At alpha: 0.3 results skew toward keyword matching, better for
exact-phrase lookups. The default of 0.7 works well generally.
Memory system settings.
| Field | Type | Default | Description |
|---|---|---|---|
database |
string | "memory/memories.db" |
SQLite path (relative to the active workspace) |
session_budget |
number | 2000 |
Character limit for session context injection |
decay_rate |
number | 0.95 |
Daily importance decay factor for non-pinned memories |
Non-pinned memories lose importance over time using the formula:
importance(t) = base_importance × decay_rate^days_since_access
Accessing a memory resets the decay timer.
Configuration for periodic MEMORY.md regeneration. The synthesis
process reads all memories and asks a model to write a coherent summary.
| Field | Type | Default | Description |
|---|---|---|---|
harness |
string | "openclaw" |
Which harness runs synthesis (forge, openclaw, claude-code, codex, opencode) |
model |
string | "sonnet" |
Model identifier |
schedule |
string | "daily" |
"daily", "weekly", or "on-demand" |
max_tokens |
number | 4000 |
Max output tokens |
Signet's shared inference control plane is configured under the top-level
inference key in agent.yaml.
If inference is omitted, Signet preserves the old behavior by compiling
memory.pipelineV2.extraction and memory.pipelineV2.synthesis into an
implicit inference profile. That keeps existing agents working without change.
Use inference when you want Signet to choose models across harnesses,
accounts, APIs, and local runtimes per turn or per subtask.
Example:
inference:
enabled: true
defaultPolicy: auto
accounts:
claude-dot:
kind: subscription_session
providerFamily: anthropic
label: Dot Claude Connected
sessionRef: CLAUDE_DOT_SESSION
openrouter-main:
kind: api
providerFamily: openrouter
credentialRef: OPENROUTER_API_KEY
targets:
opus:
executor: claude-code
account: claude-dot
models:
opus46:
model: opus-4.6
reasoning: high
toolUse: true
streaming: true
sonnet:
executor: openrouter
account: openrouter-main
privacy: remote_ok
endpoint: https://openrouter.ai/api/v1
models:
default:
model: anthropic/claude-sonnet-4-6
reasoning: medium
toolUse: true
streaming: true
costTier: medium
local:
executor: ollama
endpoint: http://127.0.0.1:11434
privacy: local_only
models:
gemma4:
model: gemma4
reasoning: medium
streaming: true
costTier: low
policies:
auto:
mode: automatic
defaultTargets:
- opus/opus46
- sonnet/default
- local/gemma4
taskClasses:
casual_chat:
reasoning: medium
preferredTargets:
- sonnet/default
hard_coding:
reasoning: high
toolsRequired: true
preferredTargets:
- opus/opus46
hipaa_sensitive:
privacy: local_only
preferredTargets:
- local/gemma4
workloads:
interactive:
policy: auto
taskClass: casual_chat
memoryExtraction:
policy: auto
taskClass: casual_chat
sessionSynthesis:
policy: auto
taskClass: casual_chat
agents:
rose:
defaultPolicy: auto
roster:
- opus/opus46
- sonnet/default
- local/gemma4
pinnedTargets:
hard_coding: opus/opus46Named account or credential identities used by targets.
| Field | Type | Description |
|---|---|---|
kind |
string | subscription_session or api |
providerFamily |
string | Provider family label, for example anthropic, openai, openrouter |
label |
string | Human-readable account label |
credentialRef |
string | Secret name or env var name for API-backed targets |
sessionRef |
string | Session identifier for subscription-backed targets |
usageTier |
string | Optional account tier label |
Executable route targets. A target can be a local runtime, API backend, subscription-backed CLI session, or gateway.
| Field | Type | Description |
|---|---|---|
executor |
string | claude-code, codex, opencode, anthropic, openrouter, ollama, llama-cpp, openai-compatible, or command |
kind |
string | Optional explicit target kind. Inferred when omitted |
account |
string | Account id from inference.accounts |
endpoint |
string | Optional base URL override |
command |
object | Command executor config with bin, optional args, cwd, and env |
privacy |
string | remote_ok, restricted_remote, or local_only |
models |
map | Named model entries for this target |
Model fields:
| Field | Type | Description |
|---|---|---|
model |
string | Provider-native model identifier |
label |
string | Optional display label |
reasoning |
string | low, medium, or high |
contextWindow |
number | Maximum prompt tokens the model can accept |
toolUse |
boolean | Whether tool use is supported |
streaming |
boolean | Whether streaming is supported |
multimodal |
boolean | Whether multimodal input is supported |
costTier |
string | low, medium, or high |
averageLatencyMs |
number | Optional routing latency hint |
Named routing policies that agents and workloads reference.
| Field | Type | Description |
|---|---|---|
mode |
string | strict, automatic, or hybrid |
allow |
array | Route refs allowed by the policy |
deny |
array | Route refs denied by the policy |
defaultTargets |
array | Ordered preferred target refs |
taskTargets |
map | Task-class specific preferred target refs |
fallbackTargets |
array | Explicit fallback refs |
maxLatencyMs |
number | Hard latency ceiling used by routing |
costCeiling |
string | Hard cost ceiling used by routing |
Task-family hints for automatic routing.
| Field | Type | Description |
|---|---|---|
reasoning |
string | Required reasoning depth |
toolsRequired |
boolean | Require tool use support |
streamingPreferred |
boolean | Prefer or require streaming support |
multimodalRequired |
boolean | Require multimodal support |
privacy |
string | Hard privacy tier, including local_only |
maxLatencyMs |
number | Task latency budget |
costCeiling |
string | Task cost ceiling |
expectedInputTokens |
number | Prompt-size hint |
expectedOutputTokens |
number | Output-size hint |
preferredTargets |
array | Preferred target refs |
keywords |
array | Lightweight classifier keywords |
Binds Signet-owned workloads to router policies or explicit targets.
Supported workload keys:
interactivememoryExtractionsessionSynthesis
Each workload can define:
| Field | Type | Description |
|---|---|---|
policy |
string | Named policy id |
taskClass |
string | Default task class for this workload |
target |
string | Explicit target/model pin |
Per-agent routing overrides.
| Field | Type | Description |
|---|---|---|
defaultPolicy |
string | Default policy for that agent |
roster |
array | Allowed target refs for that agent |
preferredTargets |
map | Task-class target preferences |
pinnedTargets |
map | Hard pins, usually managed by signet route pin |
The V2 [[pipeline|memory pipeline]] lives at packages/daemon/src/pipeline/. It runs
LLM-based fact extraction against incoming conversation text, then decides
whether to write new memories, update existing ones, or skip. Config lives
under memory.pipelineV2 in agent.yaml.
Inference selection for extraction and session synthesis can also be routed
through the top-level inference.workloads bindings. When explicit routing is
enabled for default, memoryExtraction, sessionSynthesis, widgetGeneration, or repair, those workloads use the
shared inference control plane. Legacy extraction and synthesis fields are treated as load-time compatibility input, not separate runtime providers.
The config uses a nested structure with grouped sub-objects. Legacy flat
keys (e.g. extractionModel, workerPollMs) are still supported for
backward compatibility, but nested keys take precedence when both are
present.
Enable the pipeline:
memory:
pipelineV2:
enabled: true
shadowMode: true # extract without writing — safe first step
extraction:
provider: ollama
model: qwen3:4bThese top-level boolean fields gate major pipeline behaviors.
| Field | Default | Description |
|---|---|---|
enabled |
true |
Master switch. Pipeline does nothing when false. |
shadowMode |
false |
Extract facts but skip writes. Useful for evaluation. |
mutationsFrozen |
false |
Allow reads; block all writes. Overrides shadowMode. |
semanticContradictionEnabled |
false |
Enable LLM-based semantic contradiction detection for UPDATE/DELETE proposals. |
telemetryEnabled |
false |
Enable anonymous telemetry reporting. |
The relationship between shadowMode and mutationsFrozen matters:
shadowMode suppresses writes from the normal extraction path only;
mutationsFrozen is a harder freeze that blocks all write paths
including repairs and graph updates.
Controls the LLM-based extraction stage. Supports multiple providers.
| Field | Default | Range | Description |
|---|---|---|---|
provider |
"ollama" |
— | "none", "ollama", "claude-code", "opencode", "codex", "anthropic", "openrouter", or "command" |
model |
"qwen3:4b" |
— | Model name for the configured provider |
timeout |
90000 |
5000-300000 ms | Extraction call timeout |
minConfidence |
0.7 |
0.0-1.0 | Confidence threshold; facts below this are dropped |
structuredOutput |
true |
— | Send JSON schema in the format field of LLM requests. Set false when the provider rejects structured output (e.g. GitHub Copilot API). The daemon also auto-detects unsupported providers at runtime and disables this transparently. |
command |
— | — | Command provider config (bin, args[], optional cwd, optional env) — required when provider: "command" |
rateLimit.maxCallsPerHour |
200 when rateLimit is set |
0-10000 | Max extraction-provider calls per hour; set 0 to disable rate limiting |
rateLimit.burstSize |
20 when rateLimit is set |
1-1000 | Max burst size before throttling begins |
rateLimit.waitTimeoutMs |
5000 when rateLimit is set |
0-60000 ms | How long to wait for a token before failing with RateLimitExceededError |
For safety, the intended extraction setups are:
claude-codeon a Haiku modelcodexon a GPT Mini model- local
ollamawithnemotron-3-nano:4b(preferred) orqwen3:4b(deprecated — Nemotron's superior reasoning makes Qwen3 the weaker choice going forward; expect degraded extraction quality in future updates)
Set provider: none to disable extraction entirely, which is the
recommended default for VPS installs that should not make background LLM
calls.
Remote API extraction can accumulate extreme fees quickly because the
pipeline runs continuously in the background. Use anthropic,
openrouter, or remote OpenCode routes only when you explicitly want
that billing behavior.
rateLimit is opt-in. If the stanza is omitted, Signet preserves the
provider's existing behavior with no throughput throttling. When
configured, it applies only to remote or paid providers
(claude-code, anthropic, openrouter, codex, opencode).
Ollama and command providers are always exempt. If you set rateLimit
on an exempt provider, Signet logs a warning and passes calls through
unthrottled.
An empty rateLimit: {} block is treated as disabled. Set at least one
sub-field to opt in, or omit the stanza entirely to leave rate limiting
off.
When a rate-limited job fails (the bucket is empty and the wait timeout
expires), it is classified as non-retryable and sent directly to
dead-letter status. Dead-lettered jobs are not retried when the rate-limit
window resets. Choose maxCallsPerHour high enough to handle sustained
ingestion bursts, or you will permanently lose extraction for memories
queued during exhaustion. Dead-letter jobs are purged after 30 days by
the retention worker.
When configured via YAML, burstSize is clamped to a minimum of 1.
The lower-level withRateLimit() helper is more defensive: passing
burstSize: 0 or maxCallsPerHour: 0 disables the wrapper entirely
instead of constructing a limiter that can never acquire a token.
Rate-limiter state is in-memory only. After a daemon restart the full
burstSize is available immediately (the token bucket starts full). In
environments with frequent restarts (crash-loops, rolling deployments),
this means the limiter cannot protect against a burst of calls right
after startup. Set burstSize conservatively if your daemon restarts
often under load.
When using ollama, the model must be available locally. When using
claude-code, the Claude Code CLI must be on PATH. codex uses the
Codex CLI as the extraction provider. Lower minConfidence to capture
more facts at the cost of noise; raise it to write only high-confidence
facts.
There are two command paths with different contracts. Top-level
inference.targets.*.executor: command is a normal inference provider: the
prompt is sent on stdin, exposed as SIGNET_PROMPT, and the model response is
read from stdout.
Legacy memory.pipelineV2.extraction.provider: command keeps the old
side-effecting extractor contract. The summary worker executes
memory.pipelineV2.extraction.command, writes the transcript to a temporary
file, substitutes that path into args/env, and expects the command to write
memories to Signet state directly. Stdout and stderr are ignored except for
process failure.
Available legacy extraction command tokens:
$TRANSCRIPT(alias$TRANSCRIPT_PATH) — temp transcript file path$SESSION_KEY— session key (or empty string)$PROJECT— project path (or empty string)$AGENT_ID— agent id for the queued job$SIGNET_PATH— active Signet workspace path
For safety, user-derived tokens ($SESSION_KEY, $PROJECT, $TRANSCRIPT) are
intended for args/env substitution. Keep bin and cwd fixed, or use only
trusted $SIGNET_PATH / $AGENT_ID there.
Example:
memory:
pipelineV2:
extraction:
provider: command
command:
bin: node
args:
- ./scripts/custom-extractor.mjs
- --transcript
- $TRANSCRIPT
- --session
- $SESSION_KEYControls the provider used by the summary-worker for session summaries.
This is separate from fact extraction once explicitly configured.
If the synthesis block is omitted entirely, Signet falls back to the
resolved extraction provider, model, endpoint, and timeout. When an explicit
top-level inference: block exists, workload bindings decide which target
handles synthesis.
| Field | Default | Range | Description |
|---|---|---|---|
enabled |
true |
— | Enable background session summary generation |
provider |
inherited from extraction when omitted | — | "none", "ollama", "claude-code", "codex", "opencode", "anthropic", or "openrouter" |
model |
inherited from extraction when omitted | — | Model name for the configured provider |
endpoint |
inherited from extraction when omitted | — | Optional base URL override for Ollama, OpenCode, or OpenRouter |
timeout |
inherited from extraction when omitted | 5000-300000 ms | Summary generation timeout |
structuredOutput |
inherited from extraction when omitted | — | Send JSON schema in the format field of LLM requests. Set false when the synthesis provider rejects structured output (e.g. GitHub Copilot API). Falls back to extraction.structuredOutput when omitted. |
rateLimit.maxCallsPerHour |
200 when rateLimit is set |
0-10000 | Max synthesis-provider calls per hour; set 0 to disable rate limiting |
rateLimit.burstSize |
20 when rateLimit is set |
1-1000 | Max burst size before throttling begins |
rateLimit.waitTimeoutMs |
5000 when rateLimit is set |
0-60000 ms | How long to wait for a token before failing with RateLimitExceededError |
Set provider: none or enabled: false to disable background session
summary synthesis entirely.
synthesis.provider: command is invalid and rejected during config load.
Widget HTML generation uses a separate provider instance by default, so
widget traffic does not consume the synthesis pipeline's rateLimit
bucket.
As with extraction, an empty rateLimit: {} block is treated as
disabled. Set at least one sub-field to opt in.
Rate-limited synthesis jobs that fail are sent to dead-letter without
retry. See the extraction rateLimit docs above for the full warning.
The pipeline processes jobs through a queue with lease-based concurrency control.
| Field | Default | Range | Description |
|---|---|---|---|
pollMs |
2000 |
100-60000 ms | How often the worker polls for pending jobs |
maxRetries |
3 |
1-10 | Max retry attempts before a job goes to dead-letter |
leaseTimeoutMs |
300000 |
10000-600000 ms | Time before an uncompleted job lease expires |
maxLoadPerCpu |
0.8 |
0.1-8.0 | Load-per-CPU threshold above which extraction polling is deferred |
overloadBackoffMs |
30000 |
1000-300000 ms | Delay between poll attempts while host load stays above threshold |
A job that exceeds maxRetries moves to dead-letter status and is
eventually purged by the retention worker.
Legacy flat keys workerMaxLoadPerCpu and workerOverloadBackoffMs are
still accepted for backward compatibility.
When graph.enabled: true, the pipeline builds entity-relationship links
from extracted facts and uses them to boost search relevance.
| Field | Default | Range | Description |
|---|---|---|---|
enabled |
true |
— | Enable knowledge graph building and querying |
boostWeight |
0.15 |
0.0-1.0 | Weight applied to graph-neighbor score boost |
boostTimeoutMs |
500 |
50-5000 ms | Timeout for graph lookup during search |
Structural workers classify extracted facts into entity aspects, extract direct entity dependencies from facts, and synthesize cross-entity dependency edges from the existing graph.
| Field | Default | Range | Description |
|---|---|---|---|
enabled |
true |
— | Enable structural classification and dependency workers |
classifyBatchSize |
8 |
1-20 | Max facts per entity classification call |
dependencyBatchSize |
5 |
1-10 | Max stale entities or dependency jobs per worker tick |
pollIntervalMs |
10000 |
2000-120000 ms | Structural job polling interval |
synthesisEnabled |
true |
— | Enable cross-entity dependency synthesis |
synthesisIntervalMs |
60000 |
10000-600000 ms | Dependency synthesis polling interval |
synthesisTopEntities |
20 |
5-100 | Candidate entities considered per synthesis call |
synthesisMaxFacts |
10 |
3-50 | Facts included for the focal entity |
synthesisMaxStallMs |
1800000 |
0-86400000 ms | Pause dependency synthesis when extraction has made no successful progress for this long; set 0 to disable the circuit breaker |
The aliases dependencySynthesis.maxStallMs and
dependencySynthesis.synthesisMaxStallMs are accepted for
structural.synthesisMaxStallMs.
Prospective indexing generates hypothetical future queries at write time. These "hints" are indexed in FTS5 so memories match by anticipated cue, not just stored content. For example, a memory about "switched from PostgreSQL to SQLite" might generate hints like "database migration", "why SQLite", and "storage engine decision" — queries the user is likely to ask later.
| Field | Default | Range | Description |
|---|---|---|---|
enabled |
true |
— | Enable prospective indexing |
max |
5 |
1-20 | Maximum hints generated per memory |
timeout |
30000 |
5000-120000 ms | Hint generation LLM timeout |
maxTokens |
256 |
32-1024 | Max tokens for hint generation |
poll |
5000 |
1000-60000 ms | Job polling interval |
memory:
pipelineV2:
hints:
enabled: true
max: 5
timeout: 30000
maxTokens: 256
poll: 5000Graph traversal controls how the knowledge graph is walked during
retrieval. When primary: true, graph traversal produces the base
candidate pool and flat search fills gaps. When primary: false,
traditional hybrid search runs first with graph boost as
supplementary.
| Field | Default | Range | Description |
|---|---|---|---|
enabled |
true |
— | Enable graph traversal |
primary |
true |
— | Use traversal as primary retrieval strategy |
maxAspectsPerEntity |
10 |
1-50 | Max aspects to collect per entity |
maxAttributesPerAspect |
20 |
1-100 | Max attributes per aspect |
maxDependencyHops |
10 |
1-50 | Max hops for dependency walking |
minDependencyStrength |
0.3 |
0.0-1.0 | Minimum edge strength to follow |
maxBranching |
4 |
1-20 | Max branching factor during traversal |
maxTraversalPaths |
50 |
1-500 | Max paths to explore |
minConfidence |
0.5 |
0.0-1.0 | Minimum confidence for results |
timeoutMs |
500 |
50-5000 ms | Traversal timeout |
boostWeight |
0.2 |
0.0-1.0 | Weight for traversal boost in hybrid search |
constraintBudgetChars |
1000 |
100-10000 | Character budget for constraint injection |
memory:
pipelineV2:
traversal:
enabled: true
primary: true
maxAspectsPerEntity: 10
maxAttributesPerAspect: 20
maxDependencyHops: 10
minDependencyStrength: 0.3
maxBranching: 4
maxTraversalPaths: 50
minConfidence: 0.5
timeoutMs: 500
boostWeight: 0.2
constraintBudgetChars: 1000The primary flag determines the retrieval strategy. In primary mode,
entities are extracted from the query, the graph is walked to collect
related memories, and flat hybrid search only runs to fill remaining
slots. In supplementary mode (primary: false), the standard hybrid
search runs first and traversal results are blended in using
boostWeight. Primary mode is faster for entity-dense queries;
supplementary mode is more conservative and better for freeform text.
An optional reranking pass that runs after initial retrieval. An embedding-based reranker is built in (uses cached vectors, no extra LLM calls). Optionally, reranking can call the active extraction provider model.
| Field | Default | Range | Description |
|---|---|---|---|
enabled |
true |
— | Enable the reranking pass |
model |
"" |
— | Model name for the reranker (empty uses embedding-based) |
useExtractionModel |
false |
— | When true, use the extraction provider LLM for reranking and emit a synthesized summary card |
topN |
20 |
1-100 | Number of candidates to pass to the reranker |
timeoutMs |
2000 |
100-30000 ms | Timeout for the reranking call |
Controls autonomous maintenance, repair, and mutation behavior.
| Field | Default | Description |
|---|---|---|
enabled |
true |
Allow autonomous pipeline operations (maintenance, repair). |
frozen |
false |
Block autonomous writes; autonomous reads still allowed. |
allowUpdateDelete |
true |
Permit the pipeline to update or delete existing memories. |
maintenanceIntervalMs |
1800000 |
How often maintenance runs (30 min). Range: 60s-24h. |
maintenanceMode |
"execute" |
"observe" logs issues; "execute" attempts repairs. |
In "observe" mode the worker emits structured log events but makes no
changes. When frozen is true, the maintenance interval never starts,
though the worker's tick() method remains callable for on-demand
inspection.
Repair sub-workers limit how aggressively they re-embed, re-queue, or deduplicate items to avoid overloading providers.
| Field | Default | Range | Description |
|---|---|---|---|
reembedCooldownMs |
300000 |
10s-1h | Min time between re-embed batches |
reembedHourlyBudget |
10 |
1-1000 | Max re-embed operations per hour |
requeueCooldownMs |
60000 |
5s-1h | Min time between re-queue batches |
requeueHourlyBudget |
50 |
1-1000 | Max re-queue operations per hour |
dedupCooldownMs |
600000 |
10s-1h | Min time between dedup batches |
dedupHourlyBudget |
3 |
1-100 | Max dedup operations per hour |
dedupSemanticThreshold |
0.92 |
0.0-1.0 | Cosine similarity threshold for semantic dedup |
dedupBatchSize |
100 |
10-1000 | Max candidates evaluated per dedup batch |
Controls chunking for ingesting large documents into the memory store.
| Field | Default | Range | Description |
|---|---|---|---|
workerIntervalMs |
10000 |
1s-300s | Poll interval for pending document jobs |
chunkSize |
2000 |
200-50000 | Target chunk size in characters |
chunkOverlap |
200 |
0-10000 | Overlap between adjacent chunks (chars) |
maxContentBytes |
10485760 |
1 KB-100 MB | Max document size accepted |
Chunk overlap ensures context is not lost at chunk boundaries. A value of
10-15% of chunkSize is a reasonable starting point.
Content size limits applied during extraction and recall to prevent oversized content from degrading pipeline performance.
| Field | Default | Range | Description |
|---|---|---|---|
maxContentChars |
500 |
50-100000 | Max characters stored per memory |
chunkTargetChars |
300 |
50-50000 | Target chunk size for content splitting |
recallTruncateChars |
500 |
50-100000 | Max characters returned per memory in recall results |
These limits are enforced at the pipeline level. Content exceeding
maxContentChars is truncated before storage. Recall results are
truncated at recallTruncateChars to keep session context budgets
predictable.
Session checkpoint configuration for continuity recovery. Checkpoints capture periodic snapshots of session state (focus, prompts, memory activity) to aid recovery after context compaction or session restart.
| Field | Default | Range | Description |
|---|---|---|---|
enabled |
true |
— | Master switch for session checkpoints |
promptInterval |
10 |
1-1000 | Prompts between periodic checkpoints |
timeIntervalMs |
900000 |
60s-1h | Time between periodic checkpoints (15 min default) |
maxCheckpointsPerSession |
50 |
1-500 | Per-session checkpoint cap (oldest pruned) |
retentionDays |
7 |
1-90 | Days before old checkpoints are hard-deleted |
recoveryBudgetChars |
2000 |
200-10000 | Max characters for recovery digest |
Checkpoints are triggered by five events: periodic, pre_compaction,
session_end, agent, and explicit. Secrets are redacted before
storage.
Anonymous usage telemetry. Only active when telemetryEnabled: true.
Events are batched and flushed periodically.
| Field | Default | Range | Description |
|---|---|---|---|
posthogHost |
"" |
— | PostHog instance URL (empty disables) |
posthogApiKey |
"" |
— | PostHog project API key |
flushIntervalMs |
60000 |
5s-10min | Time between event flushes |
flushBatchSize |
50 |
1-500 | Max events per flush batch |
retentionDays |
90 |
1-365 | Days before local telemetry data is purged |
Background polling loop that detects stale or missing embeddings and refreshes them in small batches. Runs alongside the extraction pipeline.
| Field | Default | Range | Description |
|---|---|---|---|
enabled |
true |
— | Master switch |
pollMs |
5000 |
1s-60s | Polling interval between refresh cycles |
batchSize |
8 |
1-20 | Max embeddings refreshed per cycle |
The tracker detects embeddings that are missing, have a stale content
hash, or were produced by a different model than the currently configured
one. It uses setTimeout chains for natural backpressure.
Auth configuration lives under the auth key in agent.yaml. Signet
uses short-lived signed tokens for dashboard and API access.
auth:
mode: local
defaultTokenTtlSeconds: 604800 # 7 days
sessionTokenTtlSeconds: 86400 # 24 hours
rateLimits:
forget:
windowMs: 60000
max: 30
modify:
windowMs: 60000
max: 60
inferenceExplain:
windowMs: 60000
max: 120
inferenceExecute:
windowMs: 60000
max: 20
inferenceGateway:
windowMs: 60000
max: 30| Field | Default | Description |
|---|---|---|
mode |
"local" |
Auth mode: "local", "team", or "hybrid" |
defaultTokenTtlSeconds |
604800 |
API token lifetime (7 days) |
sessionTokenTtlSeconds |
86400 |
Session token lifetime (24 hours) |
In "local" mode the token secret is generated automatically and stored
at $SIGNET_WORKSPACE/.daemon/auth-secret. In "team" and "hybrid" modes,
the daemon validates HMAC-signed bearer tokens with role and scope
claims.
Rate limits are sliding-window counters that reset on daemon restart. Each key controls a category of potentially destructive operations.
| Operation | Default window | Default max | Description |
|---|---|---|---|
forget |
60 s | 30 | Soft-delete a memory |
modify |
60 s | 60 | Update memory content |
batchForget |
60 s | 5 | Bulk soft-delete |
forceDelete |
60 s | 3 | Hard-delete (bypasses tombstone) |
admin |
60 s | 10 | Admin API operations |
inferenceExplain |
60 s | 120 | Dry-run route decisions |
inferenceExecute |
60 s | 20 | Native routed prompt execution |
inferenceGateway |
60 s | 30 | OpenAI-compatible gateway completions |
recallLlm |
60 s | 60 | LLM-backed recall summarization |
Override any limit under auth.rateLimits.<operation>:
auth:
rateLimits:
forceDelete:
windowMs: 60000
max: 1The retention worker runs on a fixed interval and purges data that has
exceeded its retention window. It is not directly configurable in
agent.yaml; the defaults below are compiled in and apply unconditionally
when the pipeline is running.
| Field | Default | Description |
|---|---|---|
intervalMs |
21600000 |
Sweep frequency (6 hours) |
tombstoneRetentionMs |
2592000000 |
Soft-deleted memories kept for 30 days before hard purge |
historyRetentionMs |
15552000000 |
Memory history events kept for 180 days |
completedJobRetentionMs |
1209600000 |
Completed pipeline jobs kept for 14 days |
deadJobRetentionMs |
2592000000 |
Dead-letter jobs kept for 30 days |
batchLimit |
500 |
Max rows purged per step per sweep (backpressure) |
The retention worker also cleans up graph links and embeddings that
belong to purged tombstones, and orphans entity nodes with no remaining
mentions. The batchLimit prevents a single sweep from locking the
database for too long under high load.
Soft-deleted memories remain recoverable via POST /api/memory/:id/recover
until their tombstone window expires.
Controls what Signet injects during [[harnesses|harness]] lifecycle events. See [[hooks]] for full details.
hooks:
sessionStart:
recallLimit: 10
includeIdentity: true
includeRecentContext: true
recencyBias: 0.7
userPromptSubmit:
enabled: true
recallLimit: 10
maxInjectChars: 500
minScore: 0.8
preCompaction:
includeRecentMemories: true
memoryLimit: 5
summaryGuidelines: "Focus on technical decisions."hooks.sessionStart controls what is injected at the start of a new
harness session:
| Field | Default | Description |
|---|---|---|
recallLimit |
50 |
Number of memories to inject |
candidatePoolLimit |
100 |
Number of candidate memories to rank before token budgeting |
includeIdentity |
true |
Include agent name and description |
includeRecentContext |
true |
Include MEMORY.md content |
recencyBias |
0.7 |
Weight toward recent vs. important memories (0-1) |
maxInjectTokens |
12000 |
Maximum session-start injection budget after context assembly |
Predicted context from recent session summaries is scoped to the active project. If the harness does not provide a project path, Signet skips predicted-context FTS at session start to avoid global broad-term scans over large memory stores.
hooks.preCompaction controls what is included when the harness triggers
a pre-compaction summary:
| Field | Default | Description |
|---|---|---|
includeRecentMemories |
true |
Include recent memories in the prompt |
memoryLimit |
5 |
How many recent memories to include |
summaryGuidelines |
built-in | Custom instructions for session summary |
hooks.userPromptSubmit controls per-prompt memory injection:
| Field | Default | Description |
|---|---|---|
enabled |
true |
Enable per-prompt recall injection |
recallLimit |
10 |
Max recall candidates considered |
maxInjectChars |
500 |
Prompt-time injection character budget |
minScore |
0.8 |
Minimum top recall score required before injecting memories |
Environment variables take precedence over agent.yaml for runtime
overrides. They are useful in containerized or CI environments where
editing the config file is impractical.
| Variable | Default | Description |
|---|---|---|
SIGNET_PATH |
— | Runtime override for agents directory |
SIGNET_PORT |
3850 |
Daemon HTTP port |
SIGNET_HOST |
127.0.0.1 |
Daemon host for local calls and default bind address |
SIGNET_BIND |
SIGNET_HOST |
Explicit bind address override (0.0.0.0, etc.) |
SIGNET_LOG_FILE |
— | Optional explicit daemon log file path |
SIGNET_LOG_DIR |
$SIGNET_WORKSPACE/.daemon/logs |
Optional daemon log directory override |
SIGNET_SQLITE_PATH |
— | macOS explicit SQLite dylib override used before Bun opens the database |
SIGNET_SESSION_START_TIMEOUT |
15000 |
Session-start daemon wait budget in ms for Signet-managed clients. Generated Claude Code hook config writes this value directly. Generated Codex hook config rounds up to seconds and adds 5 seconds of harness grace |
SIGNET_FETCH_TIMEOUT |
15000 |
Legacy fallback for session-start timeout in ms when SIGNET_SESSION_START_TIMEOUT is unset |
SIGNET_PROMPT_SUBMIT_TIMEOUT |
5000 |
Prompt-submit daemon wait budget in ms; OpenCode uses this value directly, generated Claude Code hook config writes this value + 2000 ms grace |
SIGNET_TRUSTED_PROVIDER_ENDPOINT_HOSTS |
— | Comma-separated host allowlist for Anthropic endpoint overrides used during credentialed startup preflight (supports entries like proxy.example.com and *.example.com) |
OPENAI_API_KEY |
— | OpenAI key when embedding provider is openai |
SIGNET_PATH changes where Signet reads and writes all agent data for
that process, including the config file itself. Use this for temporary
overrides in CI or isolated local testing.
On macOS, SIGNET_SQLITE_PATH can point at a libsqlite3.dylib build
that supports loadExtension(). If it is set, Signet treats it as an
authoritative override and refuses fallback if the file is missing. If
it is unset, Signet checks $SIGNET_WORKSPACE/libsqlite3.dylib, where
$SIGNET_WORKSPACE resolves from SIGNET_PATH, then
~/.config/signet/workspace.json, then the default ~/.agents, before
trying standard Homebrew SQLite locations and finally falling back to
Apple's system SQLite.
For non-loopback Anthropic endpoint overrides, daemon-rs
only sends provider credentials during startup preflight when the host
is trusted. Official provider hosts are trusted by default. Add trusted
proxy/gateway hosts through SIGNET_TRUSTED_PROVIDER_ENDPOINT_HOSTS.
The main agent identity file. Synced to all configured harnesses on change (2-second debounce). Write it in plain markdown — there is no required structure, but a typical layout looks like this:
# Agent Name
Short introduction paragraph.
## Personality
Communication style, tone, and approach.
## Instructions
Specific behaviors, preferences, and task guidance.
## Rules
Hard rules the agent must follow.
## Context
Background about the user and their work.When AGENTS.md changes, the daemon writes updated copies to:
~/.claude/CLAUDE.md(if~/.claude/exists)~/.config/opencode/AGENTS.md(if~/.config/opencode/exists)
Each copy is prefixed with a generated header identifying the source file and timestamp, and includes a warning not to edit the copy directly.
Optional personality file for deeper character definition. Loaded by harnesses that support separate personality and instruction files.
# Soul
## Voice
How the agent speaks and writes.
## Values
What the agent prioritizes.
## Quirks
Unique personality characteristics.Auto-generated working memory summary. Updated by the synthesis system.
Do not edit by hand — changes will be overwritten on the next synthesis
run. Loaded at session start when hooks.sessionStart.includeRecentContext
is true.
The SQLite database at memory/memories.db contains three main tables.
| Column | Type | Description |
|---|---|---|
id |
TEXT | Primary key (UUID) |
content |
TEXT | Memory content |
type |
TEXT | fact, preference, decision, daily-log, episodic, procedural, semantic, system |
source |
TEXT | Source system or harness |
importance |
REAL | 0-1 score, decays over time |
tags |
TEXT | Comma-separated tags |
who |
TEXT | Source harness name |
pinned |
INTEGER | 1 if critical/pinned (never decays) |
is_deleted |
INTEGER | 1 if soft-deleted (tombstone) |
deleted_at |
TEXT | ISO timestamp of soft-delete |
created_at |
TEXT | ISO timestamp |
updated_at |
TEXT | ISO timestamp |
last_accessed |
TEXT | Last access timestamp |
access_count |
INTEGER | Number of times recalled |
confidence |
REAL | Extraction confidence (0-1) |
version |
INTEGER | Optimistic concurrency version |
manual_override |
INTEGER | 1 if user has manually edited |
| Column | Type | Description |
|---|---|---|
id |
TEXT | Primary key (UUID) |
content_hash |
TEXT | SHA-256 hash of embedded text |
vector |
BLOB | Float32 array (raw bytes) |
dimensions |
INTEGER | Vector size (e.g. 768) |
source_type |
TEXT | memory, conversation, etc. |
source_id |
TEXT | Reference to parent memory UUID |
chunk_text |
TEXT | The text that was embedded |
created_at |
TEXT | ISO timestamp |
FTS5 virtual table for keyword search over content, backed by the
memories table and created with the unicode61 tokenizer. Triggers
keep the index in sync when rows are inserted, deleted, or updated.
Location: ~/.claude/
settings.json installs hooks that fire at session lifecycle events:
{
"hooks": {
"SessionStart": [{
"hooks": [{
"type": "command",
"command": "python3 $SIGNET_WORKSPACE/memory/scripts/memory.py load --mode session-start",
"timeout": 3000
}]
}],
"UserPromptSubmit": [{
"hooks": [{
"type": "command",
"command": "python3 $SIGNET_WORKSPACE/memory/scripts/memory.py load --mode prompt",
"timeout": 2000
}]
}],
"SessionEnd": [{
"hooks": [{
"type": "command",
"command": "python3 $SIGNET_WORKSPACE/memory/scripts/memory.py save --mode auto",
"timeout": 10000
}]
}]
}
}Location: ~/.config/opencode/plugins/
signet.mjs is a bundled OpenCode plugin installed by
@signet/connector-opencode that exposes /remember and /recall
as native tools within the harness.
Note: Legacy
memory.mjsinstallations are automatically migrated to~/.config/opencode/plugins/signet.mjson reconnect.
Location: $SIGNET_WORKSPACE/hooks/agent-memory/ (hook directory)
Also configures the OpenClaw workspace in ~/.openclaw/openclaw.json
(and compatible clawdbot / moltbot config locations):
{
"agents": {
"defaults": {
"workspace": "$SIGNET_WORKSPACE"
}
}
}See HARNESSES.md for the full OpenClaw adapter docs.
If your Signet workspace is a git repository, the daemon auto-commits file changes
with a 5-second debounce after the last detected change. Commit messages
use the format YYYY-MM-DDTHH-MM-SS_auto_<filename>. The setup wizard
offers to initialize git on first run and creates a backup commit before
making any changes.
Recommended .gitignore for your workspace:
.daemon/
.secrets/
__pycache__/
*.pyc
*.log