This guide supplements the README with lessons from live production deployments. If you're running the correlation plugin in anger, these notes will save you time.
The README covers the what and why. This guide covers the how in practice:
- How to integrate correlation surfacing into your heartbeat loop
- How to tune confidence thresholds without drowning in noise
- How to manage the rule lifecycle without making a mess
- What actually goes wrong in production, and how to debug it
The correlation plugin shines when it surfaces context proactively, not just when you remember to ask. The recommended integration point is your OpenClaw heartbeat.
In your HEARTBEAT.md, add a periodic check that looks at what you're currently working on and surfaces related contexts:
# HEARTBEAT.md
## Periodic Tasks
### Correlation Surfacing (every 5 heartbeats)
When working on a topic, check correlation rules to surface related contexts:
bash scripts/correlation-surfacing.sh "<current topic keywords>"The correlation-surfacing.sh script reads your current context (passed as argument), matches it against memory/correlation-rules.json, and outputs what else you should know.
Script location: Place the script in your OpenClaw workspace scripts directory (e.g., ~/.openclaw/workspace/scripts/correlation-surfacing.sh). The script is available in the plugin repository under scripts/.
# Example invocation
bash scripts/correlation-surfacing.sh "config change"Output:
{
"context": "config change",
"matched": 1,
"rules": [{
"id": "cr-001",
"trigger": "config-change",
"fetches": ["backup-location", "rollback-instructions", "recent-changes"]
}]
}Why every 5 heartbeats and not every one? Correlation surfacing fires a memory search per matched rule. On busy systems, every heartbeat is too aggressive. Every 5 gives you regular enrichment without token burn.
The script accepts natural language. Good context strings:
"config change"— fires the config-safety rules"error 400 gateway"— fires error-debugging + gateway rules"plugin install"— fires plugin-management rules
Bad context strings (too generic, fire too often):
"help"— no rule should trigger on this"status"— too broad
Confidence (0.0–1.0) controls how strongly a rule's correlation is weighted. Getting this right is the difference between useful surfacing and noise.
| Confidence | When to use | Example |
|---|---|---|
0.95–0.99 |
Critical operations where getting this wrong is expensive | Config changes, gateway restarts, plugin installs |
0.85–0.90 |
High-value patterns that reliably indicate the context | Backup operations, error debugging |
0.70–0.80 |
Useful correlations but with some false-positive risk | Session recovery, git operations |
< 0.70 |
Exploratory rules or very niche patterns | Almost never needed in practice |
Setting everything to 0.95 because "high confidence sounds better." This causes signal drowning: your high-confidence rules dominate every search, and you never see the lower-confidence correlations that might actually be relevant to the current situation.
Rule of thumb: Only use 0.95+ for operations where the cost of missing the correlation is catastrophic (gateway down, data loss). Everything else 0.70–0.90.
- Deploy a new rule with
lifecycle.state: "testing"and confidence0.70 - Run your heartbeat for a few days, watch how often it fires
- If it fires on every unrelated query → lower confidence or narrow keywords
- If it never fires when it should → widen keywords or raise confidence
- When stable, move to
validated
Rules are not static. They have a lifecycle from idea to retirement.
proposal → testing → validated → promoted → retired
proposal — A new idea for a correlation rule. Written with lower confidence (0.60–0.75), deployed in testing mode only. Not active by default unless you explicitly enable proposal rules.
testing — The rule is live but being evaluated. You'll see it fire but it won't auto-surface until validated. Set confidence based on how sure you are.
validated — The rule fires correctly and the signal-to-noise ratio is acceptable. It is now active.
promoted — The rule is rock-solid. High confidence (0.90+), well-tested, and you want it to always be available. Only promoted rules should have confidence 0.99.
retired — The rule is obsolete. Maybe the pattern it detected no longer exists, or it was replaced by a better rule. Keep it in the file (with retired) rather than deleting it — this preserves the learned_from history.
When you discover a new correlation pattern:
- Add rule as
proposalwithconfidence: 0.70 - After a week of testing, move to
testing - After validation (noise acceptable), move to
validated - After 30+ firings with no issues, consider
promoted
Don't rush promotion. A premature promoted rule with 0.99 confidence that fires inappropriately is hard to undo because people trust it.
trigger_keywords: ["error"] will fire on almost every message in a development channel. This makes the rule useless because it always fires but rarely means anything.
Fix: Be specific. ["error", "400", "crash"] — require 2+ keywords to co-occur, or use a specific phrase as the trigger_context.
A rule that says must_also_fetch: ["recovery-procedures"] but there's no file at memory/recovery-procedures.md silently does nothing. The plugin doesn't warn you.
Fix: Always verify that every context in must_also_fetch actually exists in your memory directory.
Rule A triggers on ["config"], Rule B triggers on ["change"]. Both fire on "config change". Now you get duplicate surfacing.
Fix: Before adding a rule, search existing rules for keyword overlap. Use distinct trigger_context values to keep rules distinguishable.
Rule A (confidence: 0.99) always fires on your context. Rule B (confidence: 0.85) would also be relevant but Rule A's results dominate.
Fix: If a 0.99 rule fires on every config operation, ask: does it need to be that high? Reduce to 0.90 and let lower rules breathe.
Rules without learned_from are impossible to audit. When the rule fires incorrectly in 6 months, you won't remember why you created it.
Fix: Every rule needs a learned_from that names the incident, pattern, or lesson that prompted it. Keep it descriptive but concise.
The plugin provides a debug tool to check which rules match a given context:
openclaw exec correlation_check --context "config change"This shows:
- Which rules matched
- Why they matched (which keywords fired)
- What contexts they would fetch
openclaw exec correlation_check --rule-id cr-001 --context "openclaw.json edit"If a rule should fire but doesn't:
- Check that your keywords appear in the context string
- Check that
trigger_contextmatches the semantic domain - Check the rule's lifecycle state —
proposalrules need explicit enabling
If a rule fires when it shouldn't:
- The keywords are too broad → narrow them
- Confidence too high → lower it
- Context is wrong → the same keyword can mean different things in different contexts
Most common cause: one or more contexts listed in must_also_fetch don't exist as files in your memory directory.
Fix: Run ls memory/ and verify every referenced context has a corresponding file. Create missing ones or remove the reference.
- Check that your
trigger_keywordsactually appear in the context string you're passing - Verify the rule's
lifecycle.state—proposalrules need explicit opt-in - Try the
correlation_checktool directly:openclaw exec correlation_check --context "your context here"
- If keywords overlap with another higher-confidence rule, the lower-confidence rule's results may be buried
Two rules firing on the same context and fetching overlapping contexts.
Fix: Review both rules' must_also_fetch lists for overlap. Narrow keywords on one of the rules, or reduce its confidence so the other dominates.
The rule fires on almost every query.
Fix: Raise the confidence threshold, or narrow the keyword list. A rule with confidence: 0.70 that fires constantly should probably be 0.50 or removed.
usage_count is high but signal-to-noise has degraded. This usually means the operational pattern changed (new tool, new workflow) and the rule's keywords are now triggering in unrelated contexts.
Fix: Narrow the keywords, or set lifecycle.state: "testing" to stop auto-surfacing while you recalibrate.
Correlation rules are powerful but not universal. Avoid them when:
1. The relationship is 1:1, not N:M If "when X happens, always do Y" — that's automation, not correlation. Write a script.
2. The keyword is too common Words like "help", "check", "status", "info" fire on almost everything. They'll generate noise, not signal.
3. The contexts don't exist
If you reference memory/recovery-procedures.md but never created that file, the correlation silently fails. Only reference contexts you actually maintain.
4. The pattern is genuinely one-off If it happened once and won't happen again, don't write a rule. Just document it.
5. You need exact precision Correlation is probabilistic. If you need guaranteed checks (e.g., security compliance), use deterministic rules or scripts, not correlation confidence scoring.
Before deploying a new rule, verify:
- Keywords are specific enough not to fire on every message
- All
must_also_fetchcontexts exist in memory/ -
confidenceis appropriate (not everything needs 0.95) -
learned_fromdescribes why this rule exists -
lifecycle.stateis set (default totestingfor new rules) - No existing rule has significant keyword overlap