OWASP Agent Memory Guard – Protect OpenAI Agent Memory from Poisoning Attacks #3337
Replies: 2 comments
-
|
v0.3.0 Update — just shipped a major release with new capabilities: New in v0.3.0:
Detection rate improved to 94.2% on AgentThreatBench with ML enabled. # New CLI usage
pip install agent-memory-guard
amg scan agent_memories.json --format sarif
# Or as a sidecar API
amg serve --port 8000
curl -X POST localhost:8000/scan -d '{"content": "..."}'Full changelog: https://github.com/OWASP/www-project-agent-memory-guard |
Beta Was this translation helpful? Give feedback.
-
|
The scanning approach (detect-and-reject at write time) addresses the most obvious attack vector, but production experience shows a complementary defense is needed for memories that bypass the scanner — no detection system catches 100% of adversarial inputs. Importance-weighted decay provides this second layer: every stored memory carries an importance score that degrades based on access recency. Legitimate memories get reinforced each time an agent actually references them in its output. Poisoned memories that enter the store (either through a scanner bypass or from before the guard was installed) but are never reinforced by legitimate use patterns decay below the retrieval threshold within hours. This changes the threat model: instead of needing perfect detection at write time, you need the attacker to continuously reinject content to maintain poisoned memories above the recall threshold — which is a fundamentally harder attack and much easier to detect via access pattern anomalies. For the provenance dimension: attaching (agent_id, session_id, confidence_score) to each stored memory lets downstream consumers weight recalled context by trustworthiness. A memory stored by a verified internal agent at confidence 0.95 gets full retrieval weight; a memory from an unverified external input at confidence 0.3 gets proportionally less influence. Decay-weighted memory + provenance metadata in practice: https://github.com/Dakera-AI/dakera-deploy/blob/main/examples/tif-provenance/validate_tif_provenance.py |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
What is it?
OWASP Agent Memory Guard (AMG) is an open-source Python library that protects AI agent memory from poisoning attacks. If you're building agents with OpenAI's API that use persistent memory (conversation history, RAG, vector stores), AMG scans every memory write for:
Quick Start
Results
Links
Beta Was this translation helpful? Give feedback.
All reactions