A toolkit for red-teaming web-browsing AI agents against indirect prompt injection (IPI) attacks. Two modes of operation:
- Target Server - A fake but realistic-looking webpage with malicious instructions pre-injected. Use when the agent can visit arbitrary URLs.
- Intercepting Proxy - The agent browses whitelisted domains as normal, but a MITM proxy injects malicious instructions into the real HTML before it reaches the agent. Use when the agent is restricted to approved domains.
Mode 1 - Target Server (simplest):
AI Agent Target Server (:8081)
(under test) (payload pre-injected)
│ │
├── GET /page?... ──→ │
│ │── look up payload + embed + inject
◀── HTML with payload ┘
Mode 2 - Intercepting Proxy (for whitelisted domains):
AI Agent Proxy (:8080) Real Domain
(under test) (whitelisted)
│ │ │
├── GET url ─────→ │── forward request ──→ │
│ │◀── original HTML ─────┘
│ │── inject payload into HTML
◀── modified HTML ─┘
Exfil Tracker (:9090) ◀── logs any callbacks triggered by the agent
Tip: Mode 2 also supports
url_pattern: "*"to intercept any URL - useful when there is no fixed whitelist. See Config for details.
pip install mitmproxy fastapi uvicorn pyyaml requests pytest
# Build the payload library (first time only - clones benchmark repos)
python payloads/extract_bipia.py
python payloads/extract_injecagent.py
python payloads/extract_agentdojo.py
python payloads/extract_tensor_trust.py
python payloads/extract_wasp.py
python payloads/extract_llmail_inject.py # HuggingFace step is optional
# Mode 1 - Target Server
python tracker/server.py & # exfil tracker on :9090
python target/server.py & # target server on :8081
# Point agent to: http://localhost:8081/page?attack_type=data_exfil&embedding=html_comment
# Mode 2 - Intercepting Proxy
python tracker/server.py & # exfil tracker on :9090
mitmdump -s proxy/addon.py --set ipi_config=config/example_test.yaml
# Route agent traffic through localhost:8080
# Standalone Demo (no proxy setup needed)
python test_run/demo_fetch_inject.py --config test_run/any_url_demo.yaml --all-techniquesCheck http://localhost:9090/log or tracker/exfil_log.jsonl for exfil hits.
payloads/unified.jsonl - 820 deduplicated attack strings from six
published benchmarks. A spreadsheet view lives at payloads/attack_payloads.xlsx.
| Source | Payloads | License | Origin |
|---|---|---|---|
| BIPIA (Microsoft) | 220 | MIT | Text & code injection benchmarks |
| InjecAgent (UIUC) | 62 | MIT | Tool-integrated agent attacks |
| AgentDojo (ETH Zurich) | 47 | MIT | Multi-suite task injections |
| Tensor Trust (UC Berkeley) | 400 | MIT | Hijacking + prompt extraction robustness |
| WASP (Meta / FAIR) | 84 | CC-BY-NC-4.0 | WebArena attacker objectives × 4 wrappers |
| LLMail-Inject (Microsoft / ISTA / ETH) | 7 | MIT | Challenge scenarios (HF dataset optional) |
License note: WASP is non-commercial. Each WASP entry carries
license: "CC-BY-NC-4.0"so downstream consumers can filter before any commercial use. All other sources are MIT-compatible.
Attack types: behavior_manipulation (258), prompt_extraction (200),
data_exfil (134), goal_hijacking (124), unauthorized_action (99),
capability_enum (5).
Domains: chat (405), web (253), code (104), email (58).
See payloads/SOURCES.md for full extraction details
and license terms.
Each technique is invisible to humans viewing the rendered page, but readable by agents processing raw HTML.
| Technique | How it hides the payload | Trade-off |
|---|---|---|
| HTML comment | <!-- IMPORTANT SYSTEM MESSAGE: {payload} --> |
Simplest; broadly effective |
| Invisible CSS | <div style="display:none"> |
Works against most agent pipelines |
| Semantic prose | LLM-generated paragraph embedding the payload naturally | Highest success rate across models |
| Point | Where it goes |
|---|---|
head_meta |
Inside <head> (as a meta tag or comment) |
body_top |
First element after <body> |
body_inline |
Between existing <p> paragraphs |
sidebar |
Appended to <aside> or <nav> elements |
footer |
Just before </body> |
script_comment |
Inside the first <script> block as a JS comment |
target:
url_pattern: "*.clientcorp.com/docs/*" # glob pattern, or "*" for any URL
# Can also be a list:
# url_pattern: ["*://www.amazon.com/*", "*://docs.clientcorp.com/*"]
injection:
payload_filter:
attack_type: [data_exfil, goal_hijacking]
source_benchmark: [bipia, injecagent]
embedding: html_comment # html_comment | invisible_css | semantic | random
insertion_point: body_inline # body_top | body_inline | footer | head_meta | sidebar | script_comment
rotation:
mode: sequential # sequential | random
payloads_per_session: 10Pre-built configs: config/example_test.yaml, config/top100_sites.yaml (100
popular domains). See also test_run/ for demo configs.
├── payloads/
│ ├── extract_bipia.py # Extract from Microsoft BIPIA
│ ├── extract_injecagent.py # Extract from UIUC InjecAgent
│ ├── extract_agentdojo.py # Extract from ETH AgentDojo (payloads only)
│ ├── extract_tensor_trust.py # Extract from UC Berkeley Tensor Trust
│ ├── extract_wasp.py # Extract from Meta WASP (CC-BY-NC-4.0)
│ ├── extract_llmail_inject.py # Extract from Microsoft LLMail-Inject
│ ├── unified.jsonl # 820 deduplicated attack payloads
│ ├── attack_payloads.xlsx # Spreadsheet view of the library
│ └── SOURCES.md # Per-source extraction details + licenses
├── proxy/
│ ├── addon.py # mitmproxy addon (YAML-configurable)
│ ├── config.py # Config & payload loading
│ └── injector.py # HTML injection logic (6 insertion points)
├── target/
│ └── server.py # Target page server on :8081 (Mode 1)
├── templates/
│ ├── html_comment.py # <!-- payload -->
│ ├── invisible_css.py # Hidden div with display:none
│ └── semantic_embed.py # Prose paragraph (optional LLM mode)
├── tracker/
│ └── server.py # FastAPI exfil tracker on :9090
├── config/
│ ├── example_test.yaml # Sample test case config
│ └── top100_sites.yaml # Top-100 sites whitelist config
├── test_run/
│ ├── demo_fetch_inject.py # Fetch real pages, inject, save before/after
│ ├── any_url_demo.yaml # Demo config - any URLs
│ └── amazon_demo.yaml # Demo config - Amazon example
├── tests/
│ ├── conftest.py # Shared fixtures
│ ├── test_injector.py # Template + injection + config unit tests
│ ├── test_addon.py # mitmproxy addon unit tests
│ ├── test_tracker.py # FastAPI exfil tracker tests
│ └── test_e2e.py # End-to-end: spawns mitmdump subprocess
└── CLAUDE.md # Full project spec
python -m pytest tests/ # 48 unit tests (fast)
python -m pytest tests/ -m e2e # end-to-end (spawns real mitmdump)| OWASP ID | Risk | How this toolkit tests it |
|---|---|---|
| ASI01 | Agent Goal Hijacking | Goal hijacking payloads redirect the agent's task |
| ASI02 | Tool Misuse | Unauthorized action payloads trigger unintended tool calls |
| ASI03 | Identity & Privilege Abuse | Data exfil payloads steal credentials and tokens |
| ASI05 | Memory Poisoning | Behavior manipulation payloads alter agent output |
| LLM01 | Prompt Injection | All payload categories - this is the core attack vector |
- BIPIA - github.com/microsoft/BIPIA (arXiv:2312.14197)
- InjecAgent - github.com/uiuc-kang-lab/InjecAgent (arXiv:2403.02691)
- AgentDojo - github.com/ethz-spylab/agentdojo (arXiv:2406.13352)
- Tensor Trust - github.com/HumanCompatibleAI/tensor-trust-data (arXiv:2311.01011)
- WASP - github.com/facebookresearch/wasp (arXiv:2504.18575)
- LLMail-Inject - github.com/microsoft/llmail-inject-challenge (OpenReview GM9H3iM7VJ)
- OWASP Agentic Top 10 - genai.owasp.org
- OWASP LLM Top 10 2025 - genai.owasp.org
This toolkit is for authorized red-teaming and defensive research only. All payload strings are treated as data and never executed. Only run the proxy against systems you own or have explicit permission to test.