Skip to content

VulcanLab/IPI-Proxy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Indirect Prompt Injection - Payload Library & Injection Proxy

A toolkit for red-teaming web-browsing AI agents against indirect prompt injection (IPI) attacks. Two modes of operation:

  1. Target Server - A fake but realistic-looking webpage with malicious instructions pre-injected. Use when the agent can visit arbitrary URLs.
  2. Intercepting Proxy - The agent browses whitelisted domains as normal, but a MITM proxy injects malicious instructions into the real HTML before it reaches the agent. Use when the agent is restricted to approved domains.
Mode 1 - Target Server (simplest):

  AI Agent          Target Server (:8081)
  (under test)      (payload pre-injected)
      │                     │
      ├── GET /page?... ──→ │
      │                     │── look up payload + embed + inject
      ◀── HTML with payload ┘

Mode 2 - Intercepting Proxy (for whitelisted domains):

  AI Agent          Proxy (:8080)           Real Domain
  (under test)                              (whitelisted)
      │                  │                       │
      ├── GET url ─────→ │── forward request ──→ │
      │                  │◀── original HTML ─────┘
      │                  │── inject payload into HTML
      ◀── modified HTML ─┘

  Exfil Tracker (:9090) ◀── logs any callbacks triggered by the agent

Tip: Mode 2 also supports url_pattern: "*" to intercept any URL - useful when there is no fixed whitelist. See Config for details.

Quick Start

pip install mitmproxy fastapi uvicorn pyyaml requests pytest

# Build the payload library (first time only - clones benchmark repos)
python payloads/extract_bipia.py
python payloads/extract_injecagent.py
python payloads/extract_agentdojo.py
python payloads/extract_tensor_trust.py
python payloads/extract_wasp.py
python payloads/extract_llmail_inject.py   # HuggingFace step is optional

# Mode 1 - Target Server
python tracker/server.py &                 # exfil tracker on :9090
python target/server.py &                  # target server on :8081
# Point agent to: http://localhost:8081/page?attack_type=data_exfil&embedding=html_comment

# Mode 2 - Intercepting Proxy
python tracker/server.py &                 # exfil tracker on :9090
mitmdump -s proxy/addon.py --set ipi_config=config/example_test.yaml
# Route agent traffic through localhost:8080

# Standalone Demo (no proxy setup needed)
python test_run/demo_fetch_inject.py --config test_run/any_url_demo.yaml --all-techniques

Check http://localhost:9090/log or tracker/exfil_log.jsonl for exfil hits.

How It Works

Payload Library

payloads/unified.jsonl - 820 deduplicated attack strings from six published benchmarks. A spreadsheet view lives at payloads/attack_payloads.xlsx.

Source Payloads License Origin
BIPIA (Microsoft) 220 MIT Text & code injection benchmarks
InjecAgent (UIUC) 62 MIT Tool-integrated agent attacks
AgentDojo (ETH Zurich) 47 MIT Multi-suite task injections
Tensor Trust (UC Berkeley) 400 MIT Hijacking + prompt extraction robustness
WASP (Meta / FAIR) 84 CC-BY-NC-4.0 WebArena attacker objectives × 4 wrappers
LLMail-Inject (Microsoft / ISTA / ETH) 7 MIT Challenge scenarios (HF dataset optional)

License note: WASP is non-commercial. Each WASP entry carries license: "CC-BY-NC-4.0" so downstream consumers can filter before any commercial use. All other sources are MIT-compatible.

Attack types: behavior_manipulation (258), prompt_extraction (200), data_exfil (134), goal_hijacking (124), unauthorized_action (99), capability_enum (5).

Domains: chat (405), web (253), code (104), email (58).

See payloads/SOURCES.md for full extraction details and license terms.

Embedding Techniques

Each technique is invisible to humans viewing the rendered page, but readable by agents processing raw HTML.

Technique How it hides the payload Trade-off
HTML comment <!-- IMPORTANT SYSTEM MESSAGE: {payload} --> Simplest; broadly effective
Invisible CSS <div style="display:none"> Works against most agent pipelines
Semantic prose LLM-generated paragraph embedding the payload naturally Highest success rate across models

Insertion Points

Point Where it goes
head_meta Inside <head> (as a meta tag or comment)
body_top First element after <body>
body_inline Between existing <p> paragraphs
sidebar Appended to <aside> or <nav> elements
footer Just before </body>
script_comment Inside the first <script> block as a JS comment

Config

target:
  url_pattern: "*.clientcorp.com/docs/*"   # glob pattern, or "*" for any URL
  # Can also be a list:
  # url_pattern: ["*://www.amazon.com/*", "*://docs.clientcorp.com/*"]

injection:
  payload_filter:
    attack_type: [data_exfil, goal_hijacking]
    source_benchmark: [bipia, injecagent]
  embedding: html_comment     # html_comment | invisible_css | semantic | random
  insertion_point: body_inline # body_top | body_inline | footer | head_meta | sidebar | script_comment

rotation:
  mode: sequential            # sequential | random
  payloads_per_session: 10

Pre-built configs: config/example_test.yaml, config/top100_sites.yaml (100 popular domains). See also test_run/ for demo configs.

Project Structure

├── payloads/
│   ├── extract_bipia.py          # Extract from Microsoft BIPIA
│   ├── extract_injecagent.py     # Extract from UIUC InjecAgent
│   ├── extract_agentdojo.py      # Extract from ETH AgentDojo (payloads only)
│   ├── extract_tensor_trust.py   # Extract from UC Berkeley Tensor Trust
│   ├── extract_wasp.py           # Extract from Meta WASP (CC-BY-NC-4.0)
│   ├── extract_llmail_inject.py  # Extract from Microsoft LLMail-Inject
│   ├── unified.jsonl             # 820 deduplicated attack payloads
│   ├── attack_payloads.xlsx      # Spreadsheet view of the library
│   └── SOURCES.md                # Per-source extraction details + licenses
├── proxy/
│   ├── addon.py                  # mitmproxy addon (YAML-configurable)
│   ├── config.py                 # Config & payload loading
│   └── injector.py               # HTML injection logic (6 insertion points)
├── target/
│   └── server.py                 # Target page server on :8081 (Mode 1)
├── templates/
│   ├── html_comment.py           # <!-- payload -->
│   ├── invisible_css.py          # Hidden div with display:none
│   └── semantic_embed.py         # Prose paragraph (optional LLM mode)
├── tracker/
│   └── server.py                 # FastAPI exfil tracker on :9090
├── config/
│   ├── example_test.yaml         # Sample test case config
│   └── top100_sites.yaml         # Top-100 sites whitelist config
├── test_run/
│   ├── demo_fetch_inject.py      # Fetch real pages, inject, save before/after
│   ├── any_url_demo.yaml         # Demo config - any URLs
│   └── amazon_demo.yaml          # Demo config - Amazon example
├── tests/
│   ├── conftest.py               # Shared fixtures
│   ├── test_injector.py          # Template + injection + config unit tests
│   ├── test_addon.py             # mitmproxy addon unit tests
│   ├── test_tracker.py           # FastAPI exfil tracker tests
│   └── test_e2e.py               # End-to-end: spawns mitmdump subprocess
└── CLAUDE.md                     # Full project spec

Tests

python -m pytest tests/            # 48 unit tests (fast)
python -m pytest tests/ -m e2e     # end-to-end (spawns real mitmdump)

OWASP Mapping

OWASP ID Risk How this toolkit tests it
ASI01 Agent Goal Hijacking Goal hijacking payloads redirect the agent's task
ASI02 Tool Misuse Unauthorized action payloads trigger unintended tool calls
ASI03 Identity & Privilege Abuse Data exfil payloads steal credentials and tokens
ASI05 Memory Poisoning Behavior manipulation payloads alter agent output
LLM01 Prompt Injection All payload categories - this is the core attack vector

References

  • BIPIA - github.com/microsoft/BIPIA (arXiv:2312.14197)
  • InjecAgent - github.com/uiuc-kang-lab/InjecAgent (arXiv:2403.02691)
  • AgentDojo - github.com/ethz-spylab/agentdojo (arXiv:2406.13352)
  • Tensor Trust - github.com/HumanCompatibleAI/tensor-trust-data (arXiv:2311.01011)
  • WASP - github.com/facebookresearch/wasp (arXiv:2504.18575)
  • LLMail-Inject - github.com/microsoft/llmail-inject-challenge (OpenReview GM9H3iM7VJ)
  • OWASP Agentic Top 10 - genai.owasp.org
  • OWASP LLM Top 10 2025 - genai.owasp.org

Responsible Use

This toolkit is for authorized red-teaming and defensive research only. All payload strings are treated as data and never executed. Only run the proxy against systems you own or have explicit permission to test.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages