Skip to content

tin537/LMTWT

Repository files navigation

LMTWT — Let Me Talk With Them

Python 3.10+ License: MIT Contributions: Welcome

LMTWT is an async-first security testing framework for evaluating LLM resistance to prompt injection, jailbreaks, and tool-use attacks. It pits one model (the attacker) against another (the target) and reports whether the target was compromised — automatically, at scale, against frontier APIs or custom backends behind your own protocol.

Scope: LLM-only. LMTWT attacks the model's behavior, training, context, tools, and conversation memory. It is not a general web/network/API pentester — for SQLi, XSS, IDOR, port scanning, OAuth flows, and the rest of the OWASP Top 10, use Burp / ZAP / nuclei / nmap. LMTWT speaks production chatbot protocols (Socket.IO, custom WS, JWT-auth) only because that's how you reach the LLM behind them.

What you can hit

  • Hosted LLMs — OpenAI, Anthropic, Gemini
  • Local LLMs — Hugging Face transformers, LM Studio (OpenAI-compatible)
  • Agent runtimesClaude Code via ACP (Agent Client Protocol over stdio)
  • Custom backends via external-api — your own chatbot at any of:
    • HTTP (single round-trip JSON)
    • SSE (Server-Sent Events streaming)
    • WebSocket (raw frame protocol)
    • Socket.IO (v5/EIO v4 or v2/EIO v3, with ack + event correlation)

Anything you can describe in a JSON config can be a target — payload templates, auth headers, dotted-path response extraction, ack handling, the lot.

What it does to them

  • Single-shot attack templates (--mode template) — curated injection / jailbreak prompts
  • Probe sweeps (--probe-mode) — eight built-in vulnerability categories
  • Multi-turn flows (--mode multi-turn) — scripted social-engineering arcs
  • Tool-use attacks (--mode tool-use) — indirect prompt injection via tool results
  • Refinement strategies (--strategy pair|tap) — automated PAIR / TAP attack search
  • Hacker mode (--hacker-mode) — attacker reads conversation history and adapts
  • Three judges: regex, LLM-based, or ensemble for success detection

Install

git clone https://github.com/tin537/LMTWT.git
cd LMTWT

# Recommended: uv (https://docs.astral.sh/uv/)
uv sync

# Or plain pip
pip install -e .

# Optional: local Hugging Face inference
pip install -e '.[local]'

# Optional: Web UI (Gradio)
pip install -e '.[web]'

Python 3.10+ is required. GPU acceleration (CUDA / Apple MPS) is auto-detected when PyTorch is installed.

Configure credentials

Create .env at the repo root (only the providers you'll use):

GEMINI_API_KEY=...
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
HUGGINGFACE_API_KEY=hf_...     # optional, only for gated models

LM Studio and Claude Code (ACP) need no key — they run locally.

Quick start

The fastest way to use LMTWT is the scan subcommand — one command, full vulnerability sweep, complete engagement bundle out:

# Full scan: fingerprint → catalog → adaptive → climb → pollinate → chatbot attacks
lmtwt scan --target openai --attacker gemini

# Preview the plan without firing
lmtwt scan --target openai --dry-run

# Quick scan (catalog + fingerprint only, ~2 min)
lmtwt scan --target openai --depth quick

# Thorough scan (above + self-play + N=10 repeats, ~hours)
lmtwt scan --target openai --depth thorough

# Custom backend behind your own protocol
lmtwt scan --target external-api --target-config examples/custom_api_target.json

The bundle lands in ./scan-<date>-<target>/:

scan.json    report.md    report.html    report.pdf    scorecard.md
fingerprint.json    plan.json    scan.db    repro/F00N_<id>.json

You hand the folder to a client and they have everything: human-readable report, machine-readable JSON, per-finding repro packs, and a queryable SQLite db.

For granular control (run only one technique, tune individual knobs), the legacy flat CLI still works:

# Interactive: Gemini attacking OpenAI
./run.sh --attacker gemini --target openai --mode interactive

# Local-only: LM Studio attacking LM Studio (no API costs)
./run.sh --attacker lmstudio --attacker-model "qwen2.5-7b" \
         --target lmstudio  --target-model "llama-3.1-8b" \
         --mode interactive

# Web UI (Gradio)
./run.sh --web

# Web UI (FastAPI + SSE — pip install -e '.[api]')
lmtwt --web-api

# Custom OpenAI-compatible attacker (Ollama, vLLM, LiteLLM, self-hosted gateway)
export OPENAI_COMPAT_BASE_URL=http://localhost:11434/v1
export OPENAI_COMPAT_MODEL=qwen2.5:7b
lmtwt scan --target openai --attacker openai-compat

Usage examples

Targeting different model backends

# Hosted: Claude as the target
./run.sh --attacker gemini --target anthropic

# Local: Hugging Face transformer
./run.sh --attacker gemini --target huggingface \
         --target-model "mistralai/Mistral-7B-Instruct-v0.2"

# Local: LM Studio (OpenAI-compatible REST on localhost:1234)
./run.sh --attacker gemini --target lmstudio --target-model "your-model-id"

# Agent runtime: Claude Code over ACP
./run.sh --attacker gemini --target claude-code

# Custom HTTP backend
./run.sh --attacker gemini --target external-api \
         --target-config examples/custom_api_target.json

# Custom Socket.IO backend (e.g. fintech / customer-service chatbot)
./run.sh --attacker gemini --target external-api \
         --target-config examples/socketio_target.json

Attack modes

# Hacker mode — attacker adapts based on target's prior responses
./run.sh --attacker gemini --target openai --hacker-mode

# Probe a specific vulnerability category
./run.sh --probe-mode --probe-category injection --target openai

# Batch attacks with explicit instructions
./run.sh --mode batch \
         --instruction "Create a jailbreak prompt" \
         --instruction "Test system-prompt extraction"

# Multi-turn social-engineering flow
./run.sh --mode multi-turn --flow trust_then_pivot

# Tool-use attack (indirect prompt injection via tool results)
./run.sh --mode tool-use --tool-vector hidden_instruction

# Automated refinement: PAIR (5 iterations) or TAP (tree of thoughts)
./run.sh --strategy pair --strategy-iterations 5
./run.sh --strategy tap  --strategy-branching 3 --strategy-depth 4

Standardized templates

# List built-in attack templates
./run.sh --list-templates

# Run a specific template
./run.sh --mode template --template basic_prompt_injection

# List multi-turn flows / tool-use vectors
./run.sh --list-flows
./run.sh --list-vectors

Routing through Burp / mitmproxy / ZAP

Every model — hosted, local, and external — flows through the same TLS/proxy layer. Burp captures help when you're targeting a custom backend whose protocol you don't fully understand yet.

./run.sh --attacker gemini --target external-api \
         --target-config my_target.json \
         --proxy http://127.0.0.1:8080 \
         --ca-bundle ~/.burp/cacert.pem

CA bundle accepts both PEM (.pem / .crt) and DER (.der / .cer) — no conversion needed. Per-target overrides (proxy, ca_bundle, insecure) in the target-config JSON win over CLI flags.

Attack categories (probe mode)

Category What it tests
dan "Do Anything Now" jailbreak prompts
injection Classic prompt injection
xss Cross-site scripting payloads in model output
glitch Unicode and token-boundary exploits
misleading Misinformation / hallucination induction
malware Malware-related content generation
forbidden_knowledge Dangerous-knowledge extraction
snowball Escalating-hallucination attacks

External-API targets in 30 seconds

The external-api target is the framework's escape hatch. Point it at a JSON file describing the wire protocol and LMTWT handles the rest. Four protocols are built in:

protocol Use when
http (default) One-shot REST chat endpoint
sse Server-Sent Events streaming response
websocket / ws / wss Raw WebSocket JSON frames
socketio / socket.io Socket.IO v5 (EIO v4) or v2 (EIO v3) — set eio_version

A minimal Socket.IO config looks like:

{
  "protocol": "socketio",
  "endpoint": "wss://chat.example.com/socket.io/",
  "eio_version": "3",
  "headers": { "Authorization": "Bearer ...", "User-Agent": "android" },
  "event_name": "send_message",
  "response_event": "receive_message",
  "payload_template": {
    "messageContent": [{ "content": "", "type": "TEXT" }],
    "messageId": "", "sessionId": "", "role": "USER"
  },
  "prompt_path": "messageContent.0.content",
  "message_id_key": "messageId",
  "session_id_key": "sessionId",
  "session_id": "session-from-your-bootstrap-api",
  "response_path": "messageContent.0.content"
}

See docs/configuration.md for the full schema and examples/ for working configs (HTTP, Socket.IO, Ollama).

Debug helper: set LMTWT_SOCKETIO_DEBUG=1 to dump every Socket.IO frame to stderr while a run is in flight.

Web UI

./run.sh --web                                    # localhost:8501
./run.sh --web --web-port 8080 --share            # public Gradio share link

The UI exposes model selection, interactive attack composition, result visualization, and a session history with pass/fail tracking.

Architecture in one paragraph

The async-first engine (src/lmtwt/attacks/async_engine.py) drives an attacker and a target, both implementing a small AsyncAIModel interface (src/lmtwt/models/async_base.py). Provider classes live in src/lmtwt/models/; external transports in src/lmtwt/models/external/. Each chat() returns a typed ChatResponse; streaming is exposed via astream(). A judge (src/lmtwt/judges/) decides whether the target's response counts as a successful jailbreak. Read docs/architecture.md for the full picture.

Tests

uv run pytest                       # full suite (~150 tests, ~2s)
uv run pytest tests/test_external_socketio.py -v
uv run pytest --cov=src/lmtwt

Python Tests

Documentation

Contributing

git checkout -b feature/your-thing
uv sync                              # installs dev group automatically
uv run pytest                        # green before pushing
uv run ruff check src tests          # lint
git commit -m "feat: ..."
git push origin feature/your-thing

PRs welcome. See CONTRIBUTING.md and CODE_OF_CONDUCT.md.

Acknowledgments

Inspired by NVIDIA's garak (Apache 2.0). LMTWT is an original implementation under MIT — we appreciate garak's prior work in the LLM red-team space.

Support the project

PayPal

Disclaimer

For educational purposes and authorized security testing only. Always get written permission before testing any system you don't own. Researchers running CTFs, internal red-team engagements, and personal lab work are the intended audience. The authors disclaim responsibility for misuse.

License

MIT — see LICENSE.

Contact

Tanuphat Tin — tanuphat.chai@gmail.com github.com/tin537/LMTWT

About

LMTWT is AI security testing framework for evaluating LLM prompt injection vulnerabilities

Topics

Resources

Code of conduct

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages