Try it live on HuggingFace Spaces β paste a URL. Detect whether it contains hidden instructions targeting AI agents.
Scan web content for prompt injection, hidden instructions, and adversarial content targeting AI agents.
AI agents browse the web, read documents, and consume external content. Adversaries hide instructions in invisible text, HTML metadata, encoded payloads, and zero-width characters β Palisade finds them all.
| Scenario | Risk level | What Palisade finds |
|---|---|---|
| Clean marketing page | β Low | No hidden text, no injection patterns, no exfiltration |
| Hidden CSS prompt injection | π΄ High | display:none text with role override instructions |
| Metadata exfiltration prompt | π¨ Critical | HTML comment + JSON-LD + base64-encoded data theft payload |
| Capability | Palisade Scanner | Manual review | Generic scrapers |
|---|---|---|---|
| Hidden text detection | β 20+ CSS/HTML techniques | β | β |
| Injection pattern matching | β 100+ regexes, 5 categories | β | β |
| LLM-as-judge classifier | β understands adversarial intent | N/A | β |
| Metadata analysis | β comments, JSON-LD, meta, data attrs | β | β |
| Exfiltration detection | β URLs, eval(), fetch(), redirects | β | β |
| MCPGuard policy generation | β auto-generate rules | β | β |
| CI/CD mode | β
--ci --threshold high |
β | β |
| Zero-width character detection | β | β | β |
AI agents browse the web, read documents, and consume external content. Adversaries can hide instructions in:
- Invisible text (white-on-white, zero font size, off-screen positioning)
- HTML comments and metadata
- Base64 encoded payloads
- Zero-width character injections
- Instructions disguised as product descriptions or reviews
This scanner finds them all and tells you what to do about it.
# Install
pip install palisade-scanner
# CLI: scan a URL
pis scan https://example.com
# or
palisade scan https://example.com
# Web UI: open the dashboard
pis web
# Docker
docker compose up
# β http://localhost:8000# Scan a URL
pis scan https://example.com
# Scan a local file
pis scan --file suspicious.html
# Scan pasted text
pis scan --paste "<!-- ignore instructions -->"
# JSON output
pis scan https://example.com --format json
# CI/CD mode (exit code reflects risk)
pis scan https://example.com --ci --threshold high
# Generate MCPGuard policy rules
pis policies https://evil-site.com# Scan via REST API
curl "http://localhost:8000/api/scan?url=https://example.com"
# HTML report
curl "http://localhost:8000/api/scan/https://example.com"| Layer | What It Detects |
|---|---|
| Hidden Text Detector | 20+ CSS/HTML hiding techniques (display:none, visibility, opacity, color matching, off-screen, zero-width chars, HTML comments) |
| Injection Pattern Matcher | 100+ regex patterns across 5 categories (jailbreak, role override, exfiltration, tool manipulation, impersonation) |
| Instruction Classifier | LLM-as-judge that understands adversarial intent (requires API key) |
| Metadata Analyzer | HTML comments, JSON-LD, meta tags, data attributes, <noscript>, <template> |
| Exfiltration Detector | URLs, endpoints, eval() patterns, redirect attempts, fetch() calls |
Risk Score: 0-100
Weighted formula:
base = 100
- critical * 25
- high * 10
- medium * 3
- low * 1
Categories: none (0-5) β low (6-20) β medium (21-50) β high (51-80) β critical (81-100)
User (CLI / Web / API)
β
βΌ
PipelineOrchestrator
β
βββ Loader (URL / File / Paste / PDF)
β
βββ Detector Pipeline (parallel)
β βββ HiddenTextDetector
β βββ InjectionPatternMatcher
β βββ MetadataAnalyzer
β βββ ExfiltrationDetector
β βββ InstructionClassifier (LLM)
β
βββ ScoringEngine
β
βββ Reporters
βββ JSON / Markdown / Simple
βββ Policy Generator (MCPGuard)
βββ Web UI (HTMX)
src/scanner/
βββ cli.py # Typer CLI
βββ api.py # FastAPI web app
βββ config.py # Settings (env vars)
βββ domain/
β βββ models.py # Pydantic models
β βββ scoring.py # Risk score engine
βββ loaders/
β βββ url.py # HTTP URL fetcher
β βββ pdf.py # PDF extractor
β βββ paste.py # Raw text
βββ detectors/
β βββ hidden_text.py # CSS/HTML hiding
β βββ injection_patterns.py # 100+ regex patterns
β βββ instruction_classifier.py # LLM-as-judge
β βββ metadata_analyzer.py # Comments/meta/tags
β βββ exfiltration.py # Data theft patterns
βββ pipeline/
β βββ orchestrator.py # Scan pipeline
βββ reporters/ # JSON/MD/Simple output
βββ policies/ # MCPGuard rule generation
βββ utils/ # DOM helpers
Generate rules compatible with MCPGuard:
pis scan https://evil-site.com --format mcpguard > rules.yaml
mcpguard load-rules rules.yaml# .github/workflows/check-urls.yml
- name: Scan for prompt injection
run: |
pis scan ${{ matrix.url }} --ci --threshold medium- v0.1 β Scanner core: CLI, 5 detectors, scoring, policy generation
- v0.2 β Live Monitor: scheduled re-scans, webhook alerts, diff detection
- v0.3 β Agent Validator: Browser Use agent tests pages in real time
- v0.4 β Content Safety Proxy: reverse proxy that strips injections
- v0.5 β Reputation Engine: web of trust for agent-safe URLs
- v0.6 β Red Team Lab: adversarial page generator + benchmark suite
- v0.7 β Certification Pipeline: verified AgentSafe badges
Palisade Scanner is part of the Carlos-Projects security infrastructure for AI agents:
Palisade Scanner β Scan content before agents consume it. β you are here
MCPwn β Attack MCP servers before attackers do.
AgentGate β Control how agents access your website.
MCPscop β Centralize scanner results and security posture.
MCPGuard β Runtime security proxy for MCP/A2A protocols.
- MCPwn β Offensive security testing for MCP servers
- AgentGate β Policy-based firewall and honeypot middleware for AI agents
- MCPscop β Unified security dashboard for MCP/A2A scanner results
- MCPGuard β Runtime security proxy for MCP/A2A protocols
MIT