Palisade Scanner 🔍

Try it live on HuggingFace Spaces — paste a URL. Detect whether it contains hidden instructions targeting AI agents.

Scan web content for prompt injection, hidden instructions, and adversarial content targeting AI agents.

AI agents browse the web, read documents, and consume external content. Adversaries hide instructions in invisible text, HTML metadata, encoded payloads, and zero-width characters — Palisade finds them all.

Risk examples

Scenario	Risk level	What Palisade finds
Clean marketing page	✅ Low	No hidden text, no injection patterns, no exfiltration
Hidden CSS prompt injection	🔴 High	`display:none` text with role override instructions
Metadata exfiltration prompt	🚨 Critical	HTML comment + JSON-LD + base64-encoded data theft payload

What makes Palisade unique

Capability	Palisade Scanner	Manual review	Generic scrapers
Hidden text detection	✅ 20+ CSS/HTML techniques	❌	❌
Injection pattern matching	✅ 100+ regexes, 5 categories	❌	❌
LLM-as-judge classifier	✅ understands adversarial intent	N/A	❌
Metadata analysis	✅ comments, JSON-LD, meta, data attrs	❌	❌
Exfiltration detection	✅ URLs, eval(), fetch(), redirects	❌	❌
MCPGuard policy generation	✅ auto-generate rules	❌	❌
CI/CD mode	✅ `--ci --threshold high`	❌	❌
Zero-width character detection	✅	❌	❌

Why

AI agents browse the web, read documents, and consume external content. Adversaries can hide instructions in:

Invisible text (white-on-white, zero font size, off-screen positioning)
HTML comments and metadata
Base64 encoded payloads
Zero-width character injections
Instructions disguised as product descriptions or reviews

This scanner finds them all and tells you what to do about it.

Quick Start

# Install
pip install palisade-scanner

# CLI: scan a URL
pis scan https://example.com
# or
palisade scan https://example.com

# Web UI: open the dashboard
pis web

# Docker
docker compose up
# → http://localhost:8000

Usage

CLI

# Scan a URL
pis scan https://example.com

# Scan a local file
pis scan --file suspicious.html

# Scan pasted text
pis scan --paste "<!-- ignore instructions -->"

# JSON output
pis scan https://example.com --format json

# CI/CD mode (exit code reflects risk)
pis scan https://example.com --ci --threshold high

# Generate MCPGuard policy rules
pis policies https://evil-site.com

API

# Scan via REST API
curl "http://localhost:8000/api/scan?url=https://example.com"

# HTML report
curl "http://localhost:8000/api/scan/https://example.com"

How It Works

Detection Layers

Layer	What It Detects
Hidden Text Detector	20+ CSS/HTML hiding techniques (display:none, visibility, opacity, color matching, off-screen, zero-width chars, HTML comments)
Injection Pattern Matcher	100+ regex patterns across 5 categories (jailbreak, role override, exfiltration, tool manipulation, impersonation)
Instruction Classifier	LLM-as-judge that understands adversarial intent (requires API key)
Metadata Analyzer	HTML comments, JSON-LD, meta tags, data attributes, `<noscript>`, `<template>`
Exfiltration Detector	URLs, endpoints, eval() patterns, redirect attempts, `fetch()` calls

Scoring

Risk Score: 0-100

Weighted formula:
  base = 100
  - critical * 25
  - high * 10
  - medium * 3
  - low * 1

Categories: none (0-5) → low (6-20) → medium (21-50) → high (51-80) → critical (81-100)

Architecture

User (CLI / Web / API)
        │
        ▼
PipelineOrchestrator
        │
        ├── Loader (URL / File / Paste / PDF)
        │
        ├── Detector Pipeline (parallel)
        │   ├── HiddenTextDetector
        │   ├── InjectionPatternMatcher
        │   ├── MetadataAnalyzer
        │   ├── ExfiltrationDetector
        │   └── InstructionClassifier (LLM)
        │
        ├── ScoringEngine
        │
        └── Reporters
            ├── JSON / Markdown / Simple
            ├── Policy Generator (MCPGuard)
            └── Web UI (HTMX)

Project Structure

src/scanner/
├── cli.py              # Typer CLI
├── api.py              # FastAPI web app
├── config.py           # Settings (env vars)
├── domain/
│   ├── models.py       # Pydantic models
│   └── scoring.py      # Risk score engine
├── loaders/
│   ├── url.py          # HTTP URL fetcher
│   ├── pdf.py          # PDF extractor
│   └── paste.py        # Raw text
├── detectors/
│   ├── hidden_text.py       # CSS/HTML hiding
│   ├── injection_patterns.py # 100+ regex patterns
│   ├── instruction_classifier.py  # LLM-as-judge
│   ├── metadata_analyzer.py # Comments/meta/tags
│   └── exfiltration.py     # Data theft patterns
├── pipeline/
│   └── orchestrator.py # Scan pipeline
├── reporters/          # JSON/MD/Simple output
├── policies/           # MCPGuard rule generation
└── utils/              # DOM helpers

Integration

MCPGuard

Generate rules compatible with MCPGuard:

pis scan https://evil-site.com --format mcpguard > rules.yaml
mcpguard load-rules rules.yaml

CI/CD

# .github/workflows/check-urls.yml
- name: Scan for prompt injection
  run: |
    pis scan ${{ matrix.url }} --ci --threshold medium

Roadmap

v0.1 — Scanner core: CLI, 5 detectors, scoring, policy generation
v0.2 — Live Monitor: scheduled re-scans, webhook alerts, diff detection
v0.3 — Agent Validator: Browser Use agent tests pages in real time
v0.4 — Content Safety Proxy: reverse proxy that strips injections
v0.5 — Reputation Engine: web of trust for agent-safe URLs
v0.6 — Red Team Lab: adversarial page generator + benchmark suite
v0.7 — Certification Pipeline: verified AgentSafe badges

Ecosystem

Palisade Scanner is part of the Carlos-Projects security infrastructure for AI agents:

Palisade Scanner    →  Scan content before agents consume it.  ← you are here
MCPwn               →  Attack MCP servers before attackers do.
AgentGate           →  Control how agents access your website.
MCPscop             →  Centralize scanner results and security posture.
MCPGuard            →  Runtime security proxy for MCP/A2A protocols.

MCPwn — Offensive security testing for MCP servers
AgentGate — Policy-based firewall and honeypot middleware for AI agents
MCPscop — Unified security dashboard for MCP/A2A scanner results
MCPGuard — Runtime security proxy for MCP/A2A protocols

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
.github		.github
frontend		frontend
hf-spaces		hf-spaces
src/scanner		src/scanner
tests		tests
.editorconfig		.editorconfig
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
docker-compose.yml		docker-compose.yml
opencode.jsonc		opencode.jsonc
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Palisade Scanner 🔍

Risk examples

What makes Palisade unique

Why

Quick Start

Usage

CLI

API

How It Works

Detection Layers

Scoring

Architecture

Project Structure

Integration

MCPGuard

CI/CD

Roadmap

Ecosystem

License

About

Uh oh!

Releases 3

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Palisade Scanner 🔍

Risk examples

What makes Palisade unique

Why

Quick Start

Usage

CLI

API

How It Works

Detection Layers

Scoring

Architecture

Project Structure

Integration

MCPGuard

CI/CD

Roadmap

Ecosystem

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages