The web protocol for the agentic economy.
An open-source infrastructure layer for any site with content β blogs, news, portfolios, forums, docs. Site owners and content creators serve machine-ready content to AI agents and monetize AI traffic, replacing illegal scraping with a paid, signed, legally verifiable pipeline.
Quick Start Β· Site owner guide Β· AI Agent Guide Β· Development
Every day, AI companies scrape the same web pages thousands of times, burning compute, violating copyright, and creating legal liability for everyone involved.
Fairfetch solves all three problems at once:
|
Pre-process once at the source. Sites convert HTML to clean Markdown and generate summaries once. AI agents fetch the result β eliminating the redundant 1,000x compute cost of web crawling. |
Cryptographic proof of legal access. Every request produces an Ed25519-signed Usage Grant β courtroom-grade evidence of authorized usage. You set the terms. AI companies sleep at night. |
Cut out the middleman crawlers. Edge workers steer known bots (GPTBot, CCBot) from raw HTML toward the official API, converting "illegal" crawls into paid, legal API hits in real time. |
| π Content provider (site owner) | π€ AI Agent (Consumer) | |
|---|---|---|
| Goal | Monetize content from AI traffic, stop illegal scraping | Get clean content with legal cover |
| Deploys | Edge worker + Fairfetch API | MCP client or REST calls |
| Gets | Revenue, analytics, legal control | Markdown, signatures, Usage Grants |
| Guide | Site owner onboarding β | AI Agent Integration β |
git clone https://github.com/Fairfetch-co/fairfetch.git
cd fairfetch
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
make setup-dev
# Start the development server
make devFor site owners β Verify your setup
# Health check
curl http://localhost:8402/health
# Simulate an AI agent paying for content
curl -v -H "X-PAYMENT: test_paid_fairfetch" \
-H "Accept: application/ai-context+json" \
"http://localhost:8402/content/fetch?url=https://example.com"
# Response headers include the three pillars:
# X-FairFetch-Origin-Signature β Legal (content is signed)
# X-FairFetch-License-ID β Indemnity (Usage Grant issued)
# X-PAYMENT-RECEIPT β Payment (settlement confirmed)
# Simulate a scraper β see the steering headers
curl -v -H "X-PAYMENT: test_paid_fairfetch" \
-H "User-Agent: GPTBot/1.0" \
-H "Accept: text/html" \
"http://localhost:8402/content/fetch?url=https://example.com" 2>&1 \
| grep "X-FairFetch"For AI Agents β Get content in 30 seconds
Option 1 β MCP (zero code):
npx @modelcontextprotocol/inspector python -m mcp_server.server
# β Connect β Tools β call fetch_article_markdown with url: "https://example.com"Option 2 β REST API:
curl -H "X-PAYMENT: test_paid_fairfetch" \
-H "Accept: text/markdown" \
"http://localhost:8402/content/fetch?url=https://example.com"Option 3 β Python client:
import httpx, asyncio
async def main():
async with httpx.AsyncClient() as c:
r = await c.get(
"http://localhost:8402/content/fetch",
params={"url": "https://example.com"},
headers={
"X-PAYMENT": "test_paid_fairfetch",
"Accept": "application/ai-context+json",
},
)
print(r.json())
print(r.headers["X-FairFetch-License-ID"])
asyncio.run(main())Tip
Every response includes the Green + Legal + Indemnity triple:
clean Markdown content, an X-FairFetch-Origin-Signature header, and an X-FairFetch-License-ID header.
AI Agent / LLM
(ChatGPT, Claude, Perplexity)
/ \
MCP (stdio) REST API
| |
+--------v--------+ +------------v-----------+
| MCP Server | | FastAPI (:8402) |
| (FastMCP) | | |
| | | Content x402 |
| get_summary | | Negotiation Middleware |
| fetch_md | | Bot Steering |
| verified_facts | | Usage Grants |
+--------+--------+ +------------+------------+
| |
+----------+ +------------+
| |
+----------v--v-----------+
| Core Engine |
| (Green AI Layer) |
| |
| Converter Summarizer |
| (trafilat) (LiteLLM) |
| KnowledgePacket (LD) |
| Ed25519 Signatures |
+----------+--------------+
|
+----------v--------------+
| Interfaces Layer |
| (Open Standard) |
| |
| BaseFacilitator |
| BaseSummarizer |
| BaseLicenseProvider |
+-------------------------+
The payment flow works like a toll booth: you ask for content, get told the price, pay, and then receive the content along with a receipt and a legal access grant.
Path A β x402 (one-time payment): Agent β 402 β pay β retry β Fairfetch settles with Facilitator β 200 + content + grant.
Agent Fairfetch Facilitator
| | |
| GET /content/fetch?url=... | |
| &usage=rag | |
|------------------------------>| |
| | |
| 402 Payment Required | |
| { accepts: { price (2x), | |
| usage_category: "rag" }, | |
| available_tiers: {...} } | |
|<------------------------------| |
| | |
| GET + X-PAYMENT: <proof> | |
|------------------------------>| |
| | POST /settle |
| |-------------------------> |
| | { valid, tx_hash } |
| |<------------------------- |
| | |
| 200 OK + Content | |
| X-PAYMENT-RECEIPT: 0x... | |
| X-FairFetch-License-ID: ... | |
| X-FairFetch-Usage-Category: | |
| rag | |
| X-FairFetch-Compliance-Level:| |
| standard | |
|<------------------------------| |
Path B β Wallet (pre-funded): Agent sends X-WALLET-TOKEN; Fairfetch charges the ledger and returns content in one round-trip. No 402, no Facilitator call.
Agent Fairfetch Ledger
| | |
| GET /content/fetch?url=... | |
| X-WALLET-TOKEN: <token> | |
|------------------------------>| |
| | charge(wallet, price) |
| |-------------------------> |
| | { tx_id, balance } |
| |<------------------------- |
| | |
| 200 OK + Content | |
| X-FairFetch-Payment-Method: | |
| wallet | |
| X-FairFetch-Wallet-Balance: | |
| X-PAYMENT-RECEIPT: ff_... | |
| X-FairFetch-License-ID: ... | |
|<------------------------------| |
Step 1 β Ask for content (no payment yet):
curl "http://localhost:8402/content/fetch?url=https://example.com&usage=rag"You get back a 402 Payment Required response β this is not an error, it's a price quote:
{
"accepts": {
"price": "2000",
"asset": "USDC",
"network": "base",
"payTo": "0x742d35Cc...",
"usage_category": "rag",
"compliance_level": "standard"
},
"available_tiers": {
"search_engine_indexing": { "price": "0", "compliance_level": "standard" },
"summary": { "price": "1000", "compliance_level": "standard" },
"rag": { "price": "2000", "compliance_level": "standard" },
"research": { "price": "3000", "compliance_level": "elevated" },
"training": { "price": "5000", "compliance_level": "strict" },
"commercial": { "price": "10000", "compliance_level": "strict" }
},
"error": "Payment Required",
"message": "This content requires micro-payment via x402..."
}priceβ cost in the smallest unit of USDC (1000 = $0.001). The base price can vary by content URL path when the site owner sets route-based pricing (e.g./businessvs/sports); it is then multiplied by the usage tier.payToβ the content owner's wallet address where payment goes.available_tiersβ all usage options with their prices, so you can pick the right one.
Step 2 β Pay and get content:
curl -H "X-PAYMENT: test_paid_fairfetch" \
-H "Accept: text/markdown" \
"http://localhost:8402/content/fetch?url=https://example.com&usage=rag"The X-PAYMENT header carries your payment proof. In production this is a cryptographic token from a real payment. For local testing, any value starting with test_ works.
You get back 200 OK with the content and these headers:
X-PAYMENT-RECEIPT: 0x6d8ce1bf2daf... # Transaction proof (like a bank receipt)
X-FairFetch-License-ID: 47db4290...:k2+w # Your legal access grant (store this!)
X-FairFetch-Usage-Category: rag # Confirmed: you paid for RAG usage
X-FairFetch-Compliance-Level: standard # Compliance tier for this usage
X-FairFetch-Origin-Signature: GllQLb/... # Content owner's digital signature on the content
X-Content-Hash: sha256:2c449548... # Fingerprint of the contentNote
For local testing, any X-PAYMENT value starting with test_ is accepted.
The magic token test_paid_fairfetch always works. No real wallet or money needed.
The 402 round-trip makes sense for discovery, but once an AI company is onboarded it's inefficient to negotiate payment on every request. Fairfetch supports pre-funded wallets that skip the 402 entirely:
# Register a wallet (in production, this happens through the Fairfetch marketplace)
curl -X POST "http://localhost:8402/wallet/register?owner=AcmeAI&initial_balance=100000"
# β {"wallet_token": "wallet_a1b2c3d4...", "balance": 100000, ...}
# Now fetch content instantly β no 402, no X-PAYMENT negotiation
curl -H "X-WALLET-TOKEN: wallet_test_agent_alpha" \
-H "Accept: text/markdown" \
"http://localhost:8402/content/fetch?url=https://example.com"The response includes your remaining balance and a transaction receipt:
X-FairFetch-Payment-Method: wallet # Paid via wallet (not x402)
X-FairFetch-Wallet-Balance: 99000 # Remaining balance after this charge
X-PAYMENT-RECEIPT: ff_3a7c9e2b... # Transaction ID in the ledger
X-FairFetch-License-ID: 47db4290...:k2+w # Usage Grant (same as x402 flow)How it works in practice:
| x402 (One-Time Payment) | Wallet (Pre-Funded) | |
|---|---|---|
| First request | 402 β pay β retry β content | Content immediately |
| Round-trips | 2 | 1 |
| Best for | Occasional access, discovery | High-volume production use |
| Billing | Per-request settlement | Balance deducted, settled monthly (Premium) |
Wallet management endpoints
# Check balance
curl "http://localhost:8402/wallet/balance?token=wallet_test_agent_alpha"
# β {"owner": "TestAgentAlpha", "balance": 99000, ...}
# Add funds
curl -X POST "http://localhost:8402/wallet/topup?token=wallet_test_agent_alpha&amount=50000"
# β {"amount_added": 50000, "new_balance": 149000}
# Transaction history
curl "http://localhost:8402/wallet/transactions?token=wallet_test_agent_alpha"
# β {"transactions": [{"tx_id": "ff_...", "amount": 1000, ...}, ...]}Tip
Two test wallets are pre-loaded for local development:
wallet_test_agent_alphaβ balance 100,000 ($0.10)wallet_test_agent_betaβ balance 500,000 ($0.50)
Not all content usage is equal. Fairfetch defines usage categories that control what an AI agent is permitted to do with the content, with escalating compliance requirements and pricing:
| Category | Compliance | Price Multiplier | Use Case |
|---|---|---|---|
search_engine_indexing |
Standard | 0x (free) | Search engine crawling for indexing; free when site owner allows (see config) |
summary |
Standard | 1x | Display a short summary or snippet |
rag |
Standard | 2x | Retrieval-Augmented Generation / search grounding |
research |
Elevated | 3x | Academic or internal research use |
training |
Strict | 5x | Model fine-tuning or pre-training |
commercial |
Strict | 10x | Redistribution or commercial derivative works |
Important
The usage parameter is specified via query param (?usage=rag), HTTP header (X-USAGE-CATEGORY: training), or MCP tool argument. It determines the effective price and the compliance level recorded in the Usage Grant.
# Fetch for RAG (2x base price)
curl -H "X-PAYMENT: test_paid_fairfetch" \
"http://localhost:8402/content/fetch?url=https://example.com&usage=rag"
# Fetch for training (5x base price, strict compliance)
curl -H "X-PAYMENT: test_paid_fairfetch" \
"http://localhost:8402/content/fetch?url=https://example.com&usage=training"
# The 402 response includes all available tiers and their prices
curl "http://localhost:8402/content/fetch?url=https://example.com"Every 402 response includes an available_tiers object showing the price for each category, so agents can choose the appropriate tier for their needs.
A Usage Grant is your proof of legal access β think of it as a digitally signed receipt that says "this AI agent paid for and was authorized to use this content, for this specific purpose, on this date."
Every field is included in the digital signature, so nothing can be changed after the fact:
{
"grant_id": "a1b2c3d4...",
"content_url": "https://example.com/article",
"content_hash": "sha256:2c449548...",
"license_type": "publisher-terms",
"usage_category": "rag",
"granted_to": "0xPayerWallet...",
"granted_at": "2026-02-22T12:00:00Z",
"signature": {
"algorithm": "Ed25519",
"signature": "GllQLb/V4Vd+SuTY9Gk...",
"publicKey": "J2nlmFsgoUtF3Avdmkt..."
}
}| Field | What It Means |
|---|---|
grant_id |
A unique ID for this specific access event β like an order number. |
content_url |
The article or page that was accessed. |
content_hash |
A fingerprint of the exact content delivered, proving what was received. |
license_type |
The terms set by the content owner (e.g. "publisher-terms", "research-only"). |
usage_category |
What the AI agent said it would use the content for (e.g. "rag", "training"). This is locked in β you can't pay for "summary" and later claim you used it for "training." |
granted_to |
The wallet or identity of who paid. |
granted_at |
When the access happened. |
signature |
The content owner's digital signature covering all the fields above. Like a notarized stamp β unforgeable and tamper-proof. The publicKey lets anyone independently verify it. |
Important
Store your grants. If a content owner ever questions whether you had permission to use their content, the grant is your courtroom-ready evidence. No he-said-she-said β just math.
How to verify a grant
The signature covers all grant fields joined with |. You can verify it with any Ed25519 library:
from nacl.signing import VerifyKey
import base64
public_key = base64.b64decode("J2nlmFsgoUtF3Avdmkt...")
signature = base64.b64decode("GllQLb/V4Vd+SuTY9Gk...")
payload = "a1b2c3d4...|https://example.com/article|sha256:2c449548...|publisher-terms|rag|0xPayerWallet...|2026-02-22T12:00:00Z"
VerifyKey(public_key).verify(payload.encode(), signature) # raises if tamperedOr use the built-in helper:
from interfaces.license_provider import UsageGrant
grant = UsageGrant.model_validate(grant_data)
print(f"Valid: {grant.verify()}")Three tools for AI agents (all accept an optional usage parameter for tier selection):
| Tool | Description |
|---|---|
get_site_summary |
Summary + origin signature + usage grant |
fetch_article_markdown |
Clean Markdown (Green AI) |
get_verified_facts |
Full knowledge packet + lineage + grant |
Test with MCP Inspector
make dev-mcp
# Opens the Inspector UI in your browser. Click Connect β Tools β call any tool.Add to Claude Desktop
Edit ~/Library/Application Support/Claude/claude_desktop_config.json (macOS):
{
"mcpServers": {
"fairfetch": {
"command": "python",
"args": ["-m", "mcp_server.server"],
"cwd": "/absolute/path/to/fairfetch",
"env": { "FAIRFETCH_TEST_MODE": "true" }
}
}
}Add to Cursor IDE
Create .cursor/mcp.json in your project root:
{
"mcpServers": {
"fairfetch": {
"command": "python",
"args": ["-m", "mcp_server.server"],
"cwd": "/absolute/path/to/fairfetch",
"env": { "FAIRFETCH_TEST_MODE": "true" }
}
}
}When a known crawler (GPTBot, CCBot, etc.) requests raw HTML, edge workers inject headers steering it to the legal path:
X-FairFetch-Preferred-Access: mcp+json-ld
X-FairFetch-LLMS-Txt: /.well-known/llms.txt
X-FairFetch-MCP-Endpoint: /mcp
Link: </.well-known/llms.txt>; rel="ai-policy", </mcp>; rel="ai-content-api"The /health endpoint reports scraper_interceptions β the count of crawler requests steered, showing site owners the conversion rate.
Edge boilerplates are provided for Cloudflare Workers, AWS CloudFront Lambda@Edge, Fastly Compute@Edge, and Akamai EdgeWorkers.
Every successful response includes these headers. Think of them as a receipt and proof-of-origin attached to the content:
| Header | What It Means | Example Value |
|---|---|---|
X-Data-Origin-Verified |
"This content came directly from the content source, not a third party." Required by the EU AI Act for provenance tracking. | true |
X-AI-License-Type |
The terms under which the content owner is licensing this content to you. | publisher-terms |
X-FairFetch-Usage-Category |
What you told us you're using the content for. This is locked into your Usage Grant. | rag |
X-FairFetch-Compliance-Level |
How strict the rules are for your chosen usage. Higher-impact uses (like training) require stricter compliance. | standard |
X-FairFetch-Origin-Signature |
A digital fingerprint proving the content owner's server produced this exact content. Like a notary stamp β tamper-proof. | GllQLb/V4Vd+Su... (base64) |
X-FairFetch-License-ID |
Your Usage Grant reference. Store this β it's your proof of legal access if questions arise later. Format: grant_id:signature_prefix. |
47db4290...:k2+wXE3x... |
X-Content-Hash |
A fingerprint of the content body itself, so you can verify nothing was altered in transit. | sha256:2c449548... |
X-PAYMENT-RECEIPT |
Proof that payment was settled. For x402: a transaction hash. For wallets: a ledger transaction ID (ff_...). |
0x6d8ce1bf... or ff_3a7c9e... |
X-FairFetch-Payment-Method |
How the agent paid: wallet (pre-funded account) or x402 (one-time payment). |
wallet |
X-FairFetch-Wallet-Balance |
Remaining wallet balance after this charge (only present for wallet payments). | 99000 |
X-FairFetch-Version |
Protocol version, so clients know which Fairfetch spec they're talking to. | 0.2 |
Tip
For a plain-language explanation of all Fairfetch concepts, headers, and terminology, see the Concepts Guide.
| Layer | Open Source (this repo) | Cloud (Commercial) |
|---|---|---|
interfaces/ |
Abstract standard | Same |
core/ |
HTML-to-MD, signing | Same |
api/ Β· mcp_server/ |
REST + MCP protocol | Same |
payments/ |
Mock facilitator | Real EIP-3009 settlement |
compliance/ |
Headers, lineage | Same |
plugins/ |
Placeholder stubs | Managed Clearinghouse |
The open-source repo is fully functional for local development and testing. The commercial cloud layer adds real on-chain payments, content-owner-verified key management, and persistent Usage Grant audit trails.
fairfetch/
βββ docs/ # Guides for site owners & AI agents
β βββ CONCEPTS.md # Plain-language concepts & headers
β βββ PUBLISHER_GUIDE.md # CDN deployment & onboarding
β βββ AI_AGENT_GUIDE.md # MCP/REST integration for agents
βββ interfaces/ # Open Standard (abstract bases)
β βββ facilitator.py # BaseFacilitator
β βββ summarizer.py # BaseSummarizer
β βββ license_provider.py # BaseLicenseProvider + UsageGrant + UsageCategory
βββ core/ # Green AI layer
β βββ converter.py # HTML β Markdown (trafilatura)
β βββ summarizer.py # LiteLLM implementation
β βββ knowledge_packet.py # JSON-LD builder
β βββ signatures.py # Ed25519 signing
β βββ url_validation.py # SSRF protection (block private/metadata URLs)
βββ mcp_server/ # Direct Pipeline (MCP)
β βββ server.py # FastMCP tools + resources
βββ api/ # Direct Pipeline (REST)
β βββ main.py # FastAPI app
β βββ routes.py # Endpoints + triple validation
β βββ negotiation.py # Content negotiation + bot steering
β βββ dependencies.py # FairFetchConfig + DI
βββ payments/ # x402 micro-payments
β βββ x402.py # Middleware (wallet + x402)
β βββ wallet_ledger.py # In-memory wallet ledger (test_mode seeds)
β βββ mock_facilitator.py # Local test facilitator
β βββ mock_license_facilitator.py
βββ compliance/ # EU AI Act 2026
β βββ headers.py # Standardized headers
β βββ lineage.py # Data lineage tracking
β βββ copyright.py # Copyright opt-out log
βββ plugins/ # Cloud extension point
β βββ cloud_adapter.py # Managed Clearinghouse stub
βββ deploy/ # Edge boilerplates
β βββ cloudflare/ # Workers (TS)
β βββ cloudfront/ # Lambda@Edge (Python)
β βββ fastly/ # Compute@Edge (Rust)
β βββ akamai/ # EdgeWorkers (JS)
βββ scripts/ # Dev scripts
β βββ dev_server.py # Local launcher (make dev)
βββ tests/ # 127 tests Β· 98% coverage
βββ .github/workflows/ # CI pipeline
βββ openapi.yaml # REST API spec (set servers[].url to your API when self-hosting)
βββ mcp.json # MCP Inspector config
βββ pyproject.toml # Package config
βββ Makefile # Dev commands
βββ LICENSE # Apache 2.0
Deploying your own API? If you use the Cloudflare Worker (deploy/cloudflare/wrangler.toml), set FAIRFETCH_API_ORIGIN to your API base URL. If you use the OpenAPI spec for clients or docs, set the production servers[].url in openapi.yaml to your deployed API.
| Variable | Default | Description |
|---|---|---|
FAIRFETCH_TEST_MODE |
true |
Enable mock facilitator + grants; when false, CORS is restricted to your domain and no test wallets are pre-seeded |
FAIRFETCH_PUBLISHER_WALLET |
0x000... |
EVM wallet for payments |
FAIRFETCH_PUBLISHER_DOMAIN |
localhost |
Your site domain (also used as CORS origin when test mode is off) |
FAIRFETCH_CONTENT_PRICE |
1000 |
Default base price in smallest USDC unit; used when no route rule matches |
FAIRFETCH_PRICE_BY_ROUTE |
(omit) | Optional JSON map of path prefix β price for variable pricing by route (e.g. {"": "1000", "/business": "2000", "/sports": "500"}). See Site owner guide. |
FAIRFETCH_SIGNING_KEY |
(generated) | Ed25519 private key (b64) |
FAIRFETCH_LICENSE_TYPE |
publisher-terms |
Default license |
FAIRFETCH_DEFAULT_USAGE_CATEGORY |
summary |
Default usage tier for pricing |
FAIRFETCH_SEARCH_ENGINES_ALLOWED |
(built-in list) | Comma-separated User-Agent substrings for search engines allowed free indexing (e.g. Googlebot, Bingbot, DuckDuckBot). Overrides default. |
FAIRFETCH_SEARCH_ENGINES_BLOCKED |
(empty) | Comma-separated User-Agent substrings never given free indexing (takes precedence over allowed). |
FAIRFETCH_ENABLE_GRANTS |
true |
Issue Usage Grants |
FAIRFETCH_PREFERRED_ACCESS |
true |
Inject bot-steering headers |
LITELLM_MODEL |
gpt-4o-mini |
LLM for summarization |
- URL validation: The
urlparameter is validated before any outbound request. Private IPs (e.g.127.0.0.1,10.x,192.168.x), cloud metadata endpoints (e.g.169.254.169.254), and non-HTTP(S) schemes are rejected with400anderror: "url_blocked". This prevents SSRF (server-side request forgery). - Route-based pricing: The content URL path used for price lookup is normalized (percent-encoding decoded,
.and..segments collapsed) so clients cannot bypass route matching to get a different price. Only numeric prices are accepted; invalid route or default prices fall back safely. - Test mode: With
FAIRFETCH_TEST_MODE=false, CORS allows onlyhttps://{FAIRFETCH_PUBLISHER_DOMAIN}and the ledger does not pre-seed test wallets. Use test mode only for local development. - Error responses: Upstream fetch and summarization errors return generic messages to clients; details are logged server-side only.
| Guide | What's Inside |
|---|---|
| Concepts (Plain Language) | What every header, value, and term means β no jargon |
| Site owner onboarding | CDN deployment for Cloudflare, CloudFront, Fastly, Akamai, Nginx |
| AI Agent Integration | MCP for Claude/Cursor, REST clients (Python & TS), Usage Grant verification |
| Development | Local dev setup, testing the three pillars, architecture decisions |
| Contributing | How to contribute, CLA, code standards |
Apache 2.0 β use it freely, commercially or otherwise.
Website Β· Docs Β· Issues Β· Discussions
Built with conviction that AI and content creators can thrive together.