Consume content from Fairfetch-enabled sites instead of scraping — with clean Markdown, verified provenance, and a cryptographic Usage Grant for legal indemnity.
This guide is for developers building AI agents, RAG pipelines, or MCP clients that need to fetch web content in a compliant, paid, and legally verifiable way.
| Section | What you’ll get |
|---|---|
| System requirements & checks | Python/Node versions, network, and limitations |
| Why Fairfetch vs scraping | Quick comparison and benefits |
| Option A: MCP integration | Connect via MCP (Claude, Cursor, custom clients) |
| Option B: REST API | curl, Python, TypeScript examples and flows |
| Payment: x402 vs wallet | When to use which; example requests and responses |
| Failure points & mitigations | Status codes, errors, and what to do |
| Verifying Usage Grants | How to verify and store grants |
| Content negotiation & detection | Accept header and Fairfetch-enabled site detection |
| End-to-end example | RAG pipeline with multiple sources and grants |
Recommended environment
| Requirement | Minimum | How to check |
|---|---|---|
| Python | 3.11+ | python3 --version or python --version |
| Node.js (for MCP Inspector only) | 18+ | node --version |
| Network | Outbound HTTPS | Can reach content provider API and (if used) LiteLLM/OpenAI |
Run these before starting:
python3 --version # e.g. Python 3.11.6
node --version # optional; e.g. v20.x for MCP Inspector
curl -sI https://example.com | head -1 # e.g. HTTP/2 200Infrastructure limitations (open-source Fairfetch)
| Limitation | Meaning |
|---|---|
| URL allowlist | Only public HTTP/HTTPS URLs. Private IPs, localhost, and cloud metadata URLs (e.g. 169.254.169.254) are rejected with 400 url_blocked. |
| No built-in rate limiting | The open-source server does not throttle by IP or key; deploy behind a reverse proxy or use Fairfetch Premium for production rate limits. |
| Test wallets | Pre-seeded test wallets (wallet_test_agent_alpha, wallet_test_agent_beta) exist only when the server runs with FAIRFETCH_TEST_MODE=true. In production, register via /wallet/register or the marketplace. |
| Summarization | Requires a configured LLM (e.g. OPENAI_API_KEY for LiteLLM). If missing, summary endpoints may return 503; content-only endpoints (e.g. Markdown) still work. |
| Scraping | Fairfetch |
|---|---|
| Parse raw HTML, strip ads, handle JS | One call returns clean Markdown |
| No proof of legal access | Ed25519-signed Usage Grant |
| 402/403, CAPTCHAs, blocks | x402 or wallet = predictable access |
| Each agent re-processes the same page | Pre-processed at source (Green AI) |
| Legal risk | Legal indemnity (store grants) |
Best for: Claude Desktop, Cursor, or any MCP-compatible client. One config, then use tools from your AI assistant.
Input (commands):
git clone https://github.com/Fairfetch-co/fairfetch.git
cd fairfetch
python3 -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -e ".[dev]"Expected: No errors; packages install. Then start the MCP server (e.g. under the Inspector):
npx @modelcontextprotocol/inspector python -m mcp_server.serverExpected: Browser opens the MCP Inspector; status shows “Connected” or a transport ready.
If it fails:
| Symptom | Cause | Mitigation |
|---|---|---|
python3 not found |
Python not installed or not on PATH | Install Python 3.11+; use python if that’s the command on your system. |
npx not found |
Node.js not installed | Install Node 18+ for MCP Inspector; or call the MCP server from your own client without the Inspector. |
| Inspector doesn’t connect | Wrong cwd or transport | Run from repo root; ensure python -m mcp_server.server starts without errors. |
In the Inspector: Tools → pick fetch_article_markdown → set url to https://example.com → Run.
Example input (JSON):
{ "url": "https://example.com" }Example output (shortened):
# Example Domain
This domain is for use in documentation examples...
If you see Markdown content, the MCP server and content path work.
Claude Desktop (macOS): Edit ~/Library/Application Support/Claude/claude_desktop_config.json.
Cursor: Create or edit .cursor/mcp.json in your project root.
Example config (replace the path with your actual fairfetch repo path):
{
"mcpServers": {
"fairfetch": {
"command": "python",
"args": ["-m", "mcp_server.server"],
"cwd": "/absolute/path/to/fairfetch",
"env": { "FAIRFETCH_TEST_MODE": "true" }
}
}
}Restart Claude or Cursor. You can then ask: “Summarize the article at https://example.com” or “Get verified facts from https://example.com/article”.
| Tool | Input | Returns |
|---|---|---|
get_site_summary |
url, optional usage |
Title, author, summary, signature, usage_grant |
fetch_article_markdown |
url |
Clean Markdown |
get_verified_facts |
url, optional usage |
Full JSON-LD knowledge packet + lineage + grant |
| Resource URI | Returns |
|---|---|
fairfetch://config |
Server config (version, model, public key, formats) |
fairfetch://public-key |
Ed25519 public key for signature verification |
The optional usage parameter sets the usage category (search_engine_indexing, summary, rag, research, training, commercial) and thus pricing tier and compliance level on the grant.
Best for: Custom agents, scripts, or non-MCP clients. All operations are HTTP.
Request:
curl -s "http://localhost:8402/content/fetch?url=https://example.com"Example response (402):
{
"accepts": {
"price": "1000",
"asset": "USDC",
"network": "base",
"payTo": "0x...",
"usage_category": "summary",
"compliance_level": "standard"
},
"available_tiers": {
"search_engine_indexing": { "price": "0", "compliance_level": "standard" },
"summary": { "price": "1000", "compliance_level": "standard" },
"rag": { "price": "2000", "compliance_level": "standard" },
"research": { "price": "3000", "compliance_level": "elevated" },
"training": { "price": "5000", "compliance_level": "strict" },
"commercial": { "price": "10000", "compliance_level": "strict" }
},
"error": "Payment Required",
"message": "This content requires micro-payment..."
}Only public HTTP/HTTPS URLs are allowed. Private IPs and cloud metadata URLs return 400 with {"error": "url_blocked", "detail": "The requested URL is not allowed."}.
Note: Site owners can set different prices for different URL paths (route-based pricing). The 402 price and available_tiers may therefore differ per content URL (e.g. .../business vs .../sports). Always use the quoted price from the 402 for the URL you are fetching.
Request:
curl -s -H "X-PAYMENT: test_paid_fairfetch" \
-H "Accept: application/ai-context+json" \
"http://localhost:8402/content/fetch?url=https://example.com&usage=rag"Example response (200): JSON-LD body with articleBody, fairfetch:usageGrant, etc., and headers such as X-FairFetch-License-ID, X-PAYMENT-RECEIPT, X-FairFetch-Payment-Method: x402.
Request:
curl -s -H "X-WALLET-TOKEN: wallet_test_agent_alpha" \
-H "Accept: text/markdown" \
"http://localhost:8402/content/fetch?url=https://example.com"Example response (200): Markdown body; headers include X-FairFetch-Payment-Method: wallet, X-FairFetch-Wallet-Balance: 99000, X-PAYMENT-RECEIPT: ff_....
Test wallets are available only when the server runs with FAIRFETCH_TEST_MODE=true. In production, register a wallet first (see below).
| x402 (one-time) | Wallet (pre-funded) | |
|---|---|---|
| First request | 402 → pay → retry → content | Content immediately |
| Round-trips | 2 | 1 |
| Header | X-PAYMENT: <proof> |
X-WALLET-TOKEN: <token> |
| Best for | Discovery, occasional use | High-volume production |
Register (example):
curl -X POST "http://localhost:8402/wallet/register?owner=MyAgent&initial_balance=100000"Example response:
{
"wallet_token": "wallet_abc123...",
"owner": "MyAgent",
"balance": 100000,
"created_at": "...",
"usage": "Include this token in the X-WALLET-TOKEN header..."
}Check balance:
curl "http://localhost:8402/wallet/balance?token=wallet_abc123..."Top up:
curl -X POST "http://localhost:8402/wallet/topup?token=wallet_abc123...&amount=50000"| Status / symptom | Likely cause | What to do |
|---|---|---|
400 url_blocked |
URL is private IP, metadata, or non-HTTP | Use only public http:// or https:// URLs. |
| 402 (no payment) | No X-PAYMENT or X-WALLET-TOKEN |
Add payment header; in test use X-PAYMENT: test_paid_fairfetch or a valid wallet token. |
402 wallet_error: insufficient_balance |
Wallet balance < request price | Top up the wallet or use a different wallet/token. |
402 verification_error |
Invalid or expired payment proof | Retry with a fresh payment proof or wallet token. |
502 upstream_fetch_failed |
Content provider’s server or target URL unreachable | Retry later; check target URL is public and reachable. |
503 summarization_unavailable |
LLM not configured (e.g. no API key) | Use Accept: text/markdown for content without summary, or configure LiteLLM. |
| Connection refused | Fairfetch server not running or wrong port | Start the API (e.g. make dev) and use the correct base URL and port. |
| CORS errors (browser) | Origin not allowed (production) | Server allows only the site's domain when not in test mode; call from that origin or from a non-browser client. |
Example: handling 402 in code
if resp.status_code == 402:
data = resp.json()
if data.get("wallet_error") == "insufficient_balance":
raise Exception(f"Wallet balance too low: {data['wallet_balance']}, need {data['amount_required']}")
raise Exception(f"Payment required: {data['accepts']}")Every successful response includes a Usage Grant (in the body as fairfetch:usageGrant and/or in the header X-FairFetch-License-ID). Store it as proof of legal access.
from interfaces.license_provider import UsageGrant
grant = UsageGrant.model_validate(grant_data) # from JSON body or reconstruct from header + hash
print(grant.verify()) # True if signature validThe signed payload is:
{grant_id}|{content_url}|{content_hash}|{license_type}|{usage_category}|{granted_to}|{granted_at}
Use the content owner’s public key (e.g. from fairfetch://public-key or response headers) and verify the signature with an Ed25519 library.
Accept header → response format:
| Accept | Response |
|---|---|
application/ai-context+json |
Full JSON-LD knowledge packet + lineage + grant |
application/ld+json |
JSON-LD article with signature |
text/markdown |
Markdown only (fastest) |
application/json |
Default JSON |
Detecting Fairfetch-enabled sites: Prefer the official API over scraping when you see:
/.well-known/llms.txt— AI policy fileLinkheader withrel="ai-content-api"— API endpointX-FairFetch-Preferred-Access: mcp+json-ld— preference for API use
Example (Python): HEAD the URL and check for X-FairFetch-Preferred-Access or Link containing ai-content-api; if present, use the Fairfetch API URL from Link or llms.txt instead of scraping.
Scenario: Answer a question using multiple Fairfetch sources and keep grants for compliance.
Example flow:
- Request content from each source URL with
X-PAYMENTorX-WALLET-TOKENandAccept: application/ai-context+json. - On 200, collect
articleBody(or equivalent) and append to context; collectX-FairFetch-License-ID(or full grant) per URL. - Build a prompt: “Based on the following verified sources: … Answer: {query}”.
- Send the prompt to your LLM.
- Store the list of grants for the sources used.
Minimal Python sketch:
async def rag_with_fairfetch(query: str, source_urls: list[str]) -> tuple[str, list[str]]:
context_chunks = []
grants = []
async with httpx.AsyncClient() as client:
for url in source_urls:
resp = await client.get(
f"{FAIRFETCH_URL}/content/fetch",
params={"url": url},
headers={"X-PAYMENT": "test_paid_fairfetch", "Accept": "application/ai-context+json"},
)
if resp.status_code == 200:
data = resp.json()
context_chunks.append(data.get("articleBody", ""))
grants.append(resp.headers.get("X-FairFetch-License-ID"))
context = "\n\n---\n\n".join(context_chunks)
prompt = f"Based on the following verified sources:\n\n{context}\n\nAnswer: {query}"
return prompt, grantsUse the returned grants for audit/compliance.
"""Minimal Fairfetch client for AI agents."""
import httpx
FAIRFETCH_URL = "http://localhost:8402"
WALLET_TOKEN = "wallet_test_agent_alpha" # or None to use X-PAYMENT
PAYMENT_TOKEN = "test_paid_fairfetch"
async def fetch_article(url: str, usage: str = "summary", wallet_token: str | None = None) -> dict:
headers = {"Accept": "application/ai-context+json"}
if wallet_token:
headers["X-WALLET-TOKEN"] = wallet_token
else:
headers["X-PAYMENT"] = PAYMENT_TOKEN
async with httpx.AsyncClient() as client:
resp = await client.get(
f"{FAIRFETCH_URL}/content/fetch",
params={"url": url, "usage": usage},
headers=headers,
)
if resp.status_code == 402:
data = resp.json()
if data.get("wallet_error") == "insufficient_balance":
raise Exception(f"Wallet low: have {data['wallet_balance']}, need {data['amount_required']}")
raise Exception(f"Payment required: {data['accepts']}")
resp.raise_for_status()
return {
"content": resp.json(),
"payment_method": resp.headers.get("X-FairFetch-Payment-Method"),
"wallet_balance": resp.headers.get("X-FairFetch-Wallet-Balance"),
"license_id": resp.headers.get("X-FairFetch-License-ID"),
"usage_category": resp.headers.get("X-FairFetch-Usage-Category"),
"payment_receipt": resp.headers.get("X-PAYMENT-RECEIPT"),
}const FAIRFETCH_URL = "http://localhost:8402";
const PAYMENT_TOKEN = "test_paid_fairfetch";
async function fetchArticle(url: string) {
const resp = await fetch(
`${FAIRFETCH_URL}/content/fetch?url=${encodeURIComponent(url)}`,
{
headers: {
"X-PAYMENT": PAYMENT_TOKEN,
"Accept": "application/ai-context+json",
},
}
);
if (resp.status === 402) {
const pricing = await resp.json();
throw new Error(`Payment required: ${JSON.stringify(pricing.accepts)}`);
}
const content = await resp.json();
return {
content,
licenseId: resp.headers.get("X-FairFetch-License-ID"),
paymentReceipt: resp.headers.get("X-PAYMENT-RECEIPT"),
};
}For plain-language concepts and header definitions, see the Concepts Guide. For repository and API details, see GitHub — Fairfetch-co/fairfetch.