Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
9473d09
feat(quickstart): add runnable quickstart, recipes, and fixtures (iss…
levleontiev Mar 17, 2026
c608efc
feat(quickstart): add runnable quickstart, recipes, and fixtures (iss…
levleontiev Mar 17, 2026
046ac3b
feat(quickstart): add runnable quickstart, recipes, and fixtures (iss…
levleontiev Mar 17, 2026
7b4cbbe
feat(quickstart): add runnable quickstart, recipes, and fixtures (iss…
levleontiev Mar 17, 2026
a0551ff
feat(quickstart): add runnable quickstart, recipes, and fixtures (iss…
levleontiev Mar 17, 2026
a36312e
feat(quickstart): add runnable quickstart, recipes, and fixtures (iss…
levleontiev Mar 17, 2026
d70e2c4
feat(quickstart): add runnable quickstart, recipes, and fixtures (iss…
levleontiev Mar 17, 2026
489a28a
feat(quickstart): add runnable quickstart, recipes, and fixtures (iss…
levleontiev Mar 17, 2026
53a7035
feat(quickstart): add runnable quickstart, recipes, and fixtures (iss…
levleontiev Mar 17, 2026
3b078d0
feat(quickstart): add runnable quickstart, recipes, and fixtures (iss…
levleontiev Mar 17, 2026
4b4d249
feat(quickstart): add runnable quickstart, recipes, and fixtures (iss…
levleontiev Mar 17, 2026
70ed186
feat(quickstart): add runnable quickstart, recipes, and fixtures (iss…
levleontiev Mar 17, 2026
e03dfcc
feat(quickstart): add runnable quickstart, recipes, and fixtures (iss…
levleontiev Mar 17, 2026
b538176
feat(quickstart): add runnable quickstart, recipes, and fixtures (iss…
levleontiev Mar 17, 2026
f021fd6
feat(quickstart): add runnable quickstart, recipes, and fixtures (iss…
levleontiev Mar 17, 2026
3b54d3f
feat(quickstart): add runnable quickstart, recipes, and fixtures (iss…
levleontiev Mar 17, 2026
e1dd56d
feat(quickstart): add runnable quickstart, recipes, and fixtures (iss…
levleontiev Mar 17, 2026
800c4f9
feat(quickstart): add runnable quickstart, recipes, and fixtures (iss…
levleontiev Mar 17, 2026
f13e641
feat(quickstart): add runnable quickstart, recipes, and fixtures (iss…
levleontiev Mar 17, 2026
fbcf12d
feat(quickstart): add runnable quickstart, recipes, and fixtures (iss…
levleontiev Mar 17, 2026
a93b377
docs(readme): add quickstart pointer, update project layout, fix benc…
levleontiev Mar 17, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 18 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,12 @@ Fairvisor integrates *alongside* Kong, nginx, and Envoy — it is not a replacem

## Quick start

> **Runnable quickstart:** `examples/quickstart/` — `docker compose up -d` and run your first enforce/reject test in under a minute. See [`examples/quickstart/README.md`](examples/quickstart/README.md).
>
> **Recipes:** `examples/recipes/` — deployable team budgets, runaway agent guard, and provider failover examples.
>
> **Sample artifacts:** `fixtures/` — canonical request/response fixtures for enforce, reject (TPM, TPD, prompt-too-large), and provider-native error bodies (OpenAI, Anthropic, Gemini).

### 1. Create a policy

```bash
Expand Down Expand Up @@ -304,7 +310,7 @@ Policies are versioned JSON — commit them to Git, review changes in PRs, roll

**No external datastore.** All enforcement state lives in in-process shared memory (`ngx.shared.dict`). No Redis, no Postgres, no network round-trips in the decision path.

> Reproduce: `git clone https://github.com/fairvisor/benchmark && cd benchmark && ./run-all.sh`
> Reproduce: see [fairvisor/benchmark](https://github.com/fairvisor/benchmark) — the canonical benchmark source of truth for Fairvisor Edge performance numbers.

## Deployment

Expand Down Expand Up @@ -348,14 +354,16 @@ If the SaaS is unreachable, the edge keeps enforcing with the last-known policy
## Project layout

```
src/fairvisor/ runtime modules (OpenResty/LuaJIT)
cli/ command-line tooling
spec/ unit and integration tests (busted)
tests/e2e/ Docker-based E2E tests (pytest)
examples/ sample policy bundles
helm/ Helm chart
docker/ Docker artifacts
docs/ reference documentation
src/fairvisor/ runtime modules (OpenResty/LuaJIT)
cli/ command-line tooling
spec/ unit and integration tests (busted)
tests/e2e/ Docker-based E2E tests (pytest)
examples/quickstart/ runnable quickstart (docker compose up -d)
examples/recipes/ deployable policy recipes (team budgets, agent guard, failover)
fixtures/ canonical request/response sample artifacts
helm/ Helm chart
docker/ Docker artifacts
docs/ reference documentation
```

## Contributing
Expand All @@ -376,3 +384,4 @@ pytest tests/e2e -v # E2E (requires Docker)
---

**Docs:** [docs.fairvisor.com](https://docs.fairvisor.com/docs/) · **Website:** [fairvisor.com](https://fairvisor.com) · **Quickstart:** [5 minutes to enforcement](https://docs.fairvisor.com/docs/quickstart/)

97 changes: 97 additions & 0 deletions examples/quickstart/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
# Fairvisor Edge — Quickstart

Go from `git clone` to working policy enforcement in one step.

## Prerequisites

- Docker with Compose V2 (`docker compose version`)
- Port 8080 free on localhost

## Start

```bash
docker compose up -d
```

Wait for the edge service to report healthy:

```bash
docker compose ps
# edge should show "healthy"
```

## Verify enforcement

**Allowed request** — should return `200`:

```bash
curl -s -X POST http://localhost:8080/openai/v1/chat/completions \
-H "Authorization: Bearer demo-client-jwt.demo-payload.demo-sig:sk-fake-key" \
-H "Content-Type: application/json" \
-d @../../fixtures/normal_request.json
```

Expected response matches `../../fixtures/allow_response.json`.

**Over-limit request** — should return `429`:

```bash
curl -s -X POST http://localhost:8080/openai/v1/chat/completions \
-H "Authorization: Bearer demo-client-jwt.demo-payload.demo-sig:sk-fake-key" \
-H "Content-Type: application/json" \
-d @../../fixtures/over_limit_request.json
```

Expected response body matches `../../fixtures/reject_tpm_exceeded.json`.
The response will also include:
- `X-Fairvisor-Reason: tpm_exceeded`
- `Retry-After: 60`
- `RateLimit-Limit: 100`
- `RateLimit-Remaining: 0`

## Wrapper mode and auth

This quickstart runs in `FAIRVISOR_MODE=wrapper`. The composite Bearer token format is:

```
Authorization: Bearer CLIENT_JWT:UPSTREAM_KEY
```

- `CLIENT_JWT` — a signed JWT identifying the calling client/tenant (used for policy enforcement)
- `UPSTREAM_KEY` — the real upstream API key forwarded to the provider (e.g. `sk-...` for OpenAI)

Fairvisor strips the composite header and injects the correct provider auth before forwarding. The upstream key is **never returned to the caller** — see `../../fixtures/allow_response.json` for proof (no `Authorization`, `x-api-key`, or `x-goog-api-key` headers in the response).

## Provider-prefixed paths

Wrapper mode routes by path prefix:

| Path prefix | Upstream | Auth header |
|---|---|---|
| `/openai/v1/...` | `https://api.openai.com/v1/...` | `Authorization: Bearer UPSTREAM_KEY` |
| `/anthropic/v1/...` | `https://api.anthropic.com/v1/...` | `x-api-key: UPSTREAM_KEY` |
| `/gemini/v1beta/...` | `https://generativelanguage.googleapis.com/v1beta/...` | `x-goog-api-key: UPSTREAM_KEY` |

## Anthropic example

```bash
curl -s -X POST http://localhost:8080/anthropic/v1/messages \
-H "Authorization: Bearer demo-client-jwt.demo-payload.demo-sig:sk-ant-fake-key" \
-H "Content-Type: application/json" \
-d @../../fixtures/anthropic_normal_request.json
```

A rejected Anthropic request returns an Anthropic-native error body — see `../../fixtures/reject_anthropic.json`.

## Teardown

```bash
docker compose down
```

## Next steps

- See `../recipes/` for team budgets, runaway agent guard, and provider failover scenarios
- See `../../fixtures/` for all sample request/response artifacts
- See [fairvisor/benchmark](https://github.com/fairvisor/benchmark) for performance benchmarks
- See [docs/install/](../../docs/install/) for Kubernetes, VM, and SaaS deployment options
48 changes: 48 additions & 0 deletions examples/quickstart/docker-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# Fairvisor Edge — Quickstart stack (standalone mode)
#
# Usage:
# docker compose up -d
# curl -s http://localhost:8080/readyz # health check
# curl -s -X POST http://localhost:8080/openai/v1/chat/completions \
# -H "Authorization: Bearer demo-client-jwt.demo-payload.demo-sig:sk-fake-upstream-key" \
# -H "Content-Type: application/json" \
# -d @fixtures/normal_request.json # expect 200
# curl -s -X POST http://localhost:8080/openai/v1/chat/completions \
# -H "Authorization: Bearer demo-client-jwt.demo-payload.demo-sig:sk-fake-upstream-key" \
# -H "Content-Type: application/json" \
# -d @fixtures/over_limit_request.json # expect 429
#
# This file is also the base for the e2e-smoke CI check.
# CI extends it via tests/e2e/docker-compose.test.yml; do not diverge the
# service name, port, or volume contract without updating CI as well.

services:
edge:
image: ghcr.io/fairvisor/fairvisor-edge:latest
ports:
- "8080:8080"
environment:
FAIRVISOR_CONFIG_FILE: /etc/fairvisor/policy.json
FAIRVISOR_MODE: wrapper
FAIRVISOR_SHARED_DICT_SIZE: 32m
FAIRVISOR_LOG_LEVEL: info
FAIRVISOR_WORKER_PROCESSES: "1"
volumes:
- ./policy.json:/etc/fairvisor/policy.json:ro
healthcheck:
test: ["CMD", "curl", "-sf", "http://127.0.0.1:8080/readyz"]
interval: 2s
timeout: 2s
retries: 15
start_period: 5s

mock_llm:
image: nginx:1.27-alpine
volumes:
- ./mock-llm.conf:/etc/nginx/nginx.conf:ro
healthcheck:
test: ["CMD", "wget", "-q", "-O", "-", "http://127.0.0.1:80/"]
interval: 2s
timeout: 2s
retries: 10
start_period: 5s
10 changes: 10 additions & 0 deletions examples/quickstart/mock-llm.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
events {}
http {
server {
listen 80;
location / {
default_type application/json;
return 200 '{"id":"chatcmpl-qs","object":"chat.completion","choices":[{"index":0,"message":{"role":"assistant","content":"Hello from the mock backend!"},"finish_reason":"stop"}],"usage":{"prompt_tokens":10,"completion_tokens":8,"total_tokens":18}}';
}
}
}
31 changes: 31 additions & 0 deletions examples/quickstart/policy.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
{
"bundle_version": 1,
"issued_at": "2026-01-01T00:00:00Z",
"expires_at": "2030-01-01T00:00:00Z",
"policies": [
{
"id": "quickstart-tpm-policy",
"spec": {
"selector": {
"pathPrefix": "/openai/",
"methods": ["POST"]
},
"mode": "enforce",
"rules": [
{
"name": "tpm-limit",
"limit_keys": ["jwt:sub"],
"algorithm": "token_bucket_llm",
"algorithm_config": {
"tokens_per_minute": 100,
"tokens_per_day": 1000,
"burst_tokens": 100,
"default_max_completion_tokens": 50
}
}
]
}
}
],
"kill_switches": []
}
52 changes: 52 additions & 0 deletions examples/recipes/provider-failover/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
# Recipe: Provider Failover / Edge Control

Run two provider paths under independent policy budgets. When the primary
provider (OpenAI) trips a circuit breaker, your client-side router can
switch to the fallback (Anthropic) — both paths enforced by the same edge.

## How it works

- `/openai/v1/...` — enforced by an OpenAI TPM limit + a spend-based circuit breaker
- `/anthropic/v1/...` — enforced by an Anthropic TPM limit

The circuit breaker on the OpenAI path auto-trips when cumulative spend
exceeds the threshold in a 5-minute window, then auto-resets after 10 minutes.
Your application can detect the 429 with `X-Fairvisor-Reason: circuit_breaker_open`
and switch to the Anthropic path without any Fairvisor configuration change.

## Deploy

```bash
cp policy.json /etc/fairvisor/policy.json
```

## Client-side failover pattern

```python
import httpx

EDGE = "http://localhost:8080"
AUTH = "Bearer my-client-jwt.payload.sig:sk-my-upstream-key"

def chat(messages, provider="openai"):
resp = httpx.post(
f"{EDGE}/{provider}/v1/chat/completions",
headers={"Authorization": AUTH, "Content-Type": "application/json"},
json={"model": "gpt-4o", "messages": messages},
)
if resp.status_code == 429:
reason = resp.headers.get("X-Fairvisor-Reason", "")
if reason == "circuit_breaker_open" and provider == "openai":
return chat(messages, provider="anthropic")
resp.raise_for_status()
return resp.json()
```

## Auth note

The composite `CLIENT_JWT:UPSTREAM_KEY` format is the same for all providers.
Fairvisor injects the correct provider-native auth header:
- OpenAI: `Authorization: Bearer UPSTREAM_KEY`
- Anthropic: `x-api-key: UPSTREAM_KEY`

The upstream key is stripped from responses — it never reaches your client.
62 changes: 62 additions & 0 deletions examples/recipes/provider-failover/policy.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
{
"bundle_version": 1,
"issued_at": "2026-01-01T00:00:00Z",
"expires_at": "2030-01-01T00:00:00Z",
"policies": [
{
"id": "provider-failover-primary",
"spec": {
"selector": {
"pathPrefix": "/openai/",
"methods": ["POST"]
},
"mode": "enforce",
"rules": [
{
"name": "openai-tpm",
"limit_keys": ["jwt:org_id"],
"algorithm": "token_bucket_llm",
"algorithm_config": {
"tokens_per_minute": 200000,
"burst_tokens": 200000,
"default_max_completion_tokens": 2048
}
},
{
"name": "openai-circuit-breaker",
"limit_keys": ["jwt:org_id"],
"algorithm": "circuit_breaker",
"algorithm_config": {
"spend_window_seconds": 300,
"spend_threshold": 100000,
"cooldown_seconds": 600
}
}
]
}
},
{
"id": "provider-failover-fallback",
"spec": {
"selector": {
"pathPrefix": "/anthropic/",
"methods": ["POST"]
},
"mode": "enforce",
"rules": [
{
"name": "anthropic-tpm",
"limit_keys": ["jwt:org_id"],
"algorithm": "token_bucket_llm",
"algorithm_config": {
"tokens_per_minute": 100000,
"burst_tokens": 100000,
"default_max_completion_tokens": 2048
}
}
]
}
}
],
"kill_switches": []
}
Loading
Loading