diff --git a/README.md b/README.md index 07bc768..c267fa7 100644 --- a/README.md +++ b/README.md @@ -92,6 +92,12 @@ Fairvisor integrates *alongside* Kong, nginx, and Envoy — it is not a replacem ## Quick start +> **Runnable quickstart:** `examples/quickstart/` — `docker compose up -d` and run your first enforce/reject test in under a minute. See [`examples/quickstart/README.md`](examples/quickstart/README.md). +> +> **Recipes:** `examples/recipes/` — deployable team budgets, runaway agent guard, and provider failover examples. +> +> **Sample artifacts:** `fixtures/` — canonical request/response fixtures for enforce, reject (TPM, TPD, prompt-too-large), and provider-native error bodies (OpenAI, Anthropic, Gemini). + ### 1. Create a policy ```bash @@ -304,7 +310,7 @@ Policies are versioned JSON — commit them to Git, review changes in PRs, roll **No external datastore.** All enforcement state lives in in-process shared memory (`ngx.shared.dict`). No Redis, no Postgres, no network round-trips in the decision path. -> Reproduce: `git clone https://github.com/fairvisor/benchmark && cd benchmark && ./run-all.sh` +> Reproduce: see [fairvisor/benchmark](https://github.com/fairvisor/benchmark) — the canonical benchmark source of truth for Fairvisor Edge performance numbers. ## Deployment @@ -348,14 +354,16 @@ If the SaaS is unreachable, the edge keeps enforcing with the last-known policy ## Project layout ``` -src/fairvisor/ runtime modules (OpenResty/LuaJIT) -cli/ command-line tooling -spec/ unit and integration tests (busted) -tests/e2e/ Docker-based E2E tests (pytest) -examples/ sample policy bundles -helm/ Helm chart -docker/ Docker artifacts -docs/ reference documentation +src/fairvisor/ runtime modules (OpenResty/LuaJIT) +cli/ command-line tooling +spec/ unit and integration tests (busted) +tests/e2e/ Docker-based E2E tests (pytest) +examples/quickstart/ runnable quickstart (docker compose up -d) +examples/recipes/ deployable policy recipes (team budgets, agent guard, failover) +fixtures/ canonical request/response sample artifacts +helm/ Helm chart +docker/ Docker artifacts +docs/ reference documentation ``` ## Contributing @@ -376,3 +384,4 @@ pytest tests/e2e -v # E2E (requires Docker) --- **Docs:** [docs.fairvisor.com](https://docs.fairvisor.com/docs/) · **Website:** [fairvisor.com](https://fairvisor.com) · **Quickstart:** [5 minutes to enforcement](https://docs.fairvisor.com/docs/quickstart/) + diff --git a/examples/quickstart/README.md b/examples/quickstart/README.md new file mode 100644 index 0000000..a23fc78 --- /dev/null +++ b/examples/quickstart/README.md @@ -0,0 +1,97 @@ +# Fairvisor Edge — Quickstart + +Go from `git clone` to working policy enforcement in one step. + +## Prerequisites + +- Docker with Compose V2 (`docker compose version`) +- Port 8080 free on localhost + +## Start + +```bash +docker compose up -d +``` + +Wait for the edge service to report healthy: + +```bash +docker compose ps +# edge should show "healthy" +``` + +## Verify enforcement + +**Allowed request** — should return `200`: + +```bash +curl -s -X POST http://localhost:8080/openai/v1/chat/completions \ + -H "Authorization: Bearer demo-client-jwt.demo-payload.demo-sig:sk-fake-key" \ + -H "Content-Type: application/json" \ + -d @../../fixtures/normal_request.json +``` + +Expected response matches `../../fixtures/allow_response.json`. + +**Over-limit request** — should return `429`: + +```bash +curl -s -X POST http://localhost:8080/openai/v1/chat/completions \ + -H "Authorization: Bearer demo-client-jwt.demo-payload.demo-sig:sk-fake-key" \ + -H "Content-Type: application/json" \ + -d @../../fixtures/over_limit_request.json +``` + +Expected response body matches `../../fixtures/reject_tpm_exceeded.json`. +The response will also include: +- `X-Fairvisor-Reason: tpm_exceeded` +- `Retry-After: 60` +- `RateLimit-Limit: 100` +- `RateLimit-Remaining: 0` + +## Wrapper mode and auth + +This quickstart runs in `FAIRVISOR_MODE=wrapper`. The composite Bearer token format is: + +``` +Authorization: Bearer CLIENT_JWT:UPSTREAM_KEY +``` + +- `CLIENT_JWT` — a signed JWT identifying the calling client/tenant (used for policy enforcement) +- `UPSTREAM_KEY` — the real upstream API key forwarded to the provider (e.g. `sk-...` for OpenAI) + +Fairvisor strips the composite header and injects the correct provider auth before forwarding. The upstream key is **never returned to the caller** — see `../../fixtures/allow_response.json` for proof (no `Authorization`, `x-api-key`, or `x-goog-api-key` headers in the response). + +## Provider-prefixed paths + +Wrapper mode routes by path prefix: + +| Path prefix | Upstream | Auth header | +|---|---|---| +| `/openai/v1/...` | `https://api.openai.com/v1/...` | `Authorization: Bearer UPSTREAM_KEY` | +| `/anthropic/v1/...` | `https://api.anthropic.com/v1/...` | `x-api-key: UPSTREAM_KEY` | +| `/gemini/v1beta/...` | `https://generativelanguage.googleapis.com/v1beta/...` | `x-goog-api-key: UPSTREAM_KEY` | + +## Anthropic example + +```bash +curl -s -X POST http://localhost:8080/anthropic/v1/messages \ + -H "Authorization: Bearer demo-client-jwt.demo-payload.demo-sig:sk-ant-fake-key" \ + -H "Content-Type: application/json" \ + -d @../../fixtures/anthropic_normal_request.json +``` + +A rejected Anthropic request returns an Anthropic-native error body — see `../../fixtures/reject_anthropic.json`. + +## Teardown + +```bash +docker compose down +``` + +## Next steps + +- See `../recipes/` for team budgets, runaway agent guard, and provider failover scenarios +- See `../../fixtures/` for all sample request/response artifacts +- See [fairvisor/benchmark](https://github.com/fairvisor/benchmark) for performance benchmarks +- See [docs/install/](../../docs/install/) for Kubernetes, VM, and SaaS deployment options diff --git a/examples/quickstart/docker-compose.yml b/examples/quickstart/docker-compose.yml new file mode 100644 index 0000000..0b5affa --- /dev/null +++ b/examples/quickstart/docker-compose.yml @@ -0,0 +1,48 @@ +# Fairvisor Edge — Quickstart stack (standalone mode) +# +# Usage: +# docker compose up -d +# curl -s http://localhost:8080/readyz # health check +# curl -s -X POST http://localhost:8080/openai/v1/chat/completions \ +# -H "Authorization: Bearer demo-client-jwt.demo-payload.demo-sig:sk-fake-upstream-key" \ +# -H "Content-Type: application/json" \ +# -d @fixtures/normal_request.json # expect 200 +# curl -s -X POST http://localhost:8080/openai/v1/chat/completions \ +# -H "Authorization: Bearer demo-client-jwt.demo-payload.demo-sig:sk-fake-upstream-key" \ +# -H "Content-Type: application/json" \ +# -d @fixtures/over_limit_request.json # expect 429 +# +# This file is also the base for the e2e-smoke CI check. +# CI extends it via tests/e2e/docker-compose.test.yml; do not diverge the +# service name, port, or volume contract without updating CI as well. + +services: + edge: + image: ghcr.io/fairvisor/fairvisor-edge:latest + ports: + - "8080:8080" + environment: + FAIRVISOR_CONFIG_FILE: /etc/fairvisor/policy.json + FAIRVISOR_MODE: wrapper + FAIRVISOR_SHARED_DICT_SIZE: 32m + FAIRVISOR_LOG_LEVEL: info + FAIRVISOR_WORKER_PROCESSES: "1" + volumes: + - ./policy.json:/etc/fairvisor/policy.json:ro + healthcheck: + test: ["CMD", "curl", "-sf", "http://127.0.0.1:8080/readyz"] + interval: 2s + timeout: 2s + retries: 15 + start_period: 5s + + mock_llm: + image: nginx:1.27-alpine + volumes: + - ./mock-llm.conf:/etc/nginx/nginx.conf:ro + healthcheck: + test: ["CMD", "wget", "-q", "-O", "-", "http://127.0.0.1:80/"] + interval: 2s + timeout: 2s + retries: 10 + start_period: 5s diff --git a/examples/quickstart/mock-llm.conf b/examples/quickstart/mock-llm.conf new file mode 100644 index 0000000..26603ab --- /dev/null +++ b/examples/quickstart/mock-llm.conf @@ -0,0 +1,10 @@ +events {} +http { + server { + listen 80; + location / { + default_type application/json; + return 200 '{"id":"chatcmpl-qs","object":"chat.completion","choices":[{"index":0,"message":{"role":"assistant","content":"Hello from the mock backend!"},"finish_reason":"stop"}],"usage":{"prompt_tokens":10,"completion_tokens":8,"total_tokens":18}}'; + } + } +} diff --git a/examples/quickstart/policy.json b/examples/quickstart/policy.json new file mode 100644 index 0000000..f5520aa --- /dev/null +++ b/examples/quickstart/policy.json @@ -0,0 +1,31 @@ +{ + "bundle_version": 1, + "issued_at": "2026-01-01T00:00:00Z", + "expires_at": "2030-01-01T00:00:00Z", + "policies": [ + { + "id": "quickstart-tpm-policy", + "spec": { + "selector": { + "pathPrefix": "/openai/", + "methods": ["POST"] + }, + "mode": "enforce", + "rules": [ + { + "name": "tpm-limit", + "limit_keys": ["jwt:sub"], + "algorithm": "token_bucket_llm", + "algorithm_config": { + "tokens_per_minute": 100, + "tokens_per_day": 1000, + "burst_tokens": 100, + "default_max_completion_tokens": 50 + } + } + ] + } + } + ], + "kill_switches": [] +} diff --git a/examples/recipes/provider-failover/README.md b/examples/recipes/provider-failover/README.md new file mode 100644 index 0000000..520226c --- /dev/null +++ b/examples/recipes/provider-failover/README.md @@ -0,0 +1,52 @@ +# Recipe: Provider Failover / Edge Control + +Run two provider paths under independent policy budgets. When the primary +provider (OpenAI) trips a circuit breaker, your client-side router can +switch to the fallback (Anthropic) — both paths enforced by the same edge. + +## How it works + +- `/openai/v1/...` — enforced by an OpenAI TPM limit + a spend-based circuit breaker +- `/anthropic/v1/...` — enforced by an Anthropic TPM limit + +The circuit breaker on the OpenAI path auto-trips when cumulative spend +exceeds the threshold in a 5-minute window, then auto-resets after 10 minutes. +Your application can detect the 429 with `X-Fairvisor-Reason: circuit_breaker_open` +and switch to the Anthropic path without any Fairvisor configuration change. + +## Deploy + +```bash +cp policy.json /etc/fairvisor/policy.json +``` + +## Client-side failover pattern + +```python +import httpx + +EDGE = "http://localhost:8080" +AUTH = "Bearer my-client-jwt.payload.sig:sk-my-upstream-key" + +def chat(messages, provider="openai"): + resp = httpx.post( + f"{EDGE}/{provider}/v1/chat/completions", + headers={"Authorization": AUTH, "Content-Type": "application/json"}, + json={"model": "gpt-4o", "messages": messages}, + ) + if resp.status_code == 429: + reason = resp.headers.get("X-Fairvisor-Reason", "") + if reason == "circuit_breaker_open" and provider == "openai": + return chat(messages, provider="anthropic") + resp.raise_for_status() + return resp.json() +``` + +## Auth note + +The composite `CLIENT_JWT:UPSTREAM_KEY` format is the same for all providers. +Fairvisor injects the correct provider-native auth header: +- OpenAI: `Authorization: Bearer UPSTREAM_KEY` +- Anthropic: `x-api-key: UPSTREAM_KEY` + +The upstream key is stripped from responses — it never reaches your client. diff --git a/examples/recipes/provider-failover/policy.json b/examples/recipes/provider-failover/policy.json new file mode 100644 index 0000000..5974c77 --- /dev/null +++ b/examples/recipes/provider-failover/policy.json @@ -0,0 +1,62 @@ +{ + "bundle_version": 1, + "issued_at": "2026-01-01T00:00:00Z", + "expires_at": "2030-01-01T00:00:00Z", + "policies": [ + { + "id": "provider-failover-primary", + "spec": { + "selector": { + "pathPrefix": "/openai/", + "methods": ["POST"] + }, + "mode": "enforce", + "rules": [ + { + "name": "openai-tpm", + "limit_keys": ["jwt:org_id"], + "algorithm": "token_bucket_llm", + "algorithm_config": { + "tokens_per_minute": 200000, + "burst_tokens": 200000, + "default_max_completion_tokens": 2048 + } + }, + { + "name": "openai-circuit-breaker", + "limit_keys": ["jwt:org_id"], + "algorithm": "circuit_breaker", + "algorithm_config": { + "spend_window_seconds": 300, + "spend_threshold": 100000, + "cooldown_seconds": 600 + } + } + ] + } + }, + { + "id": "provider-failover-fallback", + "spec": { + "selector": { + "pathPrefix": "/anthropic/", + "methods": ["POST"] + }, + "mode": "enforce", + "rules": [ + { + "name": "anthropic-tpm", + "limit_keys": ["jwt:org_id"], + "algorithm": "token_bucket_llm", + "algorithm_config": { + "tokens_per_minute": 100000, + "burst_tokens": 100000, + "default_max_completion_tokens": 2048 + } + } + ] + } + } + ], + "kill_switches": [] +} diff --git a/examples/recipes/runaway-agent-guard/README.md b/examples/recipes/runaway-agent-guard/README.md new file mode 100644 index 0000000..7b34491 --- /dev/null +++ b/examples/recipes/runaway-agent-guard/README.md @@ -0,0 +1,50 @@ +# Recipe: Runaway Agent Guard + +Stop runaway agentic workflows before they exhaust your token budget or +billing limit. + +## Problem + +Autonomous agents (LangChain, AutoGPT, custom loops) can enter retry storms +or infinite planning loops. Without enforcement, a single runaway agent +can consume thousands of dollars of API budget in minutes. + +## How it works + +Two rules cooperate: + +1. **Loop detector** — counts requests per `agent_id` in a sliding window. + If the agent fires more than 30 requests in 60 seconds, it trips a + 120-second cooldown. This catches tight retry loops. + +2. **TPM guard** — caps tokens per minute per agent. A burst-heavy agent + that passes the loop check still cannot drain the token pool. + +## Deploy + +```bash +cp policy.json /etc/fairvisor/policy.json +``` + +## JWT shape expected + +```json +{ + "sub": "user-456", + "agent_id": "autoagent-prod-7", + "exp": 9999999999 +} +``` + +## Kill switch for incidents + +If an agent causes an incident, flip a kill switch without restarting edge: + +```bash +# Via CLI +fairvisor kill-switch enable agent-id=autoagent-prod-7 + +# Or update the policy bundle with a kill_switch entry and hot-reload +``` + +See `docs/cookbook/kill-switch-incident-response.md` for the full incident playbook. diff --git a/examples/recipes/runaway-agent-guard/policy.json b/examples/recipes/runaway-agent-guard/policy.json new file mode 100644 index 0000000..2715b60 --- /dev/null +++ b/examples/recipes/runaway-agent-guard/policy.json @@ -0,0 +1,40 @@ +{ + "bundle_version": 1, + "issued_at": "2026-01-01T00:00:00Z", + "expires_at": "2030-01-01T00:00:00Z", + "policies": [ + { + "id": "runaway-agent-guard", + "spec": { + "selector": { + "pathPrefix": "/", + "methods": ["POST"] + }, + "mode": "enforce", + "rules": [ + { + "name": "loop-detection", + "limit_keys": ["jwt:agent_id"], + "algorithm": "loop_detector", + "algorithm_config": { + "window_seconds": 60, + "max_requests": 30, + "cooldown_seconds": 120 + } + }, + { + "name": "agent-tpm-guard", + "limit_keys": ["jwt:agent_id"], + "algorithm": "token_bucket_llm", + "algorithm_config": { + "tokens_per_minute": 50000, + "burst_tokens": 50000, + "default_max_completion_tokens": 512 + } + } + ] + } + } + ], + "kill_switches": [] +} diff --git a/examples/recipes/team-budgets/README.md b/examples/recipes/team-budgets/README.md new file mode 100644 index 0000000..54c1551 --- /dev/null +++ b/examples/recipes/team-budgets/README.md @@ -0,0 +1,45 @@ +# Recipe: Team Budgets + +Enforce per-team token and cost limits using JWT claims. + +## How it works + +Each request carries a JWT with a `team_id` claim. Fairvisor uses this as +the bucket key for two independent rules: + +1. **TPM/TPD limit** — token-rate enforcement per minute and per day +2. **Monthly cost budget** — cumulative cost cap with staged warn/throttle/reject + +## Deploy + +```bash +# Copy policy to your edge config path +cp policy.json /etc/fairvisor/policy.json + +# Or use with docker compose (standalone mode): +FAIRVISOR_CONFIG_FILE=./policy.json FAIRVISOR_MODE=wrapper docker compose up -d +``` + +## JWT shape expected + +```json +{ + "sub": "user-123", + "team_id": "engineering", + "plan": "pro", + "exp": 9999999999 +} +``` + +## Staged actions at cost budget thresholds + +| Threshold | Action | +|---|---| +| 80% | Warn (allow, log, emit business event) | +| 95% | Throttle (allow with 500 ms delay) | +| 100% | Reject (429, `budget_exceeded`) | + +## Related fixtures + +- `../../../fixtures/reject_tpd_exceeded.json` — TPD reject body +- `../../../fixtures/reject_tpm_exceeded.json` — TPM reject body diff --git a/examples/recipes/team-budgets/policy.json b/examples/recipes/team-budgets/policy.json new file mode 100644 index 0000000..d361a30 --- /dev/null +++ b/examples/recipes/team-budgets/policy.json @@ -0,0 +1,47 @@ +{ + "bundle_version": 1, + "issued_at": "2026-01-01T00:00:00Z", + "expires_at": "2030-01-01T00:00:00Z", + "policies": [ + { + "id": "team-token-budget", + "spec": { + "selector": { + "pathPrefix": "/openai/", + "methods": ["POST"] + }, + "mode": "enforce", + "rules": [ + { + "name": "per-team-tpm", + "limit_keys": ["jwt:team_id"], + "algorithm": "token_bucket_llm", + "algorithm_config": { + "tokens_per_minute": 120000, + "tokens_per_day": 2000000, + "burst_tokens": 120000, + "default_max_completion_tokens": 1024 + } + }, + { + "name": "per-team-cost-budget", + "limit_keys": ["jwt:team_id"], + "algorithm": "cost_based", + "algorithm_config": { + "budget": 50000, + "period": "30d", + "cost_key": "fixed", + "fixed_cost": 1, + "staged_actions": [ + { "threshold_percent": 80, "action": "warn" }, + { "threshold_percent": 95, "action": "throttle", "delay_ms": 500 }, + { "threshold_percent": 100, "action": "reject" } + ] + } + } + ] + } + } + ], + "kill_switches": [] +} diff --git a/fixtures/allow_response.json b/fixtures/allow_response.json new file mode 100644 index 0000000..7cc0312 --- /dev/null +++ b/fixtures/allow_response.json @@ -0,0 +1,28 @@ +{ + "_comment": "Sample 200 response for an allowed request in wrapper mode. Note: no Authorization, x-api-key, or x-goog-api-key headers — upstream auth is stripped on the response side.", + "_status": 200, + "_headers": { + "Content-Type": "application/json", + "X-Fairvisor-Reason": null, + "Authorization": null, + "x-api-key": null, + "x-goog-api-key": null + }, + "id": "chatcmpl-example", + "object": "chat.completion", + "choices": [ + { + "index": 0, + "message": { + "role": "assistant", + "content": "Hello! How can I help you today?" + }, + "finish_reason": "stop" + } + ], + "usage": { + "prompt_tokens": 10, + "completion_tokens": 9, + "total_tokens": 19 + } +} diff --git a/fixtures/anthropic_normal_request.json b/fixtures/anthropic_normal_request.json new file mode 100644 index 0000000..bcffdbf --- /dev/null +++ b/fixtures/anthropic_normal_request.json @@ -0,0 +1,10 @@ +{ + "model": "claude-3-5-haiku-20241022", + "max_tokens": 20, + "messages": [ + { + "role": "user", + "content": "Say hello in one sentence." + } + ] +} diff --git a/fixtures/normal_request.json b/fixtures/normal_request.json new file mode 100644 index 0000000..049a4e4 --- /dev/null +++ b/fixtures/normal_request.json @@ -0,0 +1,10 @@ +{ + "model": "gpt-4o-mini", + "messages": [ + { + "role": "user", + "content": "Say hello in one sentence." + } + ], + "max_tokens": 20 +} diff --git a/fixtures/over_limit_request.json b/fixtures/over_limit_request.json new file mode 100644 index 0000000..b3b554f --- /dev/null +++ b/fixtures/over_limit_request.json @@ -0,0 +1,10 @@ +{ + "model": "gpt-4o", + "messages": [ + { + "role": "user", + "content": "Say hello in one sentence." + } + ], + "max_tokens": 200000 +} diff --git a/fixtures/reject_anthropic.json b/fixtures/reject_anthropic.json new file mode 100644 index 0000000..bdf468f --- /dev/null +++ b/fixtures/reject_anthropic.json @@ -0,0 +1,13 @@ +{ + "_comment": "Anthropic-native 429 reject body. Used for /anthropic/* paths.", + "_headers": { + "X-Fairvisor-Reason": "tpm_exceeded", + "Retry-After": "60", + "Content-Type": "application/json" + }, + "type": "error", + "error": { + "type": "rate_limit_error", + "message": "Token budget exceeded for this tenant." + } +} diff --git a/fixtures/reject_gemini.json b/fixtures/reject_gemini.json new file mode 100644 index 0000000..f0df901 --- /dev/null +++ b/fixtures/reject_gemini.json @@ -0,0 +1,13 @@ +{ + "_comment": "Gemini-native 429 reject body. Used for /gemini/* paths.", + "_headers": { + "X-Fairvisor-Reason": "tpm_exceeded", + "Retry-After": "60", + "Content-Type": "application/json" + }, + "error": { + "code": 429, + "message": "Token budget exceeded for this tenant.", + "status": "RESOURCE_EXHAUSTED" + } +} diff --git a/fixtures/reject_openai.json b/fixtures/reject_openai.json new file mode 100644 index 0000000..eabd023 --- /dev/null +++ b/fixtures/reject_openai.json @@ -0,0 +1,14 @@ +{ + "_comment": "OpenAI-native 429 reject body. Used for /openai/* paths and OpenAI-compatible providers.", + "_headers": { + "X-Fairvisor-Reason": "tpm_exceeded", + "Retry-After": "60", + "Content-Type": "application/json" + }, + "error": { + "type": "rate_limit_error", + "code": "tpm_exceeded", + "message": "Token budget exceeded for this tenant.", + "param": null + } +} diff --git a/fixtures/reject_prompt_too_large.json b/fixtures/reject_prompt_too_large.json new file mode 100644 index 0000000..9c4cf8c --- /dev/null +++ b/fixtures/reject_prompt_too_large.json @@ -0,0 +1,13 @@ +{ + "_comment": "429 body returned when the request exceeds max_prompt_tokens.", + "_headers": { + "X-Fairvisor-Reason": "prompt_too_large", + "Content-Type": "application/json" + }, + "error": { + "type": "rate_limit_error", + "code": "prompt_too_large", + "message": "Request prompt exceeds the maximum allowed token count for this policy.", + "param": null + } +} diff --git a/fixtures/reject_tpd_exceeded.json b/fixtures/reject_tpd_exceeded.json new file mode 100644 index 0000000..8d2bcdb --- /dev/null +++ b/fixtures/reject_tpd_exceeded.json @@ -0,0 +1,16 @@ +{ + "_comment": "429 body returned when the per-day token budget is exhausted.", + "_headers": { + "X-Fairvisor-Reason": "tpd_exceeded", + "Retry-After": "86400", + "RateLimit-Limit": "2000000", + "RateLimit-Remaining": "0", + "Content-Type": "application/json" + }, + "error": { + "type": "rate_limit_error", + "code": "tpd_exceeded", + "message": "Token budget exceeded for this tenant.", + "param": null + } +} diff --git a/fixtures/reject_tpm_exceeded.json b/fixtures/reject_tpm_exceeded.json new file mode 100644 index 0000000..26f45d0 --- /dev/null +++ b/fixtures/reject_tpm_exceeded.json @@ -0,0 +1,17 @@ +{ + "_comment": "429 body returned when the per-minute token budget is exhausted.", + "_headers": { + "X-Fairvisor-Reason": "tpm_exceeded", + "Retry-After": "60", + "RateLimit-Limit": "120000", + "RateLimit-Remaining": "0", + "RateLimit-Reset": "", + "Content-Type": "application/json" + }, + "error": { + "type": "rate_limit_error", + "code": "tpm_exceeded", + "message": "Token budget exceeded for this tenant.", + "param": null + } +}