-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Why
The edge repo already has a strong README and useful policy samples, but it is still weaker than it should be as a hands-on onboarding funnel. The main gap is packaging, not missing ideas.
Problem
Current examples/ are mostly policy snippets plus markdown, not end-to-end runnable recipes. There is no single canonical quickstart folder that takes a user from clone to working enforcement with expected output. There is also no dedicated place for sample reject/output artifacts.
Scope
- Add one canonical runnable quickstart path in the repo.
- Turn the existing policy samples into a smaller set of deployable recipes.
- Add sample response/output artifacts for the most important behaviors.
- Link to
fairvisor/benchmarkas the benchmark source of truth instead of duplicating benchmark methodology in this repo.
Proposed deliverables
examples/quickstart/with runnabledocker-compose.ymlor equivalent- expected-output section or fixture files showing success / reject behavior
- 3 canonical recipes built from existing samples:
- team budgets
- runaway agent guard
- provider failover or equivalent edge-control scenario
- dedicated sample outputs for:
fixtures/reject_tpm_exceeded.json— 429 JSON body for TPM limit hitfixtures/reject_tpd_exceeded.json— 429 JSON body for TPD (daily) limit hitfixtures/reject_prompt_too_large.json— 429 JSON body for max_prompt_tokens exceeded- relevant headers (
Retry-After,RateLimit-*,X-Fairvisor-Reason) shown in fixture comments or README - OpenAI-compatible reject behavior fixture
- README tightening where needed to point into the new runnable path
- README/docs pointer to
fairvisor/benchmarkas benchmark source of truth
Implementation constraints
- Standalone mode first: the quickstart must work without a SaaS account (standalone/local config mode). SaaS-connected mode is a follow-on step, not a prerequisite.
- Reuse the e2e compose setup:
examples/quickstart/docker-compose.ymlshould be reusable as the base for the e2e smoke test (e2e-smokeCI check). Avoid a second parallel compose stack diverging from the example. - Named fixture files: sample artifacts must use the specific filenames listed above so CI and docs can reference them by stable paths.
Spec 019 alignment
Feature 019 (LLM wrapper mode) is now specific enough that the repo onboarding path should account for it explicitly.
Add these expectations to this issue:
- the canonical quickstart should demonstrate at least one wrapper-mode path, not only generic proxy enforcement
- the runnable path should show the concrete
spec 019promise: point a client at Fairvisor, use provider prefix routing, and enforce policy without SDK code changes - at least one quickstart example should hit
/openai/v1/chat/completions(or another wrapper prefix from the provider registry) using the compositeCLIENT_JWT:UPSTREAM_KEYcontract - sample fixtures should include provider-native wrapper rejection/output examples, not only generic rejects:
- OpenAI/OpenAI-compatible reject body
- Anthropic reject body
- Gemini native reject body
- sample docs/fixtures should distinguish request logs from business events; do not imply per-request audit events if
spec 019says business events are only for notable enforcement outcomes
This does not mean building the wrapper implementation here. It means the repo packaging and examples should already be shaped around that onboarding story once wrapper mode lands.
Acceptance criteria
docker compose up -d && curl -s http://localhost:8080/v1/chat/completions -d @fixtures/over_limit_request.jsonreturns a429with the body matchingfixtures/reject_tpm_exceeded.json.docker compose up -d && curl -s http://localhost:8080/v1/chat/completions -d @fixtures/normal_request.jsonreturns200.- A technical user can go from clone to first working result through one canonical repo-owned path without stitching together docs manually.
examples/read as deployable recipes, not just disconnected policy fragments.- The repo contains concrete sample outputs for core enforcement behaviors (at minimum: TPM reject, TPD reject, prompt-too-large reject, allowed pass-through).
- Benchmark content in this repo links to
fairvisor/benchmarkinstead of diverging from it. - Wrapper-mode onboarding is represented explicitly enough that docs and website can point to it as the no-SDK-change proof path when
spec 019implementation lands.
Dependencies
- Downstream:
fairvisor/documentationissue fix: avoid colliding IPs in retry-after jitter test #8 — the docs site navigation and install/deploy guidance should link into the quickstart path added here. Coordinate the fixture file paths and compose structure before the documentation PR lands.
Spec 019 refresh (second pass)
The issue is directionally aligned already, but the runnable onboarding path should now be tightened to the final wrapper contract in spec 019:
- The canonical wrapper example should use a provider-prefixed path, not a generic OpenAI-only
/v1path. Prefer/openai/v1/chat/completionsas the primary proof path. - The quickstart request example should show the actual composite auth contract explicitly:
Authorization: Bearer CLIENT_JWT:UPSTREAM_KEY
- If the repo demonstrates a second path beyond OpenAI, prefer one provider-native path (for example Anthropic
/anthropic/v1/messagesor Gemini native/gemini/v1beta/...) so the examples prove that wrapper mode is not only OpenAI-compatible forwarding. - Sample output/fixture coverage should include one explicit check that upstream auth headers are not leaked back to the client. That means response fixtures or expected-output docs should prove that
Authorization,x-api-key, andx-goog-api-keyare stripped on the response side. - Header-descriptor examples should stay accurate to the spec: only headers actually referenced via active
header:*descriptors are stripped before upstream forwarding. Do not word the repo examples as if all custom headers are removed unconditionally. - If hybrid-mode examples are mentioned, describe them accurately: in
GATEWAY_MODE=hybrid, known provider prefixes route to wrapper and everything else falls back to proxy mode. Do not imply a generic/{provider}/v1rewrite layer. - Keep event wording strict: request logs are per-request operational logs; business events are emitted only for notable enforcement outcomes. Avoid any repo copy that drifts back toward per-request audit-event language.
Acceptance-criteria tightening
Replace the wrapper curl proof points conceptually with the final path/auth shape:
docker compose up -dplus a request to/openai/v1/chat/completionsusingAuthorization: Bearer CLIENT_JWT:UPSTREAM_KEYreturns200for an allowed fixture.- The same path with an over-limit fixture returns
429with the expected provider-compatible reject body. - At least one fixture or documented expected output proves response-side auth-header stripping.