Skip to content

Package edge repo as a runnable onboarding funnel #32

@levleontiev

Description

@levleontiev

Why

The edge repo already has a strong README and useful policy samples, but it is still weaker than it should be as a hands-on onboarding funnel. The main gap is packaging, not missing ideas.

Problem

Current examples/ are mostly policy snippets plus markdown, not end-to-end runnable recipes. There is no single canonical quickstart folder that takes a user from clone to working enforcement with expected output. There is also no dedicated place for sample reject/output artifacts.

Scope

  1. Add one canonical runnable quickstart path in the repo.
  2. Turn the existing policy samples into a smaller set of deployable recipes.
  3. Add sample response/output artifacts for the most important behaviors.
  4. Link to fairvisor/benchmark as the benchmark source of truth instead of duplicating benchmark methodology in this repo.

Proposed deliverables

  • examples/quickstart/ with runnable docker-compose.yml or equivalent
  • expected-output section or fixture files showing success / reject behavior
  • 3 canonical recipes built from existing samples:
    • team budgets
    • runaway agent guard
    • provider failover or equivalent edge-control scenario
  • dedicated sample outputs for:
    • fixtures/reject_tpm_exceeded.json — 429 JSON body for TPM limit hit
    • fixtures/reject_tpd_exceeded.json — 429 JSON body for TPD (daily) limit hit
    • fixtures/reject_prompt_too_large.json — 429 JSON body for max_prompt_tokens exceeded
    • relevant headers (Retry-After, RateLimit-*, X-Fairvisor-Reason) shown in fixture comments or README
    • OpenAI-compatible reject behavior fixture
  • README tightening where needed to point into the new runnable path
  • README/docs pointer to fairvisor/benchmark as benchmark source of truth

Implementation constraints

  • Standalone mode first: the quickstart must work without a SaaS account (standalone/local config mode). SaaS-connected mode is a follow-on step, not a prerequisite.
  • Reuse the e2e compose setup: examples/quickstart/docker-compose.yml should be reusable as the base for the e2e smoke test (e2e-smoke CI check). Avoid a second parallel compose stack diverging from the example.
  • Named fixture files: sample artifacts must use the specific filenames listed above so CI and docs can reference them by stable paths.

Spec 019 alignment

Feature 019 (LLM wrapper mode) is now specific enough that the repo onboarding path should account for it explicitly.

Add these expectations to this issue:

  • the canonical quickstart should demonstrate at least one wrapper-mode path, not only generic proxy enforcement
  • the runnable path should show the concrete spec 019 promise: point a client at Fairvisor, use provider prefix routing, and enforce policy without SDK code changes
  • at least one quickstart example should hit /openai/v1/chat/completions (or another wrapper prefix from the provider registry) using the composite CLIENT_JWT:UPSTREAM_KEY contract
  • sample fixtures should include provider-native wrapper rejection/output examples, not only generic rejects:
    • OpenAI/OpenAI-compatible reject body
    • Anthropic reject body
    • Gemini native reject body
  • sample docs/fixtures should distinguish request logs from business events; do not imply per-request audit events if spec 019 says business events are only for notable enforcement outcomes

This does not mean building the wrapper implementation here. It means the repo packaging and examples should already be shaped around that onboarding story once wrapper mode lands.

Acceptance criteria

  • docker compose up -d && curl -s http://localhost:8080/v1/chat/completions -d @fixtures/over_limit_request.json returns a 429 with the body matching fixtures/reject_tpm_exceeded.json.
  • docker compose up -d && curl -s http://localhost:8080/v1/chat/completions -d @fixtures/normal_request.json returns 200.
  • A technical user can go from clone to first working result through one canonical repo-owned path without stitching together docs manually.
  • examples/ read as deployable recipes, not just disconnected policy fragments.
  • The repo contains concrete sample outputs for core enforcement behaviors (at minimum: TPM reject, TPD reject, prompt-too-large reject, allowed pass-through).
  • Benchmark content in this repo links to fairvisor/benchmark instead of diverging from it.
  • Wrapper-mode onboarding is represented explicitly enough that docs and website can point to it as the no-SDK-change proof path when spec 019 implementation lands.

Dependencies

  • Downstream: fairvisor/documentation issue fix: avoid colliding IPs in retry-after jitter test #8 — the docs site navigation and install/deploy guidance should link into the quickstart path added here. Coordinate the fixture file paths and compose structure before the documentation PR lands.

Spec 019 refresh (second pass)

The issue is directionally aligned already, but the runnable onboarding path should now be tightened to the final wrapper contract in spec 019:

  • The canonical wrapper example should use a provider-prefixed path, not a generic OpenAI-only /v1 path. Prefer /openai/v1/chat/completions as the primary proof path.
  • The quickstart request example should show the actual composite auth contract explicitly:
    • Authorization: Bearer CLIENT_JWT:UPSTREAM_KEY
  • If the repo demonstrates a second path beyond OpenAI, prefer one provider-native path (for example Anthropic /anthropic/v1/messages or Gemini native /gemini/v1beta/...) so the examples prove that wrapper mode is not only OpenAI-compatible forwarding.
  • Sample output/fixture coverage should include one explicit check that upstream auth headers are not leaked back to the client. That means response fixtures or expected-output docs should prove that Authorization, x-api-key, and x-goog-api-key are stripped on the response side.
  • Header-descriptor examples should stay accurate to the spec: only headers actually referenced via active header:* descriptors are stripped before upstream forwarding. Do not word the repo examples as if all custom headers are removed unconditionally.
  • If hybrid-mode examples are mentioned, describe them accurately: in GATEWAY_MODE=hybrid, known provider prefixes route to wrapper and everything else falls back to proxy mode. Do not imply a generic /{provider}/v1 rewrite layer.
  • Keep event wording strict: request logs are per-request operational logs; business events are emitted only for notable enforcement outcomes. Avoid any repo copy that drifts back toward per-request audit-event language.

Acceptance-criteria tightening

Replace the wrapper curl proof points conceptually with the final path/auth shape:

  • docker compose up -d plus a request to /openai/v1/chat/completions using Authorization: Bearer CLIENT_JWT:UPSTREAM_KEY returns 200 for an allowed fixture.
  • The same path with an over-limit fixture returns 429 with the expected provider-compatible reject body.
  • At least one fixture or documented expected output proves response-side auth-header stripping.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions