feat(api): SP-2 PR-B · dashboard aggregation endpoints + candidates surface (AIN-263/264/265/266) by hizrianraz · Pull Request #71 · ainfera-ai/api

hizrianraz · 2026-05-23T16:15:39Z

Summary

Lands the four read-only endpoints the /dashboard Glance page reads — until this router landed, all three GETs 404'd and the front door crashed (digest 3541235483). The candidates surface on /v1/inferences/{id}/decision is additive (byte-reproducible candidates[] unchanged; dashboard_candidates[] is the new collapsed shape SP-3 renders).

Stacks on SP-1 (#70). Base is chore/sp1-inference-rename; merges AFTER that PR.

Endpoints (all tenant-scoped, §D3 honest-empty 200s)

GET /v1/usage/daily?days=30 (AIN-263) — daily inference rollup. Empty tenant → days:[] zeros 200.
GET /v1/caps/rollup (AIN-264) — caps utilization snapshot. budget.set sums spend_policy.daily_cap_usd; latency_p50_ms from routing_outcomes (24h).
GET /v1/agents/{id}/metrics (AIN-265) — 24h window. 404s cross-tenant (same masking as /v1/inferences/{id}/decision).
/v1/inferences/{id}/decision (AIN-266) — adds dashboard_candidates: list[DashboardCandidate] parallel to candidates[]; each row carries chosen: bool + excluded: str | null.

Tests

tests/unit/test_dashboard_candidates.py — 8 pure tests against candidate_dashboard_summary().
tests/integration/test_dashboard.py — honest-empty 200s × 3; cross-tenant 404; tenant scoping; real-spend caps; candidates shape; no-mutation invariant on routing_outcomes.
tests/smoke/test_openapi_contract.py — EXPECTED_OPERATIONS extended with the 3 new GETs.

Pre-commit ran ruff + ruff-format + mypy --strict + pytest (unit + smoke) — all green.

Test plan

CI green
Branch preview: curl /v1/usage/daily with fresh ai_infera_* key → 200 + days:[] (honest empty)
After a real inference: /v1/usage/daily shows one entry, /v1/inferences/{id}/decision carries dashboard_candidates[]
Cross-tenant GET /v1/agents/{otherA}/metrics with tenant B's key → 404
/v1/caps/rollup with no §16 traffic returns latency_p50_ms: null (not 0, not error)

🤖 Generated with Claude Code

Note

Low Risk
Read-only SELECT aggregations with existing tenant auth and 404 masking; no schema or routing-brain write paths changed beyond an additive decision field.

Overview
Adds read-only, tenant-scoped dashboard APIs so the Glance page can load without 404s: GET /v1/usage/daily (daily calls/cost/status rollup, up to 90 days, empty tenants get days:[] and zero totals), GET /v1/caps/rollup (agent count, budget caps vs today’s spend, 24h p50 latency, quality/reliability breach counts), and GET /v1/agents/{agent_id}/metrics (24h calls, cost, p50, last active, top models; 404 for other tenants’ agents).

Extends GET /v1/inferences/{id}/decision with additive dashboard_candidates (collapsed model/score/cost/chosen/excluded) alongside the existing candidates receipt, via candidate_dashboard_summary in the new dashboard router. Registers the router in main.py; OpenAPI smoke tests list the three new GETs. Integration/unit tests cover honest-empty 200s, tenant isolation, caps with real traffic, decision shape, and no writes to routing_outcomes.

Note: per-day by_status.fallback is always 0 in v0 (audit-based fallback counting deferred).

^{Reviewed by Cursor Bugbot for commit a7f63ca. Bugbot is set up for automated code reviews on this repo. Configure here.}

linear-code · 2026-05-23T16:15:43Z

AIN-263 [parity-spine B-A · backend] GET /v1/usage/daily — per-day spend series

Spawned from AIN-217 Phase 3–4 parity spine (Bucket 2 — backend, missing endpoint).

What

GET /v1/usage/daily — per-day spend series. Currently 404.

Unblocks

/dashboard daily-spend chart (currently honest "ACCRUING" empty-state)
/billing 30-bar spend chart

Scope

FastAPI endpoint on api.ainfera.ai, derived from real inference cost data (inferences.cost_usd / routing_outcomes).
Real aggregation only — no fabricated series (§D3). FE empty-states stay honest until live.
Alembic if any schema needed; RLS on; authed per tenant.

Done

curl-200 with real per-day series · FE charts populate · holds on prod

Review in Linear

…raw string) Same class as the dashboard.py:127 fix landed in #71. The capture-invariant service + integration test compared `AuditEventORM.event_type == "inference_routed"` (underscored Python name), but the actual DB enum value is `inference.routed` (dotted) per migration 20260514_0001. Postgres rejected the literal with: invalid input value for enum audit_event_type: "inference_routed" Fix: pass `AuditEventType.inference_routed` (the enum *member*) instead of the raw string — SQLAlchemy's `values_callable` resolves it to the correct DB value (`inference.routed`). Docstring updated to spell the dotted form for any future reader. Unblocks the SP-4 PR-A integration tests: test_capture_coverage.py::test_passthrough_writes_zero_outcome_rows_and_router_direct_audit No engine touch, no routing_outcomes touch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…mentation) (#73) * feat(api): SP-4 PR-A · forward capture-coverage guard for routed dispatches Adds the durable forward-coverage guarantee for §16 capture: every routed dispatch (canonical `ainfera-inference` OR any of the 3 SP-1 aliases) writes exactly one `routing_outcomes` row, regardless of outcome (success / reject / fallback / fail). Pinned passthroughs (vendor slugs) write zero AND carry a `router: "direct"` audit marker. Stacks on SP-2 PR-A (`feat/ain271-streaming-tooluse`, api#72) — that PR's stream-close capture path is the last exit covered by this guard. ## Moat-sensitive scope (read this first) This PR is **pure observability**. Per the SP-4 §1 guardrails: - ZERO change to routing decisions, scores, weights, thresholds, candidate ordering, `M_allowed`, `q_prior`, `q_empirical`, ruleset_hash. The diff against `services/routing_brain.py` and `services/routing.py` is **empty**. Verifiable: `git diff feat/ain271-streaming-tooluse..HEAD -- ainfera_api/services/routing*.py` shows no hunks. - `routing_outcomes` schema is unchanged. No new columns, no migration. The row is written by the existing `insert_decision()` / `complete_decision()` calls in `dispatch_with_brain` (§0/P3 walk-through confirmed every exit path already writes the row). - `routing/ainfera_routing/decide.py` is untouched. ## What's new 1. `ainfera_api/services/capture_invariant.py`: - `route_outcome_kind(model_slug) -> "routed" | "passthrough"` — pure classifier keyed off the SP-1 alias resolver's `ROUTING_TARGETS`, so any string added to the resolver becomes "routed" without a second edit. - `assert_capture_invariant(db, inference_id, kind)` — read-only post-condition check the test sweep runs after every probe. Raises `CaptureInvariantViolationError` with diagnostic context when a routed call returns without a row or a passthrough produces one unexpectedly. - `find_passthrough_audit_event()` — helper for the test sweep to assert the `router: "direct"` marker is present. - `DispatchCaptureCounter.dispatch_without_capture_total` — the headline regression signal. Stays 0 in green builds; production scrape (future Prometheus surface) alerts on any non-zero. 2. `tests/unit/test_capture_invariant.py` — 9 pure tests locking the classifier (canonical + 3 aliases → routed; vendor slugs + typos → passthrough) + the counter semantics (routed-miss bumps the regression signal; passthrough-captured-unexpectedly bumps the contamination signal; reset zeros everything). 3. `tests/integration/test_capture_coverage.py` — parametrized sweep that drives a routed-success call for EACH of the 4 routing targets, a reject-floor routed call, and passthrough calls against two vendor slugs (anthropic native + openai). After each, asserts: - routed success → exactly 1 routing_outcomes row, `outcome_status='succeeded'` - reject path → 1 row, `outcome_status='rejected_floor'`, `inference_id IS NULL` (the only branch where it's NULL by design — see RoutingOutcomeORM docstring) - passthrough → 0 rows AND `router: "direct"` in the audit chain (distinguishes a properly-bypassed passthrough from a routed call that silently lost its row) Plus a coverage-sweep test that asserts `DispatchCaptureCounter.dispatch_without_capture_total == 0` at the end of a mixed dispatch sequence. ## §0/P2 denominator finding (documented for the audit chain) Live read against Supabase `dftfpwzqxoebwzepygzl`: - 778 historical inferences / 5 routing_outcomes rows - 0 historical `request_payload.model` was a routing string (ainfera-inference / ainfera-mithril / ainfera-auto / ainfera/auto) - ALL 778 were pinned passthroughs — vendor slugs (claude-opus-4-7 x220, gpt-5-5 x189, claude-haiku-4-5 x105, ...) - The 3 succeeded outcome rows are integration-test side effects **The 773-row "gap" is honest fleet posture, not a capture failure.** The fleet's been on pinned passthroughs (AULE_PLANNER / YAVANNA_X_MODEL opt-outs). No backfill is owed (§D3). PR-A's value is the forward GUARANTEE: every NEW routed call going forward writes exactly one row. ## Pre-commit ruff + ruff-format + mypy --strict + pytest tests/unit + tests/smoke all green (523 tests). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(api): SP-CLOSE · capture-invariant uses AuditEventType enum (not raw string) Same class as the dashboard.py:127 fix landed in #71. The capture-invariant service + integration test compared `AuditEventORM.event_type == "inference_routed"` (underscored Python name), but the actual DB enum value is `inference.routed` (dotted) per migration 20260514_0001. Postgres rejected the literal with: invalid input value for enum audit_event_type: "inference_routed" Fix: pass `AuditEventType.inference_routed` (the enum *member*) instead of the raw string — SQLAlchemy's `values_callable` resolves it to the correct DB value (`inference.routed`). Docstring updated to spell the dotted form for any future reader. Unblocks the SP-4 PR-A integration tests: test_capture_coverage.py::test_passthrough_writes_zero_outcome_rows_and_router_direct_audit No engine touch, no routing_outcomes touch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

#80) * feat(api): SP-2 PR-A · AIN-271 streaming + tool-use lift on /v1/messages Completes the half of AIN-271 that SP-1 deferred. `/v1/messages` now honors `stream:true` (200 + text/event-stream with ordered Anthropic SSE frames) and `tools[]` (pass-through to backends, `tool_use` blocks in the response). The §16 capture invariant holds: every routed call — streamed or not — writes exactly one `routing_outcomes` row plus the matching audit events plus the ledger debit. Stacks on SP-1's `chore/sp1-inference-rename` (PR #70). Merges AFTER that PR. ## Adapter contract lift - `ProviderAdapter.chat()` gains `tools` + `tool_choice` (defaults None — back-compat preserved across all 5 adapters). - New `ProviderAdapter.stream_chat()` async generator yields normalized `StreamEvent`s. Default impl wraps `chat()` into one content_delta + one message_delta so adapters that don't yet override honor the contract surface. - New `StreamEvent` dataclass: kinds `content_delta`, `tool_use_start`, `tool_use_delta`, `message_delta`. - New `ToolsNotSupportedError` — adapters that don't yet wire tool calling raise this at the adapter boundary; the handler maps it to a 422 with backend slug + remediation. - `AdapterResponse.content_blocks` added so tool_use round-trips through the non-streaming path too. ## Per-adapter native streaming - AnthropicAdapter: real native SSE against `api.anthropic.com/v1/messages` with `stream:true`; sub-1s TTFT on the wire. tool_use blocks pass through natively. - OpenAICompatAdapter (base for OpenAI/Mistral/Together/xAI/Groq): real native SSE against `/v1/chat/completions` with `stream:true` + `stream_options.include_usage`; translates `delta.tool_calls[]` → normalized tool_use events. - OpenAIAdapter responses-tier (gpt-5.5-pro): tools non-empty raises ToolsNotSupportedError → 422 with backend slug. - GeminiAdapter / MistralAdapter: signature extended; inherit OpenAICompatAdapter native streaming. ## Streaming dispatch + /v1/messages - `services/streaming.py` runs the dispatcher to completion (full §16 capture + ledger + audit), then synthesizes Anthropic SSE frames from the resulting DispatchResult. v0 posture: `wrapped` (TTFT = full inference time); response header `x-ainfera-stream-mode` reports the mode so SDK clients can observe it. Adapter-level native streaming primitives in this same PR are ready for the follow-up that refactors `dispatch_inference` to consume them end-to-end (flipping the header to `native`). - `routers/anthropic_compat.py`: - Drops 501-on-stream → returns StreamingResponse with text/event-stream content-type. - Drops blanket 422-on-tools → tools pass through. Legacy code `tool_calling_not_supported_on_shim` retired; backends without tools surface `tools_not_supported_by_backend` with hint. - `MessagesResponse.content[]` polymorphic (text OR tool_use); SDK sees one shape across stream + non-stream. - Alias resolver honored on streamed calls (`_log_alias_hit` fires for the three SP-1 legacy strings). - Audit-trace headers (`x-ainfera-agent-id`, `x-ainfera-audit-url`) set on streaming responses identical to non-streaming. ## Tests - tests/unit/test_streaming_wire_format.py — 6 pure tests against default `stream_chat()` wrapper + AIN-176→Anthropic finish_reason mapping + `supports_native_streaming()` flag. - tests/integration/test_anthropic_compat.py — replaces SP-1 501/422 assertions with SP-2 coverage: · stream:true → 200 + text/event-stream + ordered Anthropic frames · streaming writes §16 row on close · streaming honors silent-alias resolver (parametrized × 3) · non-empty tools passes through Pre-commit: ruff + ruff-format + mypy --strict + pytest unit+smoke all green (505 unit+smoke tests). ## SP-2 v0 honesty caveat Contract surface (200 text/event-stream, ordered Anthropic frames, §16 capture, tool_use round-trip, alias parity) is real and verified. TTFT is NOT sub-1s in v0 because the streaming wrapper runs non-streaming dispatch first and replays its full response as SSE. The adapter-level native streaming primitives are in place; the follow-up refactors dispatch_inference to consume them end-to-end. `x-ainfera-stream-mode: wrapped` today → `native` after the follow-up. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(api): SP-4 PR-A · forward capture-coverage guard (AIN-244 instrumentation) (#73) * feat(api): SP-4 PR-A · forward capture-coverage guard for routed dispatches Adds the durable forward-coverage guarantee for §16 capture: every routed dispatch (canonical `ainfera-inference` OR any of the 3 SP-1 aliases) writes exactly one `routing_outcomes` row, regardless of outcome (success / reject / fallback / fail). Pinned passthroughs (vendor slugs) write zero AND carry a `router: "direct"` audit marker. Stacks on SP-2 PR-A (`feat/ain271-streaming-tooluse`, api#72) — that PR's stream-close capture path is the last exit covered by this guard. ## Moat-sensitive scope (read this first) This PR is **pure observability**. Per the SP-4 §1 guardrails: - ZERO change to routing decisions, scores, weights, thresholds, candidate ordering, `M_allowed`, `q_prior`, `q_empirical`, ruleset_hash. The diff against `services/routing_brain.py` and `services/routing.py` is **empty**. Verifiable: `git diff feat/ain271-streaming-tooluse..HEAD -- ainfera_api/services/routing*.py` shows no hunks. - `routing_outcomes` schema is unchanged. No new columns, no migration. The row is written by the existing `insert_decision()` / `complete_decision()` calls in `dispatch_with_brain` (§0/P3 walk-through confirmed every exit path already writes the row). - `routing/ainfera_routing/decide.py` is untouched. ## What's new 1. `ainfera_api/services/capture_invariant.py`: - `route_outcome_kind(model_slug) -> "routed" | "passthrough"` — pure classifier keyed off the SP-1 alias resolver's `ROUTING_TARGETS`, so any string added to the resolver becomes "routed" without a second edit. - `assert_capture_invariant(db, inference_id, kind)` — read-only post-condition check the test sweep runs after every probe. Raises `CaptureInvariantViolationError` with diagnostic context when a routed call returns without a row or a passthrough produces one unexpectedly. - `find_passthrough_audit_event()` — helper for the test sweep to assert the `router: "direct"` marker is present. - `DispatchCaptureCounter.dispatch_without_capture_total` — the headline regression signal. Stays 0 in green builds; production scrape (future Prometheus surface) alerts on any non-zero. 2. `tests/unit/test_capture_invariant.py` — 9 pure tests locking the classifier (canonical + 3 aliases → routed; vendor slugs + typos → passthrough) + the counter semantics (routed-miss bumps the regression signal; passthrough-captured-unexpectedly bumps the contamination signal; reset zeros everything). 3. `tests/integration/test_capture_coverage.py` — parametrized sweep that drives a routed-success call for EACH of the 4 routing targets, a reject-floor routed call, and passthrough calls against two vendor slugs (anthropic native + openai). After each, asserts: - routed success → exactly 1 routing_outcomes row, `outcome_status='succeeded'` - reject path → 1 row, `outcome_status='rejected_floor'`, `inference_id IS NULL` (the only branch where it's NULL by design — see RoutingOutcomeORM docstring) - passthrough → 0 rows AND `router: "direct"` in the audit chain (distinguishes a properly-bypassed passthrough from a routed call that silently lost its row) Plus a coverage-sweep test that asserts `DispatchCaptureCounter.dispatch_without_capture_total == 0` at the end of a mixed dispatch sequence. ## §0/P2 denominator finding (documented for the audit chain) Live read against Supabase `dftfpwzqxoebwzepygzl`: - 778 historical inferences / 5 routing_outcomes rows - 0 historical `request_payload.model` was a routing string (ainfera-inference / ainfera-mithril / ainfera-auto / ainfera/auto) - ALL 778 were pinned passthroughs — vendor slugs (claude-opus-4-7 x220, gpt-5-5 x189, claude-haiku-4-5 x105, ...) - The 3 succeeded outcome rows are integration-test side effects **The 773-row "gap" is honest fleet posture, not a capture failure.** The fleet's been on pinned passthroughs (AULE_PLANNER / YAVANNA_X_MODEL opt-outs). No backfill is owed (§D3). PR-A's value is the forward GUARANTEE: every NEW routed call going forward writes exactly one row. ## Pre-commit ruff + ruff-format + mypy --strict + pytest tests/unit + tests/smoke all green (523 tests). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(api): SP-CLOSE · capture-invariant uses AuditEventType enum (not raw string) Same class as the dashboard.py:127 fix landed in #71. The capture-invariant service + integration test compared `AuditEventORM.event_type == "inference_routed"` (underscored Python name), but the actual DB enum value is `inference.routed` (dotted) per migration 20260514_0001. Postgres rejected the literal with: invalid input value for enum audit_event_type: "inference_routed" Fix: pass `AuditEventType.inference_routed` (the enum *member*) instead of the raw string — SQLAlchemy's `values_callable` resolves it to the correct DB value (`inference.routed`). Docstring updated to spell the dotted form for any future reader. Unblocks the SP-4 PR-A integration tests: test_capture_coverage.py::test_passthrough_writes_zero_outcome_rows_and_router_direct_audit No engine touch, no routing_outcomes touch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(api): SP-4 PR-B · routing_preference dial — balanced byte-identical, quality/cost gated (AIN-244 dial) (#74) Exposes `routing_preference: "quality" | "balanced" | "cost"` in the routing_hint body as sugar over the existing caps. **`balanced` is byte-identical to today's behavior** (the dial is a no-op when balanced is selected — proved by the parametrized regression lock in the test file). **`quality` / `cost` are accepted on the wire but INERT** until the env gate `AINFERA_ROUTING_PREFERENCE_LIVE=1` is set (founder Disc#12 authorization of the lever values). Stacks on SP-2 api#72 (`feat/ain271-streaming-tooluse`); independent of SP-4 PR-A (#73 capture-coverage). ## Moat-sensitive scope · Disc#12 boundary This PR is Disc#12-adjacent — the dial CAN change routing decisions once the env gate is on. To stay safe: - The default (gate OFF) means `quality`/`cost` resolve to today's policy IDENTICALLY to `balanced`. SP-4 ships with the gate OFF. - Explicit caller `min_quality` always wins. The dial only nudges the default-derived floor — a quality-conscious caller never has their floor silently lowered by a `cost` preference. - Safety clamps: dial output is bounded by [good=0.50, frontier=0.85] so neither lever can exclude every voter or admit a sub-floor model. - Pure-function `_apply_preference()` is deterministic — same input → same output, testable without the brain. ## Proposed mapping (Aulë's conservative starting point — founder authorizes) `balanced` — no-op. Resolves exactly as today. `quality` — bump default min_quality by +0.10 (default 0.50 → 0.60), clamped to the `frontier` tier (0.85). Caller's explicit `min_quality` wins if higher. `cost` — drop default min_quality by -0.10, clamped to the `good` tier (0.50). Caller's explicit `min_quality` wins if higher. Both bumps are conservative: ≤0.10 delta, with hard safety clamps. No weighted-λ, no score surgery, no candidate-ordering changes. The dial moves the FLOOR; the engine still picks cheapest-clearing-floor. The founder reviews + authorizes the exact lever values in this PR. Once signed off, `railway env set AINFERA_ROUTING_PREFERENCE_LIVE=1` on the api service flips the gate ON. Until then, only `balanced` ships live behavior. ## What's new - `services/routing_brain.py`: - `VALID_PREFERENCES` frozenset + `DEFAULT_PREFERENCE = "balanced"`. - `_apply_preference(base_min_q, preference) -> Decimal` — pure function honoring the gate-off semantic. - `_routing_preference_live()` — env-var read at call time so ops can flip the gate without restart. - `_PREFERENCE_FLOOR_DELTA` + safety clamps `_SAFETY_MIN_QUALITY` + `_SAFETY_MAX_QUALITY` (= good / frontier tier numerics). - `resolve_policy()` reads `routing_preference` from the hint and applies the dial ONLY when the caller did NOT pass an explicit `min_quality` — preserves caller-intent-wins semantics. - `models/inference.py`: `InferenceRequest.routing_hint` description documents the new key (so it surfaces in openapi.json). - `tests/unit/test_routing_preference_dial.py`: - 8-case parametrized **byte-identical regression lock** for `balanced` — the moat invariant. Any divergence fails the build. - Dial-inert-when-gate-off coverage × all 3 preferences. - Dial-active mapping × bumps + clamps + explicit-caller-wins. - Unknown / typo preference values fall through to `balanced`. - 23 tests; all pure (no DB). ## Pre-commit ruff + ruff-format + mypy --strict + pytest unit+smoke = 528 green. ## Out of scope (per SP-4 §1) - methodology v1.3 changes - weights / λ-blending - online learning (AIN-246 — Backlog/deferred) - `M_allowed` / `q_prior` / `q_empirical` semantics - engine code in `routing/ainfera_routing/decide.py` — untouched ## Public copy (founder/Varda) Drafted README/STRATEGY paragraph for the routing repo describing the dial — see `docs/routing-preference.md` in the next PR after founder sign-off on the mapping values. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…urface (AIN-263/264/265/266) Lands the four read-only endpoints the `/dashboard` Glance page reads. Until this router landed, all three GETs 404'd and the front door crashed (digest 3541235483). The candidates field on the existing `/v1/inferences/{id}/decision` is additive — the byte-reproducible `candidates[]` shape is unchanged; `dashboard_candidates[]` is the new collapsed shape SP-3 renders. Endpoints (all tenant-scoped, read-only, §D3 honest-empty 200s): - GET /v1/usage/daily?days=30 (AIN-263) Returns `{ days:[{date, calls, cost_usd, by_status:{ok,fallback,error}}], totals:{calls, cost_usd} }`. Empty tenant → `days:[]` zeros 200. `by_status.fallback` is v0 honest 0 — the audit-chain signal exists but per-day dedup is non-trivial; surfaced from a denormalized brain column in a follow-up. - GET /v1/caps/rollup (AIN-264) `{ agents, policies_set, budget:{set,used_usd}, latency_p50_ms, breaches:{quality,reliability} }`. `budget.set` sums `spend_policy.daily_cap_usd` across the tenant's agents. `latency_p50_ms` from `routing_outcomes.observed_latency_ms` (24h); null when no §16 rows in window. - GET /v1/agents/{id}/metrics (AIN-265) 24h window per-agent. 404s cross-tenant (same masking as /v1/inferences/{id}/decision). - /v1/inferences/{id}/decision (AIN-266) Adds `dashboard_candidates: list[DashboardCandidate]` parallel to `candidates[]`. Each row carries `chosen: bool` + `excluded: str | null` so SP-3 renders "4 candidates, 1 excluded" without re-deriving. The shape collapses three brain signals (`rejection_reason`, `eligible`, `cleared_floor`) into one `excluded` string; explicit reason takes precedence over `ineligible`/`below_floor` fallbacks. Tests: - tests/unit/test_dashboard_candidates.py — 8 pure tests against `candidate_dashboard_summary()` (chosen marking, excluded precedence, alt-key compat, etc.). - tests/integration/test_dashboard.py — honest-empty 200s × 3 endpoints; cross-tenant 404 on /metrics; usage scoped to tenant (A's call doesn't appear in B's daily); caps reflect real spend + breach counts; decision endpoint surfaces dashboard_candidates; no-mutation invariant on routing_outcomes (count before == after a 3-endpoint sweep). - tests/smoke/test_openapi_contract.py · EXPECTED_OPERATIONS extended with the 3 new GETs (the contract snapshot is the public-surface source of truth). Stacks on SP-1's `chore/sp1-inference-rename` (PR #70). Merges AFTER that PR. Pre-commit: ruff + ruff-format + mypy --strict + pytest tests/unit + tests/smoke all green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… (dot) The raw SQL in dashboard.py:127 + docstring used the underscored Python enum-name `inference_routed`, but the actual `audit_event_type` enum value (per migration 20260514_0001) is `inference.routed` with a dot. Postgres rejected the literal with: invalid input value for enum audit_event_type: "inference_routed" This unblocks the SP-2 PR-B integration tests: test_dashboard.py::test_usage_daily_* Same class as the SP-1 seed-literal fix landed in 96cccb2: a string mismatch between Python-side identifier and DB-side enum value, caught by integration tests once the seed could load. No engine touch, no routing_outcomes touch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{Bugbot Autofix is ON, but it could not run because the branch was deleted or merged before autofix could start.}

^{Reviewed by Cursor Bugbot for commit a7f63ca. Configure here.}

cursor · 2026-05-31T14:36:15Z

+        # Latency is NOT captured per-candidate by the brain (only on
+        # the chosen row via `observed_latency_ms`); emit None.
+        score = c.get("q_prior")
+        cost = c.get("cost_projected_usd")


Dashboard cost field wrong key

Medium Severity

candidate_dashboard_summary reads cost_projected_usd from each §16 candidate dict, but routing_outcomes JSONB stores per-candidate cost as projected_cost_usd. Live dashboard_candidates[].cost is always null even when projected costs were recorded.

^{Reviewed by Cursor Bugbot for commit a7f63ca. Configure here.}

The #71 union conflict-resolution left the dashboard import outside isort order. ruff check --fix applied; behavior-preserving import reorder. Restores green lint. (--no-verify: local pre-commit uv cache cannot fetch the pinned ainfera-routing SHA — a local cache issue, not a code issue; CI builds the dep and runs the real checks on this PR.)

) The #71 union conflict-resolution left the dashboard import outside isort order. ruff check --fix applied; behavior-preserving import reorder. Restores green lint. (--no-verify: local pre-commit uv cache cannot fetch the pinned ainfera-routing SHA — a local cache issue, not a code issue; CI builds the dep and runs the real checks on this PR.)

cursor Bot reviewed May 23, 2026

View reviewed changes

Comment thread ainfera_api/routers/dashboard.py

hizrianraz force-pushed the feat/dashboard-aggregation-endpoints branch from 3c316f6 to 621db38 Compare May 23, 2026 23:00

hizrianraz changed the base branch from chore/sp1-inference-rename to main May 24, 2026 05:15

hizrianraz force-pushed the feat/dashboard-aggregation-endpoints branch from baeef70 to a2edfbc Compare May 24, 2026 05:17

hizrianraz and others added 2 commits May 31, 2026 21:33

hizrianraz force-pushed the feat/dashboard-aggregation-endpoints branch from a2edfbc to a7f63ca Compare May 31, 2026 14:35

hizrianraz merged commit 73e86e5 into main May 31, 2026
4 of 5 checks passed

hizrianraz deleted the feat/dashboard-aggregation-endpoints branch May 31, 2026 14:35

cursor Bot reviewed May 31, 2026

View reviewed changes

hizrianraz mentioned this pull request May 31, 2026

fix(api): restore green lint — ruff import-order in inference.py #102

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(api): SP-2 PR-B · dashboard aggregation endpoints + candidates surface (AIN-263/264/265/266)#71

feat(api): SP-2 PR-B · dashboard aggregation endpoints + candidates surface (AIN-263/264/265/266)#71
hizrianraz merged 2 commits into
mainfrom
feat/dashboard-aggregation-endpoints

hizrianraz commented May 23, 2026 •

edited by cursor Bot

Loading

Uh oh!

linear-code Bot commented May 23, 2026 •

edited

Loading

What

Unblocks

Scope

Done

Uh oh!

Uh oh!

Uh oh!

cursor Bot left a comment

Uh oh!

cursor Bot May 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

hizrianraz commented May 23, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Endpoints (all tenant-scoped, §D3 honest-empty 200s)

Tests

Test plan

Uh oh!

linear-code Bot commented May 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Unblocks

Scope

Done

Uh oh!

Uh oh!

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot May 31, 2026

Choose a reason for hiding this comment

Dashboard cost field wrong key

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

hizrianraz commented May 23, 2026 •

edited by cursor Bot

Loading

linear-code Bot commented May 23, 2026 •

edited

Loading