Skip to content

feat(dump): per-API-key request dump with live dashboard#61

Open
Menci wants to merge 43 commits into
mainfrom
request-dump
Open

feat(dump): per-API-key request dump with live dashboard#61
Menci wants to merge 43 commits into
mainfrom
request-dump

Conversation

@Menci

@Menci Menci commented Jun 19, 2026

Copy link
Copy Markdown
Owner

Summary

  • Per-API-key opt-in (dump_retention_seconds) records every model-invoking request (full client headers + body + response bytes) at the gateway's external boundary.
  • A captureRequestDump Hono middleware tees both request and response streams, builds a DumpRecord, and hands it fire-and-forget to a per-key DumpStore (durable) + DumpBroker (live fan-out).
  • Node backs both with filesystem + sqlite + an in-process EventTarget; Cloudflare backs both with one KeyDumpDO per key — SQLite metadata index + R2 bundle storage + WebSocket Hibernation fan-out + self-rescheduling retention alarm.
  • Dashboard adds a Requests tab per key: live SSE list (subscribe-then-snapshot-then-drain → at-most-twice, dashboard dedupes); paged history; click-row detail with raw headers/body and an Events vs Collected toggle on streamed responses.
  • Per-protocol collect<X>Stream functions in @floway-dev/protocols/dump-collect fold captured events into a final result. Return CollectOutcome<T> = { result, error, truncated } so broken/aborted streams surface explicitly in the UI.
  • Key editor gains a Request dump retention field (Off / 1h / 6h / 24h / 7d / Custom…). Lowering or disabling purges synchronously before the PATCH returns 200.

Deploy notes

Operator must update wrangler.jsonc to mirror the new bindings in wrangler.example.jsonc BEFORE deploy:

  • durable_objects.bindings: [{ name: "KEY_DUMP_DO", class_name: "KeyDumpDO" }]
  • migrations: [{ tag: "v1", new_sqlite_classes: ["KeyDumpDO"] }]
  • A second r2_buckets entry for DUMP_BLOBS (operator-chosen bucket name; create via pnpm wrangler r2 bucket create <name> first).

scripts/check-wrangler.ts is unchanged — its existing recursive comparator already validates these new top-level blocks.

The full deploy pipeline (pnpm run db:migrate:remote && pnpm run deploy) applies migrations 0033_api_key_dump_retention.sql and 0034_dump_records_node.sql and ships the DO class.

Test plan

  • pnpm run typecheck — 17/17 projects clean
  • NODE_OPTIONS=--max-old-space-size=8192 pnpm run lint — clean
  • pnpm run test — 2782+/2785 pass (the 3-5 flakes in packages/gateway/src/control-plane/upstreams/routes_test.ts are pre-existing on origin/main, unrelated)
  • Node pnpm run dev:node smoke — end-to-end: key create → retention set → captured /v1/messages + /v1/embeddingsGET /api/dump/keys/:keyId/records → SSE stream snapshot + appended
  • CF wrangler dev smoke — operator must wire the new bindings + create R2 bucket locally first
  • Manual UI walkthrough on the deployed instance: open Requests tab on a key with retention; hit captured endpoints with curl; verify list updates live + detail panel renders four panels + Events/Collected toggle works for SSE
  • Verify retention raise on a key triggers no purge but refreshes the DO's cached retention (avoids the prior busy-alarm bug)
  • Verify key delete purges associated dumps (DO storage drops to ~12 KB, R2 prefix swept)

Notes

  • 28 commits: 13 implementation commits (T1–T9 + 2 fix dispatches) + 15 follow-up commits applying a 25-finding code review.
  • No header redaction: authorization / x-api-key / cookie are captured verbatim. The capture site carries a load-bearing comment explaining why (the API key is already in our database; the dump exposes no secret the operator does not already control).
  • No retention hard ceiling: operators are the ones paying for R2 storage past the 10 GB free tier.
  • The dashboard's "Requests" entry is a row-action button on the keys list (no per-key sidenav exists in this codebase's flat-tab layout).

Menci added 28 commits June 19, 2026 18:37
Lands the schema + repo plumbing for the per-API-key request dump feature.
Adds the nullable INTEGER column, threads ApiKey.dumpRetentionSeconds
through the SQL repo (row shape, columns list, save/getById round-trip),
and fixes every construction site the compiler flagged by passing null.
A new repo-level round-trip test covers both backends.

null means dump capture is disabled (the only off state — there is no
feature flag). Downstream tasks consume the field for capture wiring,
the control-plane API, and the dashboard UI.
…collects

Land the protocols-side surface the dashboard's Request Dump tab depends on:

- `@floway-dev/protocols/dump` defines the bundle types — `DumpRecord`,
  `DumpMetadata`, `DumpStreamEvent`, `DumpResponseBody`, `DumpRecordId` — as
  the data contract between gateway, platform implementations, and the SPA.
  `DumpResponseBody` is a 3-variant discriminated union so the SPA can render
  empty-response failures distinctly from streamed-or-byte responses.
- `packages/protocols/src/gemini/stream.ts` adds `parseGeminiStream`, the
  `\n\n`-framed `data:` extractor used both by the capture middleware (to
  materialize Gemini's `:streamGenerateContent` body as dump events) and by
  the dashboard side. Handles split-frame reassembly and signal abort.
- Per-protocol `collect.ts` siblings fold a captured `DumpStreamEvent[]`
  back into the protocol's non-streaming response shape:
  - `collectMessagesStream` rebuilds an `AnthropicMessage` from
    `message_start`, `content_block_*`, and `message_delta` events.
  - `collectResponsesStream` adopts the terminal `response.completed`
    payload as authoritative; otherwise folds output-item additions and
    `output_text.delta`.
  - `collectChatCompletionsStream` concatenates `delta.content`, assembles
    `tool_calls`, and surfaces terminal usage.
  - `collectGeminiStream` concatenates `candidates[*].content.parts[*].text`
    across chunks and copies `usageMetadata` from the last chunk.
- `@floway-dev/protocols/dump-collect` re-exports all four collects from a
  separate subpath, kept apart from `./dump` so the gateway/worker can
  depend on the types without bundling collect code.
NodeDumpStore persists dump bundles via FileProvider at
dump/<keyId>/<recordId>.json and indexes (key_id, id, meta_json,
created_at) in a new dump_records table — migration 0034 — for
newest-first paginated list() with strict-exclusive `before` cursor and
a server-side cap of 200.

NodeDumpBroker is an in-process per-key fan-out: publish() pushes into
every live subscriber's queue, subscribe() returns an async iterator
backed by a queue + waker promise pair, and signal.abort() ends the
iteration and removes the subscriber.

purgeDumpsForAllKeys(store, keys) skips keys with
dumpRetentionSeconds = null and calls purgeExpired for the rest. The
entry's hourly sweep runs it next to runScheduledMaintenance with its
own try/catch so neither failure starves the other.

bootstrapNodePlatform now also returns the FileProvider it already
constructs, so entry.ts can hand it to NodeDumpStore without
re-instantiating. Gateway's index re-exports getRepo so the entry can
read the api-keys list for the sweep without reaching into internals.
Add the Cloudflare deployment target's request-dump backend:

- KeyDumpDO: SQLite-backed Durable Object, one per api-key. Holds the
  metadata index in `records` (id PK, meta_json, created_at) plus a
  `state` table (keyId, retentionSeconds). put() writes the bundle to
  R2 at dump/<keyId>/<recordId>.json, indexes the metadata, fans out to
  every accepted hibernation socket in a single loop (sockets that
  throw on send are closed and removed inline), and reschedules the
  alarm. Alarm fires at oldest+retention*1000, purges expired records
  + R2 blobs, then reschedules; an empty table clears the alarm. list
  and getRecord lazy-filter against the same retention cutoff so a
  late-firing alarm can never leak stale data.
- purgeAll lists the keyId R2 prefix (paginated, 1000 per page) and
  deletes everything before ctx.storage.deleteAll(), so any orphan an
  interrupted put left behind is also swept.
- DumpStore RPC shim: ns.idFromName(keyId) -> stub; put threads the
  retention through retentionLookup (api_keys row read), other methods
  forward unchanged.
- DumpBroker shim: publish is a no-op (DO put already fans out);
  subscribe opens a websocket against the DO via stub.fetch(/ws),
  accepts the client, and yields parsed metadata until the signal
  aborts.
- Hand-rolled cloudflare-do.d.ts declaring DurableObject,
  DurableObjectState, SqlStorage, DurableObjectNamespace,
  WebSocketPair, and the Response.webSocket field — same minimal-shape
  convention as cloudflare-sockets.d.ts and the other "Like" types in
  this app, so the runtime contract still does not pull in the full
  @cloudflare/workers-types surface.
- bootstrap.ts: extend CloudflareEnv + REQUIRED_BINDINGS with
  KEY_DUMP_DO and DUMP_BLOBS.
- entry.ts: export KeyDumpDO at the top level (so wrangler can resolve
  the class), wire the dump store + broker on every fetch. The
  scheduled() handler does NOT iterate keys to purge — retention sweep
  is per-DO alarm.
- wrangler.example.jsonc: durable_objects binding for KEY_DUMP_DO,
  v1 SQLite migration entry, second R2 bucket DUMP_BLOBS. The existing
  scripts/check-wrangler.ts already walks every block recursively, so
  adding new top-level keys is picked up without changes.
- gateway public surface: re-export getRepo so the entry can resolve
  per-key retention via repo.apiKeys.getById(keyId).
The middleware tees client-request and upstream-response bytes at the gateway's
external boundary, parses streams via parseSSEStream / parseGeminiStream, builds
a DumpRecord, and hands it off through backgroundSchedulerFromContext so the
response is not held open on persistence. Mounted only on the data-plane
endpoints that invoke a billable upstream model: messages, responses, chat
completions, embeddings, images, and Gemini generate / streamGenerateContent.

The auth middleware also stamps the resolved ApiKey on the Hono context so the
capture middleware reads dumpRetentionSeconds without a second repo lookup, and
each protocol's respond / passthroughServe writes a `dumpAccounting` slot
alongside its existing recordUsage call so meta carries upstream / model /
input+output token totals without re-deriving them inside the middleware.

Headers — including authorization, x-api-key, and cookie — are captured
verbatim. The api-key value is already in our own database; the dump exposes
no secret the operator does not already control. A header-comment at the
capture site documents this so future contributors do not "fix" it by adding
redaction.

Tests cover SSE / chunked-JSON / bytes / none body variants, the passthrough
path on null retention, request-body re-readability after the tee replay,
verbatim header capture, and dumpAccounting metadata propagation.
…te purge

Lands the dashboard's read path for captured request dumps and wires the
retention/delete write paths through to the dump store.

- New /api/dump/keys/:keyId/stream (snapshot then live appends), /records
  (newest-first, exclusive `before` cursor, 200-row hard cap), and
  /records/:recordId. Ownership check returns 404 on foreign keys to
  avoid leaking existence.
- PATCH /api/keys/:id accepts `dump_retention_seconds` (positive int or
  null). On disable or shrink the relevant DumpStore purge runs
  synchronously before the 200, so the post-PATCH read is consistent.
- DELETE /api/keys/:id and the user-soft-delete cascade both purge every
  affected key. The user path iterates listByUserId before
  softDeleteByUserId, since soft-deleted ids drop out of that listing.
- `apiKeyToJson` carries `dump_retention_seconds` so the dashboard reads
  the current setting back without a second round-trip.
- `setupAppTest` installs no-op DumpStore/DumpBroker defaults so existing
  tests that hit DELETE keep working; dump-touching tests overwrite with
  recording stubs after setupAppTest returns.
… invalid limit param

Two post-review nits from Task 7.

Add a `users/routes_test.ts` case that asserts `getDumpStore().purgeAll`
is called exactly once per live key during user soft-delete, with the
recorded set matching the pre-delete live-key list — which proves the
loop ran before `softDeleteByUserId` removed the ids from
`listByUserId`. The pre-existing soft-delete-cascade test is kept; the
two cover complementary halves of the cascade contract.

In `control-plane/dump.ts`, split the records-list `limit` parser into
a default (100) and a hard cap (200), and reject non-numeric or
non-positive inputs with 400 instead of silently substituting the cap.
Over-cap input is still clamped — the cap is policy on a well-formed
value, not a signal of bad input. The existing limit-cap test is
rewritten to cover the full matrix (missing, valid, over-cap, garbage,
non-positive, empty).
The dashboard SPA gains a per-key Requests page that streams new dumps
over SSE and renders a one-shot-fetched detail view, and the key editor
gains a "Request dump retention" control that drives the PATCH body
introduced by Task 7.

The new useDumpSubscription composable opens a single EventSource against
/api/dump/keys/<id>/stream, handles snapshot + appended events with
Set-based dedup, and paginates older records on demand. The page combines
RequestList (status badge, method + truncated path, model, tokens,
duration; failed rows tinted with the error message) with RecordDetail,
which fetches a record once and renders four panels with copy buttons:
request headers (sticky), request body (JSON pretty-print or UTF-8
decode of base64 bodies), response headers + status, and response body
that branches on type — bytes render pretty, stream toggles between a
collected view (picks the right collect<Protocol>Stream from the request
path) and a raw events list, and none surfaces meta.error.

The keys table grows a Requests action button that appears only when
dump_retention_seconds is non-null and links to the new route. The key
editor adds a preset Select (Off / 1h / 6h / 24h / 7d / Custom…) with a
parseDuration-backed custom input, plus an amber warning that fires
before save when the operator turns capture off or lowers an existing
window.

The flat dashboard/keys.vue page moves to dashboard/keys/index.vue so
the new keys/[keyId]/requests.vue can sit alongside without changing
the existing /dashboard/keys URL.
Pre-existing lint regressions cleared on the final verification pass:
- websocket_test.ts (commit 7345f28) had import-order errors from the
  responses keepalive fix; eslint --fix sorts them.
- protocols/dump/index.ts had a trailing comment continuation line
  whose indent failed @stylistic/indent. Restructured to a leading
  comment block (matching the headers field above) so the comment reads
  in one voice.
The DumpStore contract now takes retentionSeconds on every read so list/get
can apply the cutoff with the freshly-PATCHed value instead of whatever the
last put cached. The CF DO stops consulting its stored retentionSeconds for
read-path cutoffs (the alarm sweep still uses the cached value, which is
correct — it is the DO's own self-scheduled work). The Node store gains the
created_at gate on both list and get; the spec calls for lazy filtering on
both platforms and Node had none.

The control-plane handler threads key.dumpRetentionSeconds — already loaded
by the ownership check — through to every store call.

Also makes meta.path and request.path agree on the full path-with-query
string; the divergent reqPath variable was a leftover from an earlier
iteration.

Tests cover lazy-filter on list, lazy-filter on get, and the immediate
visibility of records when retention is raised — none of which were
exercised before.

Drops the comment reference to the gitignored spec path on the capture
middleware's no-redaction comment, per the rule against citing uncommitted
documents in code.
…ind; purge-then-softDelete

CF broker subscribe loop tracks a closed flag set by both close and error
listeners; the loop returns when the socket is closed and the queue is
drained. Previously a server-initiated close (DO eviction, retention turned
off mid-session) only fired wake() once and the next iteration awaited a
never-resolving promise, hanging the SSE handler forever. Regression test
uses a minimal EventTarget-based WebSocket double and bounds the wait.

ULID treats a backwards now value (NTP correction) the same as a
same-millisecond collision — keep lastTime, increment the random tail —
instead of letting lastTime regress alongside the smaller timestamp. That
preserves the strictly-increasing ordering the DumpStore relies on for
lexicographic page cursors.

deleteKey purges before soft-deleting, matching the user-cascade ordering.
A purge failure now leaves the key alive and visible to retry, instead of
orphaning dumps under a tombstoned key that the sweeps no longer iterate.
…e, not only lower

PATCH /api/keys/:id now calls purgeExpired on any positive-value retention
change (raise, lower, or enable from null). The CF KeyDumpDO uses this to
refresh its cached state.retentionSeconds and reschedule its alarm. Without
it, raising 1h to 24h leaves the DO with the old cached value; its alarm
fires under the old retention and scheduleNextAlarm reschedules with the
same stale value, busy-looping until the next captured request refreshes
state via put.

On Node the call passes through to the existing DELETE query and is a
no-op when no records sit over the new window.

Tests: enable-from-null and raise both now expect a single purgeExpired
call with the new retention. Disable still calls purgeAll. Shrink still
calls purgeExpired.
…then-snapshot for stream

Two fixes folded into one commit because they touch the same surface:

DumpStore contract revert (F18). The previous fix added a third
`retentionSeconds` arg to `list`/`get`. Push it back inside the impls:

- CF DO reads `retentionSeconds` from its own state row, refreshed on
  every put and on every `purgeExpired` (which the latest PATCH always
  calls now). No shim threading needed on the read path.
- CF shim drops the read-path retention arg and now throws if the lookup
  returns null at put time — capture middleware already gates on
  `dumpRetentionSeconds === null`, so a null here is a real bug and we
  surface it instead of silently dropping the record (F7).
- Node store takes an injected `retentionResolver` at construction time
  and applies the cutoff inside `list`/`get`. The platform-node entry
  passes a resolver that reads the api key on each call so a freshly
  raised retention takes effect on the very next read without a put or
  sweep.
- Control-plane handlers stop computing the cutoff and just call
  `list(keyId, opts)` / `get(keyId, recordId)`.

Stream-handler subscribe ordering (F1). The committed comment about
"accepting the gap" was wrong: snapshot-then-subscribe drops a record
completed in the window between the two operations. Invert the order:
subscribe first, buffer broker pushes, read the snapshot, push the
snapshot, drain the buffer, then forward live broker events. A record
completing inside the window is now delivered by both paths — the
client dedupes by id, matching the at-worst-twice spec guarantee.

Tests updated for the new 2-arg signatures and the resolver-injected
fixture; new gateway test exercises the snapshot-window race directly.
…with context

Three follow-up fixes to the capture middleware:

F3 — replace the synchronous buffer-then-replay with a real body tee.
The handler's half is wrapped into a new Request that carries forward
signal/cache/credentials/referrer from the original, and the capture
half drains into a Uint8Array reader concurrently with `next()`. The
downstream handler now sees streaming bytes as they arrive rather than
waiting for the full body to land in memory first. `duplex: 'half'` is
required on the rebuilt Request anywhere a ReadableStream body is
passed.

F5 — drop the per-call try/catch that swallowed store and broker
errors. Wrap the scheduled work so a failure is re-thrown with
`keyId=...` + `recordId=...` context, then let the BackgroundScheduler
resolver own the log. On Node that resolver is the per-process
`[background]` logger; on Cloudflare it surfaces via the CF logs.

F11/F16 — the `apiKey === undefined` branch was dead in production
(auth.ts runs first and rejects unauth'd callers with 401), so the
guard collapses to `apiKey.dumpRetentionSeconds === null` and the
absent-apiKey test goes away. The `:countTokens` skip comment loses
its reference to a gitignored spec doc.

Tests: new streaming-request-body test sends a body in two enqueue'd
chunks separated by an await turn and asserts the handler observed
both reads before EOF — which is the cleanest way to differentiate a
real tee from a buffer-and-replay.
EventSource silently reconnects on disconnect — the browser re-runs the
SSE handshake without us calling open() again, and the gateway re-emits
the snapshot event on every new stream. But the seen-set was only
cleared inside open(). After a reconnect every snapshot id was already
in seen, the dedup loop produced an empty fresh array, and the visible
record list went blank on any network flake.

Treat each snapshot as ground truth: rebuild seen and records from the
arriving array inline. appended still dedupes through the seen-set so
the snapshot/subscribe race in the gateway stays collapsed to one row.
The dashboard pane needs to be linkable so an operator can paste a
single URL pointing at a specific captured request. A new useHashRef
composable owns the bidirectional sync:

- Initial value is read from location.hash at setup time.
- Ref writes go through history.replaceState (NOT location.hash =) so
  the browser does not scroll the element into view.
- Setting the ref to null drops the hash entirely.
- A hashchange listener feeds back/forward and pasted URLs into the
  ref. Removed via onScopeDispose.

Wired into the requests page so v-model:selected-id is hash-backed.
Seven vitest cases cover seeding, write, drop, hashchange propagation,
and the no-loop guard. Tests stub location/history/window and use
effectScope — the project's web vitest config runs in the node
environment with no DOM library available.
The catch around navigator.clipboard.writeText used to swallow the
error and intentionally show no feedback, so the operator could not
tell whether the copy happened or the page silently dropped the
write (permissions, insecure context, focus issues, browser quirks).

Track a copyFailed ref alongside copied. On exception log the error
to console and flip the affected button to the danger variant with
"Copy failed" for ~2s. The success path only writes copied after the
clipboard promise resolved, so "Copied" never appears on a failure.
Same anti-pattern as the dump record-detail buttons. The catch was
empty with no explanation, so a failed clipboard write produced no
feedback at all. Thread copyFailed through to KeysTable next to copied
and render a rose circle + X icon (with a "Copy failed" tooltip) when
the tag is failing.
The per-protocol collect*Stream functions used to return the bare result
type and silently lose information on truncated or errored streams:

  - Responses only carried the most recent terminal snapshot; every
    incremental delta (output_text, function_call_arguments,
    custom_tool_call_input, reasoning_summary_text, image-generation
    partial frames, web-search lifecycle) was thrown away when the
    stream ended before its terminal frame.
  - Every collector's `default: break` swallowed the protocol-level
    error frame. Gemini's chunk parser would have crashed on the
    `{ error: { code, message, status } }` envelope instead of
    routing it.
  - Messages threw on a `content_block_delta` whose `content_block_start`
    never arrived, taking the whole "collected" view down with it.

Rework. Every collector returns

    { result: T | null; error: string | null; truncated: boolean }

exported as CollectOutcome<T> from @floway-dev/protocols/dump-collect.
Truncated streams fall back to the accumulator instead of throwing;
mid-stream error frames are captured into `error` and the partial result
is still returned. Responses gains a real fold across every delta event
class, with terminal frames overriding when present.

Detection paths:

  - Messages: type === 'error' frame (Anthropic SSE).
  - Responses: type === 'error' frame.
  - Chat Completions: no dedicated error event in the SSE protocol;
    error chunks land as `data: { error: { message, ... } }` instead
    of `data: { choices: [...] }`. Detected by shape, not event name.
  - Gemini: chunk parses as `{ error: { code, message, status } }`
    (GeminiErrorResponse). Detector requires a string `message` so a
    legitimate result chunk carrying an `error` extension doesn't
    misclassify.

Truncation signals per protocol: Messages = no message_stop. Responses
= no response.completed/incomplete/failed (or the terminal status
itself is incomplete/failed). Chat = no [DONE] marker or any choice
missing finish_reason. Gemini = any candidate missing finishReason.

Tests cover happy path, truncated stream, mid-stream error, multi-
choice/multi-candidate folding, split tool-call argument
concatenation, reasoning summary accumulation, response.incomplete
terminal, and orphan deltas. 25 tests across the four collectors;
full protocols sweep stays green (64 tests).

Dashboard integration follow-up: apps/web/src/components/dump/
RecordDetail.vue (a sibling worktree's concern) still types the
collect return as `unknown`, so the workspace typechecks today; the
component needs to be reshaped to render `result` + `error` +
`truncated` separately, and the existing "Could not collect stream"
branch should only fire when `result === null`.
…the composite index is usable

Migration 0034 builds idx_dump_records_key_created on
(key_id, created_at DESC). The previous ORDER BY id DESC couldn't use
that index — results happened to be ordered correctly because ULIDs are
time-monotonic, but the query planner couldn't prove it and had to sort.
The lazy-filter `created_at >= ?` likewise can't drive the index unless
the ORDER BY agrees with it.

Switching the ORDER BY to (created_at DESC, id DESC) — id as tie-breaker
for same-ms records — lets the planner walk the index in order. The
`id < ?` cursor predicate still means "earlier in time" under ULIDs and
keeps the existing pagination contract.
… API key read surface

Previous coverage only inserted; the SqlApiKeyRepo's ON CONFLICT UPDATE
clause for dump_retention_seconds went untested, as did the column's
appearance on findByRawKey/list/listIncludingDeleted/listByUserId after
an update. The new cases drive the column through positive -> null,
positive -> positive, and read every list path after the update.
…re aside

The migration comment about KeyDumpDO described runtime architecture,
not a schema-level invariant — drop the second sentence; the first one
still pulls its weight by saying what the table is for.

In broker.ts the "Resolver for the next-message wait" sentence just
restated the Waker type annotation; tighten to keep only the actual
information (publish and abort cooperate on the same resolver).

In the broker abort test, replace the empty no-op loop body and its
"should never run" comment with an explicit assertion that no message
was received — the test now fails fast if the iterator does receive
anything before abort, instead of silently passing because the loop
exited.
…ted view

The four collect* functions now return CollectOutcome<T> with result/error/
truncated fields instead of the raw protocol shape. RecordDetail consumes
that envelope:

- Red banner when error is set ("Stream error: ..."); shown even if
  truncated is also true, since the upstream message carries more diagnostic
  value than the missing-terminal-frame signal.
- Amber banner when truncated alone ("Stream truncated — terminal frame
  missing. Showing best-effort accumulated state.").
- Empty-state line when result is null so the renderer never dereferences
  a missing payload; the banner above it explains why.

Copy button still copies the reconstructed JSON when present, falling back
to the error string so users can paste the diagnostic in either state.
… skip the sweep

The previous sequential `for ... await store.purgeExpired(...)` propagated
the first key's failure up to the entry-point catch and the loop bailed,
leaving every later key un-purged. The outer log also collapsed to a
single anonymous `[dump-purge]` line so the operator couldn't tell which
key was at fault.

Per-key try/catch with `[dump-purge] <keyId>` logging keeps the sweep
bounded and the failure attributable. The outer catch in entry.ts now
only fires for `apiKeys.list()` failures and is labeled accordingly.
…column

The state table's v column is NOT NULL by schema, so the only meaning of
"no row" is row-missing — readState now returns string | undefined and
callers compare against undefined explicitly instead of folding "row
missing" into "stored null" with `?? null`.

alarm() takes the deleteAll-then-new-put race window as the explicit
no-op: when purgeAll has dropped the state row, the alarm bails and the
next put rewrites retention. expiryCutoff inherits the same shape — no
state row means no retention to enforce yet, so list/get include every
row. Both are documented at the site.

Also: records.created_at switches from startedAt to completedAt so the
stored record is born at completion, not at request start; a long
request was previously aged from the wrong moment. getRecord now takes
keyId as a parameter rather than reading it back from state — the shim
already has it, and after purgeAll a missing-row read now returns null
for the right reason.
…ware WS test

Two follow-ups on the broker:

- The subscribe iterator added an abort listener with `{ once: true }`,
  but that flag only auto-removes after the listener fires. On a
  long-lived shared AbortSignal — typical in SSE handlers that thread
  one signal through many subscribers — the listener would accumulate
  per iterator cycle. The finally block now removes it explicitly on
  normal return.

- The broker_test fake conflated the two ends of a WebSocketPair into a
  single EventTarget, so the test passed even though the broker's real
  contract (accept() on the fetch().webSocket client end, observe
  messages the DO server sends through the peer) was not being
  exercised. Replaced it with a proper two-ended pair; fetch() returns
  the client side, the test publishes via server.send(), and the
  iterator observes the messages on the client end.

Restate-only comments in the catch tails were dropped to match the
project's prevailing best-effort-close style.
… on reconnect

Four follow-ups from the final review pass on the request-dump branch:

- Node store now binds `meta.completedAt` as `created_at`, matching the
  CF DO's retention semantics. The CF rationale lived only at the DO
  call site; copy it next to the Node insert so each backend documents
  its own policy.
- `useDumpSubscription` no longer discards the paged-in history when the
  browser auto-reconnects EventSource. Snapshots are authoritative only
  for the window they cover (the newest N); rows older than the
  snapshot's oldest id are preserved. ULID id-comparison stands in for
  startedAt to keep the predicate one field-access deep.
- New tests in `useDumpSubscription_test.ts` cover initial snapshot,
  `loadOlder`, reconnect-preservation, and `appended` dedup against the
  preserved older rows. Modeled on `useHashRef_test.ts`.
- `control-plane/dump.ts` documents why the sink reassignment between
  the buffered-drain and the live-pump is atomic: the next broker push
  can only resume on a later microtask, by which time the live sink is
  in place.
…rowing on first PATCH/getRecord/alarm

The DO's SQLite cursor .one() throws 'Expected exactly one result' when
the result set is empty; the surrounding code (scheduleNextAlarm's empty-
table branch, getRecord's missing-row branch, readState's missing-state
branch) had unreachable null-checks that were never exercised.

Switched to .toArray()[0] which returns undefined on empty, so every
null-check now behaves as written. Surfaced first by hitting PATCH
/api/keys/:id with retention while the DO had no records yet — the
purgeExpired call propagated the throw out to a 500.
Menci added 6 commits June 20, 2026 01:23
…tom headers

Browsers cannot attach 'x-floway-session' (or any custom header) to
native EventSource connections, so the SSE /api/dump/keys/:keyId/stream
endpoint was rejected by the session-cookie auth middleware on every
dashboard open — the Requests tab showed an empty list with a red
'Stream disconnected' banner.

Fix: gateway auth middleware accepts ?session=<token> query string on
that one endpoint only (pinned to GET + the exact path pattern, so
other routes can't leak session via URL); useDumpSubscription appends
the current session token to the EventSource URL.

Also switched the loadOlder pager and RecordDetail.fetchRecord from
plain fetch() to authFetch — they were silently 401-ing too. The test
fixture for useDumpSubscription now installs a fresh Pinia per case
because the composable touches useAuthStore.
…-section detail layout

Restructures the Requests UI to match the Models page pattern:

- New top-level tab '/dashboard/requests' replaces the row-action button on
  the keys list. The page contains a key selector on the left (filtered
  to keys with dump retention enabled) above the request list, and a
  detail pane on the right.
- Removed the row-action 'Requests' icon from KeysTable; entry is now
  navigation-driven.
- Right detail pane no longer wraps in a 'Detail' titled card. The four
  section cards (Request headers / body, Response headers / body) sit
  directly in the pane and share the viewport via a 1:2:1:2 grid so the
  bodies get more room than the headers and each card scrolls
  independently when content overflows.
- All four section headers use a consistent dark surface
  (bg-surface-800/95) — previously only the request headers section did.
- JSON bodies and stream events are rendered through @floway-dev/ui's
  Code component for Prism syntax highlighting; non-JSON bytes fall back
  to plain <pre>.
- Tightened the request-list row layout: single line for status+method+
  path+time, second line for model+duration+tokens. Upstream id moves to
  the detail pane only (it was redundant in the list and crowded the
  layout).
- typed-router.d.ts regenerated.
- Old pages/dashboard/keys/[keyId]/requests.vue deleted.
… row; natural-height sections

DumpMetadata gains:
- requestBytes / responseBytes: captured wire-level byte counts. The capture
  middleware sums chunks via a TransformStream tap on the response tee (zero
  cost on top of the existing parse pipeline) plus byteLength on the buffered
  request body. Renders in the list as ↑/↓ indicators.
- upstream: now a {id, name, kind} object instead of bare id. The capture
  middleware resolves the id against the upstreams repo at finalize time so
  the dashboard can show a colored provider label without a round-trip. A
  deleted upstream falls back to id-as-name with kind='unknown'.

CF KeyDumpDO: re-create the schema after ctx.storage.deleteAll() in purgeAll.
deleteAll drops the records and state tables, but the runtime keeps the DO
instance alive — its constructor (which seeds the schema) only runs on next
eviction. Without this re-create, the next put / purgeExpired after a
PATCH-to-null + PATCH-to-positive cycle 500'd on 'no such table: state'.

RequestList row redesign:
- Line 1: status + method + path + relative-time (unchanged).
- Line 2: ⏱ duration · ↑ req-bytes · ↓ res-bytes (left, always grouped);
  model name (right). flex-wrap; the metrics stay leftmost on the first
  wrapped row, the model stays rightmost wherever it lands.
- Line 3 (conditional, hidden when neither upstream nor tokens are present):
  upstream name colored by provider kind (no badge — copilot/codex cyan,
  azure emerald, custom amber); total tokens on the right formatted as
  '1.2 M tokens'.

RecordDetail layout:
- Replaced the fixed-ratio CSS grid (which forced sections to share the
  viewport and made short bodies look padded and tall bodies have no
  scrollbar) with an outer OverlayScrollbars over the whole detail pane
  and per-section max-h-[70vh] internal scroll. Sections render at their
  natural height up to that cap; the pane scrolls when the stack
  cumulatively overflows.

Test fixtures updated to the new metadata shape (upstream object,
required byte fields).
… width

List row layout: status / model / time on line 1; path on line 2 left,
metrics group on line 2 right. The model is the most operator-relevant
identifier so it takes the prominent slot; when accounting didn't resolve
a model (4xx before upstream attempt, etc.) line 1 shows 'Unknown' in
the standard sans font as a clear 'no value' cue instead of pretending
the path is the model. The POST method is no longer rendered — every
captured endpoint is POST, so it added no signal in the list. Relative
time gets its 'ago' suffix back so it can't be confused with the
duration metric on the next line.

Code box vertical scroll: @floway-dev/ui's Code component's root is now
flex-col with an min-h-0/flex-1 OverlayScrollbars inside, so a
max-h-* class applied to <Code> at the call site propagates and the
inner scrollbar activates instead of the content being clipped by an
outer overflow-hidden box. RecordDetail's wrappers around Code became
flex containers and the Code blocks themselves carry max-h-[70vh]
(events list inside the stream view caps each event at max-h-[40vh]).

Sidebar widened from lg:w-80 → lg:w-90.
…headers

Sensitive request headers default to '••••••••' with a tightly-adjacent
eye icon (lucide-eye / lucide-eye-off). Clicking the icon reveals or
re-hides the value. State is per-row so repeated headers (rare but
possible for authorization) toggle independently, and resets whenever
a different record loads.

Frontend-only — the captured value still lands in the dump record
verbatim, which is the documented contract. This just stops the value
from being visible at a glance when the operator opens the detail
panel near another person.
…rs, no card hierarchy

Redaction now keeps the value's character count intact and masks only the
middle: first 8 + last 8 visible, the rest replaced with the same number
of '•'. Operators recognize the credential type from the prefix (sk-ant-…,
cgw-main-…) and verify against their notes via the tail. Strings of 16 or
fewer chars are fully masked because there's no middle to elide.

Detail pane reflow:
- Dropped the outer wrapper card and the glass-card on each individual
  section. Sections now sit flush, separated by a single divide-y line
  (no double border between adjacent section headers and bodies).
- Removed the inner padding around the section stack (was p-4 gap-3).
- Each section header sticks to the top of the right-pane scroll
  container (sticky top-0 z-10) — when the user scrolls past a section
  the header stays pinned until the next section's header pushes it up,
  giving the column a grid-like feel without per-section card chrome.
- Dropped per-section max-h caps on bodies; the right pane scrolls the
  whole stack so long bodies expand naturally and the sticky header
  is the navigation aid.
Menci added 8 commits June 20, 2026 04:09
Apply the synthesized round-1 review findings against the request-dump
branch. Headline items:

- Test fix: capture-dump's `dumpAccounting context var feeds the record
  metadata` test now wires `initRepo(new InMemoryRepo())` and seeds an
  `up_test` upstream so the production code path's
  `getRepo().upstreams.getById(...)` resolves; asserts the new
  `DumpUpstreamRef { id, name, kind }` shape instead of the old raw id.

- Bundle scoping: drop `collect*Stream` re-exports from the per-protocol
  index barrels (`chat-completions`, `responses`, `messages`, `gemini`).
  Collectors stay reachable only via `@floway-dev/protocols/dump-collect`,
  which keeps them out of the gateway/Worker bundle.

- UI honesty: surface base64-decode failures in the dump's Request body
  pane with an inline amber banner instead of silently falling back to
  raw base64.

- DurableObject churn: cache `keyId` / `retentionSeconds` writes in
  KeyDumpDO so the SQL only fires when the value actually changes.

- Shared helpers: extract `sumDumpTokens` (used by LLM `respond.ts` and
  passthrough-serve) and `ownedKeyOr404` (used by control-plane `dump.ts`
  and `api-keys/routes.ts`).

- Code component: make the flex-column root opt-in via a `fillParent`
  prop so the existing call sites in flow layouts keep their original
  vertical behavior.

- YAGNI: drop the dead Gemini-JSON branch from `capture-dump.ts` (the
  SSE branch handles Gemini in production after `streamSSE`).

- Comments: rewrite or remove a batch of redundant / mislabeled /
  positional comments per the project comment rules (no restating
  obvious code; no spec-id prefixes; no `// Line N:` labels).

- Error visibility: surface `keyId` in the broker pump's error log
  (`control-plane/dump.ts`); replace the silent `?? null` on the
  retention resolver in `platform-node/entry.ts` with an explicit throw
  for an unknown keyId.

Verification: vitest (262 files, 2808 tests), typecheck, lint — all
clean.
Address the 8 findings from the round-2 reviewer + cleanup pass.

I1  drop dead parseGeminiStream + co-located test; capture middleware uses
    the generic parseSSEStream for every text/event-stream body.
R2-1 revert Code.vue's fillParent prop — the redesigned detail pane no
    longer caps Code's height, so the opt-in flex shell is unreachable.
M1  rename the Gemini-collect text-concat test to match its fixture;
    mergePart never concatenates function-call args.
M2  drop "by spec" from the dump_test duplicates note (no committed spec
    document); point to the subscribe-before-snapshot comment instead.
M3  un-export DumpPurgeKey in apps/platform-node/src/dump/purge.ts.
M4  drop the redundant Waker comment in apps/platform-node/src/dump/broker.ts.
M5  align CF retentionLookup with Node: throw on unknown-key in
    apps/platform-cloudflare/entry.ts so a null at put-time is only ever
    the genuine PATCH-disabled race; update the store-shim throw message.
M6  tighten the .toArray()[0] vs .one() comment in key-dump-do.ts to
    point at the single source of truth in scheduleNextAlarm.

Verification: per-package vitest, pnpm run typecheck, and pnpm run lint
all clean.
Address the 26 findings from the round-3 reviewer + cleanup pass (10
important + 16 minor).

I1   wear captureRequestDump on the Codex POST routes (/responses,
     /responses/compact) — they were billable upstream calls going
     un-dumped while every other LLM POST was wrapped.
I2   stamp dumpAccounting on every error path in all four LLM
     respond.ts files and in passthrough-serve. Added
     errorDumpAccounting(performance) + plainDumpAccounting in
     shared/respond.ts; the existing identity-driven setter is now
     setDumpAccountingFromIdentity. Error rows in the dump list now
     carry the resolved model/upstream instead of empty cells.
I3   wrap upstream-ref lookup in capture-dump.ts in try/catch with the
     same fallback ref used for a missing row. A flaky repo no longer
     drops the entire dump.
I4   CF DO list now ORDER BY (created_at DESC, id DESC) so it uses
     idx_records_created and matches Node's cursor semantics.
I5   parseDuration('0') (and 0s/0m/0h/0d) returns null at the parser
     boundary so the dialog surfaces the validation error rather than
     letting a zero through to the backend's positive() rejection.
I6   useDumpSubscription bails the watcher on a falsy keyId (no more
     /api/dump/keys//stream and stuck reconnect loop), and the snapshot
     handler clears error.value so a successful reconnect dismisses
     the "Stream disconnected" banner.
I7   captureBytes drains via a reader loop, preserving every chunk
     received before a mid-stream error. arrayBuffer() discarded the
     whole buffer.
I8   capture-dump_test gets a describe('failure modes') block covering
     store.put throw, broker.publish throw, upstream-ref throw,
     captureBytes mid-stream error, and captureSSE mid-stream error.
     Asserts the wrapped error format and error.cause preservation,
     and that surviving capture lands on disk.
I9   dump_test snapshot-vs-buffer test now freezes the snapshot before
     the gate, then asserts 01B is NOT in snapshot AND IS in appended
     — the buffered subscribe path is the only delivery channel.
I10  two responses/collect_test cases are reshaped to prove the
     branches they claim (drop the terminal frame on the deltas-fold
     test; retarget the incomplete-terminal test at the truncated:
     true contract that actually holds).

cleanup  drop "by the spec" from data-plane/routes; remove the
         fictional cgw-main-… example in RecordDetail; add
         tabindex/role/keydown handlers to RequestList rows for
         keyboard operability.

M1-M13   misc cleanups — drop unused FakeEventSource.closed field,
         remove unreachable `<= 0` branch in EditKeyDialog, move
         created_at=completedAt rationale onto the DumpStore.put
         contract, trim DumpStreamEvent / DumpUpstreamRef comments,
         tighten 0034 header to acknowledge dual-runtime apply,
         purge dump bundles before api_keys.deleteAll() in replace-
         mode import, add FileProvider.delete(key) for true per-file
         delete in NodeDumpStore.purgeExpired, make
         appendOutputText/appendReasoningSummary slot-overwrite
         symmetric, add signal-abort + multi-subscriber tests to CF
         broker_test, drop async on KeyDumpDO.webSocketMessage, emit
         event:error SSE frame on broker failure in dump.ts, and add
         a 10-year upper bound on dumpRetentionSecondsSchema.

Verification: per-package vitest (61+1278+55+16+36 = 1446 tests),
pnpm run typecheck, and pnpm run lint all clean.
Important:
- control-plane dump stream: own the broker subscription via a local
  AbortController chained to the request signal; abort and drain on a
  snapshot read failure so the broker subscription tears down instead of
  leaking the per-key DO websocket (CF) or in-process queue (Node).
- deleteUser: purge every live key's dumps BEFORE any tombstoning so a
  purge failure leaves the user and keys retriable, matching deleteKey's
  purge-then-soft-delete contract.
- CF broker: separate transport-error handling from clean close; iterator
  throws on error so the control-plane pump emits the final
  `event: error` SSE frame instead of letting the dashboard see an
  opaque disconnect.
- Added a dump_test case covering the broker-throws-after-snapshot SSE
  error-frame path and a CF broker_test for the throw shape.

Minor cleanup:
- Imported `DumpUpstreamRef` in capture-dump.ts and inlined the unused
  public `DumpResponse` interface in protocols.
- Logged transient upstream-ref lookup failures so operators can tell
  apart "upstream deleted" from "repo down".
- Rewrote the test comment referencing gitignored docs to be
  self-contained; tightened the test to also assert the new log line.
- Removed redundant `// never reached` filler and renamed the CF broker
  "DO fanout" test to reflect what it actually proves.
- Asserted the third console.error argument in the Node purge test.
- Removed the redundant length-watcher in RequestList.vue (sentinel
  identity is stable; the existing watcher covers mount/unmount).
- Guarded RecordDetail.vue's fetchRecord with an activeFetchToken so a
  fast A->B click can't repaint A.
- Gave providerColorClass a distinct hue per provider kind from the
  declared accent palette.
- requests.vue now refetches the keys list on window focus via
  initialData.reload().
- retentionPresetFromValue returns `${sec}s` for raw seconds so the
  custom field round-trips with an explicit unit.
- KeyDumpDO.purgeOlderThan now throws on a missing cached keyId instead
  of silently skipping R2 cleanup.
C1: KeyDumpDO.purgeExpired now seeds keyId into the state table before
delegating to purgeOlderThan, and accepts keyId from the platform contract
so the first null→positive retention PATCH on a key with no captures yet
no longer trips the "state was wiped" invariant. The CF store shim forwards
keyId through, mirroring the Node store's signature. A new regression test
exercises the empty-DO path end-to-end via a vi.mock of cloudflare:workers.

I1: useDumpSubscription's error listener now reads MessageEvent.data first
and surfaces the broker's verbatim message when a server-sent `event: error`
SSE frame arrives (readyState stays OPEN per the WHATWG spec); the existing
transport-close branch handles native EventSource errors with empty data.
Two new tests cover both paths.

M1: Data-transfer import now round-trips ApiKey.dumpRetentionSeconds with
the same null|positive-integer ≤10y validation the PATCH endpoint uses;
older exports that lack the field still default to null.

M2: requests.vue reconciles selectedKeyId when dumpKeys shrinks after a
focus-driven refetch (another tab toggling retention off) by falling back
to the first remaining dump-enabled key, letting useDumpSubscription tear
down the now-stale stream.

M3: The control-plane SSE stream removes its onRequestAbort listener in a
finally block so clean-close and broker-error paths drop the listener
explicitly instead of waiting for GC.

M4: Updated the capture-dump comment to describe the actual dependency
order — the streaming respond's finally runs before the stream closes,
which is before forCapture drains, which is before finalize reads
dumpAccounting.

M5: NodeDumpStore.purgeExpired parallelizes per-record file deletes so a
large sweep no longer pays an N-round-trip serial cost.
Five minor findings after dedup of round-6 reviewer + cleanup output.
No important/critical findings remained.

- parseLimit on /api/dump/keys/:keyId/records now rejects decimals
  (Number.isInteger replaces Number.isFinite); the old Math.floor that
  silently coerced 1.5 to 1 is dead and dropped.
- parseImportedDumpRetention (round-5 M1) gets two routes_test.ts
  cases: a positive value round-trips through import, and 0/-1/over-cap
  values are rejected with 400 before any write.
- annotateBase64ContentType in capture-dump now synthesises
  application/octet-stream;base64 when the upstream sent no
  content-type at all so the SPA's /;base64$/i decoder still triggers.
  Covered by a new capture-dump_test.ts case.
- useDumpSubscription's appended-event dedup Set rebuilds from the
  visible records list when it crosses 10_000 entries so a forever-live
  stream cannot grow it without bound.
- key-dump-do_test.ts comment no longer refers to the "R4 invariant"
  by SDD round label; it describes the purgeOlderThan tripwire by what
  it actually guards.
- useDumpSubscription: prepend the appended record before the windowed
  dedup-set rebuild so the just-added id survives a boundary rebuild and
  a duplicate `appended` event for that id is still rejected.
- capture-dump: comment refers to the runtime's background scheduler
  rather than Cloudflare's `waitUntil`, matching the runtime-agnostic
  abstraction used by the code.
M1 (capture-dump): the request-body capture IIFE drained its tee'd half
with no try/catch, so a mid-upload client abort rejected
capturedRequestBytesPromise, threw at finalize's await, and dropped the
entire dump record. Mirrored the response-side reader-loop pattern:
introduced a CapturedRequestBody shape with a streamingError channel,
wrapped the loop in try/catch so it keeps the accumulated chunks on
error, and folded the request-side streamingError into meta.error
alongside the response-side one. Added a test that streams two chunks
then errors and asserts the dump lands with partial bytes plus the
error string.

M2 (useHashRef): the SPA composable carried three SSR-style
typeof location/history/window guards. The dashboard has no SSR
entry, the colocated test stubs all three globals before invoking the
composable, and grep found no other such guards in apps/web/src.
Dropped the guards - readHash, the watcher, and the hashchange listener
attach unconditionally. The test suite continues to pass because the
stubs still satisfy the direct global reads.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant