Tracking the deferred audit findings from PR #142. These are real but either defense-in-depth or larger refactors that didn't earn a same-PR fix when batched with the H-tier work. Listing here so they don't get lost.
For full context on the audit that surfaced these, see the analysis trail in PRs #140 and #142.
M2 — Pagination cursor unsigned (DoS surface)
Where: internal/agent/api.go:1786-1825
The opaque pagination token is just base64url(json) — no HMAC. Any authenticated caller can edit CreatedAt / ID in the JSON, e.g. set CreatedAt to 0001-01-01 to force a full-table-scan-then-truncate against messages. The cursor.AgentID check prevents cross-tenant reach, but a single bearer can repeatedly request expensive windows. Bounded by partial indexes today; integrity gap is real.
Fix sketch: HMAC the cursor JSON with cfg.Signing.HMACSecret (same pattern as the approval-token signer in internal/approvaltoken/token.go). Reject on tamper.
M3 — Magic-link approval tokens replay-able within TTL
Where: internal/approvaltoken/token.go, used by internal/agent/hitl_magic_api.go
A signed approval / reject token is valid for the entire message TTL plus a 10-minute grace. The status-race guard at approve/reject time blocks repeated state changes once the message is finalized, but if a token leaks (mail archive scrape, Slack-channel paste, Gmail link-tracking) before the reviewer acts, an attacker who reaches the link can race them.
Fix sketch: Either (a) track magic_link_consumed_at on the messages table (new column + migration + index + handler check), or (b) maintain a single-use jti table that the handler INSERTs into and lets the unique constraint reject replay.
M4 — DKIM private keys stored unencrypted at rest
Where: migrations/014_per_domain_dkim.sql, internal/identity/store.go:350-364 (GetDKIMKeyInternal)
Per-domain RSA-2048 private keys live in domains.dkim_private_key BYTEA with no envelope encryption. A DB-read compromise (backup leak, replica access) yields "sign as any user's domain" capability.
Fix sketch: Envelope-encrypt with a KEK derived via HKDF from cfg.Signing.HMACSecret (same secret-rotation pattern as OAuth signing), or move to an external KMS for hosted deployments. Migration needs to re-encrypt existing rows in a backwards-compatible window.
M5 — Webhook delivery vulnerable to DNS rebinding (SSRF)
Where: internal/webhook/deliverer.go:67-105, validation only at registration in internal/agent/api.go:66-98 (ValidateWebhookURL)
ValidateWebhookURL resolves the host and rejects private IPs at agent-registration time. At delivery time, only requireHTTPS is checked. The underlying DNS may have been rebound to 127.0.0.1 or RFC1918. CheckRedirect blocks 30x → loopback, but the initial DNS lookup of the registered hostname is the SSRF vector — register evil.example.com (public IP, passes validation), later change DNS to 127.0.0.1, e2a then POSTs the inbound payload to localhost.
Fix sketch: Use a custom net.Dialer with a Control callback that rejects RFC1918/loopback per-connection, mounted as the Transport.DialContext on the webhook http.Client. Re-resolves and re-checks on every connection regardless of caching.
M8 — No CSRF token on /api/oauth/consent
Where: internal/agent/oauth_handlers.go:658-755, session cookie at internal/auth/auth.go:233 (SameSite: http.SameSiteLaxMode)
Consent POST has no anti-CSRF token. SameSite=Lax blocks cross-site POSTs reliably in current browsers, so this is defensible — but a future browser/extension default-downgrade or a same-site subdomain XSS could mount the consent flow without user intent and silently authorize an MCP client to a chosen agent. Defense in depth.
Fix sketch: Bind a one-time CSRF token to the session cookie, mint on the /oauth/consent GET, verify on POST. Standard double-submit or per-session origin-bound token.
M11 — formatExpiresIn × 4, none tick
Where: web/src/app/(app)/dashboard/agents/messages/view/page.tsx:53, web/src/app/(app)/dashboard/pending/page.tsx:28, web/src/app/(app)/dashboard/pending/_components/PendingDetailPanel.tsx:55, web/src/app/(app)/api-keys/page.tsx:38
formatExpiresIn is reimplemented 4 times. None of the surfaces have a tick interval — the "expires in 47m" label captures Date.now() at render and never refreshes, so a reviewer who leaves the page open for 30 minutes sees stale numbers until SWR's window-focus revalidation kicks the data over. UX freshness, not correctness.
Fix sketch: Extract a shared <RelativeTime iso={...} kind="expires" /> component that owns its own 30-60s setInterval (mirroring the existing tick pattern in dashboard/page.tsx). Replace all 4 callsites.
Sequencing suggestion
- M2 — HMAC the cursor (smallest, defense-in-depth on existing surface)
- M3 — magic-link single-use (touches one new column + handler)
- M5 — webhook DNS rebind (focused fix on one client; high-impact security)
- M8 — OAuth CSRF (parallel to M2; reuses HMAC secret infrastructure)
- M11 — ticker UI (frontend only, no security implications)
- M4 — DKIM at-rest encryption (largest scope; KEK derivation + migration of existing rows)
Each is independent — they can land in any order or be cherry-picked as ad-hoc work permits.
Tracking the deferred audit findings from PR #142. These are real but either defense-in-depth or larger refactors that didn't earn a same-PR fix when batched with the H-tier work. Listing here so they don't get lost.
For full context on the audit that surfaced these, see the analysis trail in PRs #140 and #142.
M2 — Pagination cursor unsigned (DoS surface)
Where:
internal/agent/api.go:1786-1825The opaque pagination token is just
base64url(json)— no HMAC. Any authenticated caller can editCreatedAt/IDin the JSON, e.g. setCreatedAtto0001-01-01to force a full-table-scan-then-truncate againstmessages. Thecursor.AgentIDcheck prevents cross-tenant reach, but a single bearer can repeatedly request expensive windows. Bounded by partial indexes today; integrity gap is real.Fix sketch: HMAC the cursor JSON with
cfg.Signing.HMACSecret(same pattern as the approval-token signer ininternal/approvaltoken/token.go). Reject on tamper.M3 — Magic-link approval tokens replay-able within TTL
Where:
internal/approvaltoken/token.go, used byinternal/agent/hitl_magic_api.goA signed approval / reject token is valid for the entire message TTL plus a 10-minute grace. The status-race guard at approve/reject time blocks repeated state changes once the message is finalized, but if a token leaks (mail archive scrape, Slack-channel paste, Gmail link-tracking) before the reviewer acts, an attacker who reaches the link can race them.
Fix sketch: Either (a) track
magic_link_consumed_aton the messages table (new column + migration + index + handler check), or (b) maintain a single-usejtitable that the handler INSERTs into and lets the unique constraint reject replay.M4 — DKIM private keys stored unencrypted at rest
Where:
migrations/014_per_domain_dkim.sql,internal/identity/store.go:350-364(GetDKIMKeyInternal)Per-domain RSA-2048 private keys live in
domains.dkim_private_key BYTEAwith no envelope encryption. A DB-read compromise (backup leak, replica access) yields "sign as any user's domain" capability.Fix sketch: Envelope-encrypt with a KEK derived via HKDF from
cfg.Signing.HMACSecret(same secret-rotation pattern as OAuth signing), or move to an external KMS for hosted deployments. Migration needs to re-encrypt existing rows in a backwards-compatible window.M5 — Webhook delivery vulnerable to DNS rebinding (SSRF)
Where:
internal/webhook/deliverer.go:67-105, validation only at registration ininternal/agent/api.go:66-98(ValidateWebhookURL)ValidateWebhookURLresolves the host and rejects private IPs at agent-registration time. At delivery time, onlyrequireHTTPSis checked. The underlying DNS may have been rebound to127.0.0.1or RFC1918.CheckRedirectblocks 30x → loopback, but the initial DNS lookup of the registered hostname is the SSRF vector — registerevil.example.com(public IP, passes validation), later change DNS to127.0.0.1, e2a then POSTs the inbound payload to localhost.Fix sketch: Use a custom
net.Dialerwith aControlcallback that rejects RFC1918/loopback per-connection, mounted as theTransport.DialContexton the webhookhttp.Client. Re-resolves and re-checks on every connection regardless of caching.M8 — No CSRF token on
/api/oauth/consentWhere:
internal/agent/oauth_handlers.go:658-755, session cookie atinternal/auth/auth.go:233(SameSite: http.SameSiteLaxMode)Consent POST has no anti-CSRF token. SameSite=Lax blocks cross-site POSTs reliably in current browsers, so this is defensible — but a future browser/extension default-downgrade or a same-site subdomain XSS could mount the consent flow without user intent and silently authorize an MCP client to a chosen agent. Defense in depth.
Fix sketch: Bind a one-time CSRF token to the session cookie, mint on the
/oauth/consentGET, verify on POST. Standard double-submit or per-session origin-bound token.M11 —
formatExpiresIn× 4, none tickWhere:
web/src/app/(app)/dashboard/agents/messages/view/page.tsx:53,web/src/app/(app)/dashboard/pending/page.tsx:28,web/src/app/(app)/dashboard/pending/_components/PendingDetailPanel.tsx:55,web/src/app/(app)/api-keys/page.tsx:38formatExpiresInis reimplemented 4 times. None of the surfaces have a tick interval — the "expires in 47m" label capturesDate.now()at render and never refreshes, so a reviewer who leaves the page open for 30 minutes sees stale numbers until SWR's window-focus revalidation kicks the data over. UX freshness, not correctness.Fix sketch: Extract a shared
<RelativeTime iso={...} kind="expires" />component that owns its own 30-60ssetInterval(mirroring the existingtickpattern indashboard/page.tsx). Replace all 4 callsites.Sequencing suggestion
Each is independent — they can land in any order or be cherry-picked as ad-hoc work permits.