Skip to content

Add image-load + email-dispatch telemetry#8

Open
sandsower wants to merge 5 commits intomainfrom
instrumentation-image-failures
Open

Add image-load + email-dispatch telemetry#8
sandsower wants to merge 5 commits intomainfrom
instrumentation-image-failures

Conversation

@sandsower
Copy link
Copy Markdown
Owner

Two failure surfaces had no telemetry.

Client-side (PostHog):

  • image_load_failed from <img onerror> on item card, preview, detail, map popup
  • image_upload_failed on /api/upload non-2xx
  • report_form_failed on /api/items non-2xx, with reason
  • claim_code_email_failed when API returns claim_code_sent=false

Email dispatch audit (database):
_dispatch_email was discarding pg_net's request_id, so async failures (queue → Edge Function → Resend) left no breadcrumb. Now every dispatch lands in private.email_dispatches and a new private.email_dispatch_status view joins with net._http_response.

Diagnose a missing email:

select * from private.email_dispatch_status
where recipient = 'user@example.com'
order by dispatched_at desc;

outcome is one of: config_missing | queue_failed | pending | timed_out | error | sent | http_error.

Track failures we currently have no signal on:
- image_load_failed: <img onerror> on item card, preview, detail, map popup
- image_upload_failed: /api/upload non-2xx
- report_form_failed: /api/items non-2xx with reason
- claim_code_email_failed: API returns claim_code_sent=false
_dispatch_email previously discarded pg_net's request_id, so a queued
email that never reached Resend left no breadcrumb. Now every dispatch
inserts into private.email_dispatches with the request_id, and a new
private.email_dispatch_status view joins it with net._http_response to
show outcome (config_missing|queue_failed|pending|timed_out|error|
sent|http_error).

Diagnose a missing email with:
  select * from private.email_dispatch_status
  where recipient = 'user@example.com'
  order by dispatched_at desc;
[high] private.email_dispatches stored recipient PII with no expiry. Extended
cleanup_expired_items to (a) cascade-delete dispatches tied to expiring items
and (b) prune any dispatch older than 30 days regardless of item. Added
dispatches_deleted to the cron's return JSON; data-retention worker only
reads image_paths so the new key is non-breaking.

[medium] _dispatch_email only read 'itemId' and 'to', so contact, reply, and
support dispatches landed with null item_id/recipient. Now coalesces:
- item id: itemId | item_id
- recipient: to | posterEmail | recipient_email | requesterEmail

Verified locally with one dispatch per payload shape — all four record the
correct item_id (or null for support_request, which has none) and recipient.
[high] private.email_dispatches now has item_id with REFERENCES
public.items(id) ON DELETE CASCADE. Item deletion via delete_item
(GDPR erasure RPC), admin delete (PostgREST), and cleanup_expired_items
all drop audit rows automatically. cleanup_expired_items simplifies to
just the 30-day TTL prune for orphan rows (no item_id, e.g.
support_request).

[medium] image_url removed from image_load_failed payloads in
ItemCard, ItemPreview, item/[id]. PostHog only receives surface +
item_id; the URL is recoverable via item lookup if a maintainer
needs it. Privacy-copy promise honored.

Verified: dispatch row inserted with item_id, item delete cascaded
the dispatch row to 0. svelte-check 0 errors.
…net rows

[high] _dispatch_email was reachable via PostgREST as
rpc('_dispatch_email', { payload }) because PostgreSQL grants EXECUTE
to PUBLIC by default. Anon could send arbitrary mail through Resend
and inject attacker-controlled rows into the audit table. Added
revoke for public/anon/authenticated; only postgres + service_role
retain EXECUTE. Verified with set role anon → permission denied.

[medium] pg_net purges net._http_response on a ~6h clock; audit rows
live 30 days. Without an age check, a sent or failed dispatch became
indistinguishable from in-flight after pg_net pruned. Added
unknown_response_expired case for audit rows older than 6h with no
response row. Verified: backdated synthetic row → unknown_response_expired,
recent synthetic row → pending.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant