Skip to content

fix(scratchnode): never silently drop a cold-load send (PR B — queue until live room ready)#445

Merged
HomenShum merged 2 commits into
mainfrom
fix/coldload-send-race
Jun 1, 2026
Merged

fix(scratchnode): never silently drop a cold-load send (PR B — queue until live room ready)#445
HomenShum merged 2 commits into
mainfrom
fix/coldload-send-race

Conversation

@HomenShum
Copy link
Copy Markdown
Owner

PR B of the /ask production-grade sprint. Frontend-only, additive.

Root cause (found in PR #443 live verification)

The Convex-routing send override installs only AFTER the async init (lazy-load browser client from esm.sh, then await joinEvent). Until then, window.sendComposerMessage is the prototype-only handler — it clears the composer and renders locally but NEVER persists. A public chat//ask fired in that cold-load window is silently lost. This was the first-send failure observed live (not an Enter-key bug; not the member-row race — the not_joined auto-retry already covers that).

Fix

  • Early queueing shim installed before the client-load + join awaits: public sends buffer into _sn_pendingSends (not lost), composer clears, shows a Connecting to live room… hint. Private notes still pass through.
  • Real override drains the queue on install — replays each buffered draft through the full sendMessage → askAgent path, in order.
  • On init failure, _sn_failPendingSends restores the draft instead of dropping it.

Verification

All 4 inline <script> blocks pass node --check (live-room block is type=module/strict). Behavioral cold-load verification runs post-deploy on the live showcase per live_dom_verification.

🤖 Generated with Claude Code

…the live room is ready

PR B of the /ask launch-readiness sprint.

Root cause (found during PR #443 live verification): in home-v5.html the Convex-
routing send override is installed only AFTER the async init — lazy-loading the
browser client from esm.sh, then `await joinEvent`. Until that completes,
window.sendComposerMessage is still the prototype-only handler: it clears the
composer and renders locally but NEVER persists to Convex. A public chat / `/ask`
fired in that cold-load window is silently lost — the user believes it sent. This
is exactly the first-send failure observed live (it was NOT an Enter-key bug, and
NOT the member-row race — the not_joined auto-retry already covers that).

Fix (additive, frontend-only):
- Install an early queueing shim BEFORE the client-load + join awaits. Public
  sends are buffered into window._sn_pendingSends (not lost); the composer clears
  and shows a "Connecting to live room…" hint. Private notes still pass straight
  to the prototype handler.
- The real Convex override drains the queue the moment it installs — replaying
  each buffered draft through the full sendMessage → askAgent path, in order.
- If init fails (client load or join), _sn_failPendingSends restores the most
  recent un-sent draft to the composer + toasts, instead of dropping it.

Verification: all 4 inline <script> blocks pass `node --check` (the live-room
block is type=module / strict mode). Behavioral cold-load verification runs
post-deploy on the live showcase per .claude/rules/live_dom_verification.md
(static proto HTML isn't covered by tsc/build).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@HomenShum HomenShum enabled auto-merge (squash) May 30, 2026 23:39
@vercel
Copy link
Copy Markdown

vercel Bot commented May 30, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
nodebench-ai Ready Ready Preview, Comment Jun 1, 2026 5:04pm

Request Review

@augmentcode
Copy link
Copy Markdown

augmentcode Bot commented May 30, 2026

🤖 Augment PR Summary

Summary: Prevents ScratchNode “cold-load” public sends from being silently dropped before the Convex live-room client finishes initializing.

Changes:

  • Installs an early queueing shim for window.sendComposerMessage to buffer public drafts during client lazy-load + joinEvent
  • Clears the composer and shows a “Connecting to live room…” hint while buffering
  • Restores the most recent pending draft (and notifies) if initialization fails
  • Invokes the failure-restore path on Convex client import failure and joinEvent failure
  • Drains any queued drafts once the real Convex-routing send override is installed

🤖 Was this summary useful? React with 👍 or 👎

Copy link
Copy Markdown

@augmentcode augmentcode Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 2 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

Comment thread public/proto/home-v5.html
if (_ci) {
for (let _i = 0; _i < _queued.length; _i += 1) {
_ci.value = _queued[_i];
window.sendComposerMessage();
Copy link
Copy Markdown

@augmentcode augmentcode Bot May 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

public/proto/home-v5.html:5734 — This drains _sn_pendingSends by calling window.sendComposerMessage() in a tight loop, but the real send path is async (sendMessage/askAgent) so these will run concurrently. That can break the stated “in order” replay guarantee and potentially reorder posts under load.

Severity: medium

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

Comment thread public/proto/home-v5.html
} catch (e) {
console.warn('[scratchnode] Convex client load failed, live room unavailable:', e.message);
showLiveRoomError('Could not load the realtime client. <a href="javascript:location.reload()" style="color:#f1d6c8;text-decoration:underline">Retry</a>');
window._sn_failPendingSends && window._sn_failPendingSends();
Copy link
Copy Markdown

@augmentcode augmentcode Bot May 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

public/proto/home-v5.html:5472 — After an init failure, _sn_failPendingSends() runs but the queueing shim remains installed as window.sendComposerMessage, so subsequent clicks can keep clearing/queueing drafts with no path to ever drain (and without another restore call). This can reintroduce “silent loss” behavior after an error and/or allow _sn_pendingSends to grow until reload.

Severity: medium

Other Locations
  • public/proto/home-v5.html:5495

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

@HomenShum HomenShum merged commit de152fa into main Jun 1, 2026
16 checks passed
@HomenShum HomenShum deleted the fix/coldload-send-race branch June 1, 2026 17:15
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 1, 2026

Demo: walkthrough of the surfaces this PR changed is available as a workflow artifact (pr-demo-445) at https://github.com/HomenShum/nodebench-ai/actions/runs/26770240385

HomenShum added a commit that referenced this pull request Jun 1, 2026
… (PR C) (#446)

PR C of the /ask launch-readiness sprint. Backend-only, additive (new query,
no schema/contract change).

Launch ops can't run /ask blind. getAskTelemetry(eventId) is a bounded, read-only
aggregate over an event's answers that surfaces the operate-the-launch signals:
  - mode mix { provider, cache, deterministic, provider_fallback }
  - PROVIDER FAILURE RATE = provider_fallback / provider ATTEMPTS (cache +
    deterministic excluded from the denominator — they never reached the provider)
  - quality pass rate + avg score (from the deterministic answer evaluation)
  - total estimated cost (cents) and avg provider latency (from the provider_llm
    trace step)
  - live-search count

Honesty (agentic_reliability):
  - BOUND: scan capped at ≤1000; `capped` flag surfaced when the window is full.
  - HONEST_SCORES: every value is computed from real rows; rates are NULL (not a
    fabricated 0% / 100%) when there's no denominator — the UI must render "—",
    never invent "100% healthy" from zero data.
  - No private data: liveEventAnswers are public; the query never touches userNotes.

Tests (convex/__tests__/scratchnode.events.test.ts): +3 scenario tests — the full
aggregate from a 7-answer mixed-mode room, the HONEST_SCORES empty-room null case,
and the BOUND cap/`capped` flag.

Follow-up (separate frontend PR, after PR #445 lands): surface this in a host
"ask health" line + a degraded badge on provider_fallback answers.

Verification: convex codegen 0, tsc 0, vitest 57 passed / 1 skipped, build 0.

Co-authored-by: hshum <hshum@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
HomenShum pushed a commit that referenced this pull request Jun 1, 2026
… restore + scenario tests

The first /ask sent in the sub-second cold-load window (before the
liveEventMembers row commits) can reject not_joined. PR #445 already added the
pre-init send queue + idempotent join+resend retry; this closes the remaining
gap:

- home-v5.html: the final-catch draft restore now guards `if (!input.value)`
  like its sibling _sn_failPendingSends, so a total-failure restore never
  clobbers a newer draft typed while the send was in flight; the toast stays
  honest about whether it actually repopulated.
- scratchnode.events.test.ts: two scenario tests pin the recovery contract — a
  pre-join send rejects not_joined and persists nothing, then join+resend lands
  exactly one message (not lost, not duplicated); and the idempotent re-join
  during recovery never forks a second member row.

Additive only — no sendMessage/joinEvent contract changes.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
HomenShum added a commit that referenced this pull request Jun 1, 2026
…447)

* feat(scratchnode): visible degraded badge on provider-fallback /ask answers

Completes the honest-degraded-UX half of the /ask observability work. When the AI
provider is unavailable, askAgent answers from public sources only and records
agentMode=provider_fallback. renderAnswer already LABELLED that ("AI fallback ·
deterministic") but rendered it as neutral text, so a reader could mistake a
degraded answer for a full AI one.

Adds an amber "degraded · sources only" pill (icon + text, never colour alone for
a11y; role=status so screen readers announce it; title tooltip explains it may be
less complete) in the answer head, shown ONLY for provider_fallback answers.

Verified: all 4 inline <script> blocks pass node --check (the live-room block is
type=module / strict). Renders only on the rare fallback path; Tier-A live-DOM
check post-deploy confirms the code shipped.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(scratchnode): harden cold-load send race — guard not_joined draft restore + scenario tests

The first /ask sent in the sub-second cold-load window (before the
liveEventMembers row commits) can reject not_joined. PR #445 already added the
pre-init send queue + idempotent join+resend retry; this closes the remaining
gap:

- home-v5.html: the final-catch draft restore now guards `if (!input.value)`
  like its sibling _sn_failPendingSends, so a total-failure restore never
  clobbers a newer draft typed while the send was in flight; the toast stays
  honest about whether it actually repopulated.
- scratchnode.events.test.ts: two scenario tests pin the recovery contract — a
  pre-join send rejects not_joined and persists nothing, then join+resend lands
  exactly one message (not lost, not duplicated); and the idempotent re-join
  during recovery never forks a second member row.

Additive only — no sendMessage/joinEvent contract changes.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* chore(scratchnode): re-trigger CI (Tier B Vercel preview-poll flake on #447)

---------

Co-authored-by: hshum <hshum@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants