fix(tui): prevent stale status/fix-jobs responses from overwriting fresher data by cpcloud · Pull Request #598 · roborev-dev/roborev

cpcloud · 2026-03-30T17:41:32Z

Summary

Closes #596.

Add loadingStatus and loadingFixJobs in-flight guard booleans matching the existing loadingJobs pattern, preventing overlapping tick-based fetches from racing
Introduce startFetchStatus() and startFetchFixJobs() helpers that check the loading flag before dispatching, used at all call sites (tick, reconnect, view switch, fix trigger, patch apply)
Add statusErrMsg type so status fetch errors clear the loading flag instead of going through the generic errMsg handler
Add requestFetchFixJobs() for mutation handlers (fix enqueue, patch apply) that sets a fixJobsStale deferred-refresh flag when a fetch is already in flight — handleFixJobsMsg dispatches a follow-up fetch when this flag is set
Force-reset loadingFixJobs on reconnect so a stale in-flight request from the dead connection doesn't block all future fix-jobs fetches

🤖 Generated with Claude Code

…esher data (roborev-dev#596) Add loadingStatus and loadingFixJobs in-flight guards matching the existing loadingJobs pattern. When a fetch is already in flight, handleTickMsg skips dispatching another, preventing overlapping requests from racing and overwriting fresher data with stale responses. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Add startFetchStatus and startFetchFixJobs helpers that check the loading flag before dispatching. Use them at all call sites — tick handler, reconnect, view switch, fix trigger, and patch apply — so no path can launch a concurrent request while one is already in flight. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…tations On reconnect, force-reset loadingFixJobs and dispatch a fresh fetch (matching the existing status handling) so a stale in-flight request from the dead connection doesn't block all future fix-jobs fetches. For state-mutating handlers (fix enqueue, patch apply, rebase), use requestFetchFixJobs which sets a fixJobsStale flag when a fetch is already in flight. handleFixJobsMsg checks this flag and dispatches a follow-up fetch to pick up post-mutation state. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The fixJobsStale check already covers both success and error paths (it's outside the if/else), but add an explicit test to prove it. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Route fetchStatus/fetchFixJobs in handleSSEEventMsg and consumeSSEPendingRefresh through the startFetch helpers, consistent with the tick and reconnect paths. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

roborev-ci · 2026-03-30T18:03:58Z

roborev: Combined Review (`f90e1a1`)

Verdict: Changes improve TUI fetch coordination, but they still leave medium-severity stale-state races in status/fix-job refresh handling.

Medium

Stale refreshes can be dropped while a fetch is in flight
- Location: cmd/roborev/tui/handlers_msg.go:729, cmd/roborev/tui/handlers_msg.go:751, cmd/roborev/tui/handlers_msg.go:864
- The new startFetchStatus / startFetchFixJobs guards suppress SSE- and tick-driven refreshes whenever a request is already in flight, but suppressed refreshes are not reliably replayed. fixJobsStale is only set on local mutation paths, and status has no equivalent pending-refresh tracking. If the in-flight request completes with pre-change data, the UI can keep showing stale status or fix-job state until a later poll or event happens.
- Suggested fix: Track a pending refresh whenever an SSE/tick refresh is skipped, then trigger a follow-up fetch when the current request completes, or add request generations and ignore older responses.
Reconnect can be overwritten by late pre-reconnect responses
- Location: cmd/roborev/tui/handlers_msg.go:805
- handleReconnectMsg forces fresh status and fix-job fetches, but it does not invalidate earlier in-flight requests. Because statusMsg / statusErrMsg and fixJobsMsg are not tied to a request generation, a late response or error from before reconnect can overwrite the newer post-reconnect state or incorrectly retrigger connection-error handling.
- Suggested fix: Add per-request generations/tokens and discard stale responses after reconnect, or otherwise prevent older requests from updating state.

Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

SSE events signal daemon state changes, so use requestFetchStatus and requestFetchFixJobs in the SSE handlers. When a fetch is already in flight, these set a stale flag so handleStatusMsg/handleFixJobsMsg dispatch a follow-up fetch to pick up post-event state. Also clear statusStale on reconnect to avoid stale follow-ups from the dead connection. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

roborev-ci · 2026-03-30T18:39:04Z

roborev: Combined Review (`e751b17`)

Verdict: Changes are close, but one medium-severity correctness issue remains in the reconnect path.

Medium

cmd/roborev/tui/handlers_msg.go:812: handleReconnectMsg starts fresh fetchStatus() and fetchFixJobs() requests on reconnect even if older requests are still in flight, but the corresponding statusMsg and fixJobsMsg handlers still lack a request generation/sequence check. An older pre-reconnect response can therefore arrive later and overwrite newer state, reintroducing the stale/out-of-order update bug on reconnect.
Suggested fix: add a monotonically increasing generation/sequence to status and fix-jobs requests, and ignore responses from older generations.

Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

Increment a monotonic fetchGen counter on reconnect and embed it in statusMsg, statusErrMsg, and fixJobsMsg. Response handlers discard messages from older generations, preventing a late pre-reconnect response from overwriting post-reconnect state. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

roborev-ci · 2026-03-30T19:49:29Z

roborev: Combined Review (`8869dbc`)

Verdict: Changes are close, but there is one medium-severity race in the reconnect stale-response handling that should be fixed before merge.

Medium

cmd/roborev/tui/handlers_msg.go:225
cmd/roborev/tui/handlers_msg.go:439
cmd/roborev/tui/handlers_msg.go:957

The stale-response guard is applied too late in handleStatusMsg, handleFixJobsMsg, and handleStatusErrMsg: loadingStatus / loadingFixJobs are cleared before checking msg.gen < m.fetchGen. After a reconnect, an old pre-reconnect response can therefore clear the active in-flight guard for the new generation, allowing duplicate fetches on the next tick/SSE event and reopening the stale-overwrite race this patch is trying to prevent.

Fix by checking msg.gen < m.fetchGen before mutating the loading flags, or by tying loading-flag clearing to a request-specific token so only the active fetch can reset it.

Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

…flags Increment fetchSeq on reconnect so pre-reconnect job responses are also discarded (jobs already had seq-based staleness checking). Move loadingStatus/loadingFixJobs flag clearing after the fetchGen check so stale pre-reconnect responses don't clear the in-flight guard for the current-generation request. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Reset loadingMore during reconnect so a stale pre-reconnect pagination response (now discarded by fetchSeq) doesn't leave the flag stuck true, which would cause future tick/SSE refreshes to skip full job reloads. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

roborev-ci · 2026-03-30T20:20:51Z

roborev: Combined Review (`64f6ddc`)

Summary verdict: Changes are mostly sound, but there is one medium-severity reconnect bug that can leave fix jobs stale or hidden in the TUI.

Medium

cmd/roborev/tui/handlers_msg.go:817: handleReconnectMsg only resets and re-fetches fix jobs when the pre-reconnect tasksWorkflowEnabled() state is already true. If the daemon enables tasks after reconnect, or an older fix-jobs request was still in flight while local state was false, reconnect can skip the refresh and leave loadingFixJobs / hasActiveFixJobs stale. That can hide active fix jobs in the queue view and may block later startFetchFixJobs calls if loadingFixJobs remains stuck true.

Suggested fix: clear loadingFixJobs and fixJobsStale on every reconnect, and trigger the reconnect refresh independently of stale pre-reconnect task state, either unconditionally or after the first post-reconnect status update confirms task support.

Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

Clear loadingFixJobs and fixJobsStale unconditionally on reconnect, not just when tasksWorkflowEnabled() is true. Prevents a stuck loadingFixJobs flag if a pre-reconnect request was in flight while tasks were disabled. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

roborev-ci · 2026-03-30T20:27:16Z

roborev: Combined Review (`967fd37`)

Verdict: No medium-or-higher issues found; the change looks clean based on all three reviews.

All reviewers agreed there are no findings to report at Medium, High, or Critical severity. The change appears to correctly add in-flight guards, generation-based stale-response protection, and regression tests for the TUI fetch coordination behavior.

Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

cpcloud · 2026-03-30T20:42:42Z

Failing CI is due to a race condition that I might've introduced in #595. Working on it now.

cpcloud and others added 5 commits March 30, 2026 13:44

test(tui): add regression test for stale flag on error path

b7ee9b4

The fixJobsStale check already covers both success and error paths (it's outside the if/else), but add an explicit test to prove it. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

fix(tui): apply in-flight guards to SSE event handlers

f90e1a1

Route fetchStatus/fetchFixJobs in handleSSEEventMsg and consumeSSEPendingRefresh through the startFetch helpers, consistent with the tick and reconnect paths. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

cpcloud force-pushed the worktree-snazzy-hatching-backus branch from 5e9a2d1 to f90e1a1 Compare March 30, 2026 17:47

cpcloud and others added 2 commits March 30, 2026 15:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(tui): prevent stale status/fix-jobs responses from overwriting fresher data#598

fix(tui): prevent stale status/fix-jobs responses from overwriting fresher data#598
cpcloud wants to merge 10 commits intoroborev-dev:mainfrom
cpcloud:worktree-snazzy-hatching-backus

cpcloud commented Mar 30, 2026

Uh oh!

roborev-ci bot commented Mar 30, 2026

Uh oh!

roborev-ci bot commented Mar 30, 2026

Uh oh!

roborev-ci bot commented Mar 30, 2026

Uh oh!

roborev-ci bot commented Mar 30, 2026

Uh oh!

roborev-ci bot commented Mar 30, 2026

Uh oh!

cpcloud commented Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

cpcloud commented Mar 30, 2026

Summary

Uh oh!

roborev-ci bot commented Mar 30, 2026

roborev: Combined Review (f90e1a1)

Medium

Uh oh!

roborev-ci bot commented Mar 30, 2026

roborev: Combined Review (e751b17)

Medium

Uh oh!

roborev-ci bot commented Mar 30, 2026

roborev: Combined Review (8869dbc)

Medium

Uh oh!

roborev-ci bot commented Mar 30, 2026

roborev: Combined Review (64f6ddc)

Medium

Uh oh!

roborev-ci bot commented Mar 30, 2026

roborev: Combined Review (967fd37)

Uh oh!

cpcloud commented Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

roborev: Combined Review (`f90e1a1`)

roborev: Combined Review (`e751b17`)

roborev: Combined Review (`8869dbc`)

roborev: Combined Review (`64f6ddc`)

roborev: Combined Review (`967fd37`)