Skip to content

fix(test): retry TUI chat correlation E2E on transient gateway races#4382

Draft
hunglp6d wants to merge 3 commits into
NVIDIA:mainfrom
hunglp6d:fix/nightly-e2e-tui-chat-correlation-race-1daf081
Draft

fix(test): retry TUI chat correlation E2E on transient gateway races#4382
hunglp6d wants to merge 3 commits into
NVIDIA:mainfrom
hunglp6d:fix/nightly-e2e-tui-chat-correlation-race-1daf081

Conversation

@hunglp6d
Copy link
Copy Markdown
Contributor

@hunglp6d hunglp6d commented May 28, 2026

Summary

The openclaw-tui-chat-correlation-e2e live test retried only on the "zero-event capture" transient race, not on two other transient gateway patterns that cause intermittent nightly failures (e.g. run 26546628518). This PR adds detection for both additional patterns and extends the retry loop from 1 to 2 retries (3 total attempts), while preserving signal for real partial correlation regressions.

Related Issue

Fixes #4383

Changes

  • Add looksLikeTotalCorrelationRace() to detect the transient pattern where all chat events are uncorrelated and later user turns are missing from chat.history
  • Add looksLikePartialReplyDelivery() to detect the transient pattern where the gateway delivers only a subset of replies (correctly correlated) while the remaining replies never arrive
  • Add looksLikeTransientGatewayRace() combining all three transient detectors
  • Extend runLiveIssue2603ReproWithEventCaptureRetry from a single if-guard to a while loop with MAX_TRANSIENT_RETRIES=2 (3 total attempts)
  • Add unit tests covering both new transient patterns, including negative cases for partial regressions and mixed failures

Validation

A focused custom-e2e.yaml workflow was run on a sibling branch to confirm this fix repairs the regression. The workflow re-runs only the jobs from the original nightly that this PR targets, on ubuntu-latest, off the same fix commit as this PR.

The validation branch is intentionally not the head of this PR — it carries an extra .github/workflows/custom-e2e.yaml commit that is scaffolding, not part of the fix. Re-run the validation by pushing any commit to the validation branch.

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Verification

  • npx prek run --all-files passes
  • npm test passes
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed
  • Docs updated for user-facing behavior changes

AI Disclosure

  • AI-assisted — tool: Claude Code

Signed-off-by: Hung Le hple@nvidia.com

hunglp6d added 2 commits May 28, 2026 01:06
…y race

The openclaw-tui-chat-correlation-e2e live test retried only when the
WebSocket client captured zero chat events (the "event capture failure"
pattern).  A second transient race — where all replies arrive but every
runId differs from the chat.send response — was not retried, causing
intermittent nightly failures (e.g. run 26546628518).

Add looksLikeTotalCorrelationRace() to detect this second transient
pattern (all events uncorrelated + later user turns missing from
chat.history) and extend the retry loop to allow up to two retries
(three total attempts) for either transient signature.  A real
correlation regression would break only some runs or leave user turns
intact, so it will not be masked by the retry.

Signed-off-by: Hung Le <hple@nvidia.com>
The validation run for the total-correlation-race fix (attempt 1)
revealed a third transient pattern: partial reply delivery where
the gateway delivers only a subset of replies correctly while the
remaining replies never arrive.  Add looksLikePartialReplyDelivery()
to detect this signature (missingReplies > 0, uncorrelatedReplies
empty, no empty finals or duplicates) and include it in the
transient gateway race retry logic.

Signed-off-by: Hung Le <hple@nvidia.com>
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 28, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 28, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 5ad05ec0-eaf2-40d7-ace7-bd91015ce5cb

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@wscurran wscurran added E2E End-to-end testing — Brev infrastructure, test cases, nightly failures, and coverage gaps fix labels May 28, 2026
@wscurran
Copy link
Copy Markdown
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

E2E End-to-end testing — Brev infrastructure, test cases, nightly failures, and coverage gaps fix

Projects

None yet

Development

Successfully merging this pull request may close these issues.

nightly-e2e: openclaw-tui-chat-correlation-e2e flaky on transient gateway races (run 26546628518)

2 participants