Skip to content

retry on connection errors#436

Merged
xzrderek merged 6 commits intomainfrom
derekx/retry-on-connection
Mar 13, 2026
Merged

retry on connection errors#436
xzrderek merged 6 commits intomainfrom
derekx/retry-on-connection

Conversation

@xzrderek
Copy link
Contributor

@xzrderek xzrderek commented Mar 13, 2026

Note

Medium Risk
Touches rollout execution lifecycle and shared-session cleanup behavior across parallel runs, which can affect concurrency and resource management. Retry behavior also changes for remote HTTP failures and could mask some server-side issues if misclassified.

Overview
Improves retry and resource handling for rollout execution. rollout_processor_with_retry no longer calls rollout_processor cleanup per run; cleanup is now performed once in evaluation_test.py (including the priority-scheduler path) to avoid closing shared sessions while other parallel runs are still in flight.

Expands and refines retry triggers. Default retryable exceptions now include aiohttp connection/disconnect errors, and RemoteRolloutProcessor treats HTTP 5xx from /init as a ConnectionError (retryable) while keeping 4xx as non-retryable failures.

Tests are updated to stop asserting per-run cleanup from rollout_processor_with_retry.

Written by Cursor Bugbot for commit a11c7dd. This will update automatically on new commits. Configure here.

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.

@xzrderek xzrderek merged commit 3c8d8f2 into main Mar 13, 2026
17 checks passed
@xzrderek xzrderek deleted the derekx/retry-on-connection branch March 13, 2026 23:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant