Skip to content

fix: Handle WSClient write failure when server closes WebSocket#6671

Merged
bthomee merged 2 commits intodevelopfrom
pratik/fix-wsclient-unhandled-exception-on-close
Apr 7, 2026
Merged

fix: Handle WSClient write failure when server closes WebSocket#6671
bthomee merged 2 commits intodevelopfrom
pratik/fix-wsclient-unhandled-exception-on-close

Conversation

@pratikmankawde
Copy link
Copy Markdown
Contributor

@pratikmankawde pratikmankawde commented Mar 26, 2026

High Level Overview of Change

Fix intermittent xrpl.rpc.RPCOverload test failure caused by an unhandled exception in WSClientImpl::invoke() when the server closes the WebSocket connection after booting an overloaded client.

Context of Change

Failing CI job: https://github.com/XRPLF/rippled/actions/runs/23560752956/job/68599747791

The bug was introduced when the test infrastructure was originally written. WSClientImpl::invoke() uses the throwing overload of ws_.write_some(). When the server boots a client for exceeding resource thresholds (session->close({policy_error, "threshold exceeded"}) in ServerHandler::processSession), a WebSocket close frame is sent. There is a race between:

  1. The server sending the close frame
  2. The test's main thread calling invoke()write_some() on the next loop iteration

If write_some() executes before the async read handler (on_read_msg) processes the close frame, Beast's internal check_stop_now() throws Operation canceled [system:125]. Since invoke() has no exception handling around the write, this propagates as an unhandled exception.

The fix replaces the throwing write_some overload with the error_code overload. On write failure, invoke() returns {} (empty Json::Value) — the same return value it already produces when findMsg() times out. The RPCOverload test already handles this correctly: if (jv.isNull()) { booted = true; }.

Observed in CI job rhel-10-clang-any-amd64-release (PR #6619 merge queue run, 2026-03-25):

#19 failed: unhandled exception: Operation canceled [system:125 at
boost/beast/websocket/impl/stream_impl.hpp:355:68 in function
'bool boost::beast::websocket::stream::impl_type::check_stop_now(error_code &)']
failed: xrpl.rpc.RPCOverload had 1 failures.

API Impact

None — this change is limited to test infrastructure (src/test/jtx/impl/WSClient.cpp).

…-5573)

Use the error_code overload of write_some in WSClientImpl::invoke()
to gracefully handle the case where the server has already closed the
WebSocket connection (e.g. after booting a client that exceeded
resource thresholds). This prevents an unhandled "Operation canceled"
exception that intermittently fails xrpl.rpc.RPCOverload in CI.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@xrplf-ai-reviewer xrplf-ai-reviewer bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues.

Review by Claude Opus 4.6 · Prompt: V12

Copy link
Copy Markdown
Contributor

@xrplf-ai-reviewer xrplf-ai-reviewer bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues.

Review by Claude Opus 4.6 · Prompt: V12

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 26, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 81.4%. Comparing base (2c765f6) to head (6884543).
⚠️ Report is 37 commits behind head on develop.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff            @@
##           develop   #6671     +/-   ##
=========================================
- Coverage     81.4%   81.4%   -0.0%     
=========================================
  Files          998     998             
  Lines        74443   74443             
  Branches      7563    7562      -1     
=========================================
- Hits         60632   60624      -8     
- Misses       13811   13819      +8     

see 4 files with indirect coverage changes

Impacted file tree graph

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Contributor

@kuznetsss kuznetsss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@pratikmankawde pratikmankawde added the Ready to merge *PR author* thinks it's ready to merge. Has passed code review. Perf sign-off may still be required. label Mar 30, 2026
@bthomee bthomee requested a review from Copilot April 7, 2026 19:00
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes an intermittent unit test failure in the WebSocket-based test client (WSClientImpl) by preventing an exception from escaping when the server has already initiated WebSocket close (e.g., booting an overloaded client).

Changes:

  • Switch WSClientImpl::invoke() from the throwing ws_.write_some() overload to the error_code overload.
  • On write failure, return an empty Json::Value (null) to match existing “no response” behavior (e.g., findMsg() timeout), allowing the RPCOverload test to treat it as “booted”.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@bthomee bthomee added this pull request to the merge queue Apr 7, 2026
Merged via the queue into develop with commit 6d1a5be Apr 7, 2026
37 checks passed
@bthomee bthomee deleted the pratik/fix-wsclient-unhandled-exception-on-close branch April 7, 2026 19:43
mvadari pushed a commit that referenced this pull request Apr 7, 2026
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
abrahim-djordjevic pushed a commit to abrahim-djordjevic/rippled that referenced this pull request Apr 7, 2026
…F#6671)

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Ready to merge *PR author* thinks it's ready to merge. Has passed code review. Perf sign-off may still be required.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants