Skip to content

fix: Prevent client destruction race condition#1183

Merged
bjsowa merged 3 commits intoRobotWebTools:ros2from
field-ai:hotfix/client-destruction
Mar 17, 2026
Merged

fix: Prevent client destruction race condition#1183
bjsowa merged 3 commits intoRobotWebTools:ros2from
field-ai:hotfix/client-destruction

Conversation

@FieldSwan
Copy link
Contributor

Public API Changes
None

Description

  1. Adds a client creation/destruction stress test to replicate the issue observed where sometimes we would see:
Exception in thread Thread-1 (spin):
Traceback (most recent call last):
  File "/usr/lib/python3.12/threading.py", line 1073, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.12/threading.py", line 1010, in run
    self._target(*self._args, **self._kwargs)
  File "/opt/ros/kilted/lib/python3.12/site-packages/rclpy/executors.py", line 374, in spin
    self.spin_once()
  File "/opt/ros/kilted/lib/python3.12/site-packages/rclpy/executors.py", line 968, in spin_once
    self._spin_once_impl(timeout_sec)
  File "/opt/ros/kilted/lib/python3.12/site-packages/rclpy/executors.py", line 951, in _spin_once_impl
    handler, entity, node = self.wait_for_ready_callbacks(
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/ros/kilted/lib/python3.12/site-packages/rclpy/executors.py", line 921, in wait_for_ready_callbacks
    return next(self._cb_iter)
           ^^^^^^^^^^^^^^^^^^^
  File "/opt/ros/kilted/lib/python3.12/site-packages/rclpy/executors.py", line 820, in _wait_for_ready_callbacks
    waitable.add_to_wait_set(wait_set)
  File "/opt/ros/kilted/lib/python3.12/site-packages/rclpy/event_handler.py", line 176, in add_to_wait_set
rclpy._rclpy_pybind11.InvalidHandle: cannot use Destroyable because destruction was requested

Note that this stress test is relatively slow, takes 20 seconds on my machine.

  1. Fixes the issue by adding the websocket protocol destruction to the executor waitset (if possible) to prevent a race condition with the executor's other tasks.

Tested on ROS2 kilted.

May be related to: #1144

@bjsowa
Copy link
Member

bjsowa commented Mar 16, 2026

Overall LGTM. Just one minor change I added in field-ai#4

@FieldSwan
Copy link
Contributor Author

Overall LGTM. Just one minor change I added in field-ai#4

Sounds good! Merged your change in.

@bjsowa
Copy link
Member

bjsowa commented Mar 17, 2026

Please update the branch with the base branch

@ikwilnaarhuisman
Copy link

@FieldSwan Tested this. It fixes the fact that you can no longer use the executor. Which is great. But any goal send after that forces it to spawn threads that are not destroyed. Making this change a killer in the long run. Same for the hotfix @YannickdeHoop and I made mentioned in #1144 (comment)

This video demonstrates the following: Using the rosbridge I am calling the navigation of my robot using teleoperation, basically sending "go to XYZ", I limited the cpu usage allowed for the rosbridge to be 10% which is equivalent to the actual robots we use at lowpad. Considering I have that much of a beefy cpu. As shown in the video, once this happens threads keep spawning and the rosbridge keeps functioning as /intended/, but the threads keep and keep coming which will eventually grind the entire system to a halt.

output2.mp4

@FieldSwan FieldSwan force-pushed the hotfix/client-destruction branch from 420c1e4 to eb76081 Compare March 17, 2026 10:18
@bjsowa bjsowa changed the title Prevent client destruction race condition fix: Prevent client destruction race condition Mar 17, 2026
@bjsowa
Copy link
Member

bjsowa commented Mar 17, 2026

@FieldSwan Tested this. It fixes the fact that you can no longer use the executor. Which is great. But any goal send after that forces it to spawn threads that are not destroyed. Making this change a killer in the long run. Same for the hotfix @YannickdeHoop and I made mentioned in #1144 (comment)

I'm still merging it as it also fixes one race condition.

@bjsowa bjsowa merged commit 5b20a2e into RobotWebTools:ros2 Mar 17, 2026
3 checks passed
@bjsowa
Copy link
Member

bjsowa commented Mar 17, 2026

@Mergifyio backport kilted jazzy humble

@mergify
Copy link

mergify bot commented Mar 17, 2026

backport kilted jazzy humble

✅ Backports have been created

Details

Cherry-pick of 5b20a2e has failed:

On branch mergify/bp/jazzy/pr-1183
Your branch is up to date with 'origin/jazzy'.

You are currently cherry-picking commit 5b20a2e.
  (fix conflicts and run "git cherry-pick --continue")
  (use "git cherry-pick --skip" to skip this patch)
  (use "git cherry-pick --abort" to cancel the cherry-pick operation)

Changes to be committed:
	modified:   rosbridge_server/package.xml
	modified:   rosbridge_server/src/rosbridge_server/websocket_handler.py
	new file:   rosbridge_server/test/test_stress_clients.py

Unmerged paths:
  (use "git add <file>..." to mark resolution)
	both modified:   rosbridge_server/CMakeLists.txt

To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally

Cherry-pick of 5b20a2e has failed:

On branch mergify/bp/humble/pr-1183
Your branch is up to date with 'origin/humble'.

You are currently cherry-picking commit 5b20a2e.
  (fix conflicts and run "git cherry-pick --continue")
  (use "git cherry-pick --skip" to skip this patch)
  (use "git cherry-pick --abort" to cancel the cherry-pick operation)

Changes to be committed:
	modified:   rosbridge_server/package.xml
	modified:   rosbridge_server/src/rosbridge_server/websocket_handler.py
	new file:   rosbridge_server/test/test_stress_clients.py

Unmerged paths:
  (use "git add <file>..." to mark resolution)
	both modified:   rosbridge_server/CMakeLists.txt

To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally

mergify bot pushed a commit that referenced this pull request Mar 17, 2026
* Client creation/destruction race condition stress test

* Fix race condition with client destruction

* Use ros isolated pytest (#4)

---------

Co-authored-by: Błażej Sowa <bsowa123@gmail.com>
(cherry picked from commit 5b20a2e)
mergify bot pushed a commit that referenced this pull request Mar 17, 2026
* Client creation/destruction race condition stress test

* Fix race condition with client destruction

* Use ros isolated pytest (#4)

---------

Co-authored-by: Błażej Sowa <bsowa123@gmail.com>
(cherry picked from commit 5b20a2e)

# Conflicts:
#	rosbridge_server/CMakeLists.txt
mergify bot pushed a commit that referenced this pull request Mar 17, 2026
* Client creation/destruction race condition stress test

* Fix race condition with client destruction

* Use ros isolated pytest (#4)

---------

Co-authored-by: Błażej Sowa <bsowa123@gmail.com>
(cherry picked from commit 5b20a2e)

# Conflicts:
#	rosbridge_server/CMakeLists.txt
bjsowa added a commit that referenced this pull request Mar 17, 2026
* Client creation/destruction race condition stress test

* Fix race condition with client destruction

* Use ros isolated pytest (#4)

---------


(cherry picked from commit 5b20a2e)

Co-authored-by: FieldSwan <michael.swan@fieldai.com>
Co-authored-by: Błażej Sowa <bsowa123@gmail.com>
bjsowa added a commit that referenced this pull request Mar 17, 2026
* fix: Prevent client destruction race condition (#1183)

* Client creation/destruction race condition stress test

* Fix race condition with client destruction

* Use ros isolated pytest (#4)

---------

Co-authored-by: Błażej Sowa <bsowa123@gmail.com>
(cherry picked from commit 5b20a2e)

# Conflicts:
#	rosbridge_server/CMakeLists.txt

* Fix conflicts

---------

Co-authored-by: FieldSwan <michael.swan@fieldai.com>
Co-authored-by: Błażej Sowa <bsowa123@gmail.com>
bjsowa added a commit that referenced this pull request Mar 17, 2026
* fix: Prevent client destruction race condition (#1183)

* Client creation/destruction race condition stress test

* Fix race condition with client destruction

* Use ros isolated pytest (#4)

---------

Co-authored-by: Błażej Sowa <bsowa123@gmail.com>
(cherry picked from commit 5b20a2e)

# Conflicts:
#	rosbridge_server/CMakeLists.txt

* Fix conflicts

---------

Co-authored-by: FieldSwan <michael.swan@fieldai.com>
Co-authored-by: Błażej Sowa <bsowa123@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants