perf(graph): replace 50ms sleep-poll in wait_for_service with condvar#193
Open
YuanYuYuan wants to merge 3 commits into
Open
perf(graph): replace 50ms sleep-poll in wait_for_service with condvar#193YuanYuYuan wants to merge 3 commits into
YuanYuYuan wants to merge 3 commits into
Conversation
Add `change_signal: Arc<(StdMutex<u64>, Condvar)>` to `Graph`. The liveliness callback drops graph.data before acquiring the condvar mutex, increments an epoch counter, and calls notify_all() — preserving the lock ordering (waiter: condvar → data; notifier: data → condvar) to prevent deadlock. Replace the sleep-poll loop in `RawServiceClient::wait_for_service` (ffi/service.rs) with a condvar wait. The waiter holds the condvar mutex across both the condition check and wait_timeout so no signal fired between the two can be missed. The async paths (wait_for_subscription, wait_for_publisher, wait_for_server) already use tokio Notify with correct TOCTOU handling and are unchanged.
…gnal The epoch counter was written on every graph change but never read by the waiter — the condvar mechanism itself guarantees no missed signals when the waiter holds the mutex across the condition check and wait_timeout. Remove the dead u64 and use StdMutex<()> instead.
let _ = mutex.lock() drops the guard immediately — Rust rejects this as a non-binding let on a synchronization lock. The notify_all() call does not require the mutex to be held on the notifier side.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
wait_for_servicewith a condvar notification fromGraph, so callers wake the instant a matching service server appears rather than up to 50ms later.Key Changes
crates/hiroz/src/graph.rs: Addchange_signal: Arc<(StdMutex<u64>, Condvar)>toGraph. The liveliness subscriber callback dropsgraph.datalock before acquiring this mutex, increments an epoch counter, and callsnotify_all(). This preserves the lock ordering (waiter: condvar mutex → graph.data; notifier: graph.data → condvar mutex) so no deadlock is possible.crates/hiroz/src/ffi/service.rs: Replacestd::thread::sleep(50ms)poll loop inRawServiceClient::wait_for_servicewith a condvar wait ongraph.change_signal. The waiter holds the condvar mutex across the condition check and thewait_timeoutcall, so no signal fired between the check and the wait can be missed.Scope
The async paths (
wait_for_subscription,wait_for_publisher,wait_for_server) already usegraph.change_notify(tokioNotify) with the correct TOCTOU-safe pattern and are unchanged. This PR only fixes the sync FFI path.Breaking Changes
None