Skip to content

Fix flaky PickAsync_UpdateAddressesWhileRequestingConnection_DoesNotDeadlock test#2705

Merged
JamesNK merged 1 commit intogrpc:masterfrom
JamesNK:fix/flaky-deadlock-test
Mar 31, 2026
Merged

Fix flaky PickAsync_UpdateAddressesWhileRequestingConnection_DoesNotDeadlock test#2705
JamesNK merged 1 commit intogrpc:masterfrom
JamesNK:fix/flaky-deadlock-test

Conversation

@JamesNK
Copy link
Copy Markdown
Member

@JamesNK JamesNK commented Mar 27, 2026

The test's MessageLogged callback intercepted every ConnectionRequested log event, but the SyncPoint is a one-shot mechanism (backed by TaskCompletionSource). After Continue() is called, subsequent WaitToContinue() calls return immediately.

When UpdateAddresses triggers a reconnect that fires additional ConnectionRequested events, the callback re-enters and executes inside the subchannel lock, causing state interleaving that leaves CurrentEndPoint null while the picker is set to PickFirstPicker. PickAsync then waits for a new picker that never arrives, timing out after 5 seconds.

Fix: Use Interlocked.CompareExchange to ensure the callback only intercepts ConnectionRequested once, matching the test's intent of pausing exactly one connection request.

Verified stable with 100/100 passes in a loop.

…eadlock test

The test's MessageLogged callback intercepted every ConnectionRequested log
event, but the SyncPoint is a one-shot mechanism. After Continue() is called,
subsequent WaitToContinue() calls return immediately. When UpdateAddresses
triggers a reconnect that fires additional ConnectionRequested events, the
callback re-enters and executes inside the subchannel lock, causing state
interleaving that leaves CurrentEndPoint null while the picker is set to
PickFirstPicker. PickAsync then waits for a new picker that never arrives.

Fix by using Interlocked.CompareExchange to ensure the callback only
intercepts ConnectionRequested once, matching the test's intent.
Copy link
Copy Markdown
Member

@asheshvidyut asheshvidyut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@JamesNK JamesNK merged commit ccf6ab7 into grpc:master Mar 31, 2026
12 of 13 checks passed
@JamesNK JamesNK deleted the fix/flaky-deadlock-test branch March 31, 2026 01:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants