Fix panic on peer shutdown during block commit#5393
Fix panic on peer shutdown during block commit#5393Ady0333 wants to merge 1 commit intohyperledger:mainfrom
Conversation
Wait for deliverPayloads goroutine to complete before closing ledger resources in GossipStateProviderImpl.Stop(), preventing send on closed snapshotMgr channels. Signed-off-by: Ady0333 <adityashinde1525@gmail.com>
|
Hi @yacovm , When you get a chance, please take a look at this PR. |
|
@Ady0333 can you please add unit test? |
|
Thank you for the contribution. We currently have neither a peer node stop nor a peer node shutdown command. Because of this, I’m not sure what this PR is intended to address. Our existing shutdown APIs are only used within tests to close services after a unit test run. |
|
@cendhu My understanding was that snapshotMgr.shutdown() can close channels while snapshot generation goroutines are still active, which could lead to a send-on-closed-channel panic. Even if shutdown is currently only used in tests, this panic could still cause non-deterministic failures in test environments where snapshots are generated concurrently. The intent of the fix was to prevent this race and ensure shutdown happens safely even when snapshot generation is in progress. Please let me know if this scenario is not considered relevant for Fabric, and I can close the PR if needed. |
Type of change
Description
This PR fixes a shutdown race in GossipStateProviderImpl that could cause a peer to panic during block commit.
On shutdown, Stop() closed the ledger immediately after signaling stopCh, while the deliverPayloads() goroutine could still be committing blocks. That commit path sends events to the snapshot manager, whose channels are closed as part of ledger.Close(). If shutdown overlaps with an in-flight commit, this results in a send on closed channel panic.
The fix waits for deliverPayloads() to complete before closing ledger resources and adds a stop check in the payload loop so shutdown exits promptly.
Additional details
The shutdown path is now coordinated with block commit to ensure no snapshot manager channels are closed while still in use.
Tested locally with
go test ./gossip/state/....Release Note
Fixes a peer shutdown race that could cause a panic during block commit when
snapshot manager channels were closed while still in use.