Skip to content

Fix panic on peer shutdown during block commit#5393

Open
Ady0333 wants to merge 1 commit intohyperledger:mainfrom
Ady0333:fix/gossip-state-shutdown-panic
Open

Fix panic on peer shutdown during block commit#5393
Ady0333 wants to merge 1 commit intohyperledger:mainfrom
Ady0333:fix/gossip-state-shutdown-panic

Conversation

@Ady0333
Copy link
Contributor

@Ady0333 Ady0333 commented Feb 9, 2026

Type of change

  • Bug fix

Description

This PR fixes a shutdown race in GossipStateProviderImpl that could cause a peer to panic during block commit.

On shutdown, Stop() closed the ledger immediately after signaling stopCh, while the deliverPayloads() goroutine could still be committing blocks. That commit path sends events to the snapshot manager, whose channels are closed as part of ledger.Close(). If shutdown overlaps with an in-flight commit, this results in a send on closed channel panic.

The fix waits for deliverPayloads() to complete before closing ledger resources and adds a stop check in the payload loop so shutdown exits promptly.


Additional details

The shutdown path is now coordinated with block commit to ensure no snapshot manager channels are closed while still in use.

Tested locally with go test ./gossip/state/....


Release Note

Fixes a peer shutdown race that could cause a panic during block commit when
snapshot manager channels were closed while still in use.

Wait for deliverPayloads goroutine to complete before closing
ledger resources in GossipStateProviderImpl.Stop(), preventing
send on closed snapshotMgr channels.

Signed-off-by: Ady0333 <adityashinde1525@gmail.com>
@Ady0333 Ady0333 requested a review from a team as a code owner February 9, 2026 20:17
@Ady0333
Copy link
Contributor Author

Ady0333 commented Feb 9, 2026

Hi @yacovm ,

When you get a chance, please take a look at this PR.

@C0rWin
Copy link
Contributor

C0rWin commented Feb 20, 2026

@Ady0333 can you please add unit test?

@cendhu
Copy link
Contributor

cendhu commented Mar 9, 2026

Thank you for the contribution. We currently have neither a peer node stop nor a peer node shutdown command. Because of this, I’m not sure what this PR is intended to address. Our existing shutdown APIs are only used within tests to close services after a unit test run.

@Ady0333
Copy link
Contributor Author

Ady0333 commented Mar 11, 2026

@cendhu
Thanks for the clarification.

My understanding was that snapshotMgr.shutdown() can close channels while snapshot generation goroutines are still active, which could lead to a send-on-closed-channel panic. Even if shutdown is currently only used in tests, this panic could still cause non-deterministic failures in test environments where snapshots are generated concurrently.

The intent of the fix was to prevent this race and ensure shutdown happens safely even when snapshot generation is in progress.

Please let me know if this scenario is not considered relevant for Fabric, and I can close the PR if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants