Skip to content

Conversation

@s-h-ubham
Copy link
Contributor

Title

Persist active-first peers to last_peers.json with debounced writer; remove lastKnownPeers


Summary

This PR reworks peer persistence to make last_peers.json always reflect the most useful peers for fast bootstrap and robust recovery.
It removes the legacy lastKnownPeers logic and replaces it with a unified, debounced persistence mechanism.


Key Changes

Active-first persistence

  • last_peers.json now lists:

    1. Active peers first — gathered from Peers() (currently connected peers).
    2. Recently seen peers — collected from the peerstore (seenPeers), acting as fallbacks.
  • Entries are deduplicated and capped (e.g., at 200) to prevent file bloat.

Debounced runtime writes

  • Peer connect/disconnect events trigger updates to seenPeers and schedule a debounced write, minimizing disk churn.

Graceful shutdown handling

  • Introduced an isShuttingDown flag:

    • Close() performs a final synchronous persist.
    • Further background writes are prevented after shutdown begins.

Code cleanup

  • Completely removed lastKnownPeers (fields, helpers, locks, and references).
  • Applied minor linter and formatting fixes (wsl, nlreturn, gofmt).

Issue Addressed

Previously, last_peers.json could be overwritten with stale data during shutdown, leading to:

  • Slow reconnections.

  • Dialing of dead or outdated peers.

This update ensures last_peers.json always reflects a current, prioritized, and bounded set of peers, improving reconnection reliability and startup speed.


Impact

Behavior

  • Faster reconnection after restarts — active peers are dialed first.
  • Fallback safety retained via recently seen peers.

Compatibility

  • No configuration changes required.
  • File format remains the same ([]string of multiaddrs).

Performance

  • Minimal overhead thanks to debounced async writes and bounded peer list size.

Testing & Validation

  • Updated unit tests to validate internal peer tracking (numPeers() instead of Peers() where relevant).

  • Manual validation steps:

    1. Start a small cluster, connect/disconnect peers.

    2. Confirm last_peers.json ordering: active peers first, inactive peers next.

    3. Graceful shutdown logs:

      Writing last_peers.json (shutdown)
      

      and shows active-first ordering.

    4. Restart nodes with only last_peers.json: observe dialing prioritizes live peers.


@s-h-ubham s-h-ubham requested a review from R-Santev October 9, 2025 12:45
@s-h-ubham s-h-ubham merged commit c886a28 into develop Oct 10, 2025
4 checks passed
@s-h-ubham s-h-ubham deleted the fix/bootnode-discovery branch October 14, 2025 07:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants