Skip to content

0.3.81: TCP keepalive + lastSeen-aware stale detection in addPeer dedup#6

Merged
sym-bot merged 1 commit into
mainfrom
fix/keepalive-stale-handshake-on-replace
Apr 29, 2026
Merged

0.3.81: TCP keepalive + lastSeen-aware stale detection in addPeer dedup#6
sym-bot merged 1 commit into
mainfrom
fix/keepalive-stale-handshake-on-replace

Conversation

@sym-bot
Copy link
Copy Markdown
Owner

@sym-bot sym-bot commented Apr 29, 2026

Summary

Mirrors `@sym-bot/sym` v0.5.3 on the Swift side so cross-runtime peers (sym-swift ↔ sym-node) recover symmetrically from peer restarts.

Field problem

iPhone ↔ Mac Catalyst MeloMove pair flap after either side rebuilds. The pattern:

```
[SYM] session: handshake complete with Hongwei (019dd87d)
[SYM] peer: connected: Hongwei (outbound, bonjour)
[SYM] session: disconnected: Connection closed
[SYM] peer: disconnected: Hongwei
```

Repeated indefinitely. Cause: peer restart leaves a dead-but-ESTABLISHED TCP socket on the survivor. The `addPeer` dedup logic in v0.3.80 keeps rejecting the live new dial against the zombie entry. Without TCP keepalive, the OS holds the dead socket in ESTABLISHED state for ~2 hours (macOS default `TCP_KEEPALIVE = 7200s`), so the flap continues that long.

Fix

1. TCP keepalive on every NWConnection

`SymPeerSession.tcpParametersWithKeepalive()` helper builds `NWParameters` with `NWProtocolTCP.Options.enableKeepalive = true`, `keepaliveIdle = 1`, `keepaliveInterval = 1`, `keepaliveCount = 3`. Used by:

  • Outbound `NWConnection` inits (`init(outboundTo:)`, `init(remoteHost:port:)`)
  • Inbound `NWListener` parameters (`SymDiscovery.startListener`) — accepted connections inherit keepalive

Dead remote ends now reaped in ~4 seconds instead of ~2 hours.

2. lastSeen-aware stale-prior detection in `addPeer` dedup

Before applying the dual-dial tie-break or same-direction-duplicate rule, check if `existing.lastSeen` is older than `staleAfterSeconds` (10s, matching Node SDK's `_heartbeatInterval`). If stale, replace prior with new — the remote re-dialling is itself evidence its prior is dead.

Tests

71/71 existing unit tests pass.

Test plan

  • All existing unit tests pass (`swift test`)
  • Verify on iPhone ↔ Mac Catalyst MeloMove pair after either side rebuilds. Connection should re-establish within ~10s instead of staying broken for hours.

🤖 Generated with Claude Code

Mirrors @sym-bot/sym v0.5.3 — same shape on the Swift side so cross-runtime
peers (sym-swift ↔ sym-node) recover symmetrically from peer restarts.

Two-part fix:

1. SymPeerSession.tcpParametersWithKeepalive() helper builds NWParameters
   with NWProtocolTCP.Options.enableKeepalive=true, keepaliveIdle=1,
   keepaliveInterval=1, keepaliveCount=3. Used by both outbound NWConnection
   inits and the inbound NWListener parameters in SymDiscovery. Dead remote
   ends reaped in ~4s instead of waiting macOS default TCP_KEEPALIVE=7200s.

2. SymNode.addPeer dedup now treats a peer entry with lastSeen older than
   staleAfterSeconds (10s) as stale, and the new dial replaces it
   regardless of dual-dial tie-break or same-direction-duplicate logic.
   Remote re-dialling is itself evidence its prior is dead.

Field problem this fixes: iPhone↔Mac-Catalyst MeloMove peer flap after
either side rebuilds. Old behavior — peer restart leaves dead-but-
ESTABLISHED socket on survivor; addPeer rejects live redial against
zombie; flap continues until OS keepalive eventually reaps. New behavior:
keepalive reaps within ~4s, AND lastSeen-stale check unblocks redial
within 10s of last activity. Recovery is seconds, not hours.

71/71 unit tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@sym-bot sym-bot merged commit a56ff3f into main Apr 29, 2026
1 check passed
@sym-bot sym-bot deleted the fix/keepalive-stale-handshake-on-replace branch April 29, 2026 19:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant