Skip to content

0.3.82: relax TCP keepalive timings (Wi-Fi-friendly)#7

Merged
sym-bot merged 1 commit into
mainfrom
fix/relax-keepalive-timings
Apr 29, 2026
Merged

0.3.82: relax TCP keepalive timings (Wi-Fi-friendly)#7
sym-bot merged 1 commit into
mainfrom
fix/relax-keepalive-timings

Conversation

@sym-bot
Copy link
Copy Markdown
Owner

@sym-bot sym-bot commented Apr 29, 2026

Summary

v0.3.81 set keepalive at `idle=1s, interval=1s, count=3` (~4s to declare dead) — far too aggressive for real-world Wi-Fi. Brief mid-handshake pauses on healthy connections got the socket reaped before the application-level handshake exchange could complete:

```
[SYM] session: connection ready (outbound=true)
[SYM] session: handshake timeout after 10s — disconnecting
```

Even fully-functional peers were producing this on the field on iPhone↔Mac-Catalyst pairs over Wi-Fi.

Fix

`idle=10s, interval=30s, count=3` → ~100s to declare dead. Wi-Fi-friendly: a few seconds of natural network jitter during handshake or quiet CMB flow doesn't trigger reaping. Peer-restart scenarios still recover within ~100s instead of macOS default ~2h.

The application-layer `lastSeen`-stale check in `SymNode.addPeer` (shipped in v0.3.81) still gives fast ~10s peer-restart recovery — that's the user-visible "peer restarted, reconnect now" path. OS keepalive is the slower fallback for cases the application layer doesn't see.

Tests

71/71 existing unit tests pass.

🤖 Generated with Claude Code

v0.3.81 keepalive (idle=1s, interval=1s, count=3 → ~4s detection) was
too aggressive. Brief mid-handshake pauses on Wi-Fi got the connection
reaped before the application-level handshake exchange could complete,
producing "session: handshake timeout after 10s — disconnecting" on
healthy connections.

v0.3.82: idle=10s, interval=30s, count=3 → ~100s to declare dead.
Wi-Fi-friendly. Healthy connections survive ordinary network jitter.

Application-layer lastSeen-stale check in addPeer (also from v0.3.81)
still gives fast (~10s) peer-restart recovery; OS keepalive is the
fallback for scenarios the application layer doesn't see.

71/71 unit tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@sym-bot sym-bot merged commit 03aed34 into main Apr 29, 2026
1 check passed
@sym-bot sym-bot deleted the fix/relax-keepalive-timings branch April 29, 2026 19:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant