fix(sdk): query actual relay state in RelayManager.isConnected by variablefate · Pull Request #101 · variablefate/roadflare-ios

variablefate · 2026-05-07T02:03:50Z

Summary

RelayManager.isConnected was checking only Swift-side state — client != nil && !connectedRelayURLs.isEmpty && _handlerAlive — none of which track WebSocket health. The Client object stays alive through airplane-mode toggles and the notification handler task only dies on hard errors, so the getter has always reported "connected" once a relay had ever been reached on launch.
Discovered on-device while verifying feat(sdk): rethrow publishProfileAndMark + publishAndMark for eager onboarding banner #99 + feat(ui): global offline pill at top of RootView #100. With airplane mode on, the new connectivity pill never appeared and ConnectivitySheet showed all 3 relay dots green even though driver pictures clearly weren't loading.
Fix: query client.relays() and check each Relay.isConnected() (rust-nostr binding's live per-relay status). "At least one relay connected" → true.
The _handlerAlive flag stays — it's still useful for reconnectIfNeeded's liveness gate, but that's a separate concern from "is a relay reachable right now."

Impact

Pre-existing bug, not introduced by #99 or #100, but it gates both from working as intended on real devices:

The ConnectivityPill from feat(ui): global offline pill at top of RootView #100 polls isRelayConnected() — would never show on-device without this fix.
OnboardingPublishFailureBanner's offline-park branch (feat(onboarding): publish-failure banner with retry #95 / ADR-0016) and feat(sdk): rethrow publishProfileAndMark + publishAndMark for eager onboarding banner #99's eager-error connectivity gate both depended on the same getter.
Per-tab inline offline UI (DriversTab, HistoryTab, RideTab) all poll the same getter.
ConnectivitySheet's 3 relay dots use the same global flag (separate UI bug; this fix at least makes the "all green / all red" branch correct rather than always-green-when-offline).

Tests use FakeRelayManager which maintains its own truthful _isConnected state, so the regression never surfaced in CI.

Sequencing

This should land before #99 and #100; both PRs need to be rebased on top so they actually work on-device. Happy to do the rebase once this merges.

Test plan

xcodebuild ... build passes for device.
xcodebuild ... test (full iOS test plan) passes — FakeRelayManager satisfies the protocol's get async requirement either via sync getter (RidestrSDKTests/FakeRelayManager.swift) or get async (RoadFlareTests/FakeRelayManager.swift).
Manual on-device with combined branch (feat(sdk): rethrow publishProfileAndMark + publishAndMark for eager onboarding banner #99 + feat(ui): global offline pill at top of RootView #100 + this fix): airplane mode ON → pill appears within ~10s, ConnectivitySheet dots turn red. Airplane mode OFF → pill clears within ~10s.

🤖 Generated with Claude Code

`isConnected` returned `client != nil && !connectedRelayURLs.isEmpty && _handlerAlive` — none of which reflect WebSocket health. The `Client` object stays alive through airplane-mode toggles; the notification handler task only dies on hard error, not on transient drops. So the getter always reported "connected" once any relay had ever been reached on launch. Side effect: every consumer of the lying signal — the per-tab inline offline UI in DriversTab/HistoryTab/RideTab, the ConnectivitySheet relay dots, the new ConnectivityPill from #100, the `guard await isConnected` short-circuits in AppState reconnect / sync flows — was silently broken on-device. Visible on real device only because tests use FakeRelayManager which has its own truthful state. Fix: query `client.relays()` and check each `Relay.isConnected()` (the rust-nostr binding's live per-relay status). "At least one relay connected" → true. The `_handlerAlive` flag remains relevant for `reconnectIfNeeded`'s liveness gate; that's a separate concern from "is a relay reachable right now." Found during on-device verification of #99 + #100. Pre-existing bug, not introduced by either PR. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… state Discovered while retesting #101 on-device: even with the truthful isConnected getter, plain background→foreground resume never reflected the offline state. The handler's reconnect path was gated by `guard client == nil || !_handlerAlive else { return }` — and on iOS suspend, the Swift Client object stays alive and `_handlerAlive` stays true (rust-nostr only flips it on a hard error from the notification handler, not on a silently-killed socket). So the foreground handler bailed out without rebuilding, the existing client kept reporting its old per-relay statuses, and `isConnected` (now truthful per HEAD~1) reflected that stale state. Drop the guard entirely. `reconnectIfNeeded` is only called from foreground handlers and explicit user reconnect actions — both are exactly the moments when cached state can't be trusted. Cost is one ~1s handshake per foreground transition; gain is correctness. Force-quit launch already worked because there was no existing client to lie about; this fix extends that correctness to resume paths. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Even with reconnectIfNeeded force-rebuilding (HEAD~1) and isConnected querying truthful per-relay state (HEAD~2), the offline pill still took ~2 minutes to appear after toggling airplane mode mid-foreground — because nothing was triggering reconnectAndRestoreSession during that window. The 10s connection watchdog kept polling isConnected, which read rust-nostr's cached per-relay state, which stayed stale until rust-nostr's internal heartbeat eventually noticed the dead socket. Add an NWPathMonitor inside ConnectionCoordinator. Any iOS-observed network path change (Wi-Fi/cellular swap, airplane toggle, captive portal) immediately fires reconnect, bypassing the isConnected gate. rust-nostr's stale state can't fool the OS-level signal. The very first path update fires on monitor start with the current path — skipped to avoid a spurious rebuild on launch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

variablefate · 2026-05-07T03:13:36Z

Code review

No issues found. Checked for bugs and CLAUDE.md compliance.

🤖 Generated with Claude Code

_{- If this code review was useful, please react with 👍. Otherwise, react with 👎.}

- Remove `_handlerAlive` flag entirely. The PR's reconnectIfNeeded rewrite dropped its only reader (the cached-state guard), leaving three writes and zero reads. The new isConnected doc-comment was also factually wrong about the flag "remaining relevant for reconnectIfNeeded's liveness gate" — that gate was deleted in the same PR. Removing the flag, its writes, the markHandlerDead helper, and the monitor Task that called it leaves a cleaner RelayManager whose connection state is purely the rust-nostr per-relay query. - ConnectionCoordinator: track path-monitor reconnect Tasks so stop() can cancel any in-flight reconnect. Without this, a fire-and-forget Task spawned by a path-update event could continue calling the injected `reconnect` closure after the coordinator was torn down (e.g. on logout / identity replacement). Mirrors PR #95's tracked-Task pattern for the onboarding-publish watchdog. Code-review follow-up. No user-visible behavior change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

variablefate and others added 3 commits May 6, 2026 19:03

variablefate mentioned this pull request May 7, 2026

feat(ui): global offline pill at top of RootView #100

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(sdk): query actual relay state in RelayManager.isConnected#101

fix(sdk): query actual relay state in RelayManager.isConnected#101
variablefate wants to merge 4 commits intomainfrom
claude/relay-isconnected-truthful

variablefate commented May 7, 2026

Uh oh!

variablefate commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

variablefate commented May 7, 2026

Summary

Impact

Sequencing

Test plan

Uh oh!

variablefate commented May 7, 2026

Code review

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant