From aacc7eafc9ab922dcb90018f66597f4575f8bb7e Mon Sep 17 00:00:00 2001
From: variablefate <variablefate@users.noreply.github.com>
Date: Wed, 6 May 2026 19:03:28 -0700
Subject: [PATCH 1/7] fix(sdk): query actual relay state in
 RelayManager.isConnected
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

`isConnected` returned `client != nil && !connectedRelayURLs.isEmpty &&
_handlerAlive` — none of which reflect WebSocket health. The `Client`
object stays alive through airplane-mode toggles; the notification
handler task only dies on hard error, not on transient drops. So the
getter always reported "connected" once any relay had ever been reached
on launch.

Side effect: every consumer of the lying signal — the per-tab inline
offline UI in DriversTab/HistoryTab/RideTab, the ConnectivitySheet
relay dots, the new ConnectivityPill from #100, the `guard await
isConnected` short-circuits in AppState reconnect / sync flows — was
silently broken on-device. Visible on real device only because tests
use FakeRelayManager which has its own truthful state.

Fix: query `client.relays()` and check each `Relay.isConnected()` (the
rust-nostr binding's live per-relay status). "At least one relay
connected" → true. The `_handlerAlive` flag remains relevant for
`reconnectIfNeeded`'s liveness gate; that's a separate concern from
"is a relay reachable right now."

Found during on-device verification of #99 + #100. Pre-existing bug,
not introduced by either PR.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 .../Sources/RidestrSDK/Nostr/RelayManager.swift | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/RidestrSDK/Sources/RidestrSDK/Nostr/RelayManager.swift b/RidestrSDK/Sources/RidestrSDK/Nostr/RelayManager.swift
index 35a191c..3d769d5 100644
--- a/RidestrSDK/Sources/RidestrSDK/Nostr/RelayManager.swift
+++ b/RidestrSDK/Sources/RidestrSDK/Nostr/RelayManager.swift
@@ -31,8 +31,23 @@ public actor RelayManager: RelayManagerProtocol {
         await teardownConnection(clearRelayURLs: true)
     }
 
+    /// At least one relay is currently connected. Queries the underlying
+    /// rust-nostr `Client` for the live per-relay status — the previous impl
+    /// returned `client != nil && !connectedRelayURLs.isEmpty && _handlerAlive`
+    /// which all stay true through airplane-mode toggles and other transient
+    /// network drops, so it lied about the actual WebSocket state. The
+    /// `_handlerAlive` flag remains relevant for `reconnectIfNeeded`'s
+    /// liveness gate; it's a separate concern from "is a relay reachable
+    /// right now."
     public var isConnected: Bool {
-        client != nil && !connectedRelayURLs.isEmpty && _handlerAlive
+        get async {
+            guard let client else { return false }
+            let relays = await client.relays()
+            for relay in relays.values where relay.isConnected() {
+                return true
+            }
+            return false
+        }
     }
 
     private func markHandlerDead() {

From 6e5099698dcadf39fd0c7ab8575c46dc98ea73f3 Mon Sep 17 00:00:00 2001
From: variablefate <variablefate@users.noreply.github.com>
Date: Wed, 6 May 2026 19:12:11 -0700
Subject: [PATCH 2/7] fix(sdk): reconnectIfNeeded force-rebuilds instead of
 trusting cached state
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Discovered while retesting #101 on-device: even with the truthful
isConnected getter, plain background→foreground resume never reflected
the offline state. The handler's reconnect path was gated by
`guard client == nil || !_handlerAlive else { return }` — and on iOS
suspend, the Swift Client object stays alive and `_handlerAlive` stays
true (rust-nostr only flips it on a hard error from the notification
handler, not on a silently-killed socket).

So the foreground handler bailed out without rebuilding, the existing
client kept reporting its old per-relay statuses, and `isConnected`
(now truthful per HEAD~1) reflected that stale state.

Drop the guard entirely. `reconnectIfNeeded` is only called from
foreground handlers and explicit user reconnect actions — both are
exactly the moments when cached state can't be trusted. Cost is one
~1s handshake per foreground transition; gain is correctness.

Force-quit launch already worked because there was no existing client
to lie about; this fix extends that correctness to resume paths.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 .../RidestrSDK/Nostr/RelayManager.swift       | 20 +++++++++++++++----
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/RidestrSDK/Sources/RidestrSDK/Nostr/RelayManager.swift b/RidestrSDK/Sources/RidestrSDK/Nostr/RelayManager.swift
index 3d769d5..37262a5 100644
--- a/RidestrSDK/Sources/RidestrSDK/Nostr/RelayManager.swift
+++ b/RidestrSDK/Sources/RidestrSDK/Nostr/RelayManager.swift
@@ -54,13 +54,25 @@ public actor RelayManager: RelayManagerProtocol {
         _handlerAlive = false
     }
 
-    /// Reconnect to relays if disconnected. Call from app foreground handler.
-    /// Does NOT restart subscriptions — callers must re-subscribe after this returns.
+    /// Force-rebuild the relay client. Call from app foreground handler.
+    ///
+    /// Always replaces the client rather than short-circuiting on cached
+    /// state because the cached state lies on iOS background→foreground:
+    /// the OS suspends WebSockets on background, rust-nostr's per-relay
+    /// status doesn't update until the next read/write attempt fails, and
+    /// the `_handlerAlive` flag only flips false on a hard error from the
+    /// notification handler (which doesn't fire on a silently-killed
+    /// socket). Tearing down and rebuilding is the only way to get
+    /// truthful state — the alternative is letting the user sit in
+    /// "everything looks fine" while their relays are dead.
+    ///
+    /// Does NOT restart subscriptions — callers must re-subscribe after
+    /// this returns. Cheap when relays are reachable (~1s handshake);
+    /// the trade-off vs. per-call cost is correctness on every foreground.
     public func reconnectIfNeeded() async {
         guard !connectedRelayURLs.isEmpty else { return }
-        guard client == nil || !_handlerAlive else { return }
 
-        RidestrLogger.error("[RelayManager] Reconnecting relay client")
+        RidestrLogger.info("[RelayManager] Force-rebuilding relay client (foreground / explicit reconnect)")
         do {
             try await replaceClient(with: connectedRelayURLs)
         } catch {

From f309bd572ca42d979972cd1a886427632294c8e9 Mon Sep 17 00:00:00 2001
From: variablefate <variablefate@users.noreply.github.com>
Date: Wed, 6 May 2026 19:26:46 -0700
Subject: [PATCH 3/7] fix(connection): NWPathMonitor reactive signal for
 offline detection
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Even with reconnectIfNeeded force-rebuilding (HEAD~1) and isConnected
querying truthful per-relay state (HEAD~2), the offline pill still took
~2 minutes to appear after toggling airplane mode mid-foreground —
because nothing was triggering reconnectAndRestoreSession during that
window. The 10s connection watchdog kept polling isConnected, which
read rust-nostr's cached per-relay state, which stayed stale until
rust-nostr's internal heartbeat eventually noticed the dead socket.

Add an NWPathMonitor inside ConnectionCoordinator. Any iOS-observed
network path change (Wi-Fi/cellular swap, airplane toggle, captive
portal) immediately fires reconnect, bypassing the isConnected gate.
rust-nostr's stale state can't fool the OS-level signal.

The very first path update fires on monitor start with the current
path — skipped to avoid a spurious rebuild on launch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 .../ViewModels/ConnectionCoordinator.swift    | 58 +++++++++++++++++--
 1 file changed, 52 insertions(+), 6 deletions(-)

diff --git a/RoadFlare/RoadFlareCore/ViewModels/ConnectionCoordinator.swift b/RoadFlare/RoadFlareCore/ViewModels/ConnectionCoordinator.swift
index 99c7749..2bb1954 100644
--- a/RoadFlare/RoadFlareCore/ViewModels/ConnectionCoordinator.swift
+++ b/RoadFlare/RoadFlareCore/ViewModels/ConnectionCoordinator.swift
@@ -1,16 +1,35 @@
 import Foundation
+import Network
 
-/// Owns the periodic relay connection watchdog task.
+/// Owns the periodic relay connection watchdog task and a path-change
+/// reactive signal.
 ///
-/// Checks connectivity on a fixed interval and triggers reconnection
-/// when the relay drops. All behavior is injected via closures so the
-/// coordinator has no direct dependencies on AppState or SDK services.
+/// Two complementary mechanisms drive reconnects:
+///
+/// 1. **Periodic watchdog** — polls `isConnected` on a fixed interval and
+///    triggers reconnect when it returns false. Bounded latency for any
+///    drop the relay manager has noticed.
+///
+/// 2. **`NWPathMonitor` reactive signal** — fires whenever iOS observes a
+///    network path change (Wi-Fi/cellular swap, airplane mode toggle,
+///    captive portal). Forces a reconnect immediately, bypassing the
+///    `isConnected` gate, because the previous connection's cached state
+///    is by definition suspect after a path change. rust-nostr's
+///    per-relay status doesn't update until its next read/write fails,
+///    which can take minutes on a silently-killed socket — the OS-level
+///    signal is much faster.
+///
+/// All behavior is injected via closures so the coordinator has no
+/// direct dependencies on AppState or SDK services.
 @MainActor
 final class ConnectionCoordinator {
     private var watchdogTask: Task<Void, Never>?
+    private var pathMonitor: NWPathMonitor?
+    private var pathMonitorQueue: DispatchQueue?
+    private var hasReceivedFirstPath = false
     private var isReconnecting = false
 
-    /// Start the periodic watchdog.
+    /// Start the periodic watchdog and the path-change reactive signal.
     ///
     /// - Parameters:
     ///   - interval: Time between connectivity checks.
@@ -34,12 +53,39 @@ final class ConnectionCoordinator {
                 await reconnect()
             }
         }
+
+        let monitor = NWPathMonitor()
+        let queue = DispatchQueue(label: "com.roadflare.connection-coordinator.path-monitor")
+        monitor.pathUpdateHandler = { [weak self] _ in
+            // The very first path update fires on monitor start with the
+            // current path — not a real change. Skip that to avoid a
+            // spurious rebuild on app launch when `setupServices` has just
+            // established the initial connection.
+            Task { @MainActor [weak self] in
+                guard let self else { return }
+                if !self.hasReceivedFirstPath {
+                    self.hasReceivedFirstPath = true
+                    return
+                }
+                guard !self.isReconnecting, shouldReconnect() else { return }
+                self.isReconnecting = true
+                defer { self.isReconnecting = false }
+                await reconnect()
+            }
+        }
+        monitor.start(queue: queue)
+        self.pathMonitor = monitor
+        self.pathMonitorQueue = queue
     }
 
-    /// Stop the watchdog and cancel any in-flight reconnection.
+    /// Stop the watchdog, cancel the path monitor, and any in-flight reconnection.
     func stop() {
         watchdogTask?.cancel()
         watchdogTask = nil
+        pathMonitor?.cancel()
+        pathMonitor = nil
+        pathMonitorQueue = nil
+        hasReceivedFirstPath = false
         isReconnecting = false
     }
 }

From f397a4a5970fd152a14b4d9a3a1c0fcb1df918e4 Mon Sep 17 00:00:00 2001
From: variablefate <variablefate@users.noreply.github.com>
Date: Wed, 6 May 2026 20:29:24 -0700
Subject: [PATCH 4/7] chore: address sub-80 code-review polish
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

- Remove `_handlerAlive` flag entirely. The PR's reconnectIfNeeded
  rewrite dropped its only reader (the cached-state guard), leaving
  three writes and zero reads. The new isConnected doc-comment was
  also factually wrong about the flag "remaining relevant for
  reconnectIfNeeded's liveness gate" — that gate was deleted in the
  same PR. Removing the flag, its writes, the markHandlerDead helper,
  and the monitor Task that called it leaves a cleaner RelayManager
  whose connection state is purely the rust-nostr per-relay query.
- ConnectionCoordinator: track path-monitor reconnect Tasks so stop()
  can cancel any in-flight reconnect. Without this, a fire-and-forget
  Task spawned by a path-update event could continue calling the
  injected `reconnect` closure after the coordinator was torn down
  (e.g. on logout / identity replacement). Mirrors PR #95's
  tracked-Task pattern for the onboarding-publish watchdog.

Code-review follow-up. No user-visible behavior change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 .../RidestrSDK/Nostr/RelayManager.swift       | 32 +++++------------
 .../ViewModels/ConnectionCoordinator.swift    | 34 +++++++++++++++----
 2 files changed, 36 insertions(+), 30 deletions(-)

diff --git a/RidestrSDK/Sources/RidestrSDK/Nostr/RelayManager.swift b/RidestrSDK/Sources/RidestrSDK/Nostr/RelayManager.swift
index 37262a5..fc4bcdb 100644
--- a/RidestrSDK/Sources/RidestrSDK/Nostr/RelayManager.swift
+++ b/RidestrSDK/Sources/RidestrSDK/Nostr/RelayManager.swift
@@ -14,7 +14,6 @@ public actor RelayManager: RelayManagerProtocol {
     private var connectedRelayURLs: [URL] = []
     private var notificationHandler: NotificationRouter?
     private var notificationTask: Task<Void, Never>?
-    private var _handlerAlive = false  // Explicit liveness flag — Task.isCancelled doesn't detect normal completion
 
     public init(keypair: NostrKeypair) {
         self.keypair = keypair
@@ -33,12 +32,9 @@ public actor RelayManager: RelayManagerProtocol {
 
     /// At least one relay is currently connected. Queries the underlying
     /// rust-nostr `Client` for the live per-relay status — the previous impl
-    /// returned `client != nil && !connectedRelayURLs.isEmpty && _handlerAlive`
+    /// returned `client != nil && !connectedRelayURLs.isEmpty && handlerAlive`
     /// which all stay true through airplane-mode toggles and other transient
-    /// network drops, so it lied about the actual WebSocket state. The
-    /// `_handlerAlive` flag remains relevant for `reconnectIfNeeded`'s
-    /// liveness gate; it's a separate concern from "is a relay reachable
-    /// right now."
+    /// network drops, so it lied about the actual WebSocket state.
     public var isConnected: Bool {
         get async {
             guard let client else { return false }
@@ -50,21 +46,16 @@ public actor RelayManager: RelayManagerProtocol {
         }
     }
 
-    private func markHandlerDead() {
-        _handlerAlive = false
-    }
-
     /// Force-rebuild the relay client. Call from app foreground handler.
     ///
     /// Always replaces the client rather than short-circuiting on cached
     /// state because the cached state lies on iOS background→foreground:
-    /// the OS suspends WebSockets on background, rust-nostr's per-relay
-    /// status doesn't update until the next read/write attempt fails, and
-    /// the `_handlerAlive` flag only flips false on a hard error from the
-    /// notification handler (which doesn't fire on a silently-killed
-    /// socket). Tearing down and rebuilding is the only way to get
-    /// truthful state — the alternative is letting the user sit in
-    /// "everything looks fine" while their relays are dead.
+    /// the OS suspends WebSockets on background, and rust-nostr's
+    /// per-relay status doesn't update until the next read/write attempt
+    /// fails (which can take minutes on a silently-killed socket).
+    /// Tearing down and rebuilding is the only way to get truthful state —
+    /// the alternative is letting the user sit in "everything looks fine"
+    /// while their relays are dead.
     ///
     /// Does NOT restart subscriptions — callers must re-subscribe after
     /// this returns. Cheap when relays are reachable (~1s handshake);
@@ -107,7 +98,6 @@ public actor RelayManager: RelayManagerProtocol {
     private func startNotificationHandler(for client: Client) {
         let router = NotificationRouter()
         self.notificationHandler = router
-        self._handlerAlive = true
         let clientRef = client
         notificationTask = Task.detached { [router] in
             do {
@@ -120,15 +110,9 @@ public actor RelayManager: RelayManagerProtocol {
             router.removeAll()
             RidestrLogger.error("[RelayManager] All subscription streams terminated due to disconnect")
         }
-        // Monitor handler liveness from a separate task
-        Task { [weak self] in
-            await self?.notificationTask?.value  // Suspends until task completes
-            await self?.markHandlerDead()
-        }
     }
 
     private func teardownConnection(clearRelayURLs: Bool) async {
-        _handlerAlive = false
         notificationHandler?.removeAll()
         activeStreams.removeAll()
         subscriptionGenerations.removeAll()
diff --git a/RoadFlare/RoadFlareCore/ViewModels/ConnectionCoordinator.swift b/RoadFlare/RoadFlareCore/ViewModels/ConnectionCoordinator.swift
index 2bb1954..bca5178 100644
--- a/RoadFlare/RoadFlareCore/ViewModels/ConnectionCoordinator.swift
+++ b/RoadFlare/RoadFlareCore/ViewModels/ConnectionCoordinator.swift
@@ -26,6 +26,13 @@ final class ConnectionCoordinator {
     private var watchdogTask: Task<Void, Never>?
     private var pathMonitor: NWPathMonitor?
     private var pathMonitorQueue: DispatchQueue?
+    /// Reconnect Tasks spawned by path-update events. Tracked so `stop()`
+    /// can cancel any in-flight reconnect that landed mid-tear-down (e.g.
+    /// on logout / identity replacement). Without this, a fire-and-forget
+    /// path Task could continue calling the injected `reconnect` closure
+    /// after the coordinator has been torn down. Mirrors the tracked-Task
+    /// pattern from PR #95's onboarding-publish watchdog.
+    private var pathReconnectTasks: [Task<Void, Never>] = []
     private var hasReceivedFirstPath = false
     private var isReconnecting = false
 
@@ -58,9 +65,12 @@ final class ConnectionCoordinator {
         let queue = DispatchQueue(label: "com.roadflare.connection-coordinator.path-monitor")
         monitor.pathUpdateHandler = { [weak self] _ in
             // The very first path update fires on monitor start with the
-            // current path — not a real change. Skip that to avoid a
-            // spurious rebuild on app launch when `setupServices` has just
-            // established the initial connection.
+            // current path — not a real change in production paths. Skip
+            // it to avoid a spurious rebuild on app launch right after
+            // `setupServices` has established the initial connection.
+            // (Apple's `NWPathMonitor` documentation does not strictly
+            // contract this, but it is the consistently observed behavior
+            // for `start(queue:)` and is widely relied on across iOS apps.)
             Task { @MainActor [weak self] in
                 guard let self else { return }
                 if !self.hasReceivedFirstPath {
@@ -68,9 +78,17 @@ final class ConnectionCoordinator {
                     return
                 }
                 guard !self.isReconnecting, shouldReconnect() else { return }
-                self.isReconnecting = true
-                defer { self.isReconnecting = false }
-                await reconnect()
+                let reconnectTask = Task { @MainActor [weak self] in
+                    guard let self else { return }
+                    guard !Task.isCancelled else { return }
+                    self.isReconnecting = true
+                    defer { self.isReconnecting = false }
+                    await reconnect()
+                }
+                self.pathReconnectTasks.append(reconnectTask)
+                // Best-effort cleanup of finished Tasks so the array doesn't
+                // grow unboundedly across many path transitions.
+                self.pathReconnectTasks.removeAll(where: { $0.isCancelled })
             }
         }
         monitor.start(queue: queue)
@@ -85,6 +103,10 @@ final class ConnectionCoordinator {
         pathMonitor?.cancel()
         pathMonitor = nil
         pathMonitorQueue = nil
+        for task in pathReconnectTasks {
+            task.cancel()
+        }
+        pathReconnectTasks.removeAll()
         hasReceivedFirstPath = false
         isReconnecting = false
     }

From 541d84cee5953e23c1a696279914e53bb3d41ac4 Mon Sep 17 00:00:00 2001
From: variablefate <variablefate@users.noreply.github.com>
Date: Thu, 7 May 2026 14:42:26 -0700
Subject: [PATCH 5/7] chore(connection): single pathReconnectTask instead of
 unbounded array

Self-noticed defect in the previous polish commit: the
`pathReconnectTasks: [Task]` array's "best-effort cleanup" line only
removed Tasks that had been explicitly cancelled, not Tasks that
completed naturally. So the array grew with every path event during
a session, bounded only by `stop()`. Tiny leak in practice but the
cleanup comment was misleading.

`isReconnecting` already serializes path-spawned reconnects to at
most one in flight at any moment, so a single Task var is sufficient
for the original goal of "let stop() cancel any in-flight reconnect."

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 .../ViewModels/ConnectionCoordinator.swift    | 27 ++++++++-----------
 1 file changed, 11 insertions(+), 16 deletions(-)

diff --git a/RoadFlare/RoadFlareCore/ViewModels/ConnectionCoordinator.swift b/RoadFlare/RoadFlareCore/ViewModels/ConnectionCoordinator.swift
index bca5178..35c6d3f 100644
--- a/RoadFlare/RoadFlareCore/ViewModels/ConnectionCoordinator.swift
+++ b/RoadFlare/RoadFlareCore/ViewModels/ConnectionCoordinator.swift
@@ -26,13 +26,14 @@ final class ConnectionCoordinator {
     private var watchdogTask: Task<Void, Never>?
     private var pathMonitor: NWPathMonitor?
     private var pathMonitorQueue: DispatchQueue?
-    /// Reconnect Tasks spawned by path-update events. Tracked so `stop()`
-    /// can cancel any in-flight reconnect that landed mid-tear-down (e.g.
-    /// on logout / identity replacement). Without this, a fire-and-forget
-    /// path Task could continue calling the injected `reconnect` closure
-    /// after the coordinator has been torn down. Mirrors the tracked-Task
-    /// pattern from PR #95's onboarding-publish watchdog.
-    private var pathReconnectTasks: [Task<Void, Never>] = []
+    /// Most recently spawned path-event reconnect Task. `isReconnecting`
+    /// already serializes path-spawned reconnects (a second event during an
+    /// in-flight reconnect short-circuits via the guard), so at most one
+    /// such Task is doing real work at any moment — tracking the latest
+    /// is sufficient to cancel an in-flight reconnect on `stop()` (e.g.
+    /// logout / identity replacement). Mirrors the tracked-Task pattern
+    /// from PR #95's onboarding-publish watchdog.
+    private var pathReconnectTask: Task<Void, Never>?
     private var hasReceivedFirstPath = false
     private var isReconnecting = false
 
@@ -78,17 +79,13 @@ final class ConnectionCoordinator {
                     return
                 }
                 guard !self.isReconnecting, shouldReconnect() else { return }
-                let reconnectTask = Task { @MainActor [weak self] in
+                self.pathReconnectTask = Task { @MainActor [weak self] in
                     guard let self else { return }
                     guard !Task.isCancelled else { return }
                     self.isReconnecting = true
                     defer { self.isReconnecting = false }
                     await reconnect()
                 }
-                self.pathReconnectTasks.append(reconnectTask)
-                // Best-effort cleanup of finished Tasks so the array doesn't
-                // grow unboundedly across many path transitions.
-                self.pathReconnectTasks.removeAll(where: { $0.isCancelled })
             }
         }
         monitor.start(queue: queue)
@@ -103,10 +100,8 @@ final class ConnectionCoordinator {
         pathMonitor?.cancel()
         pathMonitor = nil
         pathMonitorQueue = nil
-        for task in pathReconnectTasks {
-            task.cancel()
-        }
-        pathReconnectTasks.removeAll()
+        pathReconnectTask?.cancel()
+        pathReconnectTask = nil
         hasReceivedFirstPath = false
         isReconnecting = false
     }

From 66e0843d6f2e1f3ad09199205c64d3076118e9de Mon Sep 17 00:00:00 2001
From: variablefate <variablefate@users.noreply.github.com>
Date: Thu, 7 May 2026 14:51:54 -0700
Subject: [PATCH 6/7] fix(connection): set isReconnecting synchronously in
 outer path Task
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Closes a race surfaced during re-review of PR #101. The previous
structure spawned an inner Task and set `isReconnecting = true` inside
its body, leaving a window where two rapid path events could both pass
the outer `guard !isReconnecting` (since inner A hadn't yet set the
flag), spawn overlapping inner Tasks, and orphan the first Task's
handle in `pathReconnectTask`. NWPathMonitor can fire repeatedly on
airplane-mode toggles or Wi-Fi flap, so this isn't theoretical.

Setting `isReconnecting = true` in the outer Task body — same critical
section as the guard check — closes the race. The inner Task's defer
clears it via `self?.isReconnecting = false` (which is also the only
clear, removing the duplicate `isReconnecting = true` assignment that
was left over from the prior structure).

Also tightens three doc-comments that became stale after the SDK
changes:
- isConnected doc: stop spelling the removed flag as `handlerAlive`
  (the actual name was `_handlerAlive`); generalise to "Swift-side
  cached state."
- reconnectIfNeeded doc: qualify "always replaces" with "(when relay
  URLs are configured)" to match the `guard !connectedRelayURLs.isEmpty`
  early-exit.
- ConnectionCoordinator class doc: drop "drop the relay manager has
  noticed" — leftover from the `_handlerAlive` semantics; isConnected
  now queries live state directly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 .../RidestrSDK/Nostr/RelayManager.swift       | 26 ++++++++++---------
 .../ViewModels/ConnectionCoordinator.swift    | 20 +++++++++-----
 2 files changed, 28 insertions(+), 18 deletions(-)

diff --git a/RidestrSDK/Sources/RidestrSDK/Nostr/RelayManager.swift b/RidestrSDK/Sources/RidestrSDK/Nostr/RelayManager.swift
index fc4bcdb..7fe6cad 100644
--- a/RidestrSDK/Sources/RidestrSDK/Nostr/RelayManager.swift
+++ b/RidestrSDK/Sources/RidestrSDK/Nostr/RelayManager.swift
@@ -31,10 +31,11 @@ public actor RelayManager: RelayManagerProtocol {
     }
 
     /// At least one relay is currently connected. Queries the underlying
-    /// rust-nostr `Client` for the live per-relay status — the previous impl
-    /// returned `client != nil && !connectedRelayURLs.isEmpty && handlerAlive`
-    /// which all stay true through airplane-mode toggles and other transient
-    /// network drops, so it lied about the actual WebSocket state.
+    /// rust-nostr `Client` for the live per-relay status — the previous
+    /// impl checked only Swift-side cached state (the client pointer, a
+    /// configured-URLs list, and a handler-liveness flag), all of which
+    /// stay true through airplane-mode toggles and other transient network
+    /// drops, so it lied about the actual WebSocket state.
     public var isConnected: Bool {
         get async {
             guard let client else { return false }
@@ -48,18 +49,19 @@ public actor RelayManager: RelayManagerProtocol {
 
     /// Force-rebuild the relay client. Call from app foreground handler.
     ///
-    /// Always replaces the client rather than short-circuiting on cached
-    /// state because the cached state lies on iOS background→foreground:
-    /// the OS suspends WebSockets on background, and rust-nostr's
-    /// per-relay status doesn't update until the next read/write attempt
-    /// fails (which can take minutes on a silently-killed socket).
-    /// Tearing down and rebuilding is the only way to get truthful state —
-    /// the alternative is letting the user sit in "everything looks fine"
-    /// while their relays are dead.
+    /// Always replaces the client (when relay URLs are configured) rather
+    /// than short-circuiting on cached state because the cached state
+    /// lies on iOS background→foreground: the OS suspends WebSockets on
+    /// background, and rust-nostr's per-relay status doesn't update until
+    /// the next read/write attempt fails (which can take minutes on a
+    /// silently-killed socket). Tearing down and rebuilding is the only
+    /// way to get truthful state — the alternative is letting the user
+    /// sit in "everything looks fine" while their relays are dead.
     ///
     /// Does NOT restart subscriptions — callers must re-subscribe after
     /// this returns. Cheap when relays are reachable (~1s handshake);
     /// the trade-off vs. per-call cost is correctness on every foreground.
+    /// No-op if `connect(to:)` has not been called yet (no URLs to reach).
     public func reconnectIfNeeded() async {
         guard !connectedRelayURLs.isEmpty else { return }
 
diff --git a/RoadFlare/RoadFlareCore/ViewModels/ConnectionCoordinator.swift b/RoadFlare/RoadFlareCore/ViewModels/ConnectionCoordinator.swift
index 35c6d3f..02b1de2 100644
--- a/RoadFlare/RoadFlareCore/ViewModels/ConnectionCoordinator.swift
+++ b/RoadFlare/RoadFlareCore/ViewModels/ConnectionCoordinator.swift
@@ -6,9 +6,11 @@ import Network
 ///
 /// Two complementary mechanisms drive reconnects:
 ///
-/// 1. **Periodic watchdog** — polls `isConnected` on a fixed interval and
-///    triggers reconnect when it returns false. Bounded latency for any
-///    drop the relay manager has noticed.
+/// 1. **Periodic watchdog** — polls the live `isConnected` query (which,
+///    after PR #101, asks rust-nostr for per-relay status rather than
+///    reading a cached flag) on a fixed interval and triggers reconnect
+///    when it returns false. Bounded latency for any drop visible to
+///    rust-nostr.
 ///
 /// 2. **`NWPathMonitor` reactive signal** — fires whenever iOS observes a
 ///    network path change (Wi-Fi/cellular swap, airplane mode toggle,
@@ -79,11 +81,17 @@ final class ConnectionCoordinator {
                     return
                 }
                 guard !self.isReconnecting, shouldReconnect() else { return }
+                // Set the busy flag synchronously, in the same critical
+                // section as the guard check, so a second path event
+                // landing between this Task and the inner reconnect Task
+                // observes the busy state and short-circuits. (Setting it
+                // inside the inner Task left a race window where two
+                // rapid path events both passed the guard and spawned
+                // overlapping reconnects.)
+                self.isReconnecting = true
                 self.pathReconnectTask = Task { @MainActor [weak self] in
-                    guard let self else { return }
+                    defer { self?.isReconnecting = false }
                     guard !Task.isCancelled else { return }
-                    self.isReconnecting = true
-                    defer { self.isReconnecting = false }
                     await reconnect()
                 }
             }

From 1e939d9d39d05a998dba2ad47cb37565b9bc2355 Mon Sep 17 00:00:00 2001
From: variablefate <variablefate@users.noreply.github.com>
Date: Thu, 7 May 2026 15:10:40 -0700
Subject: [PATCH 7/7] fix(connection): re-check isReconnecting after watchdog's
 isConnected await
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Closes the same class of race we just fixed in the path-handler. The
watchdog's body has an `await isConnected()` between its
`!isReconnecting` guard and the `isReconnecting = true` assignment.
During that suspension, an NWPathMonitor event can land, set
`isReconnecting = true`, and start its own reconnect — when the
watchdog resumes it unconditionally sets the flag (already true) and
calls `await reconnect()` concurrently with the in-flight one. Two
back-to-back replaceClient rebuilds, second one tearing down what the
first just built.

One-line fix: re-check `!self.isReconnecting` after the connectivity
await, before claiming the busy slot.

Surfaced by the third /code-review pass on this PR (score 92).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 .../RoadFlareCore/ViewModels/ConnectionCoordinator.swift    | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/RoadFlare/RoadFlareCore/ViewModels/ConnectionCoordinator.swift b/RoadFlare/RoadFlareCore/ViewModels/ConnectionCoordinator.swift
index 02b1de2..81e740f 100644
--- a/RoadFlare/RoadFlareCore/ViewModels/ConnectionCoordinator.swift
+++ b/RoadFlare/RoadFlareCore/ViewModels/ConnectionCoordinator.swift
@@ -58,6 +58,12 @@ final class ConnectionCoordinator {
                 try? await Task.sleep(for: interval)
                 guard let self, !self.isReconnecting, shouldReconnect() else { continue }
                 guard !(await isConnected()) else { continue }
+                // Re-check after the connectivity await: a path-monitor
+                // event landing during the suspension can have set
+                // `isReconnecting = true` and started its own reconnect.
+                // Without this guard we'd kick off a second concurrent
+                // reconnect, tearing down the in-flight rebuild's client.
+                guard !self.isReconnecting else { continue }
                 self.isReconnecting = true
                 defer { self.isReconnecting = false }
                 await reconnect()