You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Umbrella tracking issue for test-coverage gaps identified during the audit of #47 (per-component lifecycle state with derived ComapeoState). Each item below is independent; they're grouped here so a single agent (or a human picking up the work) has the full picture and can break them into sub-issues sized to whoever takes them.
The backend (backend/) has zero test infrastructure today (npm test runs expo-module test, which doesn't cover the Node-side code). Several behaviours introduced in #47 live entirely on the Node side and have no tests:
1a. error-native handler — malformed-input rejection (backend/index.js:140-152)
Should ignore (with a console.warn) frames where phase or message is not a string. Expected: no call to handleFatal, no process.exit, no error broadcast.
handleFatal calls controlIpcServer.broadcastError({phase, message, stack}) — every connected client receives an error frame.
After a 100ms flush window, process.exit(1) runs.
1c. shutdown handler — ordering invariant (backend/index.js:104-118)
The first line of the shutdown handler is controlIpcServer.broadcast({ type: \"stopping\" }). The contract is that this frame reaches all connected clients before the socket close, because native's exit-classification logic uses the presence of stopping to distinguish graceful from unexpected disconnects (see ARCHITECTURE.md §5.4).
1d. SimpleRpcServer.broadcast(msg) and broadcastError(payload) (backend/lib/simple-rpc.js:131-165)
broadcast: posts to every connected client, doesn't throw if a single client errors (logs and continues).
broadcastError: same, with the {type:\"error\", ...payload} shape.
Neither replays to late-connecting clients (intentional — see ARCHITECTURE.md §3.1).
setReadinessPhase DOES replay (already implicitly covered by the iOS handshake tests, but a direct unit test would be tighter).
Infrastructure needed
A Node-side test runner. Suggested approach:
Use node:test (built into Node 24, which is the dev runtime per package.json's devEngines).
Add a backend:test npm script.
For 1a/1d: in-process tests using SimpleRpcServer directly with a pair of MessagePort-style mocks or a real AF_UNIX socket pair on /tmp.
For 1b/1c: subprocess tests — spawn backend/index.js as a child process, feed it framed messages over a real socket, observe broadcasts and exit code. The existing MockNodeServer shape on iOS is a useful reference for the wire format.
Acceptance criteria
npm run backend:test exists and runs in CI (extend .github/workflows/).
Coverage for 1a–1d above.
Tests are fast (<5s total) so they can run on every PR.
Item 2: Android start() / stop() refused from ERROR
NodeJSService.start() (android/src/main/java/com/comapeo/core/NodeJSService.kt:345-362) and stop() (:559-575) both check getState() and refuse to proceed from ERROR. iOS has equivalent guards and they're tested by testStartFromErrorStateIsRejected and the body of testStopWhenAlreadyStoppedIsIgnored. Android has no equivalent test.
Constraints
JVM unit tests can't construct NodeJSService because the companion-object init block calls System.loadLibrary(\"comapeo-core-react-native\"), which fails outside a device/emulator.
Instrumented test in apps/example/tests/android/. Construct NodeJSService, drive it to ERROR (e.g. via the watchdog by passing a tiny startupTimeoutMs and never sending started), assert subsequent start() throws and stop() no-ops. Cost: +1 instrumented test, runs only on the emulator job in CI.
Refactor the guards out into a small testable function that takes state: State and returns a sealed StartDecision / StopDecision. Pure, JVM-testable. Lower coverage than the instrumented option (doesn't verify the call site uses the function correctly) but vastly cheaper.
Acceptance criteria
A failing test would catch "someone removed the ERROR guard from start() / stop()".
Whichever approach is chosen, it covers both methods on Android.
ComapeoCoreModule.kt's Disconnected handler maps the IPC's pre-disconnect JS state to a post-disconnect outcome (see ARCHITECTURE.md §5.4 table):
when (current) {
JsState.ERROR-> {} // terminalJsState.STOPPING, JsState.STOPPED-> setState(JsState.STOPPED)
JsState.STARTING, JsState.STARTED-> setState(JsState.ERROR, mapOf(
\"errorPhase\" to \"node-runtime-unexpected\",\"errorMessage\" to \"Backend disconnected unexpectedly\", ))}
This is the cross-process equivalent of the iOS nodeRuntime.exited(_, .unexpected) rule. It's the linchpin of correct ERROR detection on Android: if it's wrong, an unannounced backend crash silently lands in STOPPED, which is the bug #47 was designed to prevent.
Constraints
Same as Item 2 — JVM unit tests can't construct the module class (it derives from Expo's Module which depends on Android framework classes).
Suggested approach
Instrumented test that:
Boots the FGS in the test process or relies on it being up.
Connects a real NodeJSIPC to control.sock.
For each pre-disconnect state, drives it (via control frames or by waiting for natural transitions), then closes the FGS-side socket abruptly.
Asserts the JS state observer fires with the expected outcome.
The harder bit is step 1 — needs the FGS to actually be running with a known initial state. Easier path: extract the disconnect-reason mapping into a pure disconnectReason(JsState): JsState function and test that directly. Pair with an integration test that just spot-checks one path end-to-end.
Acceptance criteria
The three branches (ERROR / STOPPING-or-STOPPED / STARTING-or-STARTED) are each verified.
Success path: ipcDeferred completes with a mock IPC, sendMessage is called with the JSON-serialised {type:\"error-native\", phase, message} payload.
Timeout path: ipcDeferred never completes; after 2s the frame is dropped and a warning is logged. No exception thrown.
Constraints
Same JNI loading constraint. Plus serviceScope and ipcDeferred are private.
Suggested approach
Two options:
Extract the body into a top-level suspend fun sendErrorNativeFrame(ipcDeferred: Deferred<NodeJSIPC>, phase: String, message: String, json: Json, timeoutMs: Long): Boolean. JVM-testable with a mock NodeJSIPC interface. The class method becomes a one-liner.
Instrumented test that drives NodeJSService through both paths.
Option 1 is cleaner — the function is logically a free helper; nothing it does requires class state.
Acceptance criteria
Both paths covered, including the 2s timeout via runTest's virtual time (advanceTimeBy(2_001)).
Item 5: Android runNode() catch branch
The catch branch in runNode() (android/.../NodeJSService.kt:442-462) sets both backendState = BackendState.Error(...) and nodeRuntime = NodeRuntimeState.Exited(_, REQUESTED) to keep the component triple consistent after the thread unwinds.
Constraints
Hard to reach in a test. Requires startNodeWithArguments (a JNI external) to throw, which is impossible in a JVM unit test without an alternate stub-loaded library.
Priority
Low. Visible state still derives correctly via the backend-error rule (rule 1) even if nodeRuntime were left stale; the consistency fix from PR #48 is defensive against future code that might re-derive after another mutation. Concretely: this branch is only ever entered if startNodeWithArguments itself throws, which in production means the JNI bridge died — at that point most of the rest of the test infra is also broken.
Suggested approach (if pursued)
Instrumented test where the JS index file is deliberately malformed in a way that causes the native runtime to throw at startup. Brittle and platform-version-sensitive.
Or accept that this branch is exercised only in production and skip a test.
Acceptance criteria
Either: a test that causes runNode's catch branch to fire and verifies the component triple ends consistent.
Or: explicit decision (in this issue or its sub-issue's resolution) that the path is too costly to test and we accept the static-analysis-only coverage.
Suggested sub-issue layout
If splitting:
Sub-issue A: Item 1 (backend test harness). Largest piece — sets up new infrastructure, then implements 1a–1d on top.
Sub-issue B: Items 2 + 3 + 4 grouped ("Android testability"). They share the JNI-loading constraint and benefit from a coherent decision on instrumented-vs-extract-and-unit-test.
Sub-issue C: Item 5 (or close as wontfix with rationale).
Each sub-issue should reference back here and link to the relevant ARCHITECTURE.md sections.
Pointers for whoever picks this up
ARCHITECTURE.md §5.5 (Errors) is the design reference for which paths exist and what phase strings each uses.
ARCHITECTURE.md §5.7 (Timeout topology) is the reference for the 2s error-native timeout (Item 4) and other timeout interactions.
The iOS MockBackend (ios/Tests/Helpers/MockBackend.swift) is a useful pattern for what a backend-side test fixture looks like — wire format, framing, handshake sequencing.
LifecycleStateDerivation.kt is the model for "extract the testable bit out of the JNI-loading class" if you go that route on the Android side.
Umbrella tracking issue for test-coverage gaps identified during the audit of #47 (per-component lifecycle state with derived
ComapeoState). Each item below is independent; they're grouped here so a single agent (or a human picking up the work) has the full picture and can break them into sub-issues sized to whoever takes them.Context
The branch under #47 introduces:
deriveState(NodeRuntime, BackendState, stopRequested) → Statefunction on both platforms.stoppingcontrol frame (Node → native) for graceful-shutdown attribution.error-nativecontrol frame (Android FGS → backend) for cross-process error attribution.SimpleRpcServer.broadcast(msg)(generalised frombroadcastError).ERROR(start()/stop()refused once entered).lastErrorcleared on freshstart(), preserved acrosscleanup().What's tested in #47 (post-audit additions):
DeriveStateTests/Test).Stoppingframe parse on both platforms.errorframe → ERROR; rootkey load failure → ERROR;stoppingframe → STOPPING → STOPPED.start()from ERROR refused (existingtestStartFromErrorStateIsRejected).What's NOT tested — the items in this issue.
Architectural reference:
docs/ARCHITECTURE.md§5 (lifecycle state machine), §5.5 (errors), §5.7 (timeout topology).Item 1: Backend (Node) test harness
The backend (
backend/) has zero test infrastructure today (npm testrunsexpo-module test, which doesn't cover the Node-side code). Several behaviours introduced in #47 live entirely on the Node side and have no tests:1a.
error-nativehandler — malformed-input rejection (backend/index.js:140-152)Should ignore (with a
console.warn) frames wherephaseormessageis not a string. Expected: no call tohandleFatal, noprocess.exit, noerrorbroadcast.1b.
error-nativehandler — valid input → re-broadcast + exit (backend/index.js:140-152)A well-formed
{type:\"error-native\", phase, message}frame should:handleFatal(phase, new Error(message)).handleFatalcallscontrolIpcServer.broadcastError({phase, message, stack})— every connected client receives anerrorframe.process.exit(1)runs.1c.
shutdownhandler — ordering invariant (backend/index.js:104-118)The first line of the shutdown handler is
controlIpcServer.broadcast({ type: \"stopping\" }). The contract is that this frame reaches all connected clients before the socket close, because native's exit-classification logic uses the presence ofstoppingto distinguish graceful from unexpected disconnects (see ARCHITECTURE.md §5.4).1d.
SimpleRpcServer.broadcast(msg)andbroadcastError(payload)(backend/lib/simple-rpc.js:131-165)broadcast: posts to every connected client, doesn't throw if a single client errors (logs and continues).broadcastError: same, with the{type:\"error\", ...payload}shape.setReadinessPhaseDOES replay (already implicitly covered by the iOS handshake tests, but a direct unit test would be tighter).Infrastructure needed
A Node-side test runner. Suggested approach:
node:test(built into Node 24, which is the dev runtime perpackage.json'sdevEngines).backend:testnpm script.SimpleRpcServerdirectly with a pair ofMessagePort-style mocks or a real AF_UNIX socket pair on/tmp.backend/index.jsas a child process, feed it framed messages over a real socket, observe broadcasts and exit code. The existingMockNodeServershape on iOS is a useful reference for the wire format.Acceptance criteria
npm run backend:testexists and runs in CI (extend.github/workflows/).Item 2: Android
start()/stop()refused from ERRORNodeJSService.start()(android/src/main/java/com/comapeo/core/NodeJSService.kt:345-362) andstop()(:559-575) both checkgetState()and refuse to proceed from ERROR. iOS has equivalent guards and they're tested bytestStartFromErrorStateIsRejectedand the body oftestStopWhenAlreadyStoppedIsIgnored. Android has no equivalent test.Constraints
NodeJSServicebecause the companion-objectinitblock callsSystem.loadLibrary(\"comapeo-core-react-native\"), which fails outside a device/emulator.LifecycleStateDerivationextraction in refactor: per-component lifecycle state with derived ComapeoState #47 was specifically to keepderiveLifecycleStateJVM-testable; the rest of the class isn't.Suggested approach
Either:
apps/example/tests/android/. ConstructNodeJSService, drive it to ERROR (e.g. via the watchdog by passing a tinystartupTimeoutMsand never sendingstarted), assert subsequentstart()throws andstop()no-ops. Cost: +1 instrumented test, runs only on the emulator job in CI.state: Stateand returns a sealedStartDecision/StopDecision. Pure, JVM-testable. Lower coverage than the instrumented option (doesn't verify the call site uses the function correctly) but vastly cheaper.Acceptance criteria
Item 3: Android
ComapeoCoreModuledisconnect-reason logicComapeoCoreModule.kt'sDisconnectedhandler maps the IPC's pre-disconnect JS state to a post-disconnect outcome (see ARCHITECTURE.md §5.4 table):This is the cross-process equivalent of the iOS
nodeRuntime.exited(_, .unexpected)rule. It's the linchpin of correct ERROR detection on Android: if it's wrong, an unannounced backend crash silently lands in STOPPED, which is the bug #47 was designed to prevent.Constraints
Same as Item 2 — JVM unit tests can't construct the module class (it derives from Expo's
Modulewhich depends on Android framework classes).Suggested approach
NodeJSIPCtocontrol.sock.The harder bit is step 1 — needs the FGS to actually be running with a known initial state. Easier path: extract the disconnect-reason mapping into a pure
disconnectReason(JsState): JsStatefunction and test that directly. Pair with an integration test that just spot-checks one path end-to-end.Acceptance criteria
Item 4: Android
sendErrorNativeFrame(success + 2s timeout)NodeJSService.sendErrorNativeFrame(phase, message)(android/.../NodeJSService.kt:545-585):serviceScope.launch { try { val ipc = withTimeoutOrNull(SEND_ERROR_NATIVE_TIMEOUT_MS) { ipcDeferred.await() } if (ipc == null) { Log.w(TAG, \"Dropping error-native frame: ...\") return@launch } ipc.sendMessage(payload) } catch (e: Exception) { ... } }Two paths to verify:
ipcDeferredcompletes with a mock IPC,sendMessageis called with the JSON-serialised{type:\"error-native\", phase, message}payload.ipcDeferrednever completes; after 2s the frame is dropped and a warning is logged. No exception thrown.Constraints
Same JNI loading constraint. Plus
serviceScopeandipcDeferredare private.Suggested approach
Two options:
suspend fun sendErrorNativeFrame(ipcDeferred: Deferred<NodeJSIPC>, phase: String, message: String, json: Json, timeoutMs: Long): Boolean. JVM-testable with a mockNodeJSIPCinterface. The class method becomes a one-liner.NodeJSServicethrough both paths.Option 1 is cleaner — the function is logically a free helper; nothing it does requires class state.
Acceptance criteria
runTest's virtual time (advanceTimeBy(2_001)).Item 5: Android
runNode()catch branchThe catch branch in
runNode()(android/.../NodeJSService.kt:442-462) sets bothbackendState = BackendState.Error(...)andnodeRuntime = NodeRuntimeState.Exited(_, REQUESTED)to keep the component triple consistent after the thread unwinds.Constraints
Hard to reach in a test. Requires
startNodeWithArguments(a JNI external) to throw, which is impossible in a JVM unit test without an alternate stub-loaded library.Priority
Low. Visible state still derives correctly via the backend-error rule (rule 1) even if
nodeRuntimewere left stale; the consistency fix from PR #48 is defensive against future code that might re-derive after another mutation. Concretely: this branch is only ever entered ifstartNodeWithArgumentsitself throws, which in production means the JNI bridge died — at that point most of the rest of the test infra is also broken.Suggested approach (if pursued)
Acceptance criteria
runNode's catch branch to fire and verifies the component triple ends consistent.Suggested sub-issue layout
If splitting:
Each sub-issue should reference back here and link to the relevant ARCHITECTURE.md sections.
Pointers for whoever picks this up
error-nativetimeout (Item 4) and other timeout interactions.MockBackend(ios/Tests/Helpers/MockBackend.swift) is a useful pattern for what a backend-side test fixture looks like — wire format, framing, handshake sequencing.LifecycleStateDerivation.ktis the model for "extract the testable bit out of the JNI-loading class" if you go that route on the Android side.