callstack · thymikee · Jun 11, 2026 · Jun 11, 2026
diff --git a/docs/adr/0005-ios-runner-interaction-lifecycle.md b/docs/adr/0005-ios-runner-interaction-lifecycle.md
@@ -30,10 +30,24 @@ uses selectors or text queries to find the semantic `XCUIElement`, but when the
 activation taps the resolved center point instead of calling `XCUIElement.tap()`. tvOS remains
 focus/remote-driven because tvOS does not support normal coordinate input.
 
-Ready runner sessions are probed with a short `uptime` preflight before command send. The daemon
-does not keep or consult a "recent success" health cache. Read-only startup commands still skip that
-preflight because the first successful command is the readiness proof for a newly launched runner.
-Readiness probe commands skip preflight to avoid recursion.
+Ready runner sessions are probed with a short `uptime` preflight before command send. Read-only
+startup commands still skip that preflight because the first successful command is the readiness
+proof for a newly launched runner. Readiness probe commands skip preflight to avoid recursion.
+
+The daemon may additionally skip the ready-session `uptime` preflight for an explicit allowlist of
+mutating interactions (`tap`, `tapSeries`, `longPress`, `drag`, `dragSeries`, `swipe`) when the same
+session produced a healthy mutating response — parsed ok and not carrying `runnerFatal` — for the
+same `appBundleId` within 5 seconds. This recency lives only on the `RunnerSession` object as
+`lastHealthyMutation`, so it dies with every invalidation/restart, and it is recorded only after the
+`runnerFatal` check, so sparse AX-fallback snapshots and `runnerFatal` payloads never refresh it.
+Snapshots and other read-only responses never count as a health signal. This narrow skip is
+permitted now because the future-work precondition below is met: coordinate-first activation removed
+the command-induced teardown trigger, and the lifecycle status journal plus the status-before-
+invalidate recovery is the teardown-surviving status surface that resolves any ambiguous post-send
+failure before invalidation. A transport failure after a skip clears the recency record and is marked
+with the skip context; connection-shaped failures (refused, reset, hung up) run status recovery
+instead of a blind replay, while timeout-shaped failures propagate with the skip context (the same
+classification preflighted sends use).
 
 `uptime` is a direct runner listener probe. It is answered before command journaling, the serial
 command execution queue, app activation, and main-thread XCTest dispatch. It should measure only
@@ -63,9 +77,13 @@ If xcodebuild still exits for another reason, the next command detects the stale
 process/liveness checks and avoids the old 15-second graceful-shutdown wait. The remaining latency is
 fresh xcodebuild runner startup, not a stale transport stall.
 
-The daemon no longer models recent success as a runner-health signal. That adds one cheap `uptime`
-request before ready-session commands, but it removes a false health signal that was observed to be
-unsafe.
+The daemon no longer models a generic "recent success" cache as a runner-health signal. A proven
+healthy mutating response for the same app — recorded only after the `runnerFatal` check and only
+for allowlisted interactions — is now a real end-to-end liveness proof (HTTP listener through to the
+app target), so a hot loop of allowlisted interactions skips the per-command `uptime` request while
+still re-earning each skip from another healthy mutation. The earlier unconditional `uptime` before
+every ready-session command remains the default for non-allowlisted commands and after any
+invalidation, stale record, app-bundle change, or absent record.
 
 Apps with broken accessibility trees may still be impossible for XCTest to inspect deeply, but one
 failed snapshot no longer teaches the runner to keep using a suspect cached app target or to amplify

diff --git a/docs/ios-runner-protocol-optimizations.md b/docs/ios-runner-protocol-optimizations.md
@@ -41,21 +41,34 @@ iOS simulator validation:
 
 ### 2. Adaptive `uptime` preflight policy
 
-Status: superseded by ADR 0005 for ready-session command execution.
-
-Goal: reduce unnecessary readiness probes only when another health signal proves the runner is still
-serving new requests. A recent successful command response is not sufficient proof: React Navigation
-dogfood showed XCTest can return a successful tap response and then immediately fail the test runner
-while re-resolving a navigation-disappeared element.
-
-Acceptance criteria:
-
-- Existing first-command/startup readiness behavior is preserved.
-- Existing failed-preflight stale-session recovery is preserved.
-- Repeated hot interactions do not skip `uptime` based on cached recent-success state.
-- Commands that still need conservative readiness checks remain preflighted until measured.
-- A transport failure after skipping preflight runs status recovery before invalidation.
-- Diagnostics expose whether a command used, skipped, or recovered from a readiness preflight.
+Status: implemented with guardrails (see ADR 0005). The earlier blanket "recent success" cache was
+shipped and then reverted in #702 because XCTest could return a successful tap response and then fail
+the runner while re-resolving a navigation-disappeared element, and because sparse AX-fallback
+snapshots were cached as healthy state. #702's coordinate-first activation removed that teardown
+trigger, so the skip is reintroduced as a structurally narrower "healthy mutation recency" signal.
+
+Goal: skip the per-command `uptime` for hot allowlisted interaction loops only when a proven healthy
+mutating response makes the runner's liveness already known, while every uncertain path keeps
+preflighting.
+
+Acceptance criteria (as shipped):
+
+- First-command/startup, no-record, stale-record, app-activation-uncertain, and non-allowlisted
+  (conservative) commands still preflight; readiness probes and read-only startup commands keep
+  their existing skips.
+- Recency is derived only from healthy (parsed ok, non-`runnerFatal`) responses of an explicit
+  mutating allowlist (`tap`, `tapSeries`, `longPress`, `drag`, `dragSeries`, `swipe`) for the same
+  `appBundleId`, within a 5s freshness window, and lives only on the session object so it dies with
+  every invalidation/restart. Snapshots and read-only responses never refresh it.
+- A transport failure after a skipped preflight clears the recency record and marks the error with
+  the skip context (`runnerReadinessPreflightSkipped`, distinct from the restart predicate's
+  `runnerReadinessPreflightFailed`). Connection-shaped failures run status recovery before
+  invalidation — never a replay; timeout-shaped failures propagate with the skip context, matching
+  the existing classification for preflighted sends.
+- Diagnostics expose whether a command used, skipped, or recovered from a readiness preflight,
+  including command type, skip reason, and recency age.
+- Measured threshold: 1 runner request per hot allowlisted command after the first, with no increase
+  in invalidation or failure rate.
 
 iOS simulator validation:
 

diff --git a/src/platforms/ios/__tests__/runner-command-retry.test.ts b/src/platforms/ios/__tests__/runner-command-retry.test.ts
@@ -628,6 +628,94 @@ test('mutating commands report recovery guidance when completed status has no re
   });
 });
 
+test('mutating commands run status recovery after transport failure when readiness preflight was skipped', async () => {
+  const session = makeRunnerSession({ port: 8100, ready: true });
+
+  mockEnsureRunnerSession.mockResolvedValueOnce(session);
+  mockExecuteRunnerCommandWithSession
+    .mockRejectedValueOnce(
+      new AppError('COMMAND_FAILED', 'fetch failed', {
+        runnerReadinessPreflightSkipped: true,
+        runnerReadinessPreflightSkipReason: 'recent_healthy_mutation',
+        runnerReadinessPreflightSkippedAgeMs: 1_200,
+      }),
+    )
+    .mockResolvedValueOnce({
+      lifecycleState: 'completed',
+      lifecycleResponseJson: JSON.stringify({ ok: true, data: { message: 'tapped' } }),
+    });
+
+  const result = await runIosRunnerCommand(IOS_SIMULATOR, { command: 'tap', x: 120, y: 240 });
+
+  assert.deepEqual(result, { message: 'tapped' });
+  assert.equal(mockInvalidateRunnerSession.mock.calls.length, 0);
+  assert.equal(mockExecuteRunnerCommandWithSession.mock.calls.length, 2);
+  const recoveryDiagnostic = mockEmitDiagnostic.mock.calls.find(
+    ([event]) => event.phase === 'ios_runner_command_status_recovery',
+  )?.[0];
+  assert.ok(recoveryDiagnostic);
+  assert.equal(recoveryDiagnostic.data?.readinessPreflightSkipped, true);
+  assert.equal(recoveryDiagnostic.data?.readinessPreflightSkipReason, 'recent_healthy_mutation');
+  assert.equal(recoveryDiagnostic.data?.readinessPreflightSkippedAgeMs, 1_200);
+});
+
+test('mutating commands include skipped readiness context in lost-response guidance', async () => {
+  const session = makeRunnerSession({ port: 8100, ready: true });
+
+  mockEnsureRunnerSession.mockResolvedValueOnce(session);
+  mockExecuteRunnerCommandWithSession
+    .mockRejectedValueOnce(
+      new AppError('COMMAND_FAILED', 'fetch failed', {
+        runnerReadinessPreflightSkipped: true,
+        runnerReadinessPreflightSkipReason: 'recent_healthy_mutation',
+        runnerReadinessPreflightSkippedAgeMs: 1_200,
+      }),
+    )
+    .mockResolvedValueOnce({ lifecycleState: 'completed' });
+
+  await assert.rejects(
+    () => runIosRunnerCommand(IOS_SIMULATOR, { command: 'tap', x: 120, y: 240 }),
+    (error: unknown) => {
+      assert.ok(error instanceof AppError);
+      assert.match(String(error.details?.hint), /^This hot command skipped the uptime preflight/);
+      assert.equal(error.details?.readinessPreflightSkipped, true);
+      assert.equal(error.details?.readinessPreflightSkipReason, 'recent_healthy_mutation');
+      assert.equal(error.details?.readinessPreflightSkippedAgeMs, 1_200);
+      return true;
+    },
+  );
+
+  assert.equal(mockInvalidateRunnerSession.mock.calls.length, 0);
+});
+
+test('mutating commands keep conservative invalidation for skipped-preflight failures with unknown lifecycle', async () => {
+  const session = makeRunnerSession({ port: 8100, ready: true });
+
+  mockEnsureRunnerSession.mockResolvedValueOnce(session);
+  mockExecuteRunnerCommandWithSession
+    .mockRejectedValueOnce(
+      new AppError('COMMAND_FAILED', 'fetch failed', {
+        runnerReadinessPreflightSkipped: true,
+        runnerReadinessPreflightSkipReason: 'recent_healthy_mutation',
+        runnerReadinessPreflightSkippedAgeMs: 1_200,
+      }),
+    )
+    .mockResolvedValueOnce({ lifecycleState: 'paused' });
+
+  await assert.rejects(() =>
+    runIosRunnerCommand(IOS_SIMULATOR, { command: 'tap', x: 120, y: 240 }),
+  );
+
+  assert.deepEqual(mockInvalidateRunnerSession.mock.calls, [
+    [session, 'transport_error_after_command_send'],
+  ]);
+  assertDiagnosticDecision({
+    decision: 'retained',
+    reason: 'unknown_lifecycle_state',
+    lifecycleState: 'paused',
+  });
+});
+
 test('mutating commands preserve runner failure details from status recovery', async () => {
   const session = makeRunnerSession({ port: 8100, ready: true });