diff --git a/docs/adr/0005-ios-runner-interaction-lifecycle.md b/docs/adr/0005-ios-runner-interaction-lifecycle.md index 59e4f27a3..fb940e26a 100644 --- a/docs/adr/0005-ios-runner-interaction-lifecycle.md +++ b/docs/adr/0005-ios-runner-interaction-lifecycle.md @@ -30,10 +30,24 @@ uses selectors or text queries to find the semantic `XCUIElement`, but when the activation taps the resolved center point instead of calling `XCUIElement.tap()`. tvOS remains focus/remote-driven because tvOS does not support normal coordinate input. -Ready runner sessions are probed with a short `uptime` preflight before command send. The daemon -does not keep or consult a "recent success" health cache. Read-only startup commands still skip that -preflight because the first successful command is the readiness proof for a newly launched runner. -Readiness probe commands skip preflight to avoid recursion. +Ready runner sessions are probed with a short `uptime` preflight before command send. Read-only +startup commands still skip that preflight because the first successful command is the readiness +proof for a newly launched runner. Readiness probe commands skip preflight to avoid recursion. + +The daemon may additionally skip the ready-session `uptime` preflight for an explicit allowlist of +mutating interactions (`tap`, `tapSeries`, `longPress`, `drag`, `dragSeries`, `swipe`) when the same +session produced a healthy mutating response — parsed ok and not carrying `runnerFatal` — for the +same `appBundleId` within 5 seconds. This recency lives only on the `RunnerSession` object as +`lastHealthyMutation`, so it dies with every invalidation/restart, and it is recorded only after the +`runnerFatal` check, so sparse AX-fallback snapshots and `runnerFatal` payloads never refresh it. +Snapshots and other read-only responses never count as a health signal. This narrow skip is +permitted now because the future-work precondition below is met: coordinate-first activation removed +the command-induced teardown trigger, and the lifecycle status journal plus the status-before- +invalidate recovery is the teardown-surviving status surface that resolves any ambiguous post-send +failure before invalidation. A transport failure after a skip clears the recency record and is marked +with the skip context; connection-shaped failures (refused, reset, hung up) run status recovery +instead of a blind replay, while timeout-shaped failures propagate with the skip context (the same +classification preflighted sends use). `uptime` is a direct runner listener probe. It is answered before command journaling, the serial command execution queue, app activation, and main-thread XCTest dispatch. It should measure only @@ -63,9 +77,13 @@ If xcodebuild still exits for another reason, the next command detects the stale process/liveness checks and avoids the old 15-second graceful-shutdown wait. The remaining latency is fresh xcodebuild runner startup, not a stale transport stall. -The daemon no longer models recent success as a runner-health signal. That adds one cheap `uptime` -request before ready-session commands, but it removes a false health signal that was observed to be -unsafe. +The daemon no longer models a generic "recent success" cache as a runner-health signal. A proven +healthy mutating response for the same app — recorded only after the `runnerFatal` check and only +for allowlisted interactions — is now a real end-to-end liveness proof (HTTP listener through to the +app target), so a hot loop of allowlisted interactions skips the per-command `uptime` request while +still re-earning each skip from another healthy mutation. The earlier unconditional `uptime` before +every ready-session command remains the default for non-allowlisted commands and after any +invalidation, stale record, app-bundle change, or absent record. Apps with broken accessibility trees may still be impossible for XCTest to inspect deeply, but one failed snapshot no longer teaches the runner to keep using a suspect cached app target or to amplify diff --git a/docs/ios-runner-protocol-optimizations.md b/docs/ios-runner-protocol-optimizations.md index 6c07f6e75..164955546 100644 --- a/docs/ios-runner-protocol-optimizations.md +++ b/docs/ios-runner-protocol-optimizations.md @@ -41,21 +41,34 @@ iOS simulator validation: ### 2. Adaptive `uptime` preflight policy -Status: superseded by ADR 0005 for ready-session command execution. - -Goal: reduce unnecessary readiness probes only when another health signal proves the runner is still -serving new requests. A recent successful command response is not sufficient proof: React Navigation -dogfood showed XCTest can return a successful tap response and then immediately fail the test runner -while re-resolving a navigation-disappeared element. - -Acceptance criteria: - -- Existing first-command/startup readiness behavior is preserved. -- Existing failed-preflight stale-session recovery is preserved. -- Repeated hot interactions do not skip `uptime` based on cached recent-success state. -- Commands that still need conservative readiness checks remain preflighted until measured. -- A transport failure after skipping preflight runs status recovery before invalidation. -- Diagnostics expose whether a command used, skipped, or recovered from a readiness preflight. +Status: implemented with guardrails (see ADR 0005). The earlier blanket "recent success" cache was +shipped and then reverted in #702 because XCTest could return a successful tap response and then fail +the runner while re-resolving a navigation-disappeared element, and because sparse AX-fallback +snapshots were cached as healthy state. #702's coordinate-first activation removed that teardown +trigger, so the skip is reintroduced as a structurally narrower "healthy mutation recency" signal. + +Goal: skip the per-command `uptime` for hot allowlisted interaction loops only when a proven healthy +mutating response makes the runner's liveness already known, while every uncertain path keeps +preflighting. + +Acceptance criteria (as shipped): + +- First-command/startup, no-record, stale-record, app-activation-uncertain, and non-allowlisted + (conservative) commands still preflight; readiness probes and read-only startup commands keep + their existing skips. +- Recency is derived only from healthy (parsed ok, non-`runnerFatal`) responses of an explicit + mutating allowlist (`tap`, `tapSeries`, `longPress`, `drag`, `dragSeries`, `swipe`) for the same + `appBundleId`, within a 5s freshness window, and lives only on the session object so it dies with + every invalidation/restart. Snapshots and read-only responses never refresh it. +- A transport failure after a skipped preflight clears the recency record and marks the error with + the skip context (`runnerReadinessPreflightSkipped`, distinct from the restart predicate's + `runnerReadinessPreflightFailed`). Connection-shaped failures run status recovery before + invalidation — never a replay; timeout-shaped failures propagate with the skip context, matching + the existing classification for preflighted sends. +- Diagnostics expose whether a command used, skipped, or recovered from a readiness preflight, + including command type, skip reason, and recency age. +- Measured threshold: 1 runner request per hot allowlisted command after the first, with no increase + in invalidation or failure rate. iOS simulator validation: diff --git a/src/platforms/ios/__tests__/runner-command-retry.test.ts b/src/platforms/ios/__tests__/runner-command-retry.test.ts index f51904bb8..52c180ed0 100644 --- a/src/platforms/ios/__tests__/runner-command-retry.test.ts +++ b/src/platforms/ios/__tests__/runner-command-retry.test.ts @@ -628,6 +628,94 @@ test('mutating commands report recovery guidance when completed status has no re }); }); +test('mutating commands run status recovery after transport failure when readiness preflight was skipped', async () => { + const session = makeRunnerSession({ port: 8100, ready: true }); + + mockEnsureRunnerSession.mockResolvedValueOnce(session); + mockExecuteRunnerCommandWithSession + .mockRejectedValueOnce( + new AppError('COMMAND_FAILED', 'fetch failed', { + runnerReadinessPreflightSkipped: true, + runnerReadinessPreflightSkipReason: 'recent_healthy_mutation', + runnerReadinessPreflightSkippedAgeMs: 1_200, + }), + ) + .mockResolvedValueOnce({ + lifecycleState: 'completed', + lifecycleResponseJson: JSON.stringify({ ok: true, data: { message: 'tapped' } }), + }); + + const result = await runIosRunnerCommand(IOS_SIMULATOR, { command: 'tap', x: 120, y: 240 }); + + assert.deepEqual(result, { message: 'tapped' }); + assert.equal(mockInvalidateRunnerSession.mock.calls.length, 0); + assert.equal(mockExecuteRunnerCommandWithSession.mock.calls.length, 2); + const recoveryDiagnostic = mockEmitDiagnostic.mock.calls.find( + ([event]) => event.phase === 'ios_runner_command_status_recovery', + )?.[0]; + assert.ok(recoveryDiagnostic); + assert.equal(recoveryDiagnostic.data?.readinessPreflightSkipped, true); + assert.equal(recoveryDiagnostic.data?.readinessPreflightSkipReason, 'recent_healthy_mutation'); + assert.equal(recoveryDiagnostic.data?.readinessPreflightSkippedAgeMs, 1_200); +}); + +test('mutating commands include skipped readiness context in lost-response guidance', async () => { + const session = makeRunnerSession({ port: 8100, ready: true }); + + mockEnsureRunnerSession.mockResolvedValueOnce(session); + mockExecuteRunnerCommandWithSession + .mockRejectedValueOnce( + new AppError('COMMAND_FAILED', 'fetch failed', { + runnerReadinessPreflightSkipped: true, + runnerReadinessPreflightSkipReason: 'recent_healthy_mutation', + runnerReadinessPreflightSkippedAgeMs: 1_200, + }), + ) + .mockResolvedValueOnce({ lifecycleState: 'completed' }); + + await assert.rejects( + () => runIosRunnerCommand(IOS_SIMULATOR, { command: 'tap', x: 120, y: 240 }), + (error: unknown) => { + assert.ok(error instanceof AppError); + assert.match(String(error.details?.hint), /^This hot command skipped the uptime preflight/); + assert.equal(error.details?.readinessPreflightSkipped, true); + assert.equal(error.details?.readinessPreflightSkipReason, 'recent_healthy_mutation'); + assert.equal(error.details?.readinessPreflightSkippedAgeMs, 1_200); + return true; + }, + ); + + assert.equal(mockInvalidateRunnerSession.mock.calls.length, 0); +}); + +test('mutating commands keep conservative invalidation for skipped-preflight failures with unknown lifecycle', async () => { + const session = makeRunnerSession({ port: 8100, ready: true }); + + mockEnsureRunnerSession.mockResolvedValueOnce(session); + mockExecuteRunnerCommandWithSession + .mockRejectedValueOnce( + new AppError('COMMAND_FAILED', 'fetch failed', { + runnerReadinessPreflightSkipped: true, + runnerReadinessPreflightSkipReason: 'recent_healthy_mutation', + runnerReadinessPreflightSkippedAgeMs: 1_200, + }), + ) + .mockResolvedValueOnce({ lifecycleState: 'paused' }); + + await assert.rejects(() => + runIosRunnerCommand(IOS_SIMULATOR, { command: 'tap', x: 120, y: 240 }), + ); + + assert.deepEqual(mockInvalidateRunnerSession.mock.calls, [ + [session, 'transport_error_after_command_send'], + ]); + assertDiagnosticDecision({ + decision: 'retained', + reason: 'unknown_lifecycle_state', + lifecycleState: 'paused', + }); +}); + test('mutating commands preserve runner failure details from status recovery', async () => { const session = makeRunnerSession({ port: 8100, ready: true }); diff --git a/src/platforms/ios/__tests__/runner-session.test.ts b/src/platforms/ios/__tests__/runner-session.test.ts index 03c579bdc..3e2db7fe0 100644 --- a/src/platforms/ios/__tests__/runner-session.test.ts +++ b/src/platforms/ios/__tests__/runner-session.test.ts @@ -332,7 +332,7 @@ test('runner session emits explicit diagnostics when ready sessions are probed', }); assert.match(diagnostics, /ios_runner_readiness_preflight/); - assert.match(diagnostics, /"reason":"ready_session"/); + assert.match(diagnostics, /"reason":"no_recent_healthy_mutation"/); assert.doesNotMatch(diagnostics, /ios_runner_readiness_preflight_skipped/); }); @@ -861,6 +861,338 @@ test('runner session validates supported Apple runner devices', () => { ); }); +const ALLOWLISTED_MUTATIONS: { name: string; command: Record }[] = [ + { name: 'tap', command: { command: 'tap', x: 120, y: 240 } }, + { + name: 'selector tap', + command: { command: 'tap', selectorKey: 'label', selectorValue: 'Open article' }, + }, + { name: 'tapSeries', command: { command: 'tapSeries', x: 1, y: 2, count: 2, intervalMs: 80 } }, + { name: 'longPress', command: { command: 'longPress', x: 1, y: 2 } }, + { name: 'drag', command: { command: 'drag', x: 1, y: 2, x2: 3, y2: 4 } }, + { + name: 'dragSeries', + command: { command: 'dragSeries', x: 1, y: 2, x2: 3, y2: 4, count: 2 }, + }, + { name: 'swipe', command: { command: 'swipe', x: 1, y: 2, x2: 3, y2: 4 } }, +]; + +for (const { name, command } of ALLOWLISTED_MUTATIONS) { + test(`runner session skips readiness preflight for ${name} after a fresh same-bundle healthy mutation`, async () => { + vi.useFakeTimers(); + vi.setSystemTime(new Date('2026-06-11T00:00:00Z')); + try { + const session = makeRunnerSession({ + ready: true, + lastHealthyMutation: { atMs: Date.now() - 1_500, appBundleId: 'com.example.demo' }, + }); + mockSendRunnerCommandOnce.mockResolvedValueOnce(runnerResponse({ acted: true })); + + const diagnostics = await captureDiagnostics(async () => { + await executeRunnerCommandWithSession( + IOS_SIMULATOR, + session, + { ...command, appBundleId: 'com.example.demo' } as Parameters< + typeof executeRunnerCommandWithSession + >[2], + '/tmp/runner.log', + 30_000, + ); + }); + + assert.equal(mockWaitForRunner.mock.calls.length, 0); + assert.equal(mockSendRunnerCommandOnce.mock.calls.length, 1); + assert.match(diagnostics, /ios_runner_readiness_preflight_skipped/); + assert.match(diagnostics, /"reason":"recent_healthy_mutation"/); + assert.match(diagnostics, /"lastHealthyMutationAgeMs":1500/); + } finally { + vi.useRealTimers(); + } + }); +} + +test('runner session records recency only from allowlisted healthy mutations', async () => { + const session = makeRunnerSession({ ready: true }); + mockWaitForRunner.mockResolvedValueOnce(runnerResponse({ uptimeMs: 42 })); + mockSendRunnerCommandOnce.mockResolvedValueOnce(runnerResponse({ tapped: true })); + + await executeRunnerCommandWithSession( + IOS_SIMULATOR, + session, + { command: 'tap', x: 120, y: 240, appBundleId: 'com.example.demo' }, + '/tmp/runner.log', + 30_000, + ); + + assert.equal(session.lastHealthyMutation?.appBundleId, 'com.example.demo'); + assert.equal(typeof session.lastHealthyMutation?.atMs, 'number'); + + // Second allowlisted command now skips preflight. + mockWaitForRunner.mockClear(); + mockSendRunnerCommandOnce.mockResolvedValueOnce(runnerResponse({ tapped: true })); + await executeRunnerCommandWithSession( + IOS_SIMULATOR, + session, + { command: 'tap', x: 1, y: 2, appBundleId: 'com.example.demo' }, + '/tmp/runner.log', + 30_000, + ); + assert.equal(mockWaitForRunner.mock.calls.length, 0); +}); + +test('runner session does not record recency from successful read-only responses', async () => { + const session = makeRunnerSession({ ready: true }); + mockWaitForRunner + .mockResolvedValueOnce(runnerResponse({ uptimeMs: 42 })) + .mockResolvedValueOnce(runnerResponse({ nodes: [], truncated: false })); + + await executeRunnerCommandWithSession( + IOS_SIMULATOR, + session, + { command: 'snapshot', appBundleId: 'com.example.demo' }, + '/tmp/runner.log', + 30_000, + ); + + assert.equal(session.lastHealthyMutation, undefined); + + // The next tap must still preflight because no healthy mutation was recorded. + mockWaitForRunner.mockClear(); + mockWaitForRunner.mockResolvedValueOnce(runnerResponse({ uptimeMs: 42 })); + mockSendRunnerCommandOnce.mockResolvedValueOnce(runnerResponse({ tapped: true })); + await executeRunnerCommandWithSession( + IOS_SIMULATOR, + session, + { command: 'tap', x: 1, y: 2, appBundleId: 'com.example.demo' }, + '/tmp/runner.log', + 30_000, + ); + assert.equal(mockWaitForRunner.mock.calls.length, 1); + assertRunnerCommand(mockWaitForRunner.mock.calls[0]?.[2], { command: 'uptime' }); +}); + +test('runner session does not record recency from runnerFatal ok payloads', async () => { + const session = makeRunnerSession({ ready: true }); + mockWaitForRunner.mockResolvedValueOnce(runnerResponse({ uptimeMs: 42 })); + mockSendRunnerCommandOnce.mockResolvedValueOnce( + runnerResponse({ + acted: false, + runnerFatal: true, + runnerFatalReason: 'ax_snapshot_unavailable', + }), + ); + + await executeRunnerCommandWithSession( + IOS_SIMULATOR, + session, + { command: 'tap', x: 1, y: 2, appBundleId: 'com.example.demo' }, + '/tmp/runner.log', + 30_000, + ); + + assert.equal(session.lastHealthyMutation, undefined); +}); + +test('runner session preflights with conservative_command for non-allowlisted mutations', async () => { + const session = makeRunnerSession({ + ready: true, + lastHealthyMutation: { atMs: Date.now(), appBundleId: 'com.example.demo' }, + }); + mockWaitForRunner.mockResolvedValueOnce(runnerResponse({ uptimeMs: 42 })); + mockSendRunnerCommandOnce.mockResolvedValueOnce(runnerResponse({ typed: true })); + + const diagnostics = await captureDiagnostics(async () => { + await executeRunnerCommandWithSession( + IOS_SIMULATOR, + session, + { command: 'type', text: 'hi', appBundleId: 'com.example.demo' }, + '/tmp/runner.log', + 30_000, + ); + }); + + assert.equal(mockWaitForRunner.mock.calls.length, 1); + assert.match(diagnostics, /"reason":"conservative_command"/); +}); + +test('runner session preflights with no_recent_healthy_mutation when ready without a record', async () => { + const session = makeRunnerSession({ ready: true }); + mockWaitForRunner.mockResolvedValueOnce(runnerResponse({ uptimeMs: 42 })); + mockSendRunnerCommandOnce.mockResolvedValueOnce(runnerResponse({ tapped: true })); + + const diagnostics = await captureDiagnostics(async () => { + await executeRunnerCommandWithSession( + IOS_SIMULATOR, + session, + { command: 'tap', x: 1, y: 2, appBundleId: 'com.example.demo' }, + '/tmp/runner.log', + 30_000, + ); + }); + + assert.equal(mockWaitForRunner.mock.calls.length, 1); + assert.match(diagnostics, /"reason":"no_recent_healthy_mutation"/); +}); + +test('runner session preflights with healthy_mutation_stale when the record is older than 5s', async () => { + const session = makeRunnerSession({ + ready: true, + lastHealthyMutation: { atMs: Date.now() - 6_000, appBundleId: 'com.example.demo' }, + }); + mockWaitForRunner.mockResolvedValueOnce(runnerResponse({ uptimeMs: 42 })); + mockSendRunnerCommandOnce.mockResolvedValueOnce(runnerResponse({ tapped: true })); + + const diagnostics = await captureDiagnostics(async () => { + await executeRunnerCommandWithSession( + IOS_SIMULATOR, + session, + { command: 'tap', x: 1, y: 2, appBundleId: 'com.example.demo' }, + '/tmp/runner.log', + 30_000, + ); + }); + + assert.equal(mockWaitForRunner.mock.calls.length, 1); + assert.match(diagnostics, /"reason":"healthy_mutation_stale"/); +}); + +test('runner session preflights with app_activation_uncertain on a differing bundle', async () => { + const session = makeRunnerSession({ + ready: true, + lastHealthyMutation: { atMs: Date.now(), appBundleId: 'com.example.demo' }, + }); + mockWaitForRunner.mockResolvedValueOnce(runnerResponse({ uptimeMs: 42 })); + mockSendRunnerCommandOnce.mockResolvedValueOnce(runnerResponse({ tapped: true })); + + const diagnostics = await captureDiagnostics(async () => { + await executeRunnerCommandWithSession( + IOS_SIMULATOR, + session, + { command: 'tap', x: 1, y: 2, appBundleId: 'com.example.other' }, + '/tmp/runner.log', + 30_000, + ); + }); + + assert.equal(mockWaitForRunner.mock.calls.length, 1); + assert.match(diagnostics, /"reason":"app_activation_uncertain"/); +}); + +test('runner session preflights with startup reason for the first command on a fresh session', async () => { + const session = makeRunnerSession({ + ready: false, + lastHealthyMutation: { atMs: Date.now(), appBundleId: 'com.example.demo' }, + }); + mockWaitForRunner.mockResolvedValueOnce(runnerResponse({ uptimeMs: 42 })); + mockSendRunnerCommandOnce.mockResolvedValueOnce(runnerResponse({ tapped: true })); + + const diagnostics = await captureDiagnostics(async () => { + await executeRunnerCommandWithSession( + IOS_SIMULATOR, + session, + { command: 'tap', x: 1, y: 2, appBundleId: 'com.example.demo' }, + '/tmp/runner.log', + 30_000, + ); + }); + + assert.equal(mockWaitForRunner.mock.calls.length, 1); + assert.match(diagnostics, /"reason":"startup"/); +}); + +test('runner session clears recency and marks the error when a skipped-preflight send fails', async () => { + const session = makeRunnerSession({ + ready: true, + lastHealthyMutation: { atMs: Date.now() - 1_000, appBundleId: 'com.example.demo' }, + }); + mockSendRunnerCommandOnce.mockRejectedValueOnce(new Error('fetch failed')); + + await assert.rejects( + () => + executeRunnerCommandWithSession( + IOS_SIMULATOR, + session, + { command: 'tap', x: 1, y: 2, appBundleId: 'com.example.demo' }, + '/tmp/runner.log', + 30_000, + ), + (error: unknown) => { + assert.ok(error instanceof AppError); + assert.equal(error.details?.runnerReadinessPreflightSkipped, true); + assert.equal(error.details?.runnerReadinessPreflightSkipReason, 'recent_healthy_mutation'); + assert.equal(typeof error.details?.runnerReadinessPreflightSkippedAgeMs, 'number'); + assert.notEqual(error.details?.runnerReadinessPreflightFailed, true); + return true; + }, + ); + + assert.equal(mockWaitForRunner.mock.calls.length, 0); + assert.equal(session.lastHealthyMutation, undefined); +}); + +test('runner session does not mark structured runner failures after a skip as skipped-preflight', async () => { + const session = makeRunnerSession({ + ready: true, + lastHealthyMutation: { atMs: Date.now() - 1_000, appBundleId: 'com.example.demo' }, + }); + mockSendRunnerCommandOnce.mockResolvedValueOnce( + runnerError({ + code: 'COMMAND_FAILED', + message: 'Runner failed after receiving command', + }), + ); + + await assert.rejects( + () => + executeRunnerCommandWithSession( + IOS_SIMULATOR, + session, + { command: 'tap', x: 1, y: 2, appBundleId: 'com.example.demo' }, + '/tmp/runner.log', + 30_000, + ), + (error: unknown) => { + assert.ok(error instanceof AppError); + assert.equal(error.message, 'Runner failed after receiving command'); + assert.notEqual(error.details?.runnerReadinessPreflightSkipped, true); + return true; + }, + ); +}); + +test('runner session clears recency when an allowlisted command returns XCTest recorded failure', async () => { + const device = { ...IOS_SIMULATOR, id: 'runner-session-skip-xctest-failure-sim' }; + const session = await ensureRunnerSession(device, {}); + session.ready = true; + session.lastHealthyMutation = { atMs: Date.now() - 1_000, appBundleId: 'com.example.demo' }; + mockWaitForRunner.mockClear(); + mockSendRunnerCommandOnce.mockResolvedValueOnce( + runnerError({ + code: 'XCTEST_RECORDED_FAILURE', + message: 'XCTest recorded a failure while executing tap.', + }), + ); + + await assert.rejects( + () => + executeRunnerCommandWithSession( + device, + session, + { command: 'tap', x: 1, y: 2, appBundleId: 'com.example.demo' }, + '/tmp/runner.log', + 30_000, + ), + (error: unknown) => { + assert.ok(error instanceof AppError); + assert.equal(error.code, 'XCTEST_RECORDED_FAILURE'); + return true; + }, + ); + + assert.equal(session.lastHealthyMutation, undefined); + assert.equal(getRunnerSessionSnapshot(device.id), null); +}); + function makeRunnerSession(overrides: Partial = {}): RunnerSession { return { sessionId: `session-${overrides.port ?? 8100}`, diff --git a/src/platforms/ios/runner-command-recovery.ts b/src/platforms/ios/runner-command-recovery.ts index 486d71b11..e54d93be8 100644 --- a/src/platforms/ios/runner-command-recovery.ts +++ b/src/platforms/ios/runner-command-recovery.ts @@ -23,6 +23,12 @@ type RunnerTransportRecoveryContext = { invalidateSession: (session: RunnerSession, reason: string) => Promise; }; +type RunnerReadinessPreflightRecoveryDetails = { + readinessPreflightSkipped?: boolean; + readinessPreflightSkipReason?: string; + readinessPreflightSkippedAgeMs?: number; +}; + const RUNNER_STATUS_RECOVERY_TIMEOUT_MS = 3_000; export async function handleRunnerTransportErrorAfterCommandSend(params: { @@ -125,6 +131,7 @@ async function tryRecoverRunnerCommandAfterTransportError( signal?: AbortSignal, ): Promise { if (command.command === 'status' || !command.commandId?.trim()) return undefined; + const readinessPreflight = readReadinessPreflightRecoveryDetails(transportError); let status: Record; try { status = await executeRunnerCommandWithSession( @@ -143,6 +150,7 @@ async function tryRecoverRunnerCommandAfterTransportError( command: command.command, commandId: command.commandId, error: error instanceof Error ? error.message : String(error), + ...readinessPreflight, }, }); return { type: 'retainInvalidation', reason: 'status_probe_failed' }; @@ -156,6 +164,7 @@ async function tryRecoverRunnerCommandAfterTransportError( command: command.command, commandId: command.commandId, lifecycleState, + ...readinessPreflight, }, }); return handleRunnerCommandStatusRecovery( @@ -240,6 +249,7 @@ function handleCompletedRunnerStatus( lifecycleState: 'completed', }; } + const readinessPreflight = readReadinessPreflightRecoveryDetails(transportError); return { type: 'skipInvalidation', reason: 'completed_without_retained_response', @@ -252,7 +262,8 @@ function handleCompletedRunnerStatus( commandId: command.commandId, lifecycleState: 'completed', recovery: 'completed_without_retained_response', - hint: completedWithoutRetainedResponseHint(command.command), + ...readinessPreflight, + hint: completedWithoutRetainedResponseHint(command.command, readinessPreflight), logPath: options.logPath, transportError: transportError.message, }, @@ -275,6 +286,7 @@ function runnerStatusFailureError( : 'Runner command failed'; const hint = typeof status.lifecycleErrorHint === 'string' ? status.lifecycleErrorHint : undefined; + const readinessPreflight = readReadinessPreflightRecoveryDetails(transportError); return new AppError( toAppErrorCode(errorCode), errorMessage, @@ -283,7 +295,8 @@ function runnerStatusFailureError( commandId: command.commandId, lifecycleState: 'failed', recovery: 'runner_reported_failure', - hint: hint ?? runnerReportedFailureHint(command.command), + ...readinessPreflight, + hint: hint ?? runnerReportedFailureHint(command.command, readinessPreflight), logPath: options.logPath, transportError: transportError.message, }, @@ -300,6 +313,7 @@ function runnerStatusInFlightError( if (isReadOnlyRunnerCommand(command.command)) { return transportError; } + const readinessPreflight = readReadinessPreflightRecoveryDetails(transportError); return new AppError( 'COMMAND_FAILED', `Runner command "${command.command}" is still ${lifecycleState} after the transport response was lost.`, @@ -308,7 +322,8 @@ function runnerStatusInFlightError( commandId: command.commandId, lifecycleState, recovery: 'command_still_in_flight', - hint: inFlightAfterLostResponseHint(command.command, lifecycleState), + ...readinessPreflight, + hint: inFlightAfterLostResponseHint(command.command, lifecycleState, readinessPreflight), logPath: options.logPath, transportError: transportError.message, }, @@ -334,16 +349,61 @@ function parseLifecycleResponsePayload(value: string): LifecycleResponsePayload return {}; } -function completedWithoutRetainedResponseHint(command: string): string { - return `The runner is still reachable and reports "${command}" already completed, so agent-device kept the session open and will not replay it. Run snapshot -i to inspect the current UI, then continue from that observed state.`; +function completedWithoutRetainedResponseHint( + command: string, + readinessPreflight: RunnerReadinessPreflightRecoveryDetails, +): string { + return `${lostResponseReadinessContext(readinessPreflight)}The runner is still reachable and reports "${command}" already completed, so agent-device kept the session open and will not replay it. Run snapshot -i to inspect the current UI, then continue from that observed state.`; +} + +function runnerReportedFailureHint( + command: string, + readinessPreflight: RunnerReadinessPreflightRecoveryDetails, +): string { + return `${lostResponseReadinessContext(readinessPreflight)}The runner is still reachable and reports "${command}" failed after the transport response was lost, so agent-device kept the session open and did not replay it. Run snapshot -i to inspect the current UI and retry with a selector visible in that snapshot.`; +} + +function inFlightAfterLostResponseHint( + command: string, + lifecycleState: string, + readinessPreflight: RunnerReadinessPreflightRecoveryDetails, +): string { + return `${lostResponseReadinessContext(readinessPreflight)}The runner is still reachable and reports "${command}" is ${lifecycleState}, so agent-device kept the session open and will not replay it. Wait briefly, run snapshot -i to inspect the current UI, then continue from that observed state.`; +} + +function lostResponseReadinessContext( + readinessPreflight: RunnerReadinessPreflightRecoveryDetails, +): string { + if (readinessPreflight.readinessPreflightSkipped !== true) return ''; + return 'This hot command skipped the uptime preflight because the runner had just completed a healthy interaction; status recovery confirmed the runner still observed it. '; +} + +function readBooleanDetail(error: AppError, key: string): boolean | undefined { + const value = error.details?.[key]; + return typeof value === 'boolean' ? value : undefined; +} + +function readStringDetail(error: AppError, key: string): string | undefined { + const value = error.details?.[key]; + return typeof value === 'string' ? value : undefined; } -function runnerReportedFailureHint(command: string): string { - return `The runner is still reachable and reports "${command}" failed after the transport response was lost, so agent-device kept the session open and did not replay it. Run snapshot -i to inspect the current UI and retry with a selector visible in that snapshot.`; +function readNumberDetail(error: AppError, key: string): number | undefined { + const value = error.details?.[key]; + return typeof value === 'number' ? value : undefined; } -function inFlightAfterLostResponseHint(command: string, lifecycleState: string): string { - return `The runner is still reachable and reports "${command}" is ${lifecycleState}, so agent-device kept the session open and will not replay it. Wait briefly, run snapshot -i to inspect the current UI, then continue from that observed state.`; +function readReadinessPreflightRecoveryDetails( + error: AppError, +): RunnerReadinessPreflightRecoveryDetails { + const details: RunnerReadinessPreflightRecoveryDetails = {}; + const skipped = readBooleanDetail(error, 'runnerReadinessPreflightSkipped'); + if (skipped !== undefined) details.readinessPreflightSkipped = skipped; + const reason = readStringDetail(error, 'runnerReadinessPreflightSkipReason'); + if (reason !== undefined) details.readinessPreflightSkipReason = reason; + const ageMs = readNumberDetail(error, 'runnerReadinessPreflightSkippedAgeMs'); + if (ageMs !== undefined) details.readinessPreflightSkippedAgeMs = ageMs; + return details; } function unknownLifecycleStateHint(command: string): string { diff --git a/src/platforms/ios/runner-session-types.ts b/src/platforms/ios/runner-session-types.ts index f72529575..3a114b207 100644 --- a/src/platforms/ios/runner-session-types.ts +++ b/src/platforms/ios/runner-session-types.ts @@ -15,6 +15,10 @@ export type RunnerSession = { child: ExecBackgroundResult['child']; ready: boolean; startupTimeoutMs?: number; + // Records the last allowlisted mutating interaction that the runner confirmed + // healthy (parsed ok, non-runnerFatal) for a given app bundle. Lives only on + // the session object so it dies with every invalidation/restart (#702). + lastHealthyMutation?: { atMs: number; appBundleId?: string }; startupTimings?: Record; startupTimingsReported?: boolean; simulatorSetRedirect?: { release: () => Promise }; diff --git a/src/platforms/ios/runner-session.ts b/src/platforms/ios/runner-session.ts index 75e574d3d..44c8eb8ee 100644 --- a/src/platforms/ios/runner-session.ts +++ b/src/platforms/ios/runner-session.ts @@ -56,15 +56,37 @@ const runnerSessions = new Map(); const runnerSessionLocks = new Map>(); const RUNNER_READY_PREFLIGHT_TIMEOUT_MS = 1_000; const RUNNER_STALE_BUNDLE_UNINSTALL_TIMEOUT_MS = 10_000; +const RUNNER_PREFLIGHT_SKIP_FRESHNESS_MS = 5_000; +// Today's scroll verb is covered via 'drag'. The fused 'scroll' runner command +// (PR #760) must be added here when it lands, or hot scroll loops lose the skip. +const PREFLIGHT_SKIP_ELIGIBLE_RUNNER_COMMANDS = new Set([ + 'tap', + 'tapSeries', + 'longPress', + 'drag', + 'dragSeries', + 'swipe', +]); type RunnerReadinessPreflightDecision = | { action: 'run'; - reason: 'startup' | 'ready_session'; + reason: + | 'startup' + | 'conservative_command' + | 'no_recent_healthy_mutation' + | 'app_activation_uncertain' + | 'healthy_mutation_stale'; + lastHealthyMutationAgeMs?: number; } | { action: 'skip'; reason: 'read_only_startup_command' | 'readiness_probe_command'; + } + | { + action: 'skip'; + reason: 'recent_healthy_mutation'; + lastHealthyMutationAgeMs: number; }; function withRunnerSessionLock(deviceId: string, task: () => Promise): Promise { @@ -463,32 +485,78 @@ export async function executeRunnerCommandWithSession( emitRunnerReadinessPreflightSkipped(runnerCommand, session, preflightDecision); } - const response = await sendRunnerCommandAfterPreflight({ - device, - session, - runnerCommand, - logPath, - deadline, - timeoutMs, - signal, - readOnlyCommand, - }); + let response: Response; + try { + response = await sendRunnerCommandAfterPreflight({ + device, + session, + runnerCommand, + logPath, + deadline, + timeoutMs, + signal, + readOnlyCommand, + }); + } catch (error) { + // A transport failure right after a skipped preflight means the recency + // bet was wrong; clear it so a flaky transport cannot loop on stale skips, + // and mark the error with the skip context for status recovery. The marker + // key is disjoint from runnerReadinessPreflightFailed, so this never routes + // into the restart-and-replay path. + throw markSkippedPreflightTransportError(error, session, preflightDecision); + } try { const data = await parseRunnerResponse(response, session, logPath); const runnerFatalReason = resolveRunnerFatalReason(data); if (runnerFatalReason) { + session.lastHealthyMutation = undefined; await invalidateRunnerSession(session, runnerFatalReason); + } else if (PREFLIGHT_SKIP_ELIGIBLE_RUNNER_COMMANDS.has(runnerCommand.command)) { + session.lastHealthyMutation = { + atMs: Date.now(), + appBundleId: runnerCommand.appBundleId, + }; } return data; } catch (error) { const runnerFatalReason = resolveRunnerFatalErrorReason(error); if (runnerFatalReason) { + session.lastHealthyMutation = undefined; await invalidateRunnerSession(session, runnerFatalReason); + throw error; } - throw error; + // A body-read or malformed-payload failure is transport-shaped too (the + // runner died mid-response); structured runner failures carry a `runner` + // detail and keep their recency — the runner proved it is alive by + // answering at all. + if (isStructuredRunnerFailure(error)) throw error; + throw markSkippedPreflightTransportError(error, session, preflightDecision); } } +function isStructuredRunnerFailure(error: unknown): boolean { + return error instanceof AppError && error.details?.runner !== undefined; +} + +function markSkippedPreflightTransportError( + error: unknown, + session: RunnerSession, + preflightDecision: RunnerReadinessPreflightDecision, +): unknown { + if ( + preflightDecision.action !== 'skip' || + preflightDecision.reason !== 'recent_healthy_mutation' + ) { + return error; + } + session.lastHealthyMutation = undefined; + return markRunnerPreflightError(error, { + runnerReadinessPreflightSkipped: true, + runnerReadinessPreflightSkipReason: preflightDecision.reason, + runnerReadinessPreflightSkippedAgeMs: preflightDecision.lastHealthyMutationAgeMs, + }); +} + async function sendRunnerCommandAfterPreflight(params: { device: DeviceInfo; session: RunnerSession; @@ -565,6 +633,7 @@ async function runRunnerReadinessPreflight(params: { command: runnerCommand.command, commandId: runnerCommand.commandId, reason: decision.reason, + lastHealthyMutationAgeMs: decision.lastHealthyMutationAgeMs, sessionReady: session.ready, timeoutMs: readinessTimeoutMs, }, @@ -587,6 +656,10 @@ function emitRunnerReadinessPreflightSkipped( command: runnerCommand.command, commandId: runnerCommand.commandId, reason: decision.reason, + lastHealthyMutationAgeMs: + decision.reason === 'recent_healthy_mutation' + ? decision.lastHealthyMutationAgeMs + : undefined, sessionReady: session.ready, }, }); @@ -691,9 +764,37 @@ function resolveRunnerReadinessPreflightDecision( reason: 'readiness_probe_command', }; } + if (!PREFLIGHT_SKIP_ELIGIBLE_RUNNER_COMMANDS.has(command.command)) { + return { + action: 'run', + reason: 'conservative_command', + }; + } + const record = session.lastHealthyMutation; + if (!record) { + return { + action: 'run', + reason: 'no_recent_healthy_mutation', + }; + } + if (command.appBundleId !== record.appBundleId) { + return { + action: 'run', + reason: 'app_activation_uncertain', + }; + } + const lastHealthyMutationAgeMs = Date.now() - record.atMs; + if (lastHealthyMutationAgeMs > RUNNER_PREFLIGHT_SKIP_FRESHNESS_MS) { + return { + action: 'run', + reason: 'healthy_mutation_stale', + lastHealthyMutationAgeMs, + }; + } return { - action: 'run', - reason: 'ready_session', + action: 'skip', + reason: 'recent_healthy_mutation', + lastHealthyMutationAgeMs, }; }