test: comprehensive integration + coverage tests for session-coordinator and session-runtime#142
test: comprehensive integration + coverage tests for session-coordinator and session-runtime#142
Conversation
Cover CAPABILITIES_INIT_REQUESTED (no backend, unsupported adapter, dedup, timer timeout, SESSION_CLOSING clears timer), orchestrateSessionInit (with/without capabilities, git resolver), and CAPABILITIES_APPLIED (commands registration).
…nse, closeBackend, debounce)
…nd lifecycle Tests cover: - applyPolicyCommandForSession: idle_reap, reconnect_timeout, capabilities_timeout - withMutableSession lease guard skips fn for nonexistent sessions - closeSessionInternal logs warning when backend close() throws - createSession model propagation to session snapshot state - onProcessSpawned relay handler seeds cwd, model, and adapterName
Cover the two previously uncovered code paths: - Lines 73-76: logger.warn called when Promise.allSettled yields a rejected result during batch relaunch (verifies allSettled semantics — other sessions still processed when one fails) - Lines 99-104: teardownDomainSubscriptions iteration called from stop(), ensuring all three domain-event off() calls are made; also covers the idempotency guard and the never-started edge case
Covers lines 22-25 (deadline reached while group alive) and lines 31-41 (EPERM polling loop, EPERM-at-deadline, unexpected error code) in the internal waitForProcessGroupDead helper inside node-process-manager.ts. Uses vi.useFakeTimers() to control setTimeout-driven polling without real delays, and vi.spyOn(process, 'kill') to inject EPERM / ESRCH / EBADF errors at the signal-0 check-alive call site.
Covers three previously-uncovered lines in consumer-gatekeeper.ts: - Line 102: clearTimeout called inside cleanup() when auth resolves before the timeout fires (verified via spy on globalThis.clearTimeout). - Line 112: .catch() path where cleanup() returns false — socket was already cancelled before the auth promise rejected; returns null instead of re-throwing. - Line 137: fallback rate-limit config (burstSize:20, tokensPerSecond:50) when config.consumerMessageRateLimit is absent (undefined) at runtime.
Covers four previously-uncovered code paths in IdlePolicy: - Line 32: double-start guard — verifies start() is idempotent by asserting domainEvents.on is called exactly 3 times even when start() is invoked twice. - Line 53: double-subscribe guard — verifies ensureDomainSubscriptions() does not register duplicate event listeners when eventCleanups is already populated. - Lines 103-113: runSweep() early-exit guard — uses synchronous vi.advanceTimersByTime() to enqueue a sweep without draining microtasks, then calls stop() before Promise.resolve() so that runSweep() sees !running and returns before touching the bridge; also covers the null-snapshot continue path (line 113) with a getSession mock that returns null for one session. - Line 119: lastActivity null-coalescing — passes a snapshot with no lastActivity field and asserts the session is reaped (treated as epoch-old). Bonus: covers the !domainEvents guard path (line 53 first clause) by instantiating IdlePolicy without domainEvents and verifying periodic sweep still runs.
The SESSION_CLOSING timer-cleared test was asserting that no
`session_closed` broadcast occurred after advancing timers, but the
CAPABILITIES_TIMEOUT timer fires `deps.emitEvent("capabilities:timeout",
...)` — not a broadcast. The old assertion passed vacuously regardless
of whether the timer was actually cleared.
Replace the broadcast filter with a check on `deps.emitEvent` for
`capabilities:timeout` calls, which is the real observable side-effect
of the timer firing.
… test - Finding 1 (CRITICAL): replace three-hop private chain `reconnectController.deps.bridge.applyPolicyCommand` with the single-hop `applyPolicyCommandForSession` that is already used throughout the file, removing the `policyBridge` variable entirely. - Finding 2 (HIGH): add precondition assertion in the `idle_reap` test to verify the session is in `awaiting_backend` lifecycle state before the policy command is applied, so a regression will surface at setup rather than on the post-condition check. - Finding 3 (HIGH): replace silent `if (backendSession)` guard with a hard `expect(backendSession).not.toBeNull()` assertion so a failure to connect the backend causes an immediate, descriptive test failure rather than a confusing `logger.warn` assertion mismatch.
…erage tests Tests 1 and 3 now capture the killSpy return value and filter mock calls for `(-pid, 0)` — the exact signature used by the polling loop — before asserting at least one such call occurred. This ensures the polling path actually executed, consistent with the pattern already used in Test 4.
…coverage - Replace two Promise.resolve() flushes with four flushPromises() calls imported from src/testing/cli-message-factories.ts; the relaunchStaleSessions chain has multiple async hops (timer → relaunch reject → Promise.allSettled → result iteration → logger.warn) that require more than two microtask turns - Rename idempotency test title to accurately describe the guard that fires: "stop() is idempotent — second call exits early because reconnectTimer is already null" (the reconnectTimer null-check in stop(), not the teardownDomainSubscriptions empty-array guard) - Update inline comment to clarify teardownDomainSubscriptions is not reached on the second stop() call
Summary of ChangesHello @teng-lin, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the robustness and reliability of the session management system by introducing a comprehensive suite of new integration and coverage tests. The added tests target critical failure scenarios, race conditions, and resource cleanup, ensuring that previously untested branches in core policies and managers are now thoroughly validated. Furthermore, several identified code quality issues within the new tests have been resolved, leading to a more stable and well-understood codebase. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request adds a comprehensive set of integration and coverage tests, significantly improving the test suite's robustness. The new tests cover various edge cases and failure paths in session management, policy dispatch, and process lifecycle handling. My review focuses on improving the reliability of some asynchronous testing patterns. I've suggested replacing brittle timer-based waits with more robust methods like vi.waitUntil and vi.runAllTicks to prevent test flakiness, aligning with our guidelines for deterministic testing.
| await Promise.resolve(); | ||
| await Promise.resolve(); // two ticks: sweepChain.then wraps runSweep |
There was a problem hiding this comment.
Using multiple Promise.resolve() calls to advance microtask ticks can be brittle. vitest provides vi.runAllTicks() to flush the microtask queue completely, which is more robust and clearly expresses the intent.
| await Promise.resolve(); | |
| await Promise.resolve(); // two ticks: sweepChain.then wraps runSweep | |
| await vi.runAllTicks(); |
References
- In tests, replace non-deterministic synchronization methods with deterministic helpers to avoid flakiness.
| await flushPromises(); | ||
| await flushPromises(); | ||
| await flushPromises(); | ||
| await flushPromises(); |
There was a problem hiding this comment.
Using multiple flushPromises() calls to wait for asynchronous operations is brittle. If the underlying async chain changes, this test could become flaky. A more robust approach is to wait until the expected side effect occurs using vi.waitUntil.
| await flushPromises(); | |
| await flushPromises(); | |
| await flushPromises(); | |
| await flushPromises(); | |
| await vi.waitUntil(() => (logger.warn as any).mock.calls.length > 0); |
References
- In tests, avoid using brittle synchronization methods and instead wait for specific asynchronous events to prevent flakiness.
Cover three previously uncovered branches in slash-command-chain.ts: - Line 112 (LocalHandler catch): String(err) path when rejection value is not an Error instance (plain string and number rejections) - Line 164 (AdapterNativeHandler.execute): early-return guard when adapterSlashExecutor is null at execute time - Line 180 (AdapterNativeHandler.execute): if (!result) return guard when adapter executor resolves null or undefined
Cover lines 111, 225, and 276 — the three null-return branches that were not exercised by the existing tests: - Line 111: default case in translateCodexEvent switch (unknown event type) - Line 225: null return in translateItemAdded when item.type is neither message nor function_call (e.g. function_call_output, unknown types) - Line 276: null return in translateItemDone when item.type is not one of the three handled kinds (function_call_output / function_call / message)
Cover the two uncovered branch groups in child-process-supervisor.ts: - lines 124-125: stopAll() — verified with zero sessions, one session, and multiple sessions all stopped concurrently via Promise.all - lines 137-138: removeSession() — verified session map entry deleted, session count decremented, process handle removed, and non-existent ID handled without throwing
Added 8 new tests to reconnect-policy-coverage.test.ts covering: - process:connected and backend:connected events clearing watchdogs (clearOnConnect callback, lines 86/89) - session:closed event via mock domainEvents (clearOnClose callback) - ensureDomainSubscriptions guard when no domainEvents dep provided (line 83) - start() reconnectTimer guard when called twice (line 26) - start() early-exit when no starting sessions (line 29) - archived session skip during relaunch (line 65) - clearWatchdog no-op for unwatched sessionId (line 108) Branch coverage: 65% -> 95% (target was >=90%).
Bring src/core/messaging/message-tracer.ts branch coverage from 83.97% to 92.81% (target was ≥90%) by covering the following previously untested branches: - error() emit call: parentTraceId forwarding, auto-generated traceId, requestId/command/phase/outcome passthrough, and sessionId resolution from an existing open trace via resolveSessionId() - getSeq() MAX_SESSIONS eviction path (pre-fill 1000 session entries to trigger the oldest-entry eviction before inserting a new session) - emit() catch block (line 586): circular-reference body causes JSON.stringify to throw, exercising the minimal fallback event path - sweepStale() defensive break (line 663): monkey-patch Set.values() to return undefined so the else-break is exercised - summary() stale count for matching session (line 411) - smartSanitize() "type" field array collapse (lines 156-159): arrays of >3 objects with "type" (not "role") collapse to "[N messages]" - roughObjectSize() final return 8 fallback (line 262): Symbol type hits the non-string/number/boolean/array/object branch
…r deterministic async
Cover six previously uncovered branches in codex-session.ts: - Lines 279-280: requestRpc timeout callback fires when RPC does not respond - Line 356: resetThread awaits in-flight initializingThread before clearing state - Line 364: resetThread throws when ensureThreadInitialized leaves threadId null - Line 657: handleNotification else-branch when translateCodexEvent returns null - Line 883: translateResponseItem default case for unknown item types - Line 906: applyTraceToUnified copies requestId to slash_request_id Branch coverage rises from 88.42% to 91.66% (threshold: ≥90%).
Cover five previously uncovered branches in cloudflared-manager.ts (reported as buffered-relay-manager in the coverage task): - Line 207: scheduleRestart() timer fires when stopped=false, calling spawnProcess() - Line 130: handleData() early-return when urlFound is already true - Line 152: onError() false-branch when error fires after URL already found - Line 170: onExit() false-branch when stopped=true suppresses scheduleRestart() - Line 187: buildArgs() production mode without metricsPort omits --metrics flag Branch coverage for cloudflared-manager.ts rises from 83.33% to 97.22%.
Cover three previously uncovered branches in opencode-adapter.ts:
- Line 122: connect() guard that throws when httpClient is missing after
ensureServer() resolves (simulated by mocking ensureServer to resolve
without setting httpClient)
- Lines 194-195: reserveEphemeralPort() rejection path when server.address()
returns null or a string (defensive check for non-TCP/unexpected sockets),
covered via vi.mock('node:net') with a controllable createServer override
- Line 255: runSseLoop() guard that throws when httpClient is undefined
(called directly before ensureServer has run)
Additional tests for SSE retry loop aborted-signal paths (lines 229, 234)
and the for-await signal.aborted break (line 260) bring branch coverage
from 78.94% to 92.1%, exceeding the ≥90% threshold.
Targets BackendRecoveryService (backend-recovery-service.ts) branch coverage gaps, lifting it from 88.88% to 100%: - Line 88: `info.adapterName ?? "unknown"` — exercises the nullish- coalesce fallback by reconnecting a no-PID session with no adapterName, confirming the logger emits "unknown" in the message. - Line 132: `if (this.stopped) return` inside scheduleDedupClear's timer callback — uses a temporary clearTimeout mock to prevent stop() from cancelling the pending timer, then advances fake time past the dedup window so the callback fires while stopped === true.
Cover two previously uncovered branches in src/utils/crypto/pairing.ts:
- Line 97: handlePairingRequest returns {success:false} when sealOpen
succeeds but decrypted payload is not 32 bytes (non-32-byte sealed plaintext)
- Line 168: parsePairingLink throws when decoded public key is not 32 bytes
Statement/line coverage: 85.71% → 100%
Branch coverage: 85.71% → 95.23%
Covers previously uncovered lines 198, 207-226, 269 and extends to reach ≥90% branch coverage for the file: - Line 198: consumeStream() early return when this.query is null - Lines 207-209: system:init with missing/empty session_id (falsy branch) - Lines 217-226: catch block — stream throws while session is open - Line 269: pushInput() early return when inputDone is true - Lines 83-84: send(interrupt) optional chain with null query - Lines 146-150: startQueryLoop resume option with and without backendSessionId - Lines 166-181: canUseTool callback allow and deny decision paths - Lines 259-260: createInputIterable next() waiting branch (sets inputResolve) - Lines 271-273: pushInput() resolves a pending inputResolve promise - Lines 283-285: finishInput() resolves pending inputResolve with done:true Before: 86.79% line coverage (lines 198, 207-226, 269 uncovered) After: 100% lines, 100% statements, 100% functions, 90.56% branches
Summary
Comprehensive test coverage expansion: integration tests for session lifecycle, plus targeted branch coverage tests for 20+ previously under-covered source files. Also includes CI fixes and PR review improvements.
What's Included
Integration Tests (4 files, 34 tests)
Branch Coverage Tests (20+ files, 100+ new tests)
Core policies & session:
Adapters:
Daemon & utils:
Other:
Code Quality Fixes
Addressed 7 high-priority findings from dual code review (Claude + Gemini CLI):
CI & PR Review Fixes
session-runtime-test-helpers.tsTest Results
Testing
🤖 Generated with Claude Code