Skip to content

fix(start): reap CDP children + fail fast on busy port (issue #25)#27

Merged
anilcancakir merged 4 commits into
masterfrom
fix/start-cdp-port-failure-cleanup
Jun 9, 2026
Merged

fix(start): reap CDP children + fail fast on busy port (issue #25)#27
anilcancakir merged 4 commits into
masterfrom
fix/start-cdp-port-failure-cleanup

Conversation

@anilcancakir

Copy link
Copy Markdown
Contributor

Closes #25.

Context

The primary symptom in issue #25 (CDP Page.navigate failing with -32601, VM Service scrape timing out) was already fixed on master by commit 871d0a7, which rewrote defaultChromeNavigate to select a page-type target instead of the browser-level endpoint. This PR addresses the issue's still-open secondary observation, locks the primary fix with a regression test, and confirms the whole flow end-to-end.

Changes

  • Fail fast on a busy --port. _handleCdpBranch now probes webPort before launching Chrome (new cdpPortProbe seam, ServerSocket.bind then close). A bound port exits 1 with an actionable error naming the port and suggesting fsa stop or a different --port, instead of spawning into a doomed bind and surfacing only the generic 90s timeout. Nothing is spawned on a busy port.
  • Reap children on post-launch failure. The region from the flutter web-server spawn through the state.json write is wrapped in failure-cleanup. On any throw (navigate failure, scrape timeout, PID-capture failure) it reaps the spawned Chrome, the flutter web-server (holder + child PIDs), the FIFO, and the tmp profile directory, then surfaces the original error and returns 1. Cleanup is best-effort and never masks the original error; no SIGKILL grace loop (fsa stop remains the deliberate full reaper).
  • Regression test pinning page-target selection in defaultChromeNavigate against a fake CDP HTTP + WebSocket endpoint, so the -32601 fix cannot regress.
  • Docs: CHANGELOG [Unreleased] (Added + Fixed) and a state-and-recovery skill recovery section.

Scope is the --cdp-port branch only; the non-CDP branch and vmServicePort are untouched.

Verification

  • dart analyze lib/ test/ bin/ clean, dart format no diff, 1151 tests pass, project lib/ line coverage 84.53% (above the 80% gate).
  • Real-app end-to-end: fsa start --cdp-port 9399 --port 3199 against example/ returned exit 0, scraped vmServiceUri, and wrote a populated state.json (chromePid / cdpPort / tmpProfileDir). fsa stop then reaped Chrome + flutter + FIFO + tmp profile cleanly (port freed, both PIDs gone).

Note

_reapAfterCdpFailure takes 6 named params (above the 5-param style threshold). Left as-is deliberately: single internal caller, so a parameter object would be premature structure. Worth revisiting if a second reaper caller appears (e.g. V1.x consolidation with StopCommand._reapChrome).

Probe the web port before launching Chrome and fail fast with an actionable error (exit 1) when it is already bound, instead of spawning into a doomed bind and surfacing only a generic 90s timeout.

Wrap the post-launch region of the --cdp-port branch in failure-cleanup that reaps the spawned Chrome, the flutter web-server (holder + child PIDs), the FIFO, and the tmp profile directory on any throw, then surfaces the original error and returns 1. Cleanup is best-effort and never masks the original error.

Add a regression test that exercises defaultChromeNavigate against a fake CDP endpoint, pinning page-target selection so the -32601 fix cannot regress to the browser-level endpoint.
…#25)

Add CHANGELOG [Unreleased] entries (Added: fail-fast port-in-use error; Fixed: failure-time child reaping) and a state-and-recovery skill section covering the busy-port error and post-probe reaping.
Copilot AI review requested due to automatic review settings June 9, 2026 09:41

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens the start --cdp-port flow to address issue #25 by failing fast when the requested web port is already in use, and by cleaning up spawned CDP-related resources on post-launch failure. It also adds regression coverage to prevent the prior CDP navigation fix (page-target selection) from regressing, and updates release/skill documentation accordingly.

Changes:

  • Add a web-port availability probe to fail fast before launching Chrome when --port is already bound.
  • Add best-effort cleanup on post-launch failures to reap Chrome/flutter processes and delete the FIFO/tmp profile directory.
  • Add regression tests for CDP page-target selection in defaultChromeNavigate, plus docs and changelog updates.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File Description
lib/src/commands/start_command.dart Adds port probe seam, fail-fast behavior on busy --port, and post-launch cleanup logic for CDP start failures.
test/commands/start_command_test.dart Adds tests for busy-port fail-fast, cleanup behavior, and a regression test for CDP page-target selection.
skills/fluttersdk-artisan/references/state-and-recovery.md Documents the new fail-fast behavior and updated recovery guidance for CDP start failures.
CHANGELOG.md Records the new fail-fast and cleanup behaviors under [Unreleased].
Comments suppressed due to low confidence (1)

lib/src/commands/start_command.dart:463

  • The failure-cleanup try/catch starts after log/FIFO setup, so if logFile.parent.create, logFile.writeAsString, or _ensureFifo throws, the already-launched detached Chrome (and tmp profile dir) will not be reaped. That leaves the same kind of orphaned Chrome/tmp-profile artifacts this PR is aiming to prevent on post-launch failures.
    final logFile = File('${_logDir()}/flutter-dev.log');
    await logFile.parent.create(recursive: true);
    await logFile.writeAsString('');

    final fifoPath = '${_logDir()}/flutter-dev.fifo';
    await _ensureFifo(fifoPath);

Comment thread test/commands/start_command_test.dart Outdated
Comment thread CHANGELOG.md Outdated
Comment thread skills/fluttersdk-artisan/references/state-and-recovery.md Outdated
…est (PR #27 review)

Widen the CDP failure-cleanup try to include log-dir + FIFO creation, so a throw there (e.g. mkfifo failure) still reaps the already-launched Chrome and tmp profile dir instead of leaking them. Add a regression test for that path.

Bind the page-target regression test's fake CDP server to the 'localhost' hostname so it matches defaultChromeNavigate's http://localhost client and stays robust across IPv4/IPv6 resolution.
…#27 review)

CHANGELOG: cleanup failures are ignored silently, not logged; reword to match the code. state-and-recovery: the busy-port fast-fail reports the web port (--port), not the CDP port; clarify the heading, cause, and recovery.
@anilcancakir

Copy link
Copy Markdown
Contributor Author

Addressed all Copilot review points in commits 51cfcd9 + ac34e21:

  1. Cleanup gap before the try (low-confidence comment, start_command.dart) — valid. The failure-cleanup try now also wraps log-dir creation + _ensureFifo, so a throw during log/FIFO setup still reaps the already-launched Chrome and tmp profile dir. Added a regression test (FIFO setup throws after Chrome launch: returns 1, reaps Chrome + tmp profile dir), red-proven against the pre-fix structure.
  2. test:909 localhost vs 127.0.0.1 — fixed. The fake CDP server now binds the localhost hostname (not the IPv4 literal) to match defaultChromeNavigate's http://localhost client, robust across IPv4/IPv6 resolution.
  3. CHANGELOG.md:17 "logged" claim — fixed. Cleanup failures are swallowed silently, so the wording is now "cleanup failures are ignored and never mask the original error".
  4. state-and-recovery.md:292 web port vs CDP port — fixed. The fast-fail reports the web port (--port), not the CDP port; the heading, cause, and recovery now reference the web port explicitly.

Full suite green (1152 tests), dart analyze lib/ test/ bin/ clean, project lib/ coverage 84.53%.

@anilcancakir anilcancakir merged commit 774ba5e into master Jun 9, 2026
1 check passed
@anilcancakir anilcancakir deleted the fix/start-cdp-port-failure-cleanup branch June 9, 2026 09:54
@anilcancakir anilcancakir mentioned this pull request Jun 9, 2026
5 tasks
anilcancakir added a commit that referenced this pull request Jun 9, 2026
…t port)

Bump to 0.0.7: CDP failure-cleanup + fail-fast on busy --port (issue #25, #27), plus version-consistency sweep across example pubspecs, MCP server version, and skill docs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

start --cdp-port: CDP Page.navigate fails with -32601, VM Service scrape times out

2 participants