feat: opt-in parallel test execution (N=2) — experimental by kevinccbsg · Pull Request #6 · BRIKEV/twd-cli

kevinccbsg · 2026-04-22T08:43:03Z

Summary

Adds an opt-in parallel: true flag to twd.config.json that runs the TWD suite across two isolated Puppeteer browser contexts in parallel. Measured ~1.8× wallclock speedup on a 60-test suite (63.7 s → 34.8 s) on a developer laptop. Zero behavior change for users who don't set the flag.

This ships as experimental / beta — worker count is fixed at 2 for now; higher counts are gated on a future twd-js change that makes test IDs deterministic.

How to use

{
  "url": "http://localhost:5173",
  "parallel": true,        // ← opt-in, default false
  "retryCount": 2          // existing; honored per worker
}

All existing config (coverage, contracts, timeout, puppeteerArgs, headless, retryCount) works unchanged.

What changes for existing users

Nothing. Without parallel: true:

runTests() skips the new branch and runs the existing serial body unchanged.
The serial code itself was not refactored.
All 9 pre-existing runTests.test.js tests pass unchanged; a new test explicitly asserts the serial path is selected when the flag is absent and that anti-throttle flags are NOT in the launch args.

What's inside

New:

src/runParallel.js — orchestrates the parallel flow (~200 LoC): launch → 2 browser contexts → Promise.all of runByIds self-filtered by idx % N === workerIndex → per-worker coverage dump → merged mock validation → per-worker result trees + combined summary.
src/mergeMocks.js — pure utility that combines per-worker mock maps with worker-index-prefixed keys (defense-in-depth against twd-js random-ID collisions).
tests/runParallel.test.js — 15 unit tests covering launch args, anti-throttle flags, navigation, retry pass-through, pass/fail aggregation, coverage writes (including on failures), .nyc_output cleanup, contract mock exposure, merged-mock validation, and contract error propagation.
tests/mergeMocks.test.js — 5 unit tests for the merge utility.

Modified:

src/config.js — parallel: false added to DEFAULT_CONFIG.
src/index.js — early-return branch: if (config.parallel) return runParallel(...). Serial path below unchanged.
tests/runTests.test.js — 2 new tests asserting branch selection.
README.md — documents the new field with expected speedup and current limitations.

Key design decisions

Self-filter by index, not probe + distribute. Initial design probed one context, split IDs in Node, and passed chunks to each worker. That failed because twd-js generates test IDs via Math.random() — IDs from one context don't exist in another, so runByIds silently matched zero tests in workers ≥1. Each worker now enumerates its own __TWD_STATE__.handlers and takes the slots where idx % N === workerIndex. Registration order is stable across contexts; IDs are not.
Anti-throttle flags appended automatically. --disable-background-timer-throttling, --disable-renderer-backgrounding, --disable-backgrounding-occluded-windows are added to puppeteerArgs unless the user already set them. Measured to materially reduce waitFor timeouts at N≥2.
Coverage always dumps, even on failures. The serial path skips coverage on test failures; parallel always dumps so one worker's flake doesn't blind the other's data. Standard nyc report merges the files automatically.
Per-worker reports (not unified). Each worker's results render with reportResults independently, followed by a combined summary. Unified reporting needs canonical test identity across contexts, which is blocked on deterministic IDs in twd-js — tracked as a follow-up.

POC evidence

The approach was validated in a throwaway POC at poc/parallel/:

(a) SW isolation: 60/60 tests pass at N=2 on a real app with overlapping mocks across contexts — no cross-contamination.
(b) Coverage split: .nyc_output/out-0.json + out-1.json merge cleanly via nyc report at 85.91% statements — identical to serial baseline.
(c) Full suite completion: 60/60 with a 1.83× wallclock speedup (63.7 s serial → 34.8 s parallel).

See poc/parallel/README.md for the full findings, including the discovery of the random-ID issue and the concurrency ceiling at N≥3.

Known limitations / follow-ups

N is fixed at 2. Configurable workers: N is gated on deterministic test IDs in twd-js.
Per-worker trees, not one unified tree. Also gated on deterministic IDs.
At N=3+, the 1-second waitFor default gets tight under CPU contention. Retries (which are fully supported) absorb most of this; a per-test timeout override is a future improvement.
test-example-app is not instrumented with istanbul, so the manual smoke logs no __coverage__ on window — the coverage code path is fully unit-tested. Real coverage behavior was verified against an externally instrumented app in the POC.

Test plan

Unit tests: 199 passed (199) across 10 test files (npx vitest run)
Manual smoke against test-example-app with parallel: true: 71/71 pass, 67 mocks validated, contract report written to .twd/contract-report.md
Manual smoke against test-example-app without parallel: identical output to previous release — no behavior change
Try on the actual target app and compare wallclock to serial baseline in CI
Confirm npx nyc report on merged .nyc_output/out-<i>.json files works on CI runner
Decide on beta release tagging strategy (pre-release version vs. direct 1.2.0)

Related docs (in-tree)

Spec: docs/superpowers/specs/2026-04-21-parallel-test-execution-production-design.md
Plan: docs/superpowers/plans/2026-04-22-parallel-test-execution-production.md
POC spec/plan/findings: docs/superpowers/specs/2026-04-21-parallel-test-execution-poc-design.md, docs/superpowers/plans/2026-04-21-parallel-test-execution-poc.md, poc/parallel/README.md

🤖 Generated with Claude Code

Proves three POC criteria against the holafly/web-checkout app: - Service-worker isolation works across browser.createBrowserContext(); 60/60 tests pass at N=2 with overlapping mocks and zero contamination. - Per-worker __coverage__ dumps to .nyc_output/out-<i>.json merge cleanly via nyc report (85.91% statements, identical to serial baseline). - Full suite completes with a 1.83× wallclock speedup (63.7s → 34.8s). Includes anti-throttle Chromium flags, which reduced flakiness at moderate N values. Documents findings and the random-test-ID discovery that forced a self-filter-by-index pivot during implementation. Throwaway script; production feature will be implemented under src/ in a follow-up commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Specifies the opt-in \`parallel: true\` config flag and the supporting architecture: new src/runParallel.js module, src/mergeMocks.js utility, and a thin branch in src/index.js. Serial path remains byte-identical when the flag is absent or false. N=2 is hardcoded in this release; deterministic test IDs, unified reporting, and configurable worker counts are documented follow-ups. The plan decomposes the work into six TDD-style tasks (config default, mergeMocks utility, runParallel core, contract handling, index.js branch, manual smoke + README). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Adds a \`parallel\` boolean field to the config schema with a default of false. No behavior change — subsequent commits wire the flag into a new runParallel module. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Worker-index-prefixed keys prevent silent collisions if two browser contexts happen to generate the same twd-js random testId. Each merged mock carries its workerIndex field for downstream testName resolution. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Introduces src/runParallel.js. Launches one Puppeteer browser with anti-throttle flags, creates two isolated browser contexts, navigates each to the configured URL, runs a test chunk via runByIds (self-filtered by idx % N === workerIndex inside page.evaluate), dumps per-worker window.__coverage__ to .nyc_output/out-<i>.json, and aggregates pass/ fail/skip counts. Contract mock collection and merging land in the next commit. Not yet wired into src/index.js. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Each worker exposes its own __twdCollectMock via page.exposeFunction, writing into a per-worker Map. After both workers complete, mergeMocks combines them with worker-indexed keys. Each mock carries workerIndex so buildTestPath can pick the correct handler tree for testName resolution. Validation and markdown reporting reuse the existing serial pipeline unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Adds an early-return branch at the top of runTests(): if config.parallel is truthy, load contract validators and hand off to runParallel. Serial code path below is textually unchanged and runs when parallel is absent or false. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Adds the `parallel` boolean to the Configuration Options table and a new section covering how the feature works, the 1.8× measured speedup, and current limitations (N=2 fixed, per-worker reporting, CI tuning). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

github-actions · 2026-04-22T08:43:43Z

TWD Contract Validation

Spec	Passed	Failed	Warnings	Mode
`./contracts/users-3.0.json`	2	3	1	`warn`
`./contracts/posts-3.1.json`	2	2	0	`warn`
`./contracts/products-3.0.json`	13	23	2	`warn`
`./contracts/events-3.1.json`	6	13	0	`warn`

23 passed · 41 failed · 3 warnings · 1 skipped

Failed validations

./contracts/users-3.0.json

GET /users/{userId} (200) — mock getUserNoAddress — in "Contract Validation - Mismatches > should fail: missing nested address field"
- response.address: missing required property "address"
GET /users/{userId} (200) — mock getUserBadAddress — in "Contract Validation - Mismatches > should fail: nested address missing required city"
- response.address.city: missing required property "city"
- response.address.country: missing required property "country"
GET /users/{userId} (200) — mock getUserBadRole — in "Contract Validation - Mismatches > should fail: oneOf role with invalid variant"
- response.role: oneOf best match (branch 2 of 2) failed: must be one of: "viewer"

./contracts/posts-3.1.json

GET /posts/{postId} (200) — mock getPostNoAuthor — in "Contract Validation - Mismatches > should fail: post missing nested author object"
- response.author: missing required property "author"
GET /posts/{postId} (200) — mock getPostBadMeta — in "Contract Validation - Mismatches > should fail: post oneOf metadata matches neither variant"
- response.metadata: oneOf best match (branch 1 of 2) failed: missing required property "category", unexpected property "duration", must be one of: "article"

./contracts/products-3.0.json

GET /products (200) — mock getProductEmptyName — in "Contract Validation - Products Mismatches (OpenAPI 3.0 — error mode) > should fail: empty name violates minLength"
- response[0].name: must NOT have fewer than 1 characters
GET /products (200) — mock getProductBadSku — in "Contract Validation - Products Mismatches (OpenAPI 3.0 — error mode) > should fail: invalid SKU pattern"
- response[0].sku: must match pattern "^[A-Z]{2,4}-\d{4,8}$"
GET /products (200) — mock getProductBadUuid — in "Contract Validation - Products Mismatches (OpenAPI 3.0 — error mode) > should fail: invalid uuid format for id"
- response[0].id: must match format "uuid"
GET /products (200) — mock getProductBadDateTime — in "Contract Validation - Products Mismatches (OpenAPI 3.0 — error mode) > should fail: invalid date-time format"
- response[0].createdAt: must match format "date-time"
GET /products (200) — mock getProductBadDate — in "Contract Validation - Products Mismatches (OpenAPI 3.0 — error mode) > should fail: invalid date format"
- response[0].releaseDate: must match format "date"
GET /products (200) — mock getProductBadEmail — in "Contract Validation - Products Mismatches (OpenAPI 3.0 — error mode) > should fail: invalid email format"
- response[0].contactEmail: must match format "email"
GET /products (200) — mock getProductBadUri — in "Contract Validation - Products Mismatches (OpenAPI 3.0 — error mode) > should fail: invalid uri format"
- response[0].website: must match format "uri"
GET /products (200) — mock getProductBadIp — in "Contract Validation - Products Mismatches (OpenAPI 3.0 — error mode) > should fail: invalid ipv4 format"
- response[0].serverIp: must match format "ipv4"
GET /products (200) — mock getProductBadIpV6 — in "Contract Validation - Products Mismatches (OpenAPI 3.0 — error mode) > should fail: invalid ipv6 format"
- response[0].serverIpV6: must match format "ipv6"
GET /products (200) — mock getProductZeroPrice — in "Contract Validation - Products Mismatches (OpenAPI 3.0 — error mode) > should fail: price of 0 violates exclusiveMinimum"
- response[0].price: must be > 0
GET /products (200) — mock getProductNegQty — in "Contract Validation - Products Mismatches (OpenAPI 3.0 — error mode) > should fail: negative quantity violates minimum"
- response[0].quantity: must be >= 0
GET /products (200) — mock getProductOverQty — in "Contract Validation - Products Mismatches (OpenAPI 3.0 — error mode) > should fail: quantity exceeds maximum"
- response[0].quantity: must be <= 999999
GET /products (200) — mock getProductBadWeight — in "Contract Validation - Products Mismatches (OpenAPI 3.0 — error mode) > should fail: weight not multipleOf 0.01"
- response[0].weight: must be multiple of 0.01
GET /products (200) — mock getProductBadRating — in "Contract Validation - Products Mismatches (OpenAPI 3.0 — error mode) > should fail: rating above maximum (5)"
- response[0].rating: must be <= 5
GET /products (200) — mock getProductBadCurrency — in "Contract Validation - Products Mismatches (OpenAPI 3.0 — error mode) > should fail: invalid enum value for currency"
- response[0].currency: must be one of: "USD", "EUR", "GBP", "JPY"
GET /products (200) — mock getProductBadCategory — in "Contract Validation - Products Mismatches (OpenAPI 3.0 — error mode) > should fail: invalid enum value for category"
- response[0].category: must be one of: "electronics", "clothing", "food", "books", "toys"
GET /products (200) — mock getProductBadBool — in "Contract Validation - Products Mismatches (OpenAPI 3.0 — error mode) > should fail: string value for boolean inStock"
- response[0].inStock: expected boolean, got string
GET /products (200) — mock getProductDupTags — in "Contract Validation - Products Mismatches (OpenAPI 3.0 — error mode) > should fail: duplicate tags violates uniqueItems"
- response[0].tags: must NOT have duplicate items (items ## 1 and 0 are identical)
GET /products (200) — mock getProductTooManyTags — in "Contract Validation - Products Mismatches (OpenAPI 3.0 — error mode) > should fail: tags exceeds maxItems (10)"
- response[0].tags: must NOT have more than 10 items
GET /products (200) — mock getProductBadMeta — in "Contract Validation - Products Mismatches (OpenAPI 3.0 — error mode) > should fail: non-string value in metadata additionalProperties"
- response[0].metadata.count: expected string, got number
GET /settings (200) — mock getSettingsBadExtra — in "Contract Validation - Products Mismatches (OpenAPI 3.0 — error mode) > should fail: extra property on Settings (additionalProperties: false)"
- response.extraField: unexpected property "extraField"
GET /settings (200) — mock getSettingsBadLang — in "Contract Validation - Products Mismatches (OpenAPI 3.0 — error mode) > should fail: invalid language pattern in Settings"
- response.language: must match pattern "^[a-z]{2}(-[A-Z]{2})?$"
GET /products (200) — mock getProductBadNullable — in "Contract Validation - Products Mismatches (OpenAPI 3.0 — error mode) > should fail: wrong type for nullable description (number instead of string|null)"
- response[0].description: expected string,null, got number

./contracts/events-3.1.json

GET /events (200) — mock getEventsEmpty — in "Contract Validation - Events Mismatches (OpenAPI 3.1 — error mode) > should fail: empty events array violates minItems (1)"
- response: must NOT have fewer than 1 items
GET /events (200) — mock getEventShortName — in "Contract Validation - Events Mismatches (OpenAPI 3.1 — error mode) > should fail: event name too short (minLength: 3)"
- response[0].name: must NOT have fewer than 3 characters
GET /events (200) — mock getEventBadDate — in "Contract Validation - Events Mismatches (OpenAPI 3.1 — error mode) > should fail: invalid date-time format for startDate"
- response[0].startDate: must match format "date-time"
GET /events (200) — mock getEventFloatId — in "Contract Validation - Events Mismatches (OpenAPI 3.1 — error mode) > should fail: float value for integer id"
- response[0].id: expected integer, got number
- response[0].id: must match format "int64"
GET /events (200) — mock getEventBadBool — in "Contract Validation - Events Mismatches (OpenAPI 3.1 — error mode) > should fail: number value for boolean active"
- response[0].active: expected boolean, got number
GET /events (200) — mock getEventBadStatus — in "Contract Validation - Events Mismatches (OpenAPI 3.1 — error mode) > should fail: invalid enum value for status"
- response[0].status: must be one of: "draft", "published", "archived"
GET /events (200) — mock getEventScoreMax — in "Contract Validation - Events Mismatches (OpenAPI 3.1 — error mode) > should fail: score at exclusiveMaximum boundary (100)"
- response[0].score: must be < 100
GET /events (200) — mock getEventLowPriority — in "Contract Validation - Events Mismatches (OpenAPI 3.1 — error mode) > should fail: priority below minimum (1)"
- response[0].priority: must be >= 1
GET /events (200) — mock getEventHighPriority — in "Contract Validation - Events Mismatches (OpenAPI 3.1 — error mode) > should fail: priority above maximum (5)"
- response[0].priority: must be <= 5
GET /events (200) — mock getEventDupAttendees — in "Contract Validation - Events Mismatches (OpenAPI 3.1 — error mode) > should fail: duplicate attendees violates uniqueItems"
- response[0].attendees: must NOT have duplicate items (items ## 1 and 0 are identical)
GET /events (200) — mock getEventNoAttendees — in "Contract Validation - Events Mismatches (OpenAPI 3.1 — error mode) > should fail: empty attendees array violates minItems (1)"
- response[0].attendees: must NOT have fewer than 1 items
GET /events (200) — mock getEventBadAttendee — in "Contract Validation - Events Mismatches (OpenAPI 3.1 — error mode) > should fail: invalid email format in attendees"
- response[0].attendees[0]: must match format "email"
GET /events/{eventId} (200) — mock getEventBadNullable — in "Contract Validation - Events Mismatches (OpenAPI 3.1 — error mode) > should fail: wrong type for nullable description (number instead of string|null)"
- response.description: expected string,null, got number

View full report →

kevinccbsg and others added 8 commits April 22, 2026 09:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: opt-in parallel test execution (N=2) — experimental#6

feat: opt-in parallel test execution (N=2) — experimental#6
kevinccbsg wants to merge 8 commits into
mainfrom
feat/parallel-execution

kevinccbsg commented Apr 22, 2026

Uh oh!

github-actions Bot commented Apr 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

kevinccbsg commented Apr 22, 2026

Summary

How to use

What changes for existing users

What's inside

Key design decisions

POC evidence

Known limitations / follow-ups

Test plan

Related docs (in-tree)

Uh oh!

github-actions Bot commented Apr 22, 2026

TWD Contract Validation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant