Openspec/architecture alignment v2#2
Conversation
- Create test-eyes-reporter package in collectors/playwright-reporter - Implement Playwright Reporter interface for automatic data collection - Add CLI for JUnit XML support (any test framework) - Extend schema with flakyCount for tracking flaky tests - Add deploy command for dashboard deployment - Update frontend to display flaky test count column Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add structured error handling with detailed logging - Add 19 unit tests for aggregate, junit-parser, file-operations - Add vitest as dev dependency Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Break submitData() from 95 lines into 6 focused methods (~15 lines each) - Add SubmitContext interface for cleaner data flow - Extract mapStatus(), checkoutDataBranch(), saveData(), commitAndPush() - Follow golden rule: functions under 20 lines Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
titlePath() returns empty strings for root suite/project, causing test names like " > > example.spec.ts > auth > ..." Now correctly outputs: "example.spec.ts > auth > should login" Bumped to v0.1.3 and published to npm. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Previously all runs on same day with same commit SHA would overwrite each other. Now format is: date_HHMMSS_sha (e.g. 2026-03-20_144506_d68aea9) Bumped to v0.1.4 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
📝 WalkthroughWalkthroughThis pull request introduces the Changes
Sequence Diagram(s)sequenceDiagram
participant Playwright as Playwright<br/>(test runner)
participant Reporter as TestEyes<br/>Reporter
participant Git as Git<br/>Operations
participant DataBranch as gh-data<br/>Branch
participant Agg as Aggregation<br/>Engine
Playwright->>Reporter: onTestEnd(test)
Reporter->>Reporter: Extract result,<br/>mark flaky if retried
Playwright->>Reporter: onEnd()
Reporter->>Git: getCurrentSha(),<br/>getCurrentBranch()
Reporter->>Git: checkoutBranch<br/>(gh-data)
Reporter->>Reporter: Generate run data<br/>(runId, tests)
Reporter->>Reporter: saveRunData()<br/>to data/
Reporter->>Agg: aggregate(dataDir)
Agg->>Agg: Load existing<br/>main-test-data.json
Agg->>Agg: Scan for new<br/>JSON run files
Agg->>Agg: Merge test stats,<br/>calc avg & p95<br/>duration
Agg->>Agg: Track flaky<br/>counts
Agg->>DataBranch: Save<br/>main-test-data.json
Reporter->>Git: stageFiles(),<br/>commit(),<br/>push(gh-data)
Reporter->>Git: checkoutBranch<br/>(original)
sequenceDiagram
participant CLI as test-eyes<br/>CLI
participant Parser as JUnit<br/>Parser
participant Reporter as Reporter/<br/>Aggregator
participant Git as Git<br/>Operations
participant Deploy as Dashboard<br/>Deploy
CLI->>Parser: parseJUnitFile(path)
Parser->>Parser: Extract testcase,<br/>status, duration
Parser-->>CLI: TestResult[]
CLI->>Git: configureGit()
CLI->>Reporter: aggregate(dataDir)
Reporter->>Reporter: Reconstruct<br/>statistics
Reporter->>Reporter: Calc duration<br/>percentiles
CLI->>Git: stage, commit,<br/>push to gh-data
alt Deploy enabled
CLI->>Deploy: deployDashboard()
Deploy->>Git: checkoutBranch<br/>(gh-data)
Deploy->>Deploy: Fetch aggregated<br/>data JSON
Deploy->>Deploy: Copy frontend<br/>dist + data
Deploy->>Git: checkoutBranch<br/>(gh-pages)
Deploy->>Deploy: Commit & push<br/>gh-pages
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~65 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 19
Note
Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
apps/frontend/src/App.tsx (1)
42-42:⚠️ Potential issue | 🟡 MinorTypo in state setter name.
setSllowestSortinghas an extra 'l'. This doesn't affect functionality but reduces readability.✏️ Fix typo
- const [slowestSorting, setSllowestSorting] = useState<SortingState>([{ id: 'p95DurationMs', desc: true }]) + const [slowestSorting, setSlowestSorting] = useState<SortingState>([{ id: 'p95DurationMs', desc: true }])Also update the usage at line 98:
- onSortingChange: setSllowestSorting, + onSortingChange: setSlowestSorting,🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/frontend/src/App.tsx` at line 42, There's a typo in the state setter name: rename setSllowestSorting to setSlowestSorting where the state is declared (const [slowestSorting, setSllowestSorting] = useState<SortingState>(...)) and update all usages of setSllowestSorting elsewhere in this component (e.g., the handler referenced near the sorting update) to the corrected identifier so the state follows the slowestSorting / setSlowestSorting convention and remains consistent.
🟡 Minor comments (4)
.openspec/changes/playwright-reporter/02-solution.md-43-46 (1)
43-46:⚠️ Potential issue | 🟡 MinorInconsistent flaky detection logic with updated spec.
This document specifies flaky detection as
result.status === 'passed' && result.retry > 0, but the updated spec atopenspec/changes/architecture-alignment/specs/playwright-reporter/spec.mdrequires usingtest.outcome() === 'flaky'instead. The documentation should be updated to reflect the canonical approach.📝 Suggested update
## Flaky Detection Logic ```typescript -wasFlaky: result.status === 'passed' && result.retry > 0 +isFlaky: test.outcome() === 'flaky'</details> <details> <summary>🤖 Prompt for AI Agents</summary>Verify each finding against the current code and only fix it if needed.
In @.openspec/changes/playwright-reporter/02-solution.md around lines 43 - 46,
Update the flaky-detection expression to use the canonical Playwright API:
replace the old logic that computes wasFlaky using result.status and
result.retry (wasFlaky: result.status === 'passed' && result.retry > 0) with an
isFlaky check that calls test.outcome() === 'flaky' (rename the variable to
isFlaky and use test.outcome() to determine flakiness).</details> </blockquote></details> <details> <summary>openspec/changes/architecture-alignment/specs/flow-testing/spec.md-28-30 (1)</summary><blockquote> `28-30`: _⚠️ Potential issue_ | _🟡 Minor_ **Inconsistency between spec and implementation: `isFlaky` vs `wasFlaky`.** The spec references `isFlaky: true` but the `TestResult` interface in both `collectors/playwright-reporter/src/types.ts` and `apps/test-processing/src/types.ts` uses `wasFlaky`. This should be aligned to avoid confusion during implementation. <details> <summary>📝 Proposed fix</summary> ```diff #### Scenario: Flaky test produces correct output - **WHEN** reporter receives 3 attempts (fail, fail, pass) for one test with outcome 'flaky' -- **THEN** pushToGitHub receives RunData with 1 test having isFlaky: true, retries: 2, status: 'passed' +- **THEN** pushToGitHub receives RunData with 1 test having wasFlaky: true, retries: 2, status: 'passed'🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@openspec/changes/architecture-alignment/specs/flow-testing/spec.md` around lines 28 - 30, The spec uses isFlaky but the codebase defines wasFlaky on TestResult; update the spec language so it references wasFlaky to match the existing TestResult types (collectors/playwright-reporter/src/types.ts and apps/test-processing/src/types.ts) and any RunData pushed to pushToGitHub so expectations read "wasFlaky: true" (keep retries: 2 and status: 'passed' unchanged); alternatively, if you prefer changing code instead, rename the field on TestResult and all usages (including RunData and pushToGitHub consumers) from wasFlaky to isFlaky consistently across those two modules..openspec/changes/playwright-reporter/04-status.md-15-15 (1)
15-15:⚠️ Potential issue | 🟡 MinorTypo in demo URL.
The URL appears to contain a typo:
test-eyeTshould likely betest-eyes.📝 Proposed fix
-- **Demo:** danylosoft-create.github.io/test-eyeT/ +- **Demo:** danylosoft-create.github.io/test-eyes/🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.openspec/changes/playwright-reporter/04-status.md at line 15, The demo URL contains a typo: update the string "Demo: danylosoft-create.github.io/test-eyeT/" to the correct URL "Demo: danylosoft-create.github.io/test-eyes/" so the demo link points to the intended "test-eyes" page (look for the line starting with "Demo:" that includes "test-eyeT").collectors/playwright-reporter/package.json-47-50 (1)
47-50:⚠️ Potential issue | 🟡 MinorReplace the placeholder repository metadata before publishing.
Leaving the manifest pointed at a placeholder repo will send npm users to a dead project page.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@collectors/playwright-reporter/package.json` around lines 47 - 50, The package.json currently contains a placeholder "repository" object; update the repository metadata by replacing the placeholder "type" and "url" values in the repository field with the actual VCS type (e.g., "git") and the real repository URL for this project (or remove the repository key if this package should not expose a public repo), and confirm package.json's repository entry matches the repo used by your CI/publishing pipeline so npm users are directed to the correct project page.
🧹 Nitpick comments (9)
collectors/playwright-reporter/src/file-operations.ts (3)
10-13: Use one timestamp snapshot ingenerateRunId.Line 11 and Line 12 call
toISOString()separately; a single snapshot avoids edge-case date/time mismatch.Suggested tweak
export function generateRunId(commitSha: string): string { - const now = new Date() - const date = now.toISOString().split('T')[0] - const time = now.toISOString().split('T')[1].slice(0, 8).replace(/:/g, '') + const iso = new Date().toISOString() + const [date, timePart] = iso.split('T') + const time = timePart.slice(0, 8).replace(/:/g, '') const shortSha = commitSha.slice(0, 7) return `${date}_${time}_${shortSha}` }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@collectors/playwright-reporter/src/file-operations.ts` around lines 10 - 13, The code builds date and time from two separate toISOString() calls causing possible mismatch; in generateRunId capture a single timestamp (e.g., const iso = new Date().toISOString()) and derive date and time from that single iso value (use iso.split('T')[0] for date and iso.split('T')[1].slice(0,8).replace(/:/g,'') for time) while keeping shortSha = commitSha.slice(0,7); update occurrences of now/date/time to use the single snapshot to avoid edge-case mismatches.
54-60: Differentiate missing-directory from real filesystem failures.Line 58 currently converts all
readdirfailures into[]; permission or transient FS failures should surface instead of silently skipping data.Suggested error handling split
export async function listJsonFiles(dirPath: string): Promise<string[]> { try { const files = await fs.readdir(dirPath) return files.filter((f: string) => f.endsWith('.json')) - } catch { - return [] + } catch (error) { + const err = error as NodeJS.ErrnoException + if (err.code === 'ENOENT') return [] + throw error } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@collectors/playwright-reporter/src/file-operations.ts` around lines 54 - 60, The current listJsonFiles function swallows all fs.readdir errors and returns an empty array; change error handling so only a missing-directory (ENOENT) returns [] while other filesystem errors are rethrown. In practice, inside listJsonFiles catch the thrown error, check error.code === 'ENOENT' (or error?.code) and return [] for that case, otherwise throw the original error so permission or transient failures surface; reference the listJsonFiles function and its use of fs.readdir when making this change.
32-38: Avoid swallowing all JSON/file errors inloadJson.Returning
nullfor every failure path hides malformed JSON and permission issues, which makes production failures hard to diagnose.Suggested error handling split
export async function loadJson<T>(filepath: string): Promise<T | null> { try { const content = await fs.readFile(filepath, 'utf-8') return JSON.parse(content) as T - } catch { - return null + } catch (error) { + const err = error as NodeJS.ErrnoException + if (err.code === 'ENOENT') return null + throw error } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@collectors/playwright-reporter/src/file-operations.ts` around lines 32 - 38, loadJson currently swallows all errors and always returns null, hiding malformed JSON and permission problems; update the loadJson(filepath: string) implementation so it only returns null for "file not found" (ENOENT) cases but rethrows or propagates other errors (like JSON.parse syntax errors or EACCES permission errors); specifically, inside loadJson catch the thrown error, inspect error.code === 'ENOENT' and return null only then, otherwise rethrow the error so callers see parse/permission failures.collectors/playwright-reporter/tests/aggregate.test.ts (1)
71-107: Consider adding p95 calculation test.The multi-run test verifies
avgDurationMsbut doesn't testp95DurationMscalculation. Given that p95 is displayed in the dashboard's "Slowest Tests" table, consider adding a test case with enough data points to verify percentile logic.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@collectors/playwright-reporter/tests/aggregate.test.ts` around lines 71 - 107, Add a test assertion that verifies the p95 percentile is computed by aggregate: create enough run entries for the test case (use the existing TEST_DIR, run objects like run1/run2 or add run3..runN) with varying durationMs so the 95th percentile is meaningful, call aggregate(TEST_DIR), read main-test-data.json into mainData, and add an expectation checking mainData.tests['test1'].p95DurationMs equals the expected p95 value (referencing aggregate and main-test-data.json to locate where the value is produced and stored).collectors/playwright-reporter/src/junit-parser.ts (3)
4-45: Unused interfaces and helper functions.The interfaces
JUnitTestCase,JUnitTestSuite,JUnitReportand helper functionsnormalizeToArray,getTestStatus, andparseTestCaseare defined but never used. The actual parsing logic usesparseTestAttrsinstead. These appear to be remnants from a previous XML library-based approach.Consider removing this dead code to reduce maintenance burden.
🧹 Proposed cleanup
import { readFile } from 'fs/promises' import type { TestResult } from './types.js' -// JUnit XML types -interface JUnitTestCase { - '@_classname'?: string - '@_name': string - '@_time'?: string - failure?: unknown - error?: unknown - skipped?: unknown -} - -interface JUnitTestSuite { - '@_name'?: string - testcase?: JUnitTestCase | JUnitTestCase[] -} - -interface JUnitReport { - testsuites?: { testsuite: JUnitTestSuite | JUnitTestSuite[] } - testsuite?: JUnitTestSuite -} - -function normalizeToArray<T>(value: T | T[] | undefined): T[] { - if (!value) return [] - return Array.isArray(value) ? value : [value] -} - -function getTestStatus(testCase: JUnitTestCase): TestResult['status'] { - if (testCase.skipped !== undefined) return 'skipped' - if (testCase.failure !== undefined || testCase.error !== undefined) return 'failed' - return 'passed' -} - -function parseTestCase(testCase: JUnitTestCase): TestResult { - const className = testCase['@_classname'] || '' - const testName = testCase['@_name'] || 'unknown' - const name = className ? `${className} ${testName}` : testName - - return { - name: name.trim(), - durationMs: Math.round(parseFloat(testCase['@_time'] || '0') * 1000), - status: getTestStatus(testCase) - } -} - export function parseJUnitXml(xml: string): TestResult[] {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@collectors/playwright-reporter/src/junit-parser.ts` around lines 4 - 45, Remove the dead/unused JUnit XML artifacts: delete the interfaces JUnitTestCase, JUnitTestSuite, JUnitReport and the helper functions normalizeToArray, getTestStatus, and parseTestCase from junit-parser.ts; verify that parseTestAttrs (and any other active parsing helpers) remain untouched, search the repo for any remaining references to those symbols before committing, and run type-checks/tests to ensure no regressions.
71-74: Regex attribute parsing has edge-case limitations.The attribute regex
/name=["']([^"']+)["']/won't handle:
- XML entities in attribute values (e.g.,
",&)- Test names containing the quote character used as delimiter
This is acceptable for most JUnit outputs, but consider documenting this limitation or using a more robust XML parsing approach if broader compatibility is needed.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@collectors/playwright-reporter/src/junit-parser.ts` around lines 71 - 74, The regex-based parsing in parseTestAttrs (function parseTestAttrs) fails for attributes containing XML entities or quote chars used as delimiters; update the implementation to use a proper XML/HTML parser (e.g., DOMParser in Node/browser or an XML library like fast-xml-parser/xml2js) to extract name, classname and time attributes reliably and unescape entities, or at minimum document this limitation near parseTestAttrs/TestResult explaining that names with embedded quotes or XML entities are not supported and recommend using a real XML parser for broader compatibility.
94-97: Consider adding error handling for file read failures.The
parseJUnitFilefunction will throw if the file doesn't exist or isn't readable. Consider whether a more graceful error message would help users diagnose issues.💡 Optional improvement
export async function parseJUnitFile(filepath: string): Promise<TestResult[]> { - const xml = await readFile(filepath, 'utf-8') - return parseJUnitXml(xml) + try { + const xml = await readFile(filepath, 'utf-8') + return parseJUnitXml(xml) + } catch (error) { + throw new Error(`Failed to parse JUnit file "${filepath}": ${error instanceof Error ? error.message : error}`) + } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@collectors/playwright-reporter/src/junit-parser.ts` around lines 94 - 97, The parseJUnitFile function currently lets readFile errors bubble up; wrap the await readFile(filepath, 'utf-8') and subsequent parseJUnitXml(xml) in a try/catch inside parseJUnitFile, and on error throw or reject with a clearer message that includes the filepath and the original error (e.g., "Failed to read/parse JUnit file: <filepath> - <original error>"), or return a controlled value if preferred; refer to parseJUnitFile and parseJUnitXml when adding the try/catch and include the original error as the cause/context for easier debugging.collectors/playwright-reporter/src/aggregate.ts (1)
116-120: Skipped tests incrementtotalRunsbut are not reflected in pass/fail counts.A skipped test increments
stats.totalRuns(line 114) but doesn't increment any status counter. This creates a gap wherepassCount + failCount < totalRunsfor tests with skipped runs. If this is intentional, consider adding askipCountfield for completeness; otherwise, skipped tests may need different handling.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@collectors/playwright-reporter/src/aggregate.ts` around lines 116 - 120, The current aggregation increments stats.totalRuns but only updates stats.passCount or stats.failCount for test.status 'passed'/'failed', leaving skipped runs uncounted; modify the aggregation logic in aggregate.ts to handle test.status 'skipped' by adding a new counter stats.skipCount (initialize where stats is created) and increment it when test.status === 'skipped', or alternatively adjust existing logic to not increment totalRuns for skipped runs—update references to stats.totalRuns, stats.passCount, stats.failCount, and the new stats.skipCount accordingly so passCount + failCount + skipCount == totalRuns.collectors/playwright-reporter/src/git-operations.ts (1)
55-58: Avoid shell error suppression with|| true.The
|| truemasks all errors fromgit rm -rf ., making failures invisible. Handle the error in JavaScript instead, so you can differentiate between "nothing to remove" (expected) and actual failures.♻️ Proposed fix
export async function createOrphanBranch(branch: string): Promise<void> { - await run(`git checkout --orphan ${branch}`) - await run('git rm -rf . || true') + await run('git', ['checkout', '--orphan', branch]) + try { + await run('git', ['rm', '-rf', '.']) + } catch { + // Ignore error when there are no files to remove (empty repo) + } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@collectors/playwright-reporter/src/git-operations.ts` around lines 55 - 58, In createOrphanBranch, stop using the shell-suppressed command "git rm -rf . || true" and instead await run('git rm -rf .') inside a try/catch so we can handle failures explicitly: remove the "|| true", wrap the await run call in try/catch, inspect the thrown error (message/stdout/stderr) from run and silently ignore only the benign "did not match any files" / "pathspec" (no files to remove) cases, but rethrow or log and propagate any other errors; reference createOrphanBranch and the run helper to locate and update the code.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@apps/test-processing/src/aggregate.ts`:
- Around line 87-89: The code increments stats.flakyCount when test.wasFlaky but
older aggregated records may lack flakyCount causing NaN; before incrementing in
the aggregation logic around the stats object (where stats and test.wasFlaky are
used) ensure flakyCount is initialized to a numeric 0 if missing or not a number
(e.g., set stats.flakyCount = 0 when typeof stats.flakyCount !== 'number' or use
a fallback) and then perform the increment so flaky totals remain correct.
In `@collectors/playwright-reporter/package.json`:
- Around line 30-38: The package.json for collectors/playwright-reporter is
missing a dependency on the shared collection core in apps/test-processing, so
the repo continues to use the duplicated runtime files (aggregate.ts,
git-operations.ts, file-operations.ts, deploy.ts); update
collectors/playwright-reporter's package.json to add a dependency (or workspace
reference) to the apps/test-processing package, change imports in the reporter
code to import the shared orchestration/collection functions from that package
instead of the local duplicate modules (search for imports referencing
aggregate, git-operations, file-operations, deploy) and remove the duplicated
modules from the runtime path so the reporter delegates collection/orchestration
to the shared core.
- Around line 12-18: The package.json exports currently include a "require"
condition pointing at "./dist/index.js" while the package uses "type": "module",
causing CommonJS consumers to fail; either remove the "require" field from the
exports map (exports["."].require) or produce and point it to a dedicated
CommonJS build (e.g., compile a .cjs bundle such as "./dist/index.cjs" and
update exports["."].require to that file), and ensure the "types" and ESM
"import"/"default" still point to the ESM outputs (e.g., "./dist/index.d.ts" and
"./dist/index.js") so Node CJS consumers can require the .cjs file without
breaking ESM consumers.
In `@collectors/playwright-reporter/README.md`:
- Around line 15-19: Update the installation instructions in the README: replace
the incorrect package name string "test-eyes-reporter" with the published
package name "@practica/test-eyes" in both npm and pnpm examples so users can
successfully install the released package; ensure both occurrences in the
markdown block use "@practica/test-eyes".
- Around line 30-37: The reporter string in the Playwright config's reporter
array ('test-eyes-reporter') must match the actual npm package name; open the
reporter configuration where the reporter array is defined (the reporter: [...]
block) and replace 'test-eyes-reporter' with the package name exactly as
specified in the reporter package.json (or the installed module name), keeping
the existing options object (dataBranch, deployBranch, deploy) intact so
Playwright can resolve the module.
In `@collectors/playwright-reporter/src/aggregate.ts`:
- Around line 88-92: The reconstructed durations map currently fills samples
using stats.avgDurationMs (for each entry in data.tests), which loses historical
variance and breaks p95 accuracy; modify the loop that builds durations (the
code iterating over data.tests and calling durations.set) to first check for and
use stats.durationSamples (an array of real samples) when present, falling back
to Array(stats.totalRuns).fill(stats.avgDurationMs) only for backward
compatibility, and additionally consider using a stored stats.p95 value when
neither samples nor reliable totals exist to avoid synthetic recalculation.
In `@collectors/playwright-reporter/src/cli.ts`:
- Around line 51-60: The CLI currently accepts unvalidated branch refs (options
'data-branch' and any deploy branch from collectCommand / parseArgs) and later
passes them into git helper shell commands, creating a command-injection risk;
update collectCommand to validate ref names (e.g., allow only [A-Za-z0-9._/-]
and reject/escape anything else) before using them in any git helper, or change
the git helper invocation to use argv-style process execution (spawn/execFile)
rather than shell interpolation; ensure you apply the same validation or
safe-call change for the deploy branch path referenced around the code handling
deploy (lines near 138-149) so both dataBranch and deployBranch are sanitized or
passed as safe args to the git helpers.
- Around line 92-133: The checkout to the data branch and subsequent
save/aggregate/push sequence can leave the repo on the data branch if an error
is thrown; wrap the block that begins after switching branches (the
save/aggregate/commit/push flow that calls ensureDir, saveRunData, aggregate,
stageFiles, hasChanges, commit, push, and deployDashboard) in a try/finally and
perform the checkout back to originalBranch in the finally (using
checkoutBranch(originalBranch)) so the workspace is always restored; preserve
the existing logic that skips checkout when originalBranch is falsy or equals
dataBranch and keep deployDashboard only after successful work inside the try.
In `@collectors/playwright-reporter/src/deploy.ts`:
- Around line 57-158: The deployDashboard function can leave the repo on the
wrong branch or skip removing the tempDir if an exception occurs; wrap the main
work (everything after saving originalBranch and before cleanup) in a
try/finally so that in the finally block you always call await
checkoutBranch(originalBranch) and await fs.rm(tempDir, { recursive: true,
force: true }) (or equivalent cleanup), and ensure any early returns (e.g., when
no local frontend) still perform the cleanup by moving that logic inside the try
and returning after cleanup; reference deployDashboard, tempDir, originalBranch,
checkoutBranch, and fs.rm to locate the changes.
- Around line 81-98: deployDashboard currently uses a hard-coded localDist
('apps/frontend/dist') preventing the exposed frontendDistPath option from being
honored; update deployDashboard (and its options type) to accept and use the
frontendDistPath option (or import frontendDistPath from
collectors/playwright-reporter/src/types.ts) instead of the hard-coded
localDist, then replace uses of localDist in the fileExists and copyDir calls
(and any logging) so external repos can pass their build output; ensure the
options/interface signature that calls deployDashboard (and the variable
tempDir/fileExists/copyDir usage) is updated accordingly so the override is
threaded through.
In `@collectors/playwright-reporter/src/git-operations.ts`:
- Around line 60-64: The stageFiles function is injecting untrusted pattern
strings directly into a shell command causing command injection; update
stageFiles to call the underlying runner with argv (not a shell-interpolated
string) — e.g. invoke the run helper or child_process API using an argument
array like ['git','add', pattern] (or run('git', ['add', pattern'])) for each
pattern so the pattern is passed as a single argument and not interpreted by the
shell; ensure any helper run function supports argv arrays and adjust its call
sites if needed.
- Around line 75-91: The commit() and push() functions call run() with
interpolated strings making message and branch vulnerable to shell-injection and
their catch blocks swallow errors; update commit() and push() to call run() with
safe, argumentized execution (use an API that accepts argv arrays or escape
arguments instead of string interpolation) so the message and branch are passed
as separate args, and in both catch blocks log the caught error (include error
details and context mentioning commit(message) or push(branch)) before returning
null/false (or rethrow if preferred); reference the commit, push, run, and
getCurrentSha symbols when making these changes.
- Around line 33-53: The branch parameter is currently interpolated into shell
commands in fetchBranch, branchExists, and checkoutBranch which allows command
injection; update these functions to call the underlying execFile-style runner
with argument arrays (e.g., run('git', ['fetch', 'origin', branch + ':' +
branch']) or preferably split into ['fetch','origin',`${branch}:${branch}`],
run('git', ['rev-parse','--verify', branch]) and run('git', ['checkout',
branch]) ) instead of string interpolation, or modify the run helper to accept
(cmd, args) and use child_process.execFile/spawn without a shell; ensure you do
not reintroduce shell=true and keep error handling unchanged.
- Around line 20-23: The configureGit function is vulnerable to command
injection because it interpolates config.userName and config.userEmail into
shell strings passed to run; update configureGit to call a non-shell command
executor (e.g., use child_process.execFile or spawn with an args array) instead
of string interpolation, or change the run helper to accept a command and an
args array and call run('git', ['config', 'user.name', config.userName]) and
run('git', ['config', 'user.email', config.userEmail]); ensure inputs are passed
as arguments (not concatenated into a shell) and validate/trim
userName/userEmail before passing them.
In `@collectors/playwright-reporter/src/reporter.ts`:
- Around line 112-117: The upload path is currently ignoring commit()/push()
failures and uses a force-push, so change saveToDataBranch (and any callers like
submitData) to fail-fast on any commit or push error: have saveToDataBranch
return/throw on push failure instead of returning success, and ensure
reporter.ts (the block around saveToDataBranch, returnToOriginalBranch, deploy)
checks that result and aborts deployment when upload fails. In addition, remove
or change the force-push behavior in git-operations.ts push() (stop using
--force) and handle push conflicts (return error to caller) so concurrent CI
jobs cannot silently overwrite each other. Ensure both the first occurrence
(around lines 112–117) and the other occurrence (around lines 177–188) follow
the same fail-fast behavior and propagate push errors up to prevent deploying
stale data.
- Around line 78-84: onTestEnd is currently appending an entry for every attempt
(using result.status), causing duplicate entries for retried tests; change it to
record only the final logical outcome by using test.outcome() instead of
result.status and derive wasFlaky from test.outcome() and test.retries/run
history (so you push one entry per TestCase into this.tests). Update the
onTestEnd implementation to use test.outcome() for status, compute wasFlaky from
the test's final/flaky indicators, and ensure aggregate() continues to operate
on this.tests which will now contain only final outcomes.
In `@openspec/changes/architecture-alignment/proposal.md`:
- Around line 33-41: The CLI relocation is a breaking change because the
published package no longer exposes the previous binaries (test-eyes /
test-eyes-reporter) after removing collectors/playwright-reporter/src/cli.ts and
moving collection into collectors/junit/; either mark this in the proposal as a
breaking change or add a compatibility shim that re-exports or re-installs the
old CLI entrypoints (e.g., restore a small module at
collectors/playwright-reporter/src/cli.ts that imports and delegates to the new
collectors/junit/ entry, and update package.json "bin" to point to those legacy
paths) so existing consumers invoking the published binaries continue to work.
In
`@openspec/changes/architecture-alignment/specs/multi-format-collectors/spec.md`:
- Around line 3-12: Update the spec for collectFromRunData to require that
runData aggregation collapses multiple retry attempts of the same logical test
into a single TestResult (e.g., identify tests by a stable key such as
testId/name + suite/path) and compute flaky counts per logical test rather than
per-attempt; change both Playwright and JUnit scenarios to assert that
collectFromRunData deduplicates attempts, produces one TestResult per logical
test, and increments a flakyCount when a test has mixed pass/fail attempts
across retries before committing/pushing.
In `@openspec/changes/architecture-alignment/specs/playwright-reporter/spec.md`:
- Around line 29-38: The reporter currently duplicates collection, git and file
logic; remove the local implementations in git-operations.ts, file-operations.ts
and aggregate.ts and instead import collectFromRunData from the test-processing
package; update the reporter's onEnd handler (the function named onEnd in the
reporter) to call collectFromRunData(runData) and use its returned payload for
any subsequent save/commit/push operations, replacing calls to local functions
previously defined in git-operations.ts, file-operations.ts and aggregate.ts;
ensure any tests or callers now reference the imported collectFromRunData and
delete the redundant local modules.
---
Outside diff comments:
In `@apps/frontend/src/App.tsx`:
- Line 42: There's a typo in the state setter name: rename setSllowestSorting to
setSlowestSorting where the state is declared (const [slowestSorting,
setSllowestSorting] = useState<SortingState>(...)) and update all usages of
setSllowestSorting elsewhere in this component (e.g., the handler referenced
near the sorting update) to the corrected identifier so the state follows the
slowestSorting / setSlowestSorting convention and remains consistent.
---
Minor comments:
In @.openspec/changes/playwright-reporter/02-solution.md:
- Around line 43-46: Update the flaky-detection expression to use the canonical
Playwright API: replace the old logic that computes wasFlaky using result.status
and result.retry (wasFlaky: result.status === 'passed' && result.retry > 0) with
an isFlaky check that calls test.outcome() === 'flaky' (rename the variable to
isFlaky and use test.outcome() to determine flakiness).
In @.openspec/changes/playwright-reporter/04-status.md:
- Line 15: The demo URL contains a typo: update the string "Demo:
danylosoft-create.github.io/test-eyeT/" to the correct URL "Demo:
danylosoft-create.github.io/test-eyes/" so the demo link points to the intended
"test-eyes" page (look for the line starting with "Demo:" that includes
"test-eyeT").
In `@collectors/playwright-reporter/package.json`:
- Around line 47-50: The package.json currently contains a placeholder
"repository" object; update the repository metadata by replacing the placeholder
"type" and "url" values in the repository field with the actual VCS type (e.g.,
"git") and the real repository URL for this project (or remove the repository
key if this package should not expose a public repo), and confirm package.json's
repository entry matches the repo used by your CI/publishing pipeline so npm
users are directed to the correct project page.
In `@openspec/changes/architecture-alignment/specs/flow-testing/spec.md`:
- Around line 28-30: The spec uses isFlaky but the codebase defines wasFlaky on
TestResult; update the spec language so it references wasFlaky to match the
existing TestResult types (collectors/playwright-reporter/src/types.ts and
apps/test-processing/src/types.ts) and any RunData pushed to pushToGitHub so
expectations read "wasFlaky: true" (keep retries: 2 and status: 'passed'
unchanged); alternatively, if you prefer changing code instead, rename the field
on TestResult and all usages (including RunData and pushToGitHub consumers) from
wasFlaky to isFlaky consistently across those two modules.
---
Nitpick comments:
In `@collectors/playwright-reporter/src/aggregate.ts`:
- Around line 116-120: The current aggregation increments stats.totalRuns but
only updates stats.passCount or stats.failCount for test.status
'passed'/'failed', leaving skipped runs uncounted; modify the aggregation logic
in aggregate.ts to handle test.status 'skipped' by adding a new counter
stats.skipCount (initialize where stats is created) and increment it when
test.status === 'skipped', or alternatively adjust existing logic to not
increment totalRuns for skipped runs—update references to stats.totalRuns,
stats.passCount, stats.failCount, and the new stats.skipCount accordingly so
passCount + failCount + skipCount == totalRuns.
In `@collectors/playwright-reporter/src/file-operations.ts`:
- Around line 10-13: The code builds date and time from two separate
toISOString() calls causing possible mismatch; in generateRunId capture a single
timestamp (e.g., const iso = new Date().toISOString()) and derive date and time
from that single iso value (use iso.split('T')[0] for date and
iso.split('T')[1].slice(0,8).replace(/:/g,'') for time) while keeping shortSha =
commitSha.slice(0,7); update occurrences of now/date/time to use the single
snapshot to avoid edge-case mismatches.
- Around line 54-60: The current listJsonFiles function swallows all fs.readdir
errors and returns an empty array; change error handling so only a
missing-directory (ENOENT) returns [] while other filesystem errors are
rethrown. In practice, inside listJsonFiles catch the thrown error, check
error.code === 'ENOENT' (or error?.code) and return [] for that case, otherwise
throw the original error so permission or transient failures surface; reference
the listJsonFiles function and its use of fs.readdir when making this change.
- Around line 32-38: loadJson currently swallows all errors and always returns
null, hiding malformed JSON and permission problems; update the
loadJson(filepath: string) implementation so it only returns null for "file not
found" (ENOENT) cases but rethrows or propagates other errors (like JSON.parse
syntax errors or EACCES permission errors); specifically, inside loadJson catch
the thrown error, inspect error.code === 'ENOENT' and return null only then,
otherwise rethrow the error so callers see parse/permission failures.
In `@collectors/playwright-reporter/src/git-operations.ts`:
- Around line 55-58: In createOrphanBranch, stop using the shell-suppressed
command "git rm -rf . || true" and instead await run('git rm -rf .') inside a
try/catch so we can handle failures explicitly: remove the "|| true", wrap the
await run call in try/catch, inspect the thrown error (message/stdout/stderr)
from run and silently ignore only the benign "did not match any files" /
"pathspec" (no files to remove) cases, but rethrow or log and propagate any
other errors; reference createOrphanBranch and the run helper to locate and
update the code.
In `@collectors/playwright-reporter/src/junit-parser.ts`:
- Around line 4-45: Remove the dead/unused JUnit XML artifacts: delete the
interfaces JUnitTestCase, JUnitTestSuite, JUnitReport and the helper functions
normalizeToArray, getTestStatus, and parseTestCase from junit-parser.ts; verify
that parseTestAttrs (and any other active parsing helpers) remain untouched,
search the repo for any remaining references to those symbols before committing,
and run type-checks/tests to ensure no regressions.
- Around line 71-74: The regex-based parsing in parseTestAttrs (function
parseTestAttrs) fails for attributes containing XML entities or quote chars used
as delimiters; update the implementation to use a proper XML/HTML parser (e.g.,
DOMParser in Node/browser or an XML library like fast-xml-parser/xml2js) to
extract name, classname and time attributes reliably and unescape entities, or
at minimum document this limitation near parseTestAttrs/TestResult explaining
that names with embedded quotes or XML entities are not supported and recommend
using a real XML parser for broader compatibility.
- Around line 94-97: The parseJUnitFile function currently lets readFile errors
bubble up; wrap the await readFile(filepath, 'utf-8') and subsequent
parseJUnitXml(xml) in a try/catch inside parseJUnitFile, and on error throw or
reject with a clearer message that includes the filepath and the original error
(e.g., "Failed to read/parse JUnit file: <filepath> - <original error>"), or
return a controlled value if preferred; refer to parseJUnitFile and
parseJUnitXml when adding the try/catch and include the original error as the
cause/context for easier debugging.
In `@collectors/playwright-reporter/tests/aggregate.test.ts`:
- Around line 71-107: Add a test assertion that verifies the p95 percentile is
computed by aggregate: create enough run entries for the test case (use the
existing TEST_DIR, run objects like run1/run2 or add run3..runN) with varying
durationMs so the 95th percentile is meaningful, call aggregate(TEST_DIR), read
main-test-data.json into mainData, and add an expectation checking
mainData.tests['test1'].p95DurationMs equals the expected p95 value (referencing
aggregate and main-test-data.json to locate where the value is produced and
stored).
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: fa738906-ddbe-4866-b85a-3feb0a30cd99
⛔ Files ignored due to path filters (6)
openspec/.DS_Storeis excluded by!**/.DS_Storeopenspec/changes/.DS_Storeis excluded by!**/.DS_Storeopenspec/changes/componentize-frontend/.DS_Storeis excluded by!**/.DS_Storeopenspec/changes/github-action-install/.DS_Storeis excluded by!**/.DS_Storeopenspec/changes/github-action-install/specs/.DS_Storeis excluded by!**/.DS_Storepnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (34)
.openspec/changes/playwright-reporter/01-problem.md.openspec/changes/playwright-reporter/02-solution.md.openspec/changes/playwright-reporter/03-tasks.md.openspec/changes/playwright-reporter/04-status.mdapps/frontend/src/App.tsxapps/frontend/src/hooks/useTestFilter.tsapps/frontend/src/types/index.tsapps/test-processing/src/aggregate.tsapps/test-processing/src/types.tscollectors/playwright-reporter/README.mdcollectors/playwright-reporter/package.jsoncollectors/playwright-reporter/src/aggregate.tscollectors/playwright-reporter/src/cli.tscollectors/playwright-reporter/src/deploy.tscollectors/playwright-reporter/src/file-operations.tscollectors/playwright-reporter/src/git-operations.tscollectors/playwright-reporter/src/index.tscollectors/playwright-reporter/src/junit-parser.tscollectors/playwright-reporter/src/reporter.tscollectors/playwright-reporter/src/types.tscollectors/playwright-reporter/tests/aggregate.test.tscollectors/playwright-reporter/tests/file-operations.test.tscollectors/playwright-reporter/tests/junit-parser.test.tscollectors/playwright-reporter/tsconfig.jsonopenspec/changes/architecture-alignment/.openspec.yamlopenspec/changes/architecture-alignment/README.mdopenspec/changes/architecture-alignment/design.mdopenspec/changes/architecture-alignment/proposal.mdopenspec/changes/architecture-alignment/specs/flow-testing/spec.mdopenspec/changes/architecture-alignment/specs/multi-format-collectors/spec.mdopenspec/changes/architecture-alignment/specs/playwright-reporter/spec.mdopenspec/changes/architecture-alignment/specs/test-aggregation/spec.mdopenspec/changes/architecture-alignment/tasks.mdpnpm-workspace.yaml
| if (test.wasFlaky) { | ||
| stats.flakyCount++ | ||
| } |
There was a problem hiding this comment.
Initialize legacy flakyCount before incrementing.
At Line 88, older aggregated records without flakyCount will produce NaN on increment, corrupting flaky totals.
Proposed fix
- if (test.wasFlaky) {
- stats.flakyCount++
- }
+ if (typeof stats.flakyCount !== 'number' || Number.isNaN(stats.flakyCount)) {
+ stats.flakyCount = 0
+ }
+ if (test.wasFlaky === true) {
+ stats.flakyCount++
+ }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if (test.wasFlaky) { | |
| stats.flakyCount++ | |
| } | |
| if (typeof stats.flakyCount !== 'number' || Number.isNaN(stats.flakyCount)) { | |
| stats.flakyCount = 0 | |
| } | |
| if (test.wasFlaky === true) { | |
| stats.flakyCount++ | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@apps/test-processing/src/aggregate.ts` around lines 87 - 89, The code
increments stats.flakyCount when test.wasFlaky but older aggregated records may
lack flakyCount causing NaN; before incrementing in the aggregation logic around
the stats object (where stats and test.wasFlaky are used) ensure flakyCount is
initialized to a numeric 0 if missing or not a number (e.g., set
stats.flakyCount = 0 when typeof stats.flakyCount !== 'number' or use a
fallback) and then perform the increment so flaky totals remain correct.
| "exports": { | ||
| ".": { | ||
| "types": "./dist/index.d.ts", | ||
| "import": "./dist/index.js", | ||
| "require": "./dist/index.js", | ||
| "default": "./dist/index.js" | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
In Node.js package exports, if package.jsonhas"type": "module"and"exports": { ".": { "require": "./dist/index.js" } }, can CommonJS require()load that target across Node 18, 20, and 22 without a separate.cjs build?
💡 Result:
No.
Because your package has "type": "module", the target file ./dist/index.js is treated as ESM (Node treats .js as ESM inside a "type": "module" package, and you’d need a .cjs file to force CommonJS). (nodejs.org)
So a CommonJS consumer doing require("your-pkg") will not be able to load ./dist/index.js on Node 18 (it errors, e.g. ERR_REQUIRE_ESM), and on Node 20/22 it only works if you opt into the newer require(esm) behavior (Node 20 added this in v20.17.0, and Node 22 documents it as only supported when --experimental-require-module is enabled, and only for synchronous ESM graphs). (nodejs.org)
For “just works” across Node 18, 20, and 22 without flags, you still need a real CommonJS entry (e.g. a .cjs build, or a CommonJS “wrapper” file / nested package.json with "type": "commonjs" in dist/). (nodejs.org)
Citations:
- 1: https://nodejs.org/api/packages.html?utm_source=openai
- 2: https://nodejs.org/download/release/v20.18.1/docs/api/modules.html?utm_source=openai
- 3: https://nodejs.org/api/packages.html?utm_source=openai
Drop the require export or ship a separate CommonJS build.
With "type": "module", the file ./dist/index.js is treated as ESM. CommonJS consumers cannot load it via require() on Node 18, and on Node 20–22 it only works with experimental flags. Either remove the require condition from exports["."] or provide a proper .cjs build.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@collectors/playwright-reporter/package.json` around lines 12 - 18, The
package.json exports currently include a "require" condition pointing at
"./dist/index.js" while the package uses "type": "module", causing CommonJS
consumers to fail; either remove the "require" field from the exports map
(exports["."].require) or produce and point it to a dedicated CommonJS build
(e.g., compile a .cjs bundle such as "./dist/index.cjs" and update
exports["."].require to that file), and ensure the "types" and ESM
"import"/"default" still point to the ESM outputs (e.g., "./dist/index.d.ts" and
"./dist/index.js") so Node CJS consumers can require the .cjs file without
breaking ESM consumers.
| "peerDependencies": { | ||
| "@playwright/test": ">=1.40.0" | ||
| }, | ||
| "devDependencies": { | ||
| "@playwright/test": "^1.52.0", | ||
| "@types/node": "^22.0.0", | ||
| "typescript": "^5.6.3", | ||
| "vitest": "^3.0.0" | ||
| }, |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
This manifest still doesn't wire in the shared collection core.
The new OpenSpec change says collectors/playwright-reporter should delegate collection/orchestration to apps/test-processing, but there is no dependency here on that package. As-is, the duplicated aggregate.ts / git-operations.ts / file-operations.ts / deploy.ts stack remains the runtime path.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@collectors/playwright-reporter/package.json` around lines 30 - 38, The
package.json for collectors/playwright-reporter is missing a dependency on the
shared collection core in apps/test-processing, so the repo continues to use the
duplicated runtime files (aggregate.ts, git-operations.ts, file-operations.ts,
deploy.ts); update collectors/playwright-reporter's package.json to add a
dependency (or workspace reference) to the apps/test-processing package, change
imports in the reporter code to import the shared orchestration/collection
functions from that package instead of the local duplicate modules (search for
imports referencing aggregate, git-operations, file-operations, deploy) and
remove the duplicated modules from the runtime path so the reporter delegates
collection/orchestration to the shared core.
| ```bash | ||
| npm install test-eyes-reporter | ||
| # or | ||
| pnpm add test-eyes-reporter | ||
| ``` |
There was a problem hiding this comment.
Package name inconsistency with published package.
The installation instructions reference test-eyes-reporter, but the status document indicates the published package name is @practica/test-eyes@0.1.4. This will cause installation failures for users.
📝 Proposed fix
```bash
-npm install test-eyes-reporter
+npm install `@practica/test-eyes`
# or
-pnpm add test-eyes-reporter
+pnpm add `@practica/test-eyes`</details>
<!-- suggestion_start -->
<details>
<summary>📝 Committable suggestion</summary>
> ‼️ **IMPORTANT**
> Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
```suggestion
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@collectors/playwright-reporter/README.md` around lines 15 - 19, Update the
installation instructions in the README: replace the incorrect package name
string "test-eyes-reporter" with the published package name
"@practica/test-eyes" in both npm and pnpm examples so users can successfully
install the released package; ensure both occurrences in the markdown block use
"@practica/test-eyes".
| reporter: [ | ||
| ['list'], | ||
| ['test-eyes-reporter', { | ||
| dataBranch: 'gh-data', // Branch to store test data | ||
| deployBranch: 'gh-pages', // Branch to deploy dashboard | ||
| deploy: true // Auto-deploy after test run | ||
| }] | ||
| ] |
There was a problem hiding this comment.
Reporter name should match package name.
The reporter configuration uses 'test-eyes-reporter' which should align with the actual package name.
📝 Proposed fix
reporter: [
['list'],
- ['test-eyes-reporter', {
+ ['@practica/test-eyes', {
dataBranch: 'gh-data', // Branch to store test data
deployBranch: 'gh-pages', // Branch to deploy dashboard
deploy: true // Auto-deploy after test run
}]
]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@collectors/playwright-reporter/README.md` around lines 30 - 37, The reporter
string in the Playwright config's reporter array ('test-eyes-reporter') must
match the actual npm package name; open the reporter configuration where the
reporter array is defined (the reporter: [...] block) and replace
'test-eyes-reporter' with the package name exactly as specified in the reporter
package.json (or the installed module name), keeping the existing options object
(dataBranch, deployBranch, deploy) intact so Playwright can resolve the module.
| onTestEnd(test: TestCase, result: PlaywrightTestResult): void { | ||
| this.tests.push({ | ||
| name: test.titlePath().filter(Boolean).join(' > '), | ||
| durationMs: result.duration, | ||
| status: this.mapStatus(result.status), | ||
| wasFlaky: result.status === 'passed' && result.retry > 0 | ||
| }) |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
In the Playwright Reporter API, does onTestEnd(test, result)run once per retry attempt, and istest.outcome()the API for the final logical test outcome (includingflaky)?
💡 Result:
Yes.
reporter.onTestEnd(test, result)is invoked after each individual test run in a worker (i.e., the initial run and every retry), because theresultargument is explicitly “Result of the test run”, and each run has its ownTestResultwith atestResult.retrycounter you can use to distinguish attempts. [1], [2]test.outcome()is the API for the final logical outcome of theTestCase, computed from all runs (it can be"flaky"if it passed on a retry). [3]
Sources:
[1] Playwright Reporter API (onTestEnd) https://playwright.dev/docs/next/api/class-reporter
[2] Playwright TestResult.retry https://playwright.dev/docs/api/class-testresult
[3] Playwright TestCase.outcome() / TestCase.results https://playwright.dev/docs/next/api/class-testcase
Record final test outcome, not per-retry attempts.
onTestEnd() runs after each test attempt (initial run and every retry). This code appends an entry for every attempt, so a flaky test—initially failed, then passed on retry—creates two entries in this.tests (failed + passed). When aggregate() counts every entry, flaky tests are double-counted, inflating failure and pass totals.
Use test.outcome() to record only the final logical outcome per test, which correctly reflects flaky status and prevents duplicate counting.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@collectors/playwright-reporter/src/reporter.ts` around lines 78 - 84,
onTestEnd is currently appending an entry for every attempt (using
result.status), causing duplicate entries for retried tests; change it to record
only the final logical outcome by using test.outcome() instead of result.status
and derive wasFlaky from test.outcome() and test.retries/run history (so you
push one entry per TestCase into this.tests). Update the onTestEnd
implementation to use test.outcome() for status, compute wasFlaky from the
test's final/flaky indicators, and ensure aggregate() continues to operate on
this.tests which will now contain only final outcomes.
| const success = await this.saveToDataBranch(context) | ||
| await this.returnToOriginalBranch(context.originalBranch) | ||
|
|
||
| if (success && this.options.deploy) { | ||
| await this.deploy() | ||
| } |
There was a problem hiding this comment.
Make the upload path fail-fast and stop force-pushing the data branch.
push() is implemented as a force-push in collectors/playwright-reporter/src/git-operations.ts, so concurrent CI jobs can overwrite each other's newly written run files. This path also ignores commit()/push() failures, so submitData() can still treat the upload as successful and proceed to deployment with stale remote data.
Also applies to: 177-188
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@collectors/playwright-reporter/src/reporter.ts` around lines 112 - 117, The
upload path is currently ignoring commit()/push() failures and uses a
force-push, so change saveToDataBranch (and any callers like submitData) to
fail-fast on any commit or push error: have saveToDataBranch return/throw on
push failure instead of returning success, and ensure reporter.ts (the block
around saveToDataBranch, returnToOriginalBranch, deploy) checks that result and
aborts deployment when upload fails. In addition, remove or change the
force-push behavior in git-operations.ts push() (stop using --force) and handle
push conflicts (return error to caller) so concurrent CI jobs cannot silently
overwrite each other. Ensure both the first occurrence (around lines 112–117)
and the other occurrence (around lines 177–188) follow the same fail-fast
behavior and propagate push errors up to prevent deploying stale data.
| **APIs**: | ||
| - New exports from test-processing: `collectFromRunData`, `pushToGitHub` | ||
| - TestResult type: add `retries?: number` | ||
|
|
||
| **Dependencies**: | ||
| - playwright-reporter depends on test-processing (workspace dependency, bundled on publish) | ||
|
|
||
| **Breaking**: | ||
| - Internal refactor only, npm package API unchanged |
There was a problem hiding this comment.
Please call out the CLI move as a breaking change or ship a compatibility shim.
Moving collection into collectors/junit/ and deleting collectors/playwright-reporter/src/cli.ts changes the published test-eyes / test-eyes-reporter binaries this package currently exposes. That makes “npm package API unchanged” too strong as written.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@openspec/changes/architecture-alignment/proposal.md` around lines 33 - 41,
The CLI relocation is a breaking change because the published package no longer
exposes the previous binaries (test-eyes / test-eyes-reporter) after removing
collectors/playwright-reporter/src/cli.ts and moving collection into
collectors/junit/; either mark this in the proposal as a breaking change or add
a compatibility shim that re-exports or re-installs the old CLI entrypoints
(e.g., restore a small module at collectors/playwright-reporter/src/cli.ts that
imports and delegates to the new collectors/junit/ entry, and update
package.json "bin" to point to those legacy paths) so existing consumers
invoking the published binaries continue to work.
| ### Requirement: Format-agnostic collection core | ||
| The system SHALL provide a `collectFromRunData()` function in `apps/test-processing/` that accepts RunData from any collector and handles git operations, saving, and aggregation. | ||
|
|
||
| #### Scenario: Playwright collector calls shared core | ||
| - **WHEN** Playwright reporter calls `collectFromRunData({ runData, dataBranch: 'gh-data' })` | ||
| - **THEN** system saves run data to data branch, aggregates statistics, commits and pushes | ||
|
|
||
| #### Scenario: JUnit collector calls shared core | ||
| - **WHEN** JUnit collector calls `collectFromRunData({ runData, dataBranch: 'gh-data' })` | ||
| - **THEN** system saves run data to data branch, aggregates statistics, commits and pushes |
There was a problem hiding this comment.
Encode the per-logical-test/flaky-count contract in the spec.
These scenarios only describe save/aggregate/commit/push orchestration. They never require retries to collapse into one logical TestResult, so the current per-attempt overcounting bug would still satisfy this spec.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@openspec/changes/architecture-alignment/specs/multi-format-collectors/spec.md`
around lines 3 - 12, Update the spec for collectFromRunData to require that
runData aggregation collapses multiple retry attempts of the same logical test
into a single TestResult (e.g., identify tests by a stable key such as
testId/name + suite/path) and compute flaky counts per logical test rather than
per-attempt; change both Playwright and JUnit scenarios to assert that
collectFromRunData deduplicates attempts, produces one TestResult per logical
test, and increments a flakyCount when a test has mixed pass/fail attempts
across retries before committing/pushing.
| ### Requirement: Import shared logic from test-processing | ||
| The reporter SHALL import `collectFromRunData` from test-processing instead of implementing its own collection logic. | ||
|
|
||
| #### Scenario: Uses collectFromRunData | ||
| - **WHEN** reporter's onEnd completes | ||
| - **THEN** it calls collectFromRunData from test-processing package | ||
|
|
||
| #### Scenario: No local collection implementation | ||
| - **WHEN** inspecting playwright-reporter source | ||
| - **THEN** there is no local implementation of git operations, file saving, or aggregation |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify that playwright-reporter delegates to test-processing and doesn't have local implementations
echo "=== Checking for collectFromRunData import ==="
rg -n 'collectFromRunData' collectors/playwright-reporter/src/
echo ""
echo "=== Checking for local git operations in reporter ==="
rg -n 'git\s+(commit|push|checkout)' collectors/playwright-reporter/src/ || echo "No local git operations found"
echo ""
echo "=== Listing files in playwright-reporter/src ==="
fd . collectors/playwright-reporter/src/ --type fRepository: practicajs/test-eyes
Length of output: 1059
Reporter implementation violates shared logic architecture requirements.
The playwright-reporter contains local implementations of git operations, file saving, and aggregation that should be delegated to test-processing instead. Specifically:
git-operations.tsimplements local git operations (git checkout, commit, push)file-operations.tscontains local file handling logicaggregate.tsimplements local data aggregationcollectFromRunDatafrom test-processing is not imported anywhere in the reporter source
Per the architecture spec, the reporter should import and use collectFromRunData from the test-processing package rather than duplicating this logic locally.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@openspec/changes/architecture-alignment/specs/playwright-reporter/spec.md`
around lines 29 - 38, The reporter currently duplicates collection, git and file
logic; remove the local implementations in git-operations.ts, file-operations.ts
and aggregate.ts and instead import collectFromRunData from the test-processing
package; update the reporter's onEnd handler (the function named onEnd in the
reporter) to call collectFromRunData(runData) and use its returned payload for
any subsequent save/commit/push operations, replacing calls to local functions
previously defined in git-operations.ts, file-operations.ts and aggregate.ts;
ensure any tests or callers now reference the imported collectFromRunData and
delete the redundant local modules.
Summary by CodeRabbit
Release Notes
New Features
@practica/test-eyesnpm package—a Playwright reporter for automated test analytics integration.Documentation