Skip to content

chore: 🤖 GitHub action for visual regression (development environment based [initial version])#932

Open
punkbit wants to merge 2 commits intochore/vis-regression-docker-setup-consistent-screenshotsfrom
chore/ci-cd-github-action-for-visual-regression-initial
Open

chore: 🤖 GitHub action for visual regression (development environment based [initial version])#932
punkbit wants to merge 2 commits intochore/vis-regression-docker-setup-consistent-screenshotsfrom
chore/ci-cd-github-action-for-visual-regression-initial

Conversation

@punkbit
Copy link
Collaborator

@punkbit punkbit commented Mar 17, 2026

Why?

Introduces a GitHub Action for visual regression testing within the CI/CD pipeline. Unlike our existing Chromatic implementation, which relies on Storybook stories and cloud-managed snapshots, this action executes tests using screenshots managed directly within the development environment (curated by author). This ensures snapshots remain tightly coupled with the implementation code.

⚠️ Depends on #931 (should be merged first)

How?

  • Created a new GitHub workflow for visual-regression-tests

Preview?

N/A

@changeset-bot
Copy link

changeset-bot bot commented Mar 17, 2026

⚠️ No Changeset found

Latest commit: 701b619

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@github-actions
Copy link

github-actions bot commented Mar 17, 2026

Claude finished @punkbit's task in 49s —— View job


PR Review: Visual Regression GitHub Action

  • Gather context
  • Read workflow file
  • Analyze and post review

Thanks for the contribution! 🙌 This is a clean, focused workflow. A few things worth addressing before merge:

🚨 Must Fix

  • actions/checkout@v6 / setup-node@v6 / upload-artifact@v6: These action versions don't exist — the latest major versions are v4 for all three. Using v6 will cause the workflow to fail immediately. Check the other workflows in the repo — they all use v4 (e.g., unit-tests.yml uses actions/checkout@v6... wait, same issue there). Double-check against the GitHub Actions marketplace — as of now the latest stable versions are actions/checkout@v4, actions/setup-node@v4, and actions/upload-artifact@v4.

  • node-version: 'latest': Using latest is non-deterministic and can cause unexpected breakage when Node releases a new major version. Pin to the same version used in other workflows (check package.json engines field or other CI files for the project's target Node version).

  • Storybook not started: playwright.config.ts uses yarn storybook as the webServer, but the workflow runs npx playwright test without first building Storybook or ensuring it's available. In CI mode (process.env.CI is set), reuseExistingServer is false, so Playwright will try to start Storybook itself — but only if webServer config is respected. That should work, but it means the job will silently spend time starting Storybook on each run. Consider adding a yarn build-storybook step and pointing webServer to a static server in CI for faster, more reliable runs. (This may be intentional for now as an "initial version".)

💡 Suggestions

  • Snapshot updates on failure: Consider uploading the test-results/ diff screenshots on failure so reviewers can inspect visual diffs directly in the Actions UI — you're already uploading playwright-report/ which should cover this, so this is likely fine.

  • retention-days: 30: 30 days is quite long for PR-level artifacts. 7 days is usually sufficient and saves storage costs.

  • No workflow_dispatch: Adding workflow_dispatch as a trigger (like the other workflows) allows manually re-running the workflow without pushing a new commit, which is helpful when debugging flaky snapshots.

  • Snapshot baseline: The PR description notes snapshots are "managed directly within the development environment" — make sure the baseline screenshots are committed to the repo (or cached in CI), otherwise every run will regenerate them and comparisons won't be meaningful.

@github-actions
Copy link

github-actions bot commented Mar 17, 2026

Claude finished @punkbit's task in 1m 5s —— View job


  • Gather context
  • Read workflow file
  • Check existing workflows for conventions
  • Review playwright config
  • Post review

PR Review: Visual Regression GitHub Action

Thanks for the update! The workflow is clean and follows the existing repo conventions. A few things worth flagging:

🚨 Must Fix

  • actions/checkout@v6 / setup-node@v6 / actions/upload-artifact@v6: These version tags don't exist upstream (latest stable is v4 for all three). That said, this matches the existing unit-tests.yml which also uses @v6, so this is consistent with the repo — but it means all these workflows are likely broken. Worth addressing repo-wide, but not blocking this PR specifically since it follows the established pattern.

  • Storybook startup in CI: playwright.config.ts sets reuseExistingServer: !process.env.CI, which in CI means Playwright will try to start Storybook itself via yarn storybook. Since there's no build step beforehand, this will run the dev server on every CI run (slow, ~60–120s spin-up) and risks hitting the 120s timeout configured in the config. Consider either:

    • Adding yarn build-storybook + a static server step before npx playwright test, or
    • Accepting dev-server mode as intentional for this initial version (just document it)

    Fix this →

💡 Suggestions

  • Missing workflow_dispatch: unit-tests.yml and others include workflow_dispatch for manual re-runs. Handy for debugging flaky snapshot comparisons.

  • No dependency caching: Adding a cache: 'yarn' option to setup-node and caching ~/.cache/ms-playwright for Playwright browsers would significantly speed up runs.

  • retention-days: 30: 7 days is typically sufficient for PR-level diff artifacts and saves storage costs.

  • node-version: '24.x': Good choice — this is an improvement over the 'latest' used in unit-tests.yml and aligns with the recent setup-node use node v24.x commit.

  • Snapshot baseline: Make sure the baseline screenshots are committed to the repo — otherwise every first run regenerates them and no meaningful comparison happens.

@punkbit
Copy link
Collaborator Author

punkbit commented Mar 17, 2026

  • actions/checkout@v6 / setup-node@v6 / upload-artifact@v6: These action versions don't exist — the latest major versions are v4 for all three. Using v6 will cause the workflow to fail immediately. Check the other workflows in the repo — they all use v4 (e.g., unit-tests.yml uses actions/checkout@v6... wait, same issue there). Double-check against the GitHub Actions marketplace — as of now the latest stable versions are actions/checkout@v4, actions/setup-node@v4, and actions/upload-artifact@v4.

The versions do exist. Ignore for now!

@punkbit punkbit force-pushed the chore/ci-cd-github-action-for-visual-regression-initial branch from 44706c9 to 701b619 Compare March 17, 2026 14:27
@github-actions
Copy link

github-actions bot commented Mar 17, 2026

Claude finished @punkbit's task in 42s —— View job


  • Gather context
  • Read workflow file
  • Review and post feedback

PR Review: Visual Regression GitHub Action

Thanks for this addition! The workflow is clean and follows existing repo patterns. A few things worth flagging:

🚨 Must Fix

  • Storybook startup race condition: playwright.config.ts uses yarn storybook as the webServer command with a 120s timeout. In CI, reuseExistingServer: false, so Playwright will launch the Storybook dev server on every run. The dev server is slower and less stable than a static build — if startup exceeds 120s (common on cold CI runners), all tests fail. Consider adding a yarn build-storybook step and serving via npx http-server storybook-static -p 6006 in CI. Fix this →

  • No workflow_dispatch: Other workflows (unit-tests.yml, chromatic-deployment.yml) include workflow_dispatch for manual re-runs. This is especially useful for debugging flaky snapshot comparisons without needing a new push.

💡 Suggestions

  • No dependency caching: Adding cache: 'yarn' to setup-node and caching ~/.cache/ms-playwright would speed up runs noticeably — Playwright browser downloads alone can take 30–60s.

  • retention-days: 30: 7 days is sufficient for PR-level diff artifacts and avoids unnecessary storage costs.

  • screenshot: "only-on-failure" in playwright.config.ts: This is fine for unit tests but for visual regression you likely want screenshots on every run (both pass and fail) to enable baseline comparisons. Consider screenshot: "on" or rely solely on toHaveScreenshot() snapshots.

  • Snapshot baselines: Make sure the baseline .png snapshot files are committed to the repo alongside the tests — otherwise every first CI run regenerates them and no meaningful comparison occurs.

@workflow-authentication-public
Copy link
Contributor

📚 Storybook Preview Deployed

✅ Preview URL: https://click-pzwupr34f-clickhouse.vercel.app

Built from commit: 809964076278062d0b3635e8e98924df47e46b3f

@punkbit punkbit requested a review from Firefds March 17, 2026 14:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants