Skip to content

Add per-PR visual diff view (trial)#1

Open
RisingOrange wants to merge 51 commits intomainfrom
visual-regression-trial
Open

Add per-PR visual diff view (trial)#1
RisingOrange wants to merge 51 commits intomainfrom
visual-regression-trial

Conversation

@RisingOrange
Copy link
Copy Markdown
Owner

@RisingOrange RisingOrange commented Apr 19, 2026

Adds a per-PR visual diff view: every PR gets side-by-side before/after renders of ~15 key pages, so reviewers can see the visual impact of a change without clicking through Netlify previews page-by-page. Under the hood this is visual regression testing via Playwright + Chromatic.

Non-blocking / informational. The diff view is a signal reviewers can glance at; it never blocks a merge. The workflow exits 0 regardless of diffs, and nothing in branch protection gates on it. Merges remain possible with unreviewed snapshots. This is intentional — adopting this shouldn't change the merge flow, just surface information.

Why

  • Visual changes slip past manual QA sometimes (e.g. a past /outcomes layout bug shipped unnoticed — /outcomes is in the covered set, so this setup would have surfaced it).
  • Netlify previews only enable manual inspection — the reviewer has to open each affected page themselves and remember what it looked like before. A diff view flips that: reviewers see exactly what changed, automatically.
  • Useful both for catching regressions and for intentional changes — a CSS tweak that looks fine on the author's machine may have side-effects on pages they didn't touch.

How it works

  1. On each PR, GitHub Actions runs pnpm build && pnpm preview and navigates Playwright to a set of routes.
  2. @chromatic-com/playwright captures HTML+asset archives of each visited page.
  3. The chromaui/action step uploads archives; Chromatic re-renders them server-side in a controlled browser and diffs them against the baseline.
  4. Chromatic posts a "UI Review" check on the PR (informational — not required):
    • No visual changes → check auto-greens.
    • Visual changes → check shows "N changes" with a link to the Chromatic UI.
  5. Reviewers can optionally open Chromatic's side-by-side viewer, accept/deny per snapshot. Accepted snapshots become the new baseline once main is updated.

No baselines are committed to git — Chromatic stores them per-branch and promotes on merge. No regeneration workflow, no Linux-vs-macOS font-parity problems, no PR churn from binary PNGs.

Coverage

The diff view covers distinct page templates and layout-bearing routes, not every individual piece of content. Two posts rendered through the same +page.svelte produce the same template-level diff, so snapshotting one representative post catches regressions that would affect all of them.

Auto-discovered from src/routes/ (recursive walk over +page.svelte/+page.ts), plus a small curated list of markdown-post representatives chosen to exercise different post template variants. See tests/visual/routes.ts.

Current set: 15 routes × 2 viewports (desktop 1280×800, Pixel 7 412×839) = 30 snapshots per run.

What's covered: every unique route template under src/routes/, layout components (header/footer/banners), the homepage and its sections, landing pages (/join, /action, /learn), and representative samples of the markdown-post layout (/funding with embedded Svelte components, /values for plain long-form, /action for link-heavy, /learn for long-form with many headings).

What's not covered: individual post content. A typo in a specific post's markdown won't show up in the diff — that's a content-review concern, not a visual-regression one.

Intentional skips

Content-churn routes (content changes over time independent of code changes — diffing them would just generate noise):

  • /posts — chronological listing of every post
  • /communities — embeds Mapbox + Luma iframes (third-party content)
  • homepage LatestNews section — handled differently, via a stable page.route mock of /api/news (see External data below)

Other skips (each with a WHY comment in routes.ts):

  • /chat — OpenAI-dependent
  • /quotes — ~22,000px tall, exceeds Chromatic's 25M-pixel cap
  • /contact-us, /verify, /submitted, /uk-email-mp — form/token state
  • /email-builder — admin-only tool

CI behavior

  • Runs on PRs and pushes to main.
  • Non-blocking. exitZeroOnChanges: true — the workflow exits 0 even when Chromatic detects diffs. Chromatic's PR check is informational; branch protection is not configured to require it.
  • exitOnceUploaded: true — CI exits after archive upload. Warm-cache runs complete in ~2 min; cold-cache (first run on a branch, or after 7 days of inactivity) ~5:30. Chromatic's comparison completes asynchronously and updates the PR check separately.
  • concurrency group cancels stale runs on new pushes.
  • Playwright browsers are cached; playwright-report/ (HTML report + per-test screenshots) is uploaded as a GHA artifact on every run (14d retention) for inspection without needing the Chromatic UI.

Fork PRs

Fork PRs work end-to-end, including the Chromatic diff view. GitHub Actions can't pass repo secrets to fork-PR runs, so the Chromatic project token is inlined as plaintext in the workflow — this is Chromatic's own recommended pattern for fork-PR support. The token is a project-scoped write credential (anyone with it can run builds and consume snapshot quota for this project), rotatable from the Chromatic UI.

External data

All data-fetching pages in this repo already have try/catch fallbacks that return placeholder or empty state when API keys are missing. Since CI uses template.env (placeholder keys only), fallbacks render consistently — snapshots are deterministic by construction.

Two exceptions handled via Playwright's page.route:

  • Homepage LatestNews fetches the live Substack RSS feed (no auth required), which would churn every time a new post is published. Intercepted with a stable fixture (tests/visual/fixtures/news.json).
  • Tally form embeds on /statement and /join report their content height back to the parent iframe async, producing ~20px height deltas between runs. Requests to tally.so are aborted; the iframe container stays at its CSS height (deterministic). No coverage loss — Chromatic can't see into the iframe content anyway.

Both keep the test concerns out of production source.

Setup required (for upstream adoption)

  1. Create a Chromatic project linked to this repo (free tier: 5,000 snapshots/month — expected usage is ~1–2k/mo based on current PR volume).
  2. Replace the inlined project token in .github/workflows/visual-diff.yml with the new project's token.
  3. Merge this PR — the first push to main uploads initial baselines; accept them once in Chromatic and the pipeline is live.

Any repo collaborator can review and accept baselines via GitHub OAuth login — no separate Chromatic invites needed.

In practice, the PR reviewer also reviews the diffs — accepting expected changes is a click per snapshot (or a single batch-accept for many snapshots at once) during normal code review, not a separate ownership burden.

Local screenshot workflow

The Playwright suite also doubles as a before/after screenshot tool for intentional visual changes, without needing Chromatic:

pnpm exec playwright test
pnpm exec playwright show-report

Runs the suite against a fresh preview build and opens an HTML report with per-route screenshots. To compare before/after: run on the base branch, save playwright-report/ somewhere, switch branches, re-run, eyeball both reports.

No quota, no PR check side effects — just local screenshots.

Trial results

Validated end-to-end on the fork (RisingOrange/pauseai-website#1):

  • No-op PRs come back green with 0 diffs.
  • Intentional CSS changes produced clean, readable diffs in Chromatic's review UI.
  • Measured CI runtime: ~2 min warm cache (imagetools + Playwright browsers cached), ~5:30 cold.

Known limitations

  • Transient mid-hydration flashes won't be caught. Static snapshots capture the settled state, so bugs where both the SSR output and the post-hydration state look correct but the transition between them flashes (e.g. an async boundary briefly blanking a subtree) are invisible to this setup.
  • Chromium-only. Firefox- or Safari-specific rendering bugs (e.g. color-profile differences, vendor-specific CSS) won't be caught. Adding more browsers is possible but multiplies the snapshot count by the number of browsers.
  • If a new data-fetching page is added without a try/catch fallback, it'll break CI builds. That's a code-quality issue at the page, not a test-suite concern.
  • Free-tier ceiling is 5,000 snapshots/month; we're well under, but worth watching as route count grows.

Minimal trial setup on a branch in the fork. Installs Playwright with
Chromatic's archive integration, runs one smoke test against the
homepage, and wires a non-blocking GitHub Actions workflow that uploads
to Chromatic on push + PR. Site-wide coverage and data-backed fixtures
come next once this pipeline is confirmed green in CI.
@RisingOrange RisingOrange force-pushed the visual-regression-trial branch from a2c7452 to de101b9 Compare April 19, 2026 20:39
Cover all unique-layout top-level routes (discovered from src/routes
dirs) plus a small curated sample of markdown posts. New top-level
routes get coverage automatically; only the exclude list needs
maintenance.
- Drop job-level continue-on-error (was masking build/test failures);
  exitZeroOnChanges already prevents Chromatic diffs from failing CI.
- Add concurrency guard so stale runs get cancelled on new pushes.
- Add exitOnceUploaded: true — Chromatic posts its own PR check, so
  waiting for the comparison in GH Actions is redundant.
- Add playwright-cache restore-keys for partial-hit fallback.
- Filter discovered routes by presence of +page.svelte/+page.ts so we
  don't need to hardcode non-page dirs (api, [slug], sitemap, etc.).
- Wait for networkidle before checking fonts for more stable snapshots.
- Pin chromaui/action to v1 (was @latest — supply-chain risk)
- Upload playwright-report artifact on failure for debugging
- Drop unused VISUAL_TEST=1 env flag (fixture chokepoint isn't
  implemented; YAGNI until it is)
- Sanity-check that route discovery found a plausible set
The reporter is 'list' only, so the upload step had nothing to
archive. Our tests are 3 lines (goto + fonts.ready) — failures are
self-describing in log output, and Chromatic archives already give
us a rich debuggable artifact of what Playwright saw.
- /posts is a chronological listing; every new post would produce a
  noise diff. Exclude it.
- Fork PRs can't access CHROMATIC_PROJECT_TOKEN, which would make the
  publish step fail noisily. Gate it on 'push' events or same-repo PRs;
  fork-PR runs will run tests but skip the publish cleanly.
@RisingOrange RisingOrange changed the title Visual regression scaffold (trial) Add per-PR visual diff view (trial) Apr 19, 2026
- smoke.spec.ts: assert response.ok() on goto — HTTP 404/500 don't
  fail page.goto by default, so a missing post or broken SSR page
  would silently upload an error render as the snapshot.
- routes.ts: exclude /communities — embeds Mapbox tiles and a Luma
  iframe, both third-party content that drifts over time.
- workflow: drop redundant 'pnpm inlang:settings' step — pnpm build
  already runs scripts/l10n/run which invokes inlang-settings.ts.

Not addressed: recursive route discovery (would enlarge scope and
add flaky candidates); acceptable known limitation for now.
Aligns with the PR-description framing where the user-facing value is
a visual diff view; visual regression testing is just the underlying
mechanic.
- On cache miss: 'playwright install --with-deps' as before.
- On cache hit: only run 'playwright install-deps' (browser is cached;
  only the ephemeral runner's system libs need installing). Saves the
  ~20s browser re-download on every cache hit.
- Pin mobile viewport to 393x851 so we don't silently diff if Playwright
  bumps its Pixel 5 preset. Matches how desktop is already pinned.
Pixel 5's viewport (393x851) was representative but the device is
six years old. Pixel 7 (412x915) better represents current mainstream
Android phones. Viewport is still explicitly pinned so future preset
updates won't silently affect snapshots.
Pixel 7's device screen is 412x915 but the web viewport in Chrome
(excluding browser chrome) is 412x839 per Playwright's preset. For
web visual regression we want the viewport.
The plugin uses a content-addressed cache at
node_modules/.cache/imagetools (SHA-1 keys per source image +
transform). Adding/removing images just adds/removes entries;
existing entries remain valid. With restore-keys fallback, partial-hit
runs still benefit. Expected savings: ~45% of build = ~1-1:30 min.
Chromium re-renders SVG edges with slightly different sub-pixel
values between runs (seen on WIRED/FORTUNE logos and orange link
glyphs on the homepage). These diffs were imperceptible to the eye
but Chromatic was flagging them. Real color/position changes are
still caught — this only suppresses anti-aliasing noise.
diffIncludeAntiAliasing is a project-level setting in the Chromatic
web UI, not a valid CLI config key. Configure it there instead.
Chromatic (via Storybook conventions) treats slashes in test titles as
hierarchy delimiters, which was splitting snapshots into two 'visit'
folders — 13 nested stories plus the homepage as an empty-leaf sibling.

Wrap routes in test.describe('routes', ...) and slugify names so slashes
only appear inside the describe hierarchy, not in leaf titles. Produces
a single flat folder: home, about, sayno-share, ...
One-level scan silently missed /sayno/share (and any future nested
static pages). Switch to a recursive walk that still skips dynamic
segments ([slug]) and honors EXCLUDE_ROUTES. Route count: 14 -> 15.
The homepage renders LatestNews, which fetches /api/news — which in
turn fetches the live pauseai.substack.com RSS feed. Every new
Substack post silently creates a homepage visual diff unrelated to
the PR.

Gate the Substack fetch on VISUAL_TEST=1 (returning an empty list)
and set that env var in playwright.config.ts webServer.env. Internal
news still renders from prerendered /api/posts, so the homepage
'Latest news' section stays populated and layout-realistic.
GHA jobs without an explicit permissions block get the default
GITHUB_TOKEN scopes, which include write access to contents on some
repos. This workflow only needs to read the repo (checkout) and hand
off to Chromatic; tighten to contents: read.
Previously the Playwright run only emitted the list reporter, so there
was no browseable output. Add the html reporter with screenshot: 'on'
(PNG per test per viewport) and upload playwright-report/ as an
artifact (14d retention, uploaded even on failure).

Makes diff inspection possible without Chromatic's UI — humans open
the HTML report, agents grab PNGs via 'gh run download'.
The project's ESLint config forbids process.env in SvelteKit source
files. The Substack-stub gate needs to resolve the env var at runtime
(webServer sets it before pnpm preview starts), so $env/dynamic/private
is the right binding.
Swap the production-source env-var gate in /api/news for a test-layer
route interception in tests/visual/smoke.spec.ts, and return a populated
news fixture (tests/visual/fixtures/news.json) so the homepage renders
6 news cards instead of an empty section — better snapshot coverage of
the card/grid layout.

Drops the VISUAL_TEST leak from production code and removes the need
to set env.VISUAL_TEST in playwright.config.ts webServer.

LatestNews fetches client-side via onMount, so page.route interception
is sufficient. If news ever moves to SSR, we'd need to revisit.
Drop screenshot: 'on' (viewport-only by default) in favor of an
explicit page.screenshot({ fullPage: true }) per test. Attached via
testInfo.outputPath so it lands in the HTML report. Matches what
Chromatic archives — lets reviewers inspect the full page from the
GHA artifact.
Codex review flagged: (1) fixture exercised only the placeholder-image
branch; (2) the test couldn't tell if the route mock silently stopped
matching.

Add image: '/Cover.jpg' to three internal items (exercises the Image
component path) and keep three substack items imageless (exercises the
placeholder-gradient fallback). Then assert that the first fixture
title is visible on the homepage — if the mock ever stops intercepting
and live Substack data leaks in, the test fails loudly.
The previous page.screenshot({ path: testInfo.outputPath(...) }) wrote
PNGs into the test-results/ directory but the HTML reporter only
copies explicitly-attached artifacts into playwright-report/data/. The
uploaded artifact dropped from ~5MB to 230KB with screenshots missing.

Switch to testInfo.attach() so the full-page PNGs actually land in the
downloadable report.
@chromatic-com/playwright re-exports expect but strips the web-first
matcher type extensions, so expect(locator).toBeVisible() doesn't
type-check. @playwright/test's expect has the same issue once Chromatic's
test is in the same file. Sidestep the type merge by using the locator's
own waitFor() — same semantics, no import drama.
Fork PRs need repo secrets to run Chromatic, but pull_request_target would
expose them to unreviewed fork code. Put the job behind a chromatic-fork
environment with required reviewers so a maintainer approves each run
before fork code executes.
chromaui/action@v1 ships with CLI 11.27.0, which Chromatic now flags as
significantly outdated. Installing chromatic as a devDep makes the
action use the local version instead.
Both workflows were duplicating ~50 lines of setup, caches, Playwright
install, and Chromatic publish. Move the shared body into
.github/actions/visual-diff so future changes touch one place.

Fork-PR skip becomes a token-empty check inside the action (GHA exposes
the secret as empty string when unavailable, which happens on fork-PR
runs of the non-target workflow). Trust boundary unchanged.

Also drop gitignore entries that our config doesn't produce
(blob-report, playwright/.cache, and chromatic local log files not
written in CI).
chromaui/action@v1 bundles Chromatic CLI 11.27.0 and ignores the local
install, so the outdated-CLI warning persists even after bumping the
chromatic devDep. Call the local binary via pnpm exec instead so we
actually run the v16 CLI.
chromaui/action has current tags matching the CLI version; v1 was
ancient and bundled CLI 11.27.0. Switching to v16 gives us the current
CLI inside the published action — no need for a local chromatic devDep
or a direct pnpm exec invocation.
Chromatic's own docs recommend inlining the project token as plaintext
to support forked PRs (fork workflows can't read repo secrets). The
token is a project-scoped write credential, rotatable, with limited
blast radius — not worth the complexity of the pull_request_target +
environment gate we previously had.

Deletes visual-regression-fork.yml entirely. pull_request fires once
per PR, same-repo or fork, and has access to the inlined token.
Previous key hashed only image files, so a Vite or
@sveltejs/enhanced-img version bump (or a vite.config.ts change)
would restore stale transforms. Add pnpm-lock.yaml and vite.config.ts
to the key.
Only one caller, so the indirection wasn't buying anything.
Wraps playwright test + chromatic upload into a single command so a
contributor can eyeball a visual diff in Chromatic's UI without pushing.

Token is the same public project token already inlined in the workflow
(Chromatic's recommended fork-PR pattern), rotatable from the Chromatic
UI.
Tally's iframe reports its content height back to the parent async,
producing ~20px height deltas between runs and flaky Chromatic diffs.
Blocking the request collapses the container to its CSS size.

No coverage loss — Chromatic can't see into the iframe anyway.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant