Version
1.59.1
Steps to reproduce
- Configure
trace: "retain-on-failure" with retries: 1 and a webServer in a Playwright config:
const config: PlaywrightTestConfig = {
fullyParallel: true,
retries: ci ? 1 : 0,
use: {
trace: "retain-on-failure",
serviceWorkers: "block",
navigationTimeout: 30000,
actionTimeout: 10000,
},
webServer: {
command: "pnpm start:mocked-auth",
port: 3102,
reuseExistingServer: !ci,
timeout: 120000,
stdout: "pipe",
},
};
No custom context.tracing.start() / .stop() calls — trace collection is entirely config-driven.
- Run a test that fails in CI on a resource-constrained or contended runner (in our case, an ARM64 GitHub Actions runner in a shared job alongside build/lint/unit-test tasks running via Nx with
parallel: 3)
- Inspect the resulting
trace.zip files in test-results/
We cannot reproduce the truncation locally (Windows, x64, headed or headless). It only occurs in CI.
Expected behavior
trace.zip should be a valid, complete ZIP archive that can be opened in the Trace Viewer.
Actual behavior
trace.zip starts with valid PK local file headers but is truncated — the End of Central Directory (EOCD) record is missing. The file cannot be opened by any tool:
$ npx playwright trace open trace.zip
Error: End of central directory record signature not found. Either not a zip file, or file is truncated.
Both the initial run and retry produce independently truncated files of nearly identical size:
| File |
Size |
Valid PK header |
EOCD record |
trace.zip (run 1) |
75,121 bytes |
Yes (50 4B 03 04) |
Missing |
trace.zip (retry 1) |
75,222 bytes |
Yes (50 4B 03 04) |
Missing |
The consistent ~75KB truncation point across both independent attempts suggests a systematic teardown cutoff rather than a random race. By manually walking the local file headers and inflating with zlib, we confirmed the trace data is partially present — action logs, route fulfillments, and fixture teardown events were all recoverable. The ZIP was simply never finalized.
This worked correctly on 1.58.x with the same config and CI setup.
Additional context
We run a second Playwright suite (E2E against a deployed environment) with the same trace: "retain-on-failure" config. That suite does not have retries enabled and its traces are always valid. The key differences in the failing (mocked) suite are retries, a webServer, and that it runs in a shared CI job with higher resource contention.
Suspected regression source
PR #39884 (commit a8ea6558, merged 2026-03-27) converted yazl/yauzl from static top-level imports to lazy await import() in the three codepaths that write trace.zip:
localUtils.zip() — the primary path for config-driven retain-on-failure traces
SerializedFS._performOperation('zip') in fileUtils.ts
testTracing.stopIfNeeded() / mergeTraceFiles() — final trace.zip assembly
The teardown flow wraps stopIfNeeded() in _runWithTimeout(...).catch(() => {}) — errors and timeouts are silently swallowed. The lazy await import('../zipBundle') adds async latency at ZIP creation time that did not exist in 1.58 (where the module was already loaded at startup). Under resource contention in CI, this could widen the window for the teardown timeout to fire after yazl has started streaming local file entries but before zipFile.end() completes writing the central directory and EOCD record.
A secondary area of change is the screencast/tracing refactor (PR #39512, PR #39520, PR #39937) which changed how screencast frames are captured during tracing. This could affect resource flush timing before ZIP assembly.
Environment
System:
OS: Windows 11 10.0.26100
CPU: (16) x64 11th Gen Intel(R) Core(TM) i7-11800H @ 2.30GHz
Memory: 4.51 GB / 31.80 GB
Binaries:
Node: 22.22.0
pnpm: 11.1.3
npmPackages:
@playwright/test: 1.59.1 => 1.59.1
CI (where truncation occurs):
OS: Ubuntu 24.04 (ARM64)
Runner: GitHub Actions, 32 CPU / 128 GB
Container: Playwright 1.59 image
Version
1.59.1
Steps to reproduce
trace: "retain-on-failure"withretries: 1and awebServerin a Playwright config:No custom
context.tracing.start()/.stop()calls — trace collection is entirely config-driven.parallel: 3)trace.zipfiles intest-results/We cannot reproduce the truncation locally (Windows, x64, headed or headless). It only occurs in CI.
Expected behavior
trace.zipshould be a valid, complete ZIP archive that can be opened in the Trace Viewer.Actual behavior
trace.zipstarts with validPKlocal file headers but is truncated — the End of Central Directory (EOCD) record is missing. The file cannot be opened by any tool:Both the initial run and retry produce independently truncated files of nearly identical size:
trace.zip(run 1)50 4B 03 04)trace.zip(retry 1)50 4B 03 04)The consistent ~75KB truncation point across both independent attempts suggests a systematic teardown cutoff rather than a random race. By manually walking the local file headers and inflating with
zlib, we confirmed the trace data is partially present — action logs, route fulfillments, and fixture teardown events were all recoverable. The ZIP was simply never finalized.This worked correctly on 1.58.x with the same config and CI setup.
Additional context
We run a second Playwright suite (E2E against a deployed environment) with the same
trace: "retain-on-failure"config. That suite does not have retries enabled and its traces are always valid. The key differences in the failing (mocked) suite are retries, awebServer, and that it runs in a shared CI job with higher resource contention.Suspected regression source
PR #39884 (commit
a8ea6558, merged 2026-03-27) convertedyazl/yauzlfrom static top-level imports to lazyawait import()in the three codepaths that writetrace.zip:localUtils.zip()— the primary path for config-drivenretain-on-failuretracesSerializedFS._performOperation('zip')infileUtils.tstestTracing.stopIfNeeded()/mergeTraceFiles()— final trace.zip assemblyThe teardown flow wraps
stopIfNeeded()in_runWithTimeout(...).catch(() => {})— errors and timeouts are silently swallowed. The lazyawait import('../zipBundle')adds async latency at ZIP creation time that did not exist in 1.58 (where the module was already loaded at startup). Under resource contention in CI, this could widen the window for the teardown timeout to fire afteryazlhas started streaming local file entries but beforezipFile.end()completes writing the central directory and EOCD record.A secondary area of change is the screencast/tracing refactor (PR #39512, PR #39520, PR #39937) which changed how screencast frames are captured during tracing. This could affect resource flush timing before ZIP assembly.
Environment