Skip to content

fix(virtdisplay): allocate Xvfb display by explicit number with retry#296

Open
JWriter20 wants to merge 2 commits into
apify:masterfrom
JWriter20:pr/upstream-virtdisplay-race
Open

fix(virtdisplay): allocate Xvfb display by explicit number with retry#296
JWriter20 wants to merge 2 commits into
apify:masterfrom
JWriter20:pr/upstream-virtdisplay-race

Conversation

@JWriter20

Copy link
Copy Markdown
Contributor

Problem

With headless: "virtual", the display is acquired via Xvfb -displayfd 3, which makes Xvfb walk display numbers from 0 upward and re-init GLX on every collision. Under concurrent bursts (many NewBrowser calls at once) the collision-walk is the dominant startup cost.

Worse, the acquisition reads the chosen number off the fd-3 pipe with a for await. If Xvfb opens fd 3 but never writes a number — a half-init crash that doesn't close the pipe, a bind-retry loop, an exhausted ulimit — the read hangs forever and the child process leaks.

Fix

Replace the -displayfd read with explicit allocation:

  • Draw a display number from a large sparse range (randomInt), then spawn Xvfb :N.
  • Decide who won the kernel race by reading /tmp/.X{N}-lock: our PID in the lock ⇒ we won; the child exiting ⇒ collision ⇒ retry. The .X{N}-lock file uses the same O_CREAT | O_EXCL atomic primitive Xvfb relies on with -displayfd, so race-safety is identical.
  • Every attempt is bounded by a timeout and losing children are SIGKILLed, so no path hangs or leaks a process. Retries (max 5) cover the rare birthday collision.

Note on framebuffer size

The Xvfb framebuffer stays at 1x1x24. The spoofed screen.width/screen.height come from the BrowserForge fingerprint, not the underlying X display, so the framebuffer size has no fingerprint impact and a 1×1 buffer keeps virtual-display memory minimal.

Net +92 / -39 in src/virtdisplay.ts. Type-checks clean (tsc --noEmit), biome clean.

🤖 Generated with Claude Code

Under concurrent `headless: "virtual"` bursts, `-displayfd 3` makes Xvfb
walk display numbers from 0 upward and re-init GLX on every collision,
which is the dominant cost — and a half-init crash that opens fd 3 but
never writes a number would hang the for-await on the pipe forever,
leaking the child.

Replace the `-displayfd` read with explicit allocation: draw a display
number from a large sparse range, spawn `Xvfb :N`, and decide who won the
kernel race by reading `/tmp/.X{N}-lock` (the same O_CREAT|O_EXCL atomic
primitive Xvfb uses with -displayfd, so race-safety is identical) — our
PID in the lock means we won; the child exiting means collision, retry.
Each attempt is bounded by a timeout and bad children are SIGKILLed, so
there is no path that hangs or leaks a process.

Keeps the Xvfb framebuffer at 1x1x24: the spoofed screen.width/height
come from the BrowserForge fingerprint, not the X display, so the
framebuffer size has no fingerprint impact.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
JWriter20 added a commit to JWriter20/camoufox-js that referenced this pull request Jun 14, 2026
The 1920x1080 bump (249c965) was a workaround for a proxy-auth
page.evaluate crash under headless:"virtual". Reverting to the upstream
1x1x24 framebuffer per decision; the spoofed screen.width/height come
from the BrowserForge fingerprint, not the X display, so the framebuffer
size has no fingerprint impact. Matches upstream PR apify#296.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
kill() sends SIGKILL, so a terminated Xvfb reports exitCode === null and
signalCode === "SIGKILL" — the existing test asserted exitCode became
non-null and so failed against the new implementation. Add a hasExited()
helper that treats either a non-null exitCode or signalCode as
terminated, and use it for the liveness and teardown assertions. Also
refresh the stale -displayfd comments to describe the random-display +
/tmp/.X{N}-lock race the implementation now uses.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants