Pipeline for generating web page variants (HTML), screenshots, and target-element coordinates for visual attribution evaluation.
web_variants_generation/pipeline/– Code and source inputs (everything needed to run the generator).web_variants_generation/pipeline/shared/– Shared Python scripts: screenshot generation, coordinate extraction, verification overlay.web_variants_generation/pipeline/scenarios/<name>/– Per-scenario: source HTML,config.json, and JS variation generator.
web_variants_generation/data/– Generated outputs (database): variation HTML, screenshots,coordinates.json, verifications. Populated by running the pipeline; optionally gitignored.
- Python 3.8+ with pip.
- Node.js (for running the JS variation generators).
- Install Python dependencies, Node dependencies, and Playwright browsers:
# If node/npm is missing, install Node.js first (example with nvm):
curl -fsSL https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.3/install.sh | bash
export NVM_DIR="$HOME/.nvm" && [ -s "$NVM_DIR/nvm.sh" ] && . "$NVM_DIR/nvm.sh"
nvm install --lts && nvm use --lts
pip install -r requirements.txt
npm install
playwright install chromiumOptional (recommended) Python setup with uv:
# Install uv (if missing)
curl -LsSf https://astral.sh/uv/install.sh | sh
source "$HOME/.local/bin/env"
uv venv .venv
source .venv/bin/activate
uv pip install -r requirements.txt
npm install
uv run playwright install chromiumEach scenario has its own folder under web_variants_generation/pipeline/scenarios/<name>/ with:
source/– Original HTML snapshot(s).config.json– Paths and target settings (paths point todata/<name>/).generate_variations.js– Produces variation HTML intodata/<name>/html/.README.md– Scenario-specific instructions.
Typical flow:
-
Generate variation HTML (from the scenario directory or repo root):
cd web_variants_generation/pipeline/scenarios/<name> && node generate_variations.js
Output:
web_variants_generation/data/<name>/html/*.html. -
Run the shared pipeline (screenshots → coordinates → verification images) from repo root:
python web_variants_generation/pipeline/shared/screenshot_generator.py web_variants_generation/pipeline/scenarios/<name>/config.json python web_variants_generation/pipeline/shared/coordinate_calculator.py web_variants_generation/pipeline/scenarios/<name>/config.json python web_variants_generation/pipeline/shared/verification_boxer.py web_variants_generation/pipeline/scenarios/<name>/config.json
Or use the scenario’s
run.shif provided. -
Results appear under
web_variants_generation/data/<name>/:screenshots/,coordinates.json,verifications/.
Config paths in each scenario’s config.json still use data/<name>/ relative to the web_variants_generation folder so that generated artifacts stay in one place.
From the repository root:
bash web_variants_generation/pipeline/run_all.shUseful options:
# Keep running remaining scenarios even if one fails
bash web_variants_generation/pipeline/run_all.sh --continue-on-error
# Run only selected scenarios
bash web_variants_generation/pipeline/run_all.sh --scenarios "amazon_first booking npr"See web_variants_generation/pipeline/scenarios/<name>/README.md for each scenario’s source, target element, and any special notes.