- CSVs (
data/*.csv) are committed to git β compact (~400 KB per race) - Raw JSON (
data/raw/) is gitignored β too large for git (~10 MB per race) - Race manifest (
data/races.json) tracks all available races and is auto-generated by the scraper
- Add the race to
scripts/race-registry.jsonwith its group URL fromlabs-v2.competitor.com - Run discovery to find event IDs:
node scripts/discover.js <ironman-results-url-or-group-url> - Scrape results:
Or scrape a single race directly:
node scripts/scrape-all.js --slug=<race-slug> --year=2025node scripts/scrape.js <slug> <event-id> --no-raw - The app reads from
data/races.jsonβ no code changes needed
ironman.com results page
β discover.js extracts group UUID β fetches subevent IDs
β scrape.js fetches API results β raw JSON β CSV
β scrape-all.js generates data/races.json manifest
β Next.js app reads races.json + CSVs at build time