Skip to content

Latest commit

Β 

History

History
34 lines (28 loc) Β· 1.05 KB

File metadata and controls

34 lines (28 loc) Β· 1.05 KB

Contributing

Data Strategy

  • CSVs (data/*.csv) are committed to git β€” compact (~400 KB per race)
  • Raw JSON (data/raw/) is gitignored β€” too large for git (~10 MB per race)
  • Race manifest (data/races.json) tracks all available races and is auto-generated by the scraper

Adding New Races

  1. Add the race to scripts/race-registry.json with its group URL from labs-v2.competitor.com
  2. Run discovery to find event IDs:
    node scripts/discover.js <ironman-results-url-or-group-url>
    
  3. Scrape results:
    node scripts/scrape-all.js --slug=<race-slug> --year=2025
    
    Or scrape a single race directly:
    node scripts/scrape.js <slug> <event-id> --no-raw
    
  4. The app reads from data/races.json β€” no code changes needed

Data Pipeline

ironman.com results page
  β†’ discover.js extracts group UUID β†’ fetches subevent IDs
  β†’ scrape.js fetches API results β†’ raw JSON β†’ CSV
  β†’ scrape-all.js generates data/races.json manifest
  β†’ Next.js app reads races.json + CSVs at build time