Local ingest utilities for NAEI datasets.
The 2024 PV ingest script lives at scripts/load_naei_data.py and only reads these visible pivot sheets:
AirPollutants→NAEI2024pv_AQ.csvHeavyMetals→NAEI2024pv_HM.csvParticulateMatter→NAEI2024pv_PM.csvPOPs&PAHs→NAEI2024pv_POP.csv
All other sheets (hidden backing data, query tables, report tabs) are ignored.
python3 -m pip install -r requirements.txtSet one of:
SUPABASE_DB_URLDATABASE_URL
You can put this in .env for local runs.
Run:
schema/migrations/2026-04-09_naei2024pv_ingest_patch.sql
This patch adds territory support for naei2024pv_series, missing FKs, and unique constraints needed for idempotent upserts.
python3 scripts/load_naei_data.py extract-pv-xlsx \
--xlsx "/Users/mikehinford/Dropbox/Apps/github-data-explorer-mk2/2026/PivotTableViewer_naei24_AQ_2026_02_26.xlsx" \
--output-dir "/Users/mikehinford/Dropbox/Apps/github-data-explorer-mk2/2026/extracted-pv" \
--dataset-prefix NAEI2024pvOutput files:
NAEI2024pv_AQ.csvNAEI2024pv_HM.csvNAEI2024pv_PM.csvNAEI2024pv_POP.csv
Normalized columns:
extracted_atsource_sheetdataset_prefixterritory_namepollutantreporting_yearemission_unitsource_nameactivity_nameemission_valuenfr_code
python3 scripts/load_naei_data.py load-pv \
--path "/Users/mikehinford/Dropbox/Apps/github-data-explorer-mk2/2026/extracted-pv" \
--dataset-prefix NAEI2024pv \
--source-url "PivotTableViewer_naei24_AQ_2026_02_26.xlsx"python3 scripts/load_naei_data.py run-pv-ingest \
--xlsx "/Users/mikehinford/Dropbox/Apps/github-data-explorer-mk2/2026/PivotTableViewer_naei24_AQ_2026_02_26.xlsx" \
--output-dir "/Users/mikehinford/Dropbox/Apps/github-data-explorer-mk2/2026/extracted-pv" \
--dataset-prefix NAEI2024pv \
--source-url "PivotTableViewer_naei24_AQ_2026_02_26.xlsx"dataset_file.file_namestores the CSV filename.dataset_file.extracted_atuses CSV file timestamp.- Row-level
extracted_atin normalized CSV comes from workbooktime-stamp(with extraction run timestamp fallback if missing). - Re-runs are idempotent when schema patch constraints are applied.