Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 33 additions & 10 deletions .claude/skills/model-calibration/SKILL.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
name: model-calibration
description: Run, refresh, or diagnose the model's calibration pipeline (feed -> food_waste -> food_demand -> cost -> stability) that produces the git-tracked artefacts under `data/curated/calibration/`. Covers the dependency order, the `tools/calibrate` wrapper, realistic runtime expectations, when each kind of upstream change forces a re-run, and how to diagnose the most common failure mode: a hidden supply/demand mismatch that inflates the production-stability L1 cost. Use whenever calibration is relevant -- the user touches inputs/build logic that feed the calibration solves, calibration artefacts look off, or a refresh of the artefacts is needed after a model/data change.
description: Run, refresh, or diagnose the model's calibration pipeline (feed -> food_waste -> food_demand -> cost -> stability) that produces the per-config artefact sets under `data/curated/calibration/<source>/` (the default set is git-tracked). Covers the dependency order, the `tools/calibrate` wrapper, realistic runtime expectations, when each kind of upstream change forces a re-run, and how to diagnose the most common failure mode: a hidden supply/demand mismatch that inflates the production-stability L1 cost. Use whenever calibration is relevant -- the user touches inputs/build logic that feed the calibration solves, calibration artefacts look off, or a refresh of the artefacts is needed after a model/data change.
---

<!--
Expand All @@ -11,12 +11,19 @@ SPDX-License-Identifier: CC-BY-4.0

# Model Calibration

The default workflow consumes five git-tracked calibration artefacts under
`data/curated/calibration/`. Each is produced by a dedicated validation-mode
solve and absorbs a specific class of residual mismatch so that ordinary
solves don't have to. Without these files in place, production-stability,
The default workflow consumes five calibration artefact groups organized
in per-config *sets* under `data/curated/calibration/<source>/`, selected
by the `calibration.source` config key (the `default` set is
git-tracked). Each is produced by a dedicated validation-mode solve and
absorbs a specific class of residual mismatch so that ordinary solves
don't have to. Without these files in place, production-stability,
costs, and food/feed accounting drift from observed 2020 reality.

Every set carries a `provenance.yaml` stamp of the structural config it
was fit against; workflow runs error at DAG time when their config
differs structurally from the consumed set's stamp (see "Artefact sets
and provenance" below).

Authoritative reference: `docs/calibration.rst`. This skill is the operational
companion: when to run, how to run, what to expect, what to watch out for.

Expand Down Expand Up @@ -58,12 +65,20 @@ tools/calibrate food_waste
tools/calibrate food_demand
tools/calibrate cost
tools/calibrate stability
tools/calibrate --check # per-step staleness probe (dry-run, no execution)
tools/calibrate --check # per-step staleness + provenance probe (no execution)
tools/calibrate --base config/<name>.yaml [all|<step>|--check]
# calibrate a dedicated set for another config
```

The wrapper defaults to `pixi -e gurobi` -- all calibration configs use
Gurobi. HiGHS is too slow here. Override with `CALIBRATE_PIXI_ENV=<env>`.

With `--base`, the base config must declare its own `calibration.source`
(refusing to overwrite the shared `default` set); a fresh set is seeded
from `default` and regenerated in order, and the `all` chain uses
`name: calibration-<source>` so processing trees don't thrash. After any
successful run the set is (re)stamped with `provenance.yaml`.

Pass extra flags through positionally:

```bash
Expand Down Expand Up @@ -153,13 +168,21 @@ sequential (each Broyden iteration depends on the previous solve).

## Output landing zones

- `data/curated/calibration/*` -- the five artefacts, **git-tracked**. Commit them together as a refresh; mixed-vintage artefacts are the most common cause of confusing downstream solves.
- `processing/calibration/*` -- shared upstream prep, NOT committed.
- `data/curated/calibration/<source>/*` -- one artefact set per base config, plus its `provenance.yaml` stamp; the `default` set is **git-tracked**. Commit a set together as a refresh; mixed-vintage artefacts are the most common cause of confusing downstream solves.
- `processing/calibration/*` (or `processing/calibration-<source>/*` for non-default bases) -- shared upstream prep, NOT committed.
- `results/calibration/*` -- per-iteration solve logs, NOT committed.
- `results/calibration/calibration/deviation_penalty_trace.csv` -- per-iter Broyden trace (per-component lambda, achieved deviations, residual norm). Inspect when stability behaves oddly.

## Artefact sets and provenance

- A config selects its set with `calibration.source` (default: `default`); all artefact paths resolve through the `{calibration_source}` placeholder at config-load time.
- Structurally divergent configs must either calibrate their own set (`calibration.source: <name>` + `tools/calibrate --base config/<name>.yaml`), point at a compatible set, or set `calibration.accept_provenance_mismatch: true` (test/tutorial-grade escape hatch: warning instead of error).
- The provenance check covers config drift only; code/data staleness remains `tools/calibrate --check`'s job. Both run from `tools/calibrate --check`.
- The stamp compares all non-solve-time leaves minus exempt machinery keys (see `PROVENANCE_EXEMPT_PREFIXES` in `workflow/validation/calibration_provenance.py`). Solve-time knobs (GHG price, value_per_yll, deviation_penalty, scenario overrides) never trip it.
- `tests/test_calibration_provenance.py::TestDefaultStampConsistency` fails when `config/default.yaml` changes structurally without a recalibration/restamp -- that is the intended forcing function.

The currently calibrated L1 centre lives in
`data/curated/calibration/deviation_penalty.yaml` under
`data/curated/calibration/default/deviation_penalty.yaml` under
`l1_costs.<component>`. Solves that set
`deviation_penalty.{land,feed,diet}.l1_cost: "calibrated"` resolve the
sentinel from this file at solve time. Per-component
Expand Down Expand Up @@ -289,7 +312,7 @@ tools/smk --configfile config/validation.yaml -- \
results/validation/solved/model_scen-default.nc

# Current calibrated L1 centre
cat data/curated/calibration/deviation_penalty.yaml
cat data/curated/calibration/default/deviation_penalty.yaml

# Per-iter Broyden trace (after a stability run)
cat results/calibration/calibration/deviation_penalty_trace.csv
Expand Down
21 changes: 15 additions & 6 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -325,14 +325,15 @@ pixi run -e dev pytest -v # verbose output

## Calibration

Five calibrations feed the default workflow. Their outputs live under
`data/curated/calibration/` and are git-tracked; builds depend on them.
When upstream data or build logic changes materially, regenerate in
this order:
Five calibrations feed the default workflow. Their outputs are organized
in per-config artefact *sets* under `data/curated/calibration/<source>/`
(selected by the `calibration.source` config key; the `default` set is
git-tracked) and builds depend on them. When upstream data or build
logic changes materially, regenerate in this order:

1. **feed** — `config/calibration/feed.yaml` → `grassland_yield.csv`,
`fodder_conversion.csv`, `exogenous_forage.csv`,
`exogenous_protein.csv`.
`exogenous_feed.csv`.
2. **food_waste** — `config/calibration/food_waste.yaml` →
`food_waste.yaml` (per-food-group consumer-side waste multipliers).
3. **food_demand** — `config/calibration/food_demand.yaml` →
Expand All @@ -354,7 +355,15 @@ Single entrypoint: `tools/calibrate` (`all` by default; `feed`,
`food_waste`, `food_demand`, `cost`, `stability`, or `--check` for
staleness). `tools/smk` prints a one-line reminder when
`data/curated/` inputs are newer than the oldest calibration artefact.
See `docs/calibration.rst` for the full story.

Each artefact set carries a `provenance.yaml` stamp of the structural
config it was fit against (written by `tools/calibrate`); every workflow
run checks its config against the stamp of the set it consumes and
errors on structural mismatch. Configs with different structural
assumptions must declare their own `calibration.source` and run
`tools/calibrate --base config/<name>.yaml`, or set
`calibration.accept_provenance_mismatch: true` (test/tutorial configs
only). See `docs/calibration.rst` for the full story.

## Configuration Validation

Expand Down
8 changes: 8 additions & 0 deletions REUSE.toml
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,14 @@ precedence = "aggregate"
SPDX-FileCopyrightText = "2026 Koen van Greevenbroek"
SPDX-License-Identifier = "PDDL-1.0"

# Machine-generated calibration provenance stamps (written by
# workflow/scripts/write_calibration_provenance.py via tools/calibrate).
[[annotations]]
path = "data/curated/calibration/**/provenance.yaml"
precedence = "aggregate"
SPDX-FileCopyrightText = "2026 Koen van Greevenbroek"
SPDX-License-Identifier = "CC-BY-4.0"

# Root LICENSE copy so GitHub detects the project license (it only scans the
# repository root, not LICENSES/). The canonical text lives in LICENSES/.
[[annotations]]
Expand Down
37 changes: 27 additions & 10 deletions config/default.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,23 @@ paths:
logs_root: "logs"
benchmarks_root: "benchmarks"

# --- section: calibration ---
# Which calibration artefact set to use. The five calibration artefact
# groups (feed, food_waste, food_demand, cost, stability) live under
# ``data/curated/calibration/<source>/``; every artefact path below that
# contains the ``{calibration_source}`` placeholder resolves against this
# key at config-load time. A config whose structural assumptions differ
# from the set it consumes must either regenerate its own set
# (``tools/calibrate --base <config>`` with its own ``source`` name) or
# point ``source`` at a compatible existing set. Compatibility is checked
# at workflow start against the set's ``provenance.yaml``.
calibration:
source: "default"
# Downgrade a provenance mismatch from an error to a warning. Only for
# configs that knowingly reuse a set calibrated under different
# structural assumptions (e.g. coarse test/tutorial resolutions).
accept_provenance_mismatch: false

# --- section: netcdf ---
# NetCDF export settings for PyPSA network files (build and solve outputs)
netcdf:
Expand Down Expand Up @@ -165,7 +182,7 @@ deviation_penalty:
tolerance: 0.02
max_iter: 8
trust_region_log: 0.693 # log(2): caps |dx|_inf in log-coords per iteration
calibrated_yaml: "data/curated/calibration/deviation_penalty.yaml"
calibrated_yaml: "data/curated/calibration/{calibration_source}/deviation_penalty.yaml"
trace_csv: "<results>/{name}/calibration/deviation_penalty_trace.csv"
seeds:
cropland: 1.0 # crop deviation reaches the 5% target near L1~1.1
Expand Down Expand Up @@ -656,9 +673,9 @@ grazing:
grassland_forage_calibration:
enabled: true
generate: false
grassland_yield_correction: "data/curated/calibration/grassland_yield.csv"
fodder_conversion_correction: "data/curated/calibration/fodder_conversion.csv"
exogenous_forage: "data/curated/calibration/exogenous_forage.csv"
grassland_yield_correction: "data/curated/calibration/{calibration_source}/grassland_yield.csv"
fodder_conversion_correction: "data/curated/calibration/{calibration_source}/fodder_conversion.csv"
exogenous_forage: "data/curated/calibration/{calibration_source}/exogenous_forage.csv"
scenario: "default"

# Protein-feed calibration: per-country exogenous monogastric/ruminant
Expand All @@ -673,7 +690,7 @@ grazing:
exogenous_feed_calibration:
enabled: true
generate: false
exogenous_feed: "data/curated/calibration/exogenous_feed.csv"
exogenous_feed: "data/curated/calibration/{calibration_source}/exogenous_feed.csv"
scenario: "default"

# Food waste calibration: a per-food-group multiplier on (1 - waste_fraction)
Expand All @@ -685,7 +702,7 @@ exogenous_feed_calibration:
food_loss_waste_calibration:
enabled: true
generate: false
calibration_file: "data/curated/calibration/food_waste.yaml"
calibration_file: "data/curated/calibration/{calibration_source}/food_waste.yaml"
food_groups:
# Groups with documented FBS-vs-GDD gap that the SDG-based defaults
# under- or over-state. The SDG global all-foods 10% waste rate fits
Expand Down Expand Up @@ -716,7 +733,7 @@ food_loss_waste_calibration:
food_demand_calibration:
enabled: true
generate: false
calibration_file: "data/curated/calibration/food_demand.csv"
calibration_file: "data/curated/calibration/{calibration_source}/food_demand.csv"
# Bounds on the per-food multiplier. Tight on purpose: anything that
# would fall outside flags a structural data issue worth investigating
# rather than being silently absorbed.
Expand Down Expand Up @@ -1687,9 +1704,9 @@ cost_calibration:
enabled: true # Apply calibration corrections to production costs
generate: false # Generate calibration from solved model (breaks DAG cycle when true)
scenario: "calibration" # Scenario name used for calibration solve
crop_correction_csv: "data/curated/calibration/crop_cost.csv"
grassland_correction_csv: "data/curated/calibration/grassland_cost.csv"
animal_correction_csv: "data/curated/calibration/animal_cost.csv"
crop_correction_csv: "data/curated/calibration/{calibration_source}/crop_cost.csv"
grassland_correction_csv: "data/curated/calibration/{calibration_source}/grassland_cost.csv"
animal_correction_csv: "data/curated/calibration/{calibration_source}/animal_cost.csv"

# --- section: solving ---
solving:
Expand Down
15 changes: 15 additions & 0 deletions config/schemas/config.schema.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ required:
- currency_base_year
- downloads
- paths
- calibration
- netcdf
- validation
- consumer_values
Expand Down Expand Up @@ -122,6 +123,20 @@ properties:
minLength: 1
description: "Root directory for Snakemake benchmark TSV files"

calibration:
type: object
required: [source, accept_provenance_mismatch]
additionalProperties: false
description: "Selection of the calibration artefact set under data/curated/calibration/<source>/"
properties:
source:
type: string
pattern: "^[a-zA-Z0-9_-]+$"
description: "Name of the calibration artefact set to read (and write, for generation runs)"
accept_provenance_mismatch:
type: boolean
description: "Downgrade a calibration provenance mismatch from an error to a warning"

netcdf:
type: object
description: "NetCDF export settings for PyPSA network files"
Expand Down
5 changes: 5 additions & 0 deletions config/tutorial/01_ghg_prices.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,11 @@

name: "tutorial_01"

# Tutorials knowingly reuse the default calibration artefacts at a coarser
# regional resolution.
calibration:
accept_provenance_mismatch: true

# Reduced spatial resolution so the tutorial completes in a few minutes on a
# laptop after the one-off data download. 200 is the smallest value the
# default country list admits without enabling cross-border clustering.
Expand Down
5 changes: 5 additions & 0 deletions config/tutorial/02_consumer_values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,11 @@

name: "tutorial_02"

# Tutorials knowingly reuse the default calibration artefacts at a coarser
# regional resolution.
calibration:
accept_provenance_mismatch: true

# See config/tutorial/01_ghg_prices.yaml for the rationale on target_count.
aggregation:
regions:
Expand Down
Loading
Loading