Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
119 changes: 119 additions & 0 deletions CONTEXT.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
# ModelSkill

Internal domain glossary for ModelSkill — a Python package for evaluating model skill by comparing simulation output with observations, primarily for MIKE 21/3 (Flexible Mesh) hydrodynamic and oceanographic models. The user-facing version lives in `docs/user-guide/terminology.qmd`; this file is the dev-facing canonical reference and additionally documents internal/alpha vocabulary that is not yet stable enough for the public docs.

Upstream vocabulary (file formats, geometry concepts, layered meshes) lives in [mikeio's CONTEXT.md](../mikeio/CONTEXT.md). Where the two glossaries overlap, mikeio is authoritative — modelskill aligns rather than diverges.

## Language

### Core entities

**Observation**:
Measured data from a real instrument or station, used as the reference truth. Concrete types: `PointObservation`, `TrackObservation` (plus `VerticalObservation`, alpha).
_Avoid_: ground truth, target, label.

**ModelResult**:
Simulation output to be evaluated against an Observation. Concrete types: `PointModelResult`, `TrackModelResult`, `GridModelResult`, `DfsuModelResult`, `DummyModelResult` (plus `VerticalModelResult`, alpha).
_Avoid_: prediction, forecast, model output.

**Comparer**:
The pairing of one Observation with one or more ModelResults after spatial and temporal matching. Holds the matched data alongside the raw model data.
_Avoid_: comparison, matched dataset.

**ComparerCollection**:
A collection of Comparers, typically one per station, used for cross-station skill assessment.
_Avoid_: multi-comparer, comparer set.

**Baseline**:
A reference ModelResult representing a naive prediction (mean, persistence, climatology). Realized as `DummyModelResult`. Anchors skill scores like NSE that quantify "is the real model better than predicting nothing clever".
_Avoid_: null model, dummy model (use "Baseline" in prose; `DummyModelResult` is the class).

### Geometry types (gtype)

**Geometry type (gtype)**:
The label describing the spatial structure of an Observation or ModelResult. Determines match-compatibility. Values: `point`, `track`, `grid`, `unstructured` (flexible mesh), `vertical` (alpha), `node` / `reach` (future). Accessible via `.gtype`.

**Point**:
A fixed (x, y) location, single z, timeseries of values. e.g. tide gauge, single-depth mooring.

**Track**:
A moving (x, y) location, single z, timeseries of values. e.g. satellite altimeter, ship transect.

**Grid**:
A regular axis-aligned spatial extent with a value at each (x, y) per timestep. ModelResult-only (no GridObservation). Backed by `GridModelResult` (xarray / nc / dfs2).

**Flexible Mesh** (FM):
An unstructured spatial extent of Nodes + Elements with a value per Element per timestep. ModelResult-only. Backed by `DfsuModelResult` (dfsu files). Use **flexible mesh** for the geometry concept and **dfsu** for the file format — never "unstructured grid". Aligns with mikeio's vocabulary.

**Vertical** (alpha):
A fixed (x, y) location with values varying along z and time — a water column at one station. e.g. CTD cast, moored profiler, modeled column extracted from a 3D dfsu.
_Avoid_: profile, water column, column, depth profile (all overloaded).

### Skill assessment

**Skill**:
The ability of a model to reproduce observations. As an API, `Comparer.skill()` returns a `SkillTable` of metrics grouped by (observation, model, variable). Conceptually a Comparer-level concept; also available on ComparerCollection.

**Score**:
A single numerical value summarizing model performance for one metric, computed as a weighted average across all time-steps, observations and variables. Conceptually a ComparerCollection-level concept; also available on `Comparer.score` as the degenerate single-observation case.

**Metric**:
A mathematical expression evaluated on matched (obs, model) pairs (bias, RMSE, R², NSE, …). Defined in `modelskill.metrics`. Both circular and scalar metrics are supported.

**SkillTable**:
Skill metrics aggregated by categorical groupings (model, observation, variable). DataFrame-like.

**SkillGrid**:
Skill metrics binned in horizontal space (x, y).

**SkillProfile** (alpha):
Skill metrics binned in z. Single-station today; will gain an `observation` dim for multi-station before release.

### Physical quantities

**Quantity**:
A physical quantity (name + unit, e.g. *Water Level [m]*) attached to an Observation or ModelResult. Compatibility gates matching — water level cannot be matched against wind speed. When reading dfs files via mikeio, derived from `EUMType` + `EUMUnit`. Class: `Quantity`.

**Field**:
Data defined over a spatial extent (Grid or Flexible Mesh) with a value at each location and timestep — i.e. the gtype=grid or gtype=unstructured shape. Contrasts with **timeseries** (point/track). Not directly comparable to observations; `match()` extracts a timeseries per observation location before producing a Comparer.

**Timeseries**:
A sequence of values in time at a single location (point) or a single moving location (track). Univariate by convention in ModelSkill; multivariate timeseries are assessed one variable at a time.

## Relationships

- An **Observation** and a **ModelResult** of compatible **gtype** match into a **Comparer** (or **ComparerCollection** for multiple observations).
- A **Field** (Grid / FM ModelResult) is *not* directly comparable to an Observation; `match()` extracts a **Timeseries** at each observation location.
- **Skill** is naturally per-Comparer (per-(obs, model) breakdown); **Score** is naturally per-ComparerCollection (cross-observation weighted average). Both methods exist on both classes for convenience.
- A **Comparer** of `gtype="vertical"` exposes a `.vertical` accessor for column-specific operations (`skill`, `plot.profile`, `plot.hovmoller`). *Alpha.*
- A **ComparerCollection** of vertical members will expose `cc.vertical.skill()` (release blocker, follow-up PR) returning a **SkillProfile** with `(z, observation, model)` dims. *Alpha.*

## Conventions

**z-axis direction**:
Default `positive="up"` (MIKE 3 convention — z=0 at datum, below-surface negative). Read from `z.attrs["positive"]` if set; user can override via the `positive` constructor kwarg. Plotter inverts the y-axis iff `positive="down"`. mikeio-sourced datasets do not currently carry this attribute.

**Flexible Mesh vs dfsu**:
*Flexible mesh* (or FM) is the geometry/engine concept; *dfsu* is the binary file format (extension `.dfsu`). Class names anchored to the file use `Dfsu*`; prose about the geometry uses "flexible mesh". Aligns with mikeio.

## Flagged ambiguities

- **"performance" vs "skill"**: Treated as aliases in user docs (terminology.qmd folds "performance" into Skill). Internally, prefer "skill".
- **"unstructured" vs "flexible mesh"**: The code's `GeometryType.UNSTRUCTURED` enum value is a legacy name. Prose and docs say "flexible mesh"; the enum rename was deferred (flag in API reviews, do not change unprompted).
- **(x, y) on a VerticalObservation vs VerticalModelResult**: legitimately differ. Obs holds the true CTD/instrument position; model holds the nearest mesh element center. No equality check is performed at match time — the mismatch is expected, not a bug. *Alpha.*
- **"profile"**: Rejected as an entity name — overloaded across `SkillProfile`, `plot.profile()`, Taylor profiles, and (informally) CTD output. The entity is **Vertical**; "profile" survives only when describing the *output* (e.g., `SkillProfile` reads as "skill as a function of depth"). *Alpha.*
- **3D dfsu → VerticalModelResult extraction**: not provided in-package. Extraction is an offline preprocessing step because it is slow (mesh navigation, sigma-z layer reconstruction). Users extract one column with mikeio, write a dfs0, and pass that to `VerticalModelResult`. *Alpha.*

## Example dialogue

> **Dev:** "I have a tide-gauge timeseries and a dfsu of surface elevation. What's the workflow?"
>
> **Maintainer:** "Wrap the gauge as a `PointObservation` and the dfsu as a `DfsuModelResult` (gtype `unstructured`, i.e. flexible mesh). Call `match()` — it'll extract a timeseries from the mesh at the gauge location and hand you back a `Comparer`. Then `cmp.skill()` gives you a `SkillTable` of metrics, or `cmp.score(metric=...)` for a single number."
>
> **Dev:** "And if I have ten gauges?"
>
> **Maintainer:** "`match()` with a list of observations returns a `ComparerCollection`. `cc.score(...)` is the natural one-number summary — it weights across observations. `cc.skill()` still gives you the per-observation breakdown."
>
> **Dev:** "Is my model better than just predicting the mean?"
>
> **Maintainer:** "Add a `DummyModelResult` as a baseline alongside your real model, run `match()`, and compare scores. NSE is the metric that's literally defined as 'how much better than the mean'."
Loading
Loading