Pre-ranking for multivariate calibration assessment

### Summary

Implement pre-ranking methods from Gneiting et al. (2008) to assess calibration of multivariate forecasts via univariate rank histograms.

### Motivation

When forecasting multiple correlated quantities (e.g., multiple locations, age groups, or forecast horizons), standard univariate calibration checks miss dependency structure. A forecast could be perfectly calibrated marginally (each location's PIT is uniform) but miss the correlation structure entirely. Gneiting et al. (2008) propose "pre-ranking" approaches that reduce multivariate calibration assessment to univariate ranks, which can then be visualised via rank histograms. This could complement the new `forecast_multivariate_sample` class by providing calibration diagnostics.

### Pre-ranking methods

The paper describes two main pre-ranking approaches:

1. **Multivariate rank** (Section 2.1): Based on pre-ranks using an orthant semi-ordering. For each point, count how many other points lie "to the lower left" in all dimensions. The observation's rank among samples is then interpreted as a calibration metric. Ties are resolved randomly.

2. **Minimum spanning tree (MST) rank** (Section 2.2): Remove each point in turn and compute the MST length of the remaining points. Rank observation by where its MST length falls among sample MST lengths.

### Design options

This could be implemented in two ways

#### Option A: Include in score()

Add coverage columns to `score.forecast_multivariate_sample()` output:
- `multivariate_coverage_50`, `multivariate_coverage_90`, etc.
- Each is 0/1 per forecast unit (like univariate `interval_coverage_*`)
- Aggregation via `summarise_scores()` gives coverage proportions
- Existing visualisation tools can display coverage deviations

This parallels how univariate interval coverage works and feels most natural.

#### Option B: Standalone function

```r
get_multivariate_ranks(forecast_multivariate, method = "average_rank")
```

Returns the observation's rank among samples (1 to M+1). This can be passed to `get_pit_histogram()` for visualisation. Useful for detailed calibration diagnostics beyond coverage.

### Reference

Gneiting, T., Stanberry, L. I., Grimit, E. P., Held, L., & Johnson, N. A. (2008). Assessing probabilistic forecasts of multivariate quantities, with an application to ensemble predictions of surface winds. *Test*, 17(2), 211-235. https://doi.org/10.1007/s11749-008-0114-x -- available at https://stat.uw.edu/sites/default/files/files/reports/2008/tr537.pdf


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pre-ranking for multivariate calibration assessment #1064

Summary

Motivation

Pre-ranking methods

Design options

Option A: Include in score()

Option B: Standalone function

Reference

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Pre-ranking for multivariate calibration assessment #1064

Description

Summary

Motivation

Pre-ranking methods

Design options

Option A: Include in score()

Option B: Standalone function

Reference

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions