Rewrite clim benchmark with 366-DOY forecast-style climatology#33
Rewrite clim benchmark with 366-DOY forecast-style climatology#33colonesej wants to merge 3 commits into
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #33 +/- ##
==========================================
Coverage 100.00% 100.00%
==========================================
Files 2 5 +3
Lines 108 385 +277
==========================================
+ Hits 108 385 +277 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Hi @andreas-grafberger, can I ask you to review the main requirements and test coverages in this PR!? Do they deviate too much from the previous implementation? Is it a good starting point in reviewing the |
There was a problem hiding this comment.
Pull request overview
This PR rewrites the hyve-clim-benchmark tool into a modular package centered on an explicit 366-DOY calendar model and forecast-like output structure, replacing the legacy monolithic script and adding targeted unit + end-to-end tests.
Changes:
- Replaced
src/hyve/tools/clim_benchmark.pywith a package implementation (config,dates,sampling,percentiles,io,cli) built around deterministic 366-DOY pooling and(doy, issue_hour, ensemble, *space)outputs. - Added CLI/config validation for stride/window, issue-frequency, and percentile selection; preserved input variable name in output.
- Added unit tests for date/index logic and end-to-end tests for key operational scenarios and leap-year semantics.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_clim_benchmark_end_to_end.py | End-to-end coverage for daily + sub-daily use-cases, output shape/attrs, leap-year behavior, and CLI smoke. |
| tests/test_clim_benchmark_dates.py | Unit coverage for timestep inference, DOY mapping/pools, issue-hour splitting, and config validation. |
| src/hyve/tools/clim_benchmark/sampling.py | Builds (doy, issue_hour) slots and gathers time samples for per-slot computations. |
| src/hyve/tools/clim_benchmark/percentiles.py | Computes per-slot quantiles and assembles the final climatology array. |
| src/hyve/tools/clim_benchmark/io.py | Adds auxiliary (month, day) coords and writes NetCDF with metadata attrs. |
| src/hyve/tools/clim_benchmark/dates.py | Core DOY logic: DOY mapping, pool construction with wrap-around, issue-hour splitting. |
| src/hyve/tools/clim_benchmark/config.py | Pydantic config + validation for window/stride, issue frequency, and percentiles. |
| src/hyve/tools/clim_benchmark/cli.py | CLI argument parsing and orchestration of the pipeline. |
| src/hyve/tools/clim_benchmark/init.py | Exposes main for the hyve-clim-benchmark entrypoint. |
| src/hyve/tools/clim_benchmark.py | Removes the legacy script implementation. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| delayed_results: dict[Slot, xr.DataArray] = {} | ||
| for slot_key, indices in slots.items(): | ||
| sub = gather(da, indices) | ||
| delayed_results[slot_key] = dask.delayed(_quantile_slot)(sub, quantiles) | ||
|
|
||
| computed_list = dask.compute(*delayed_results.values()) | ||
| computed: dict[Slot, xr.DataArray] = dict(zip(delayed_results.keys(), computed_list)) | ||
|
|
||
| # Build the full output array: (doy, issue_hour, ensemble, *space). | ||
| issue_hours = sorted({ih for (_, ih) in slots.keys()}) | ||
| space_dims = [d for d in da.dims if d != "time"] | ||
| space_sizes = {d: da.sizes[d] for d in space_dims} | ||
| space_coords = {d: da.coords[d] for d in space_dims if d in da.coords} | ||
|
|
||
| out_shape = (366, len(issue_hours), len(percentiles), *space_sizes.values()) | ||
| out_dtype = next(iter(computed.values())).dtype if computed else da.dtype | ||
| data = np.full(out_shape, np.nan, dtype=out_dtype) |
| space_coords = {d: da.coords[d] for d in space_dims if d in da.coords} | ||
|
|
||
| out_shape = (366, len(issue_hours), len(percentiles), *space_sizes.values()) | ||
| out_dtype = next(iter(computed.values())).dtype if computed else da.dtype |
| ds_in = xr.open_dataset(input_path, chunks={}) | ||
| da = _select_variable(ds_in, variable) | ||
| if config.start_date is not None or config.end_date is not None: | ||
| da = da.sel(time=slice(config.start_date, config.end_date)) | ||
|
|
||
| time_index = pd.DatetimeIndex(da["time"].values) | ||
| timestep = infer_timestep(time_index) | ||
| timestep_hours = timestep.total_seconds() / 3600.0 | ||
| config.validate_against_data(timestep_hours) | ||
|
|
||
| logger.info( | ||
| "timestep=%s, window_days=%d, stride=%s, issue_frequency_hours=%d", | ||
| timestep, | ||
| config.window_days, | ||
| config.stride, | ||
| config.issue_frequency_hours, | ||
| ) | ||
|
|
||
| slots = build_slots( | ||
| da, | ||
| window_days=config.window_days, | ||
| stride=config.stride, | ||
| issue_frequency_hours=config.issue_frequency_hours, | ||
| ) | ||
| logger.info("Built %d (doy, issue_hour) slots", len(slots)) | ||
|
|
||
| clim_da = compute_climatology(da, slots, config.percentiles) | ||
| ds_out = build_output_dataset(clim_da, config, timestep) | ||
|
|
||
| if output_path: | ||
| logger.info("Writing %s", output_path) | ||
| write_netcdf(ds_out, output_path) | ||
| return ds_out |
| buckets: dict[int, list[int]] = {d: [] for d in range(1, 367)} | ||
| for i, d in enumerate(doys): | ||
| buckets[int(d)].append(i) | ||
| # Duplicate non-leap Dec 31 into DOY 366. | ||
| non_leap_dec31 = np.where( | ||
| (~is_leap_year(years)) & (months == 12) & (days == 31) | ||
| )[0] | ||
| for i in non_leap_dec31: | ||
| buckets[366].append(int(i)) | ||
|
|
||
| return {d: np.asarray(sorted(ix), dtype=np.int64) for d, ix in buckets.items()} |
| src_doy = ((target_doy - 1 + int(off)) % 366) + 1 | ||
| parts.append(doy_to_idx[src_doy]) | ||
| if parts: | ||
| pools[target_doy] = np.concatenate(parts) |
…-doy-rewrite Reduce memory usage and add better error message
What (edited)
The main goal of this script is to create an ensemble forecast-like dataset from the reanalysis dataset. This forecast represents the climatology for each pixel grid where reanalysis is available. to build an ensemble, different climatology percentiles are used, for intance with 10 ensemble members, we would have their values as the climatological 10th, 20th, ...., 100th quantile.
The script should build one forecast for different day-of-years.
Sampling is determined by
windowandstride, where the first represents the number of steps around the targetdoyto sample from, and stripe represents the frequency inside this window to sample (every day, week or month)For sub-daily data (like EFAS), we can sample centering on the usual forecast issue time (00 and 12) then, producing slightly different windows for each.
Computationally the strategy is to manually define computing blocks of required indices for each targeted step, then computing each block separately and keep memory footprint under control.
The simplest case, is daily data, daily window, 1 issue time, at 00, one quantile (median) and one leadtime step. This should produce the climatological median for each day of the reanalysis. It can be verified using GloFAS data.
Lead years: We are changing a bit the leap-year convention from the original script, meaning this should always produce a 366 days output, where 29thFeb is assigned its natural
doyof 60 (will match 1st of March in non-leap years). the 366thdoyalways receive the value valid for the 31st of December, either a unique value or a duplicate for 365th in non-leap years.Why
The previous
hyve-clim-benchmarkdate handling was brittle (leap-day behavior, hardcoded output calendar assumptions, stride semantics tied to raw timestep indexing). This PR rewrites the tool around explicit 366-day DOY logic and forecast-like output structure.Original plan (executed)
(doy, issue_hour, ensemble, *space).What changed
src/hyve/tools/clim_benchmark.pywith package modules:src/hyve/tools/clim_benchmark/config.pysrc/hyve/tools/clim_benchmark/dates.pysrc/hyve/tools/clim_benchmark/sampling.pysrc/hyve/tools/clim_benchmark/percentiles.pysrc/hyve/tools/clim_benchmark/io.pysrc/hyve/tools/clim_benchmark/cli.pysrc/hyve/tools/clim_benchmark/__init__.pyDate and calendar model
doy=1..366.Dec 31contributes to both DOY 365 and DOY 366 pool (fallback semantics).CLI and configuration
--stride {daily,weekly,monthly}.dailyrequireswindow_days >= 1weeklyrequireswindow_days >= 7monthlyrequireswindow_days >= 30--issue-frequency(hours, divisor of 24) and validation against inferred input timestep.--percentileslist.dis).Output semantics
(doy, issue_hour, ensemble, *space).monthanddaycoordinates on DOY (Option A decision).time_unit,window_days,stride,issue_frequency_hours,percentiles.Computation strategy
dask.delayedpercentile computation withchunk({'time': -1})for bounded memory (Option A decision).Tests added
tests/test_clim_benchmark_dates.pytests/test_clim_benchmark_end_to_end.pyCoverage includes:
Validation
62 passed.Notes
hyve-clim-benchmarkconsistent with the rewrite goals.