[1438] Adding padding to end_date to avoid duplicate samples by enssow · Pull Request #1749 · ecmwf/WeatherGenerator

enssow · 2026-01-29T16:17:58Z

Description

TimeWindowHandler doesn't produce enough available forecast initilisation times to choose for samples when run inference on a model trained with $n_{fstep}$ forecast steps and $n_{samples}*dt\geq t_{end} - t_{start}$.
Where $n_{fstep}=$--forecast_steps, $n_{samples}=$--samples, $dt=$--step_hours, $t_{start}=$--start, $t_{end}=$--end
(See #1438 and #1085) for more info

This PR provides this padding by working out how many available individual initialisation times there are and adjusting the end of the time window to accomodate that and taking into account the extra time needed to accomodate the number of forecast steps to rollout to

Issue Number

Closes #1438

Checklist before asking for review

I have performed a self-review of my code
My changes comply with basic sanity checks:
- I have fixed formatting issues with ./scripts/actions.sh lint
- I have run unit tests with ./scripts/actions.sh unit-test
- I have documented my code and I have updated the docstrings.
- I have added unit tests, if relevant
I have tried my changes with data and code:
- I have run the integration tests with ./scripts/actions.sh integration-test
- (bigger changes) I have run a full training and I have written in the comment the run_id(s): launch-slurm.py --time 60
- (bigger changes and experiments) I have shared a hegdedoc in the github issue with all the configurations and runs for this experiments
I have informed and aligned with people impacted by my change:
- for config changes: the MatterMost channels and/or a design doc
- for changes of dependencies: the MatterMost software development channel

enssow · 2026-01-29T16:44:56Z

Tested on SANTIS for:

uv run inference --from-run-id f4duf5ji --samples 254 --streams-output ERA5 --options training_config.forecast.num_steps=5
uv run inference --from-run-id f4duf5ji --samples 254 --streams-output ERA5
both now do not return duplication warning and inference_id cixysv6l was run to completion with success

ankitpatnala · 2026-02-03T13:56:47Z

Thanks @enssow for handling this issue
I tested the code using
srun uv run inference --from-run-id f4duf5ji --samples 10 -start="2022-10-01" -end="2022-10-02" --streams-output ERA5 --options training_config.forecast.num_steps=5
The code functioned way it has been described.

But I still do not know what will be a better strategy; should we pad with available dates or decrease the num_samples to defined range date. What if there is no data after the defined end_date. It will throw an error or return empty tensors.

ankitpatnala · 2026-02-03T14:26:49Z

Can you run some inference with this options
srun uv run inference --from-run-id f4duf5ji --samples 10 -start=2022-10-01 -end=2022-10-02 --streams-output ERA5 --options training_config.forecast.num_steps=10 training_config.forecast.time_step=03:00:00

I saw some unwanted behaviour there

Logging set up. Logs are in ./output/uv3yi4ac
DDP initialization: rank=0, world_size=1
Using adjusted end date 2022-10-03T00:00:00.000000000 instead of 2022-10-02T00:00:00.000000000
TimeWindowHandler: start=2022-10-01T00:00:00.000000000, end=2022-10-03T00:00:00.000000000, len=06:00:00, step=06:00:00`

adding padding to end_date to avoid duplicate samples

6c26d0f

github-project-automation bot added this to WeatherGen-dev Jan 29, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[1438] Adding padding to end_date to avoid duplicate samples#1749

[1438] Adding padding to end_date to avoid duplicate samples#1749
enssow wants to merge 1 commit intoecmwf:developfrom
enssow:sorcha/dev/1438

enssow commented Jan 29, 2026 •

edited

Loading

Uh oh!

enssow commented Jan 29, 2026

Uh oh!

ankitpatnala commented Feb 3, 2026 •

edited

Loading

Uh oh!

ankitpatnala commented Feb 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

enssow commented Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Issue Number

Checklist before asking for review

Uh oh!

enssow commented Jan 29, 2026

Uh oh!

ankitpatnala commented Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ankitpatnala commented Feb 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

enssow commented Jan 29, 2026 •

edited

Loading

ankitpatnala commented Feb 3, 2026 •

edited

Loading