[1438] Adding padding to end_date to avoid duplicate samples#1749
[1438] Adding padding to end_date to avoid duplicate samples#1749enssow wants to merge 1 commit intoecmwf:developfrom
Conversation
|
Tested on SANTIS for:
|
|
Thanks @enssow for handling this issue But I still do not know what will be a better strategy; should we pad with available dates or decrease the num_samples to defined range date. What if there is no data after the defined end_date. It will throw an error or return empty tensors. |
|
Can you run some inference with this options I saw some unwanted behaviour there |
Description
TimeWindowHandler doesn't produce enough available forecast initilisation times to choose for samples when run inference on a model trained with$n_{fstep}$ forecast steps and $n_{samples}*dt\geq t_{end} - t_{start}$ .$n_{fstep}=$ $n_{samples}=$ $dt=$ $t_{start}=$ $t_{end}=$
Where
--forecast_steps,--samples,--step_hours,--start,--end(See #1438 and #1085) for more info
This PR provides this padding by working out how many available individual initialisation times there are and adjusting the end of the time window to accomodate that and taking into account the extra time needed to accomodate the number of forecast steps to rollout to
Issue Number
Closes #1438
Checklist before asking for review
./scripts/actions.sh lint./scripts/actions.sh unit-test./scripts/actions.sh integration-testlaunch-slurm.py --time 60