-
Notifications
You must be signed in to change notification settings - Fork 28
Description
@DCaviedesV, @s-poll, @kvrigor, @jjokella
While running this test case, I observed lack of reproducibility between identical runs when using a binary built with intel.psmpi. Two runs performed using JUBE with the same configuration with model_id = icon-eclm produce different results. The same behaviour is observed for the model_id = icon-eclm-parflow, where repeated runs with identical settings (using JUBE) also produce different outputs. do not produce identical results.
Below, I present the maximum absolute error for each variables computed by comparing the outputs of two identical runs for the model_id=icon-eclm-parflow and model_id=icon-eclm:
| variable | model_id=icon-eclm-parflow max. abs. error |
model_id=icon-eclm max. abs. error |
|---|---|---|
| ICON | ||
| w error = | 1.0482156277 | 1.1639280319 |
| theta_v error = | 5.6016540527 | 5.4000854492 |
| qv error = | 0.0038429732 | 0.0035402337 |
| shfl_s error = | 47.2150611877 | 131.6409301758 |
| lhfl_s error = | 60.2069587708 | 414.4877624512 |
| eCLM | ||
| TWS error = | 9.8613281250 | 10.4727783203 |
| H2OSOI error = | 0.0715191215 | 0.1653197557 |
| TSOI error = | 3.6534118652 | 5.7122192383 |
| TG error = | 3.6873474121 | 6.1456604004 |
| EFLX_LH_TOT error = | 192.7704467773 | 197.8242187500 |
| FSH error = | 148.3424072266 | 218.8023071289 |
| FSA error = | 271.3027954102 | 328.3160400391 |
| FSR error = | 131.1349182129 | 141.3157348633 |
| FIRA error = | 51.1596603394 | 75.3006668091 |
| Rnet error = | 220.1431274414 | 254.9673156738 |
| EFLX_SOIL_GRND error = | 108.2931365967 | 113.7866668701 |
| ParFlow | ||
| pressure error = | 3460.3571839130 | - |
| saturation error = | 0.2303589024 | - |
| evaptrans error = | 0.5114949301 | - |
We need some insight if the magnitude of these errors is acceptable and expected for this type of scenario.
Icon units: w=[m/s], theta_v=[K], qv = [kg/kg], shfl/lhfl= [W/m2]
eCLM variables units here
NOTE: Tests were carried out on JURECA-DC with Stages/2025.
NOTE1: For the model_id = eclm-parflow, or only one component (eclm, icon, or parflow); this does not occur.
NOTE2: In contrast, this non-reproducibility is not observed when using executables built with gnu.openmpi or gnu.psmpi. For these builds, repeated runs with identical configurations are reproducible for both model_id's.
NOTE3: Try another test case.