Skip to content

Non-reproducibility when using icon + oasis and compiled with intel.psmpi #20

@AGonzalezNicolas

Description

@AGonzalezNicolas

@DCaviedesV, @s-poll, @kvrigor, @jjokella

While running this test case, I observed lack of reproducibility between identical runs when using a binary built with intel.psmpi. Two runs performed using JUBE with the same configuration with model_id = icon-eclm produce different results. The same behaviour is observed for the model_id = icon-eclm-parflow, where repeated runs with identical settings (using JUBE) also produce different outputs. do not produce identical results.

Below, I present the maximum absolute error for each variables computed by comparing the outputs of two identical runs for the model_id=icon-eclm-parflow and model_id=icon-eclm:

variable model_id=icon-eclm-parflow max. abs. error model_id=icon-eclm max. abs. error
ICON
w error = 1.0482156277 1.1639280319
theta_v error = 5.6016540527 5.4000854492
qv error = 0.0038429732 0.0035402337
shfl_s error = 47.2150611877 131.6409301758
lhfl_s error = 60.2069587708 414.4877624512
eCLM
TWS error = 9.8613281250 10.4727783203
H2OSOI error = 0.0715191215 0.1653197557
TSOI error = 3.6534118652 5.7122192383
TG error = 3.6873474121 6.1456604004
EFLX_LH_TOT error = 192.7704467773 197.8242187500
FSH error = 148.3424072266 218.8023071289
FSA error = 271.3027954102 328.3160400391
FSR error = 131.1349182129 141.3157348633
FIRA error = 51.1596603394 75.3006668091
Rnet error = 220.1431274414 254.9673156738
EFLX_SOIL_GRND error = 108.2931365967 113.7866668701
ParFlow
pressure error = 3460.3571839130 -
saturation error = 0.2303589024 -
evaptrans error = 0.5114949301 -

We need some insight if the magnitude of these errors is acceptable and expected for this type of scenario.
Icon units: w=[m/s], theta_v=[K], qv = [kg/kg], shfl/lhfl= [W/m2]
eCLM variables units here

NOTE: Tests were carried out on JURECA-DC with Stages/2025.

NOTE1: For the model_id = eclm-parflow, or only one component (eclm, icon, or parflow); this does not occur.

NOTE2: In contrast, this non-reproducibility is not observed when using executables built with gnu.openmpi or gnu.psmpi. For these builds, repeated runs with identical configurations are reproducible for both model_id's.

NOTE3: Try another test case.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions