System Tests#

CI System Tests on External Machines by Builder (Click to Expand)

Machine: balfrin	alps_mch_test_cpu	alps_mch_test_gpu	balfrin_cpu_nvidia	balfrin_cpu_nvidia_mixed	balfrin_gpu_nvidia	balfrin_gpu_nvidia_mixed
`check.c2sm_clm_r13b03_seaice`	✔︎	✔︎	✔︎	✔︎	✔︎	✔︎
`check.mch_ch_r04b09_dace`			✔︎		✔︎
`check.mch_icon-ch1_small`	✔︎	✔︎	✔︎	✔︎	✔︎	✔︎
`check.mch_icon-ch2_small`	✔︎	✔︎	✔︎	✔︎	✔︎	✔︎
`check.mch_kenda-ch1_dev2_small`	✔︎	✔︎	✔︎		✔︎
`check.mch_kenda-ch1_dev_small`	✔︎	✔︎	✔︎		✔︎
`check.mch_kenda-ch1_small`	✔︎	✔︎	✔︎	✔︎	✔︎	✔︎
`check.mch_opr_r04b07`			✔︎		✔︎
`check.mch_opr_r04b07_lhn_00`			✔︎		✔︎
`check.mch_opr_r04b07_lhn_12_emvorado`			✔︎		✔︎
`check.mch_opr_r04b07_nest`			✔︎		✔︎
`check.mch_opr_r04b07_performance`						✔︎
`check.mch_opr_r04b07_sstice_inst`	✔︎	✔︎	✔︎	✔︎	✔︎	✔︎
`check.mch_opr_r19b07_lpi`			✔︎		✔︎
`check.mch_opr_r19b07_midnight`	✔︎		✔︎	✔︎
`check.mch_opr_r19b07_turb`			✔︎		✔︎
`check.mch_pollen_test`	✔︎	✔︎	✔︎	✔︎	✔︎	✔︎

Machine: breeze

breeze_gcc

breeze_intel

breeze_nag

check.atm_qubicc_short

✔︎

exp.icon-testbed_sync

✔︎

exp.ocean_omip_short_r2b4

✔︎

Machine: dwd_nec	dwd_nec_coupling	dwd_nec_hybrid	dwd_nec_oper
`check.ICON_CLM`		✔︎
`check.atm_nwp_jsbach_test`		✔︎	✔︎
`check.atm_tracer_df4`			✔︎
`check.dwd_art_dustrad_R2B4N5`			✔︎
`check.waves_R2B4_global_no_forcing`		✔︎
`exp.check_externals_DWD.run`	✔︎
`exp.check_icon`	✔︎
`nwpexp.build_comin_plugins_DWD_nec`	✔︎
`exp.icon-testbed_sync`	✔︎
`nwpexp.run_ICON-ART_01_R3B08_lam_initmode7_pollen`			✔︎
`nwpexp.run_ICON_01_R3B9_lam`			✔︎
`nwpexp.run_ICON_02_R2B13_lam`			✔︎
`nwpexp.run_ICON_03_R19B7N8-ID2_ID1_lam`			✔︎
`nwpexp.run_ICON_05_R02B06N08_ifsinit_restarttest`			✔︎
`nwpexp.run_ICON_06_R02B06N07_UPATMO_ifsinit_restarttest`			✔︎
`nwpexp.run_ICON_07_R02B04N06M_restarttest`			✔︎
`nwpexp.run_ICON_08_R19B7-ID2_oper`			✔︎
`nwpexp.run_ICON_09_R2B6N7_oper_EPS`			✔︎
`nwpexp.run_ICON_11_R3B08_lam_initmode7_restarttest`		✔︎	✔︎
`nwpexp.run_ICON_12_R3B08_lam_initmode4`			✔︎
`nwpexp.run_ICON_13_R2B08-dkltest`			✔︎
`nwpexp.run_ICON_14_R2B6N7_oper_IAU_and_restarttest`		✔︎	✔︎
`nwpexp.run_ICON_15_R19B7-ID2_ass`			✔︎
`nwpexp.run_ICON_16_R19B7-ID2_ass`			✔︎
`nwpexp.run_ICON_17_R2B4_AO_coupled`	✔︎
`nwpexp.run_ICON_18_R2B4_waves_adv_nophys`		✔︎
`nwpexp.run_ICON_19_R2B4_cmip_forcing`		✔︎	✔︎
`nwpexp.run_ICON_20_R2B4_R2B6_AO_coupled`	✔︎
`nwpexp.run_ICON_21_R2B4_waves_standalone_restart`		✔︎
`nwpexp.run_ICON_22_R3B08_lam_SBM_initmode4`			✔︎
`nwpexp.run_ICON_23_R2B4_atmo_waves_coupled`	✔︎
`nwpexp.run_ICON_25_R19B7_RUC_fc`			✔︎
`nwpexp.run_ICON_26_R2B4_R2B6_AO-HD_coupled`	✔︎
`nwpexp.run_ICON_27_R2B6_mvstream_qbudget`			✔︎
`nwpexp.run_ICON_28_R2B6_IAU_atmo_waves_coupled`	✔︎
`nwpexp.run_ICON_29_R2B5_R2B6_AO_coupled_gribInitOce`	✔︎
`exp.run_ICON-SCM_01_BOMEX.run`			✔︎
`exp.run_ICON-SCM_02_REAL.run`			✔︎
`exp.run_ICON-SCM_03_LANFEX.run`			✔︎

Machine: horeka	horeka_cpu_nvhpc	horeka_gcc	horeka_gpu_nvhpc	horeka_intel
`check.dwd_art_dustrad_R2B4N5`	✔︎	✔︎		✔︎
`check.dwd_run_ICON_09_R2B4N5_EPS`	✔︎		✔︎
`check.mch_ch_r04b09_dace_synsat`	✔︎		✔︎
`exp.check_icon`	✔︎	✔︎	✔︎	✔︎
`exp.dwd_run_ICON_09_R2B4N5_EPS`		✔︎		✔︎

Machine: levante	levante_aurora	levante_cpu_nvhpc	levante_gcc	levante_gcc_hybrid	levante_gpu_nvhpc	levante_gpu_validation	levante_intel	levante_intel_hybrid	levante_intel_hybrid_mixed	levante_intel_pio	levante_nag	levante_nag_serial
`exp.art_levante_test.run`			✔︎
`exp.art_levante_test_short.run`			✔︎
`bubble.config`								✔︎
`exp.test_coupled_160kmNestedAtm_40kmOce`				✔︎							✔︎
`test_amip-aes_R2B6_gpu.config`					✔︎
`check.ICON_CLM`		✔︎		✔︎	✔︎		✔︎
`clmexp.ICON_CLM_global_mean_no_boundary`				✔︎				✔︎
`clmexp.ICON_CLM_output_coupling`				✔︎			✔︎
`check.aes_amip_pfts_test`			✔︎	✔︎	✔︎		✔︎	✔︎	✔︎		✔︎
`check.aes_amip_test`			✔︎	✔︎	✔︎		✔︎	✔︎	✔︎		✔︎
`check.atm_ape`			✔︎	✔︎			✔︎	✔︎	✔︎		✔︎
`check.atm_bubble_land_test`		✔︎	✔︎	✔︎			✔︎	✔︎	✔︎		✔︎	✔︎
`check.atm_bubble_test`		✔︎	✔︎	✔︎			✔︎	✔︎	✔︎		✔︎	✔︎
`check.atm_cph_nest_R2B4_test`			✔︎	✔︎	✔︎		✔︎	✔︎	✔︎		✔︎
`check.atm_heldsuarez`			✔︎	✔︎			✔︎	✔︎	✔︎		✔︎	✔︎
`check.atm_nwp_jsbach_test`			✔︎		✔︎		✔︎				✔︎
`check.atm_qubicc`		✔︎	✔︎	✔︎	✔︎		✔︎	✔︎	✔︎		✔︎
`check.atm_qubicc_grb`		✔︎	✔︎	✔︎	✔︎		✔︎	✔︎	✔︎		✔︎
`check.atm_qubicc_nofor`		✔︎	✔︎	✔︎	✔︎		✔︎	✔︎	✔︎		✔︎
`check.atm_qubicc_onlyfor`		✔︎	✔︎	✔︎	✔︎		✔︎	✔︎	✔︎		✔︎
`check.atm_qubicc_pfts`		✔︎	✔︎	✔︎	✔︎		✔︎	✔︎	✔︎		✔︎
`check.atm_qubicc_test_aero`		✔︎	✔︎	✔︎	✔︎		✔︎	✔︎	✔︎		✔︎
`check.atm_qubicc_tmx`		✔︎	✔︎	✔︎	✔︎		✔︎	✔︎	✔︎		✔︎
`check.atm_sma`			✔︎	✔︎			✔︎	✔︎			✔︎
`check.atm_tracer_df4`								✔︎			✔︎
`check.land_quincy_canopy_test`			✔︎		✔︎			✔︎			✔︎
`check.waves_R2B4_global_no_forcing`											✔︎
`exp.aes_bubble_test.run`	✔︎
`exp.host_and_vector_only_tests.run`	✔︎
`exp.check_externals_LEVANTE.run`		✔︎	✔︎	✔︎			✔︎	✔︎	✔︎	✔︎	✔︎	✔︎
`exp.check_externals_LEVANTE_gpu.run`					✔︎
`exp.check_icon`		✔︎	✔︎	✔︎	✔︎		✔︎	✔︎	✔︎	✔︎	✔︎	✔︎
`exp.atm_tracer_Hadley_comin_portability`		✔︎			✔︎
`exp.build_comin_plugins_LEVANTE.run`			✔︎	✔︎
`exp.run_ICON_17_R2B4_AO_coupled_LEVANTE.run`			✔︎					✔︎
`exp.atm_memLog`			✔︎					✔︎
`exp.atm_memLog_AsyncIO`			✔︎					✔︎
`exp.oce_memLog`			✔︎					✔︎
`exp.mkexp_log_monitoring.run`								✔︎
`exp.atm_amip_R2B4_1day`										✔︎
`exp.atm_amip_R2B4_1day_pio`										✔︎
`exp.esm_bb_ruby0_check_output_LEVANTE.run`			✔︎
`exp.hamocc_omip_10days`										✔︎
`exp.icon-testbed_communication_orig`					✔︎			✔︎
`exp.icon-testbed_communication_yaxt`					✔︎			✔︎
`exp.icon-testbed_sync`	✔︎	✔︎	✔︎		✔︎			✔︎			✔︎
`exp.ac3_les_20210211.run`								✔︎
`exp.test_atm_nwp_yac_o3coupling.run`							✔︎
`nwpexp.run_ICON_18_R2B4_waves_adv_nophys`								✔︎			✔︎
`nwpexp.run_ICON_19_R2B4_cmip_forcing`				✔︎				✔︎
`nwpexp.run_ICON_21_R2B4_waves_standalone_restart`								✔︎
`nwpexp.run_ICON_23_R2B4_atmo_waves_coupled`			✔︎				✔︎
`exp.ocean_WilliamsonTestCase2_Hex`			✔︎	✔︎			✔︎	✔︎			✔︎	✔︎
`exp.ocean_omip_R2B4_V0_GM0`		✔︎			✔︎
`exp.ocean_omip_R2B4_V0_GM0_comparison.run`						✔︎
`exp.ocean_omip_R2B4_V0_GM0_cpurun.run`						✔︎
`exp.ocean_omip_R2B4_V0_GM0_cpurun_lvec.run`						✔︎
`exp.ocean_omip_R2B4_V0_GM0_gpurun.run`						✔︎
`exp.ocean_omip_R2B4_V0_GM0_gpurun_lvec.run`						✔︎
`exp.ocean_omip_R2B4_V0_GM0_icono`		✔︎			✔︎
`exp.ocean_omip_R2B4_V1_GM0`		✔︎			✔︎
`exp.ocean_omip_R2B4_V1_GM0_comparison.run`						✔︎
`exp.ocean_omip_R2B4_V1_GM0_cpurun.run`						✔︎
`exp.ocean_omip_R2B4_V1_GM0_cpurun_lvec.run`						✔︎
`exp.ocean_omip_R2B4_V1_GM0_gpurun.run`						✔︎
`exp.ocean_omip_R2B4_V1_GM0_gpurun_lvec.run`						✔︎
`exp.test_concurrent_hamocc_omip_10days`			✔︎	✔︎			✔︎	✔︎			✔︎
`exp.test_hamocc_omip_10days`			✔︎	✔︎			✔︎	✔︎			✔︎
`exp.test_ocean_newice_omip_10days`			✔︎	✔︎			✔︎	✔︎			✔︎
`exp.test_ocean_omip_10days`			✔︎	✔︎			✔︎	✔︎			✔︎
`exp.test_ocean_zstar_omip_10days`			✔︎	✔︎			✔︎	✔︎			✔︎
`exp.ocean_omip_ptest`			✔︎	✔︎			✔︎	✔︎			✔︎
`exp.test_multioutput_model_40km`			✔︎				✔︎				✔︎
`exp.test_ocean_omip_technical`			✔︎	✔︎			✔︎	✔︎			✔︎
`exp.atm_tracer_Hadley`		✔︎			✔︎
`exp.esm_bb_ruby0`			✔︎	✔︎			✔︎	✔︎	✔︎	✔︎
`exp.esm_bb_ruby0_pio`										✔︎
`exp.seamless_bb-ecradmin`			✔︎	✔︎			✔︎	✔︎
`exp.seamless_bb-proto1`			✔︎	✔︎			✔︎	✔︎			✔︎
`exp.seamless_bb-proto2`			✔︎	✔︎			✔︎	✔︎
`exp.test_nwp_R02B04N06multi`			✔︎	✔︎			✔︎	✔︎	✔︎
`exp.test_nwp_R02B04_R02B05_nest`			✔︎	✔︎			✔︎	✔︎	✔︎
`exp.test_nwp_R02B04_R02B05_nest_comin`			✔︎	✔︎
`exp.test_nwp_R02B04_R02B05_nest_comin_python`			✔︎	✔︎
`test_nextGEMS.config`								✔︎
`test_yaxt_xchange.config`								✔︎

Machine: lumi	lumi_cpu	lumi_gpu
`check.aes_amip_pfts_test`	✔︎	✔︎
`check.aes_amip_test`	✔︎	✔︎
`check.atm_bubble_test`	✔︎	✔︎
`check.atm_heldsuarez`	✔︎	✔︎
`check.atm_qubicc`	✔︎	✔︎
`check.atm_qubicc_grb`	✔︎	✔︎
`check.atm_qubicc_nofor`	✔︎	✔︎
`check.atm_qubicc_onlyfor`	✔︎	✔︎
`check.atm_qubicc_pfts`	✔︎	✔︎
`check.atm_qubicc_test_aero`	✔︎	✔︎
`check.atm_qubicc_tmx`	✔︎	✔︎
`exp.check_icon`	✔︎	✔︎
`exp.icon-testbed_sync`	✔︎	✔︎
`exp.ocean_WilliamsonTestCase2_Hex`	✔︎
`exp.test_ocean_zstar_omip_10days`	✔︎
`exp.atm_tracer_Hadley_R2B4`	✔︎	✔︎

Machine: mpimac

mpimac_gcc

check.atm_qubicc_short

✔︎

exp.ocean_omip_short_r2b4

✔︎

Machine: santis	santis_cpu_nvhpc	santis_gpu_nvhpc
`check.c2sm_clm_r13b03_seaice`	✔︎	✔︎
`check.dwd_run_ICON_09_R2B4N5_EPS`	✔︎	✔︎
`check.exclaim_ape_R02B04`	✔︎	✔︎
`check.mch_icon-ch1_small`	✔︎	✔︎
`check.mch_icon-ch2_small`	✔︎	✔︎
`check.mch_opr_r04b07_sstice_inst`	✔︎	✔︎
`check.mch_opr_r19b07_2m_expl`	✔︎	✔︎
`check.mch_opr_r19b07_2m_impl`	✔︎	✔︎
`check.mch_opr_r19b07_lpi`	✔︎	✔︎
`check.mch_opr_r19b07_turb`	✔︎	✔︎

ICON Development Checksuite#

The ICON development checksuite (icon-dev.checksuite) defines a set of generic system tests that can be applied to any ICON configuration/experiment. One or several of the checksuite flags (see definitions below) defined in the check.<exp-name> file indicates which of the system tests are applied to the experiment. check.<exp-name> may also be used to add specific input to the experiments. See a list of checksuite experiments here.

Note: Checksuite is only compatible with make_runscripts, but not with mkexp at the moment.

The central element of the checksuite is the base (b) test, which runs a simulation of the experiment and only fails if the simulation crashes (i.e., a smoke test). Most of the other tests use this first simulation (referred hereon as the base simulation) as a reference for comparison of results, see e.g. restart test below. Note that:

If check.<exp-name> does not define 'b' among its flags, the base simulation still runs in order to have a reference for other tests.
If there are several tests defined in check.<exp-name>, the base simulation runs once and the same base simulation is used as reference for all tests.

Performance Tests#

Performance (p) test: Measures the total runtime of the simulation and compares it to a stored reference value. The test fails if the current run is more than 10% slower than the reference.

Regression Tests#

Tolerance (t) test: Uses probtest to generate statistics on the output of the base simulation. These statistics are then compared to a set of stored reference values, and the test checks that all deviations fall within predefined tolerance intervals. Both the reference values and their associated tolerances are stored in advance.
Update (u) test: Checks bit-identity of results of the base simulation with stored reference values.

Runtime Configuration Tests#

CUDA Graph (g) test: Runs the simulation with CUDA Graphs enabled and compares the output to the base simulation to verify correctness. This tests compatibility and stability of the code when GPU graph execution is active.

Sanitizer Tests#

Compute Sanitizer (c) test: Runs the simulation under NVIDIA’s Compute Sanitizer to detect memory access errors, uninitialized memory usage, synchronization issues, and potential data races on the GPU. This test is useful for debugging low-level GPU issues but can significantly increase runtime and produce very large log files.

Technical Feature Tests#

Restart (r) test: Tests the “restart” feature, by writing a checkpoint file at the middle of the base simulation runtime, and then running a second simulation starting from that checkpoint. Then, the test checks bit-identity of results between the second half of the base simulation and the second simulation.

Technical Decomposition Tests#

mpi (m) test: Runs a simulation with modified MPI settings. Specifically, it reduces the number of MPI processes per node by 1 if the number is greater than 1; otherwise, it reduces the number of nodes by 1. After running the simulation, it checks bit-identity of results between this and the base simulation.
nproma (n) test: Runs a simulation with modified nproma settings (nproma modified = nproma base + 1), then checks bit-identity of results between this and the base simulation.
omp (o) test: Runs a simulation with modified omp settings (one thread less per MPI processor), then checks bit-identity of results between this and the base simulation.

System Tests#

Machine: balfrin

Machine: breeze

Machine: dwd_nec

Machine: horeka

Machine: levante

Machine: lumi