System Tests#

CI System Tests on External Machines by Builder (Click to Expand)

Machine: balfrin

alps_mch_test_cpu
alps_mch_test_gpu
balfrin_cpu_nvidia
balfrin_cpu_nvidia_mixed
balfrin_gpu_nvidia
balfrin_gpu_nvidia_mixed

check.c2sm_clm_r13b03_seaice

✔︎

✔︎

✔︎

✔︎

✔︎

✔︎

check.mch_ch_r04b09_dace

✔︎

✔︎

check.mch_icon-ch1_small

✔︎

✔︎

✔︎

✔︎

✔︎

✔︎

check.mch_icon-ch2_small

✔︎

✔︎

✔︎

✔︎

✔︎

✔︎

check.mch_kenda-ch1_dev2_small

✔︎

✔︎

✔︎

✔︎

check.mch_kenda-ch1_dev_small

✔︎

✔︎

✔︎

✔︎

check.mch_kenda-ch1_small

✔︎

✔︎

✔︎

✔︎

✔︎

✔︎

check.mch_opr_r04b07

✔︎

✔︎

check.mch_opr_r04b07_lhn_00

✔︎

✔︎

check.mch_opr_r04b07_lhn_12_emvorado

✔︎

✔︎

check.mch_opr_r04b07_nest

✔︎

✔︎

check.mch_opr_r04b07_performance

✔︎

check.mch_opr_r04b07_sstice_inst

✔︎

✔︎

✔︎

✔︎

✔︎

✔︎

check.mch_opr_r19b07_lpi

✔︎

✔︎

check.mch_opr_r19b07_midnight

✔︎

✔︎

✔︎

check.mch_opr_r19b07_turb

✔︎

✔︎

check.mch_pollen_test

✔︎

✔︎

✔︎

✔︎

✔︎

✔︎

Machine: breeze

breeze_gcc
breeze_intel
breeze_nag

check.atm_qubicc_short

✔︎

✔︎

✔︎

exp.icon-testbed_sync

✔︎

✔︎

exp.ocean_omip_short_r2b4

✔︎

✔︎

✔︎

Machine: dwd_nec

dwd_nec_coupling
dwd_nec_hybrid
dwd_nec_oper

check.ICON_CLM

✔︎

check.atm_nwp_jsbach_test

✔︎

✔︎

check.atm_tracer_df4

✔︎

check.dwd_art_dustrad_R2B4N5

✔︎

check.waves_R2B4_global_no_forcing

✔︎

exp.check_externals_DWD.run

✔︎

exp.check_icon

✔︎

nwpexp.build_comin_plugins_DWD_nec

✔︎

exp.icon-testbed_sync

✔︎

nwpexp.run_ICON-ART_01_R3B08_lam_initmode7_pollen

✔︎

nwpexp.run_ICON_01_R3B9_lam

✔︎

nwpexp.run_ICON_02_R2B13_lam

✔︎

nwpexp.run_ICON_03_R19B7N8-ID2_ID1_lam

✔︎

nwpexp.run_ICON_05_R02B06N08_ifsinit_restarttest

✔︎

nwpexp.run_ICON_06_R02B06N07_UPATMO_ifsinit_restarttest

✔︎

nwpexp.run_ICON_07_R02B04N06M_restarttest

✔︎

nwpexp.run_ICON_08_R19B7-ID2_oper

✔︎

nwpexp.run_ICON_09_R2B6N7_oper_EPS

✔︎

nwpexp.run_ICON_11_R3B08_lam_initmode7_restarttest

✔︎

✔︎

nwpexp.run_ICON_12_R3B08_lam_initmode4

✔︎

nwpexp.run_ICON_13_R2B08-dkltest

✔︎

nwpexp.run_ICON_14_R2B6N7_oper_IAU_and_restarttest

✔︎

✔︎

nwpexp.run_ICON_15_R19B7-ID2_ass

✔︎

nwpexp.run_ICON_16_R19B7-ID2_ass

✔︎

nwpexp.run_ICON_17_R2B4_AO_coupled

✔︎

nwpexp.run_ICON_18_R2B4_waves_adv_nophys

✔︎

nwpexp.run_ICON_19_R2B4_cmip_forcing

✔︎

✔︎

nwpexp.run_ICON_20_R2B4_R2B6_AO_coupled

✔︎

nwpexp.run_ICON_21_R2B4_waves_standalone_restart

✔︎

nwpexp.run_ICON_22_R3B08_lam_SBM_initmode4

✔︎

nwpexp.run_ICON_23_R2B4_atmo_waves_coupled

✔︎

nwpexp.run_ICON_25_R19B7_RUC_fc

✔︎

nwpexp.run_ICON_26_R2B4_R2B6_AO-HD_coupled

✔︎

nwpexp.run_ICON_27_R2B6_mvstream_qbudget

✔︎

nwpexp.run_ICON_28_R2B6_IAU_atmo_waves_coupled

✔︎

nwpexp.run_ICON_29_R2B5_R2B6_AO_coupled_gribInitOce

✔︎

exp.run_ICON-SCM_01_BOMEX.run

✔︎

exp.run_ICON-SCM_02_REAL.run

✔︎

exp.run_ICON-SCM_03_LANFEX.run

✔︎

Machine: horeka

horeka_cpu_nvhpc
horeka_gcc
horeka_gpu_nvhpc
horeka_intel

check.dwd_art_dustrad_R2B4N5

✔︎

✔︎

✔︎

check.dwd_run_ICON_09_R2B4N5_EPS

✔︎

✔︎

check.mch_ch_r04b09_dace_synsat

✔︎

✔︎

exp.check_icon

✔︎

✔︎

✔︎

✔︎

exp.dwd_run_ICON_09_R2B4N5_EPS

✔︎

✔︎

Machine: levante

levante_aurora
levante_cpu_nvhpc
levante_gcc
levante_gcc_hybrid
levante_gpu_nvhpc
levante_intel
levante_intel_hybrid
levante_intel_hybrid_mixed
levante_intel_pio
levante_nag
levante_nag_serial

exp.art_levante_test.run

✔︎

exp.art_levante_test_short.run

✔︎

bubble.config

✔︎

exp.test_coupled_160kmNestedAtm_40kmOce

✔︎

✔︎

test_amip-aes_R2B6_gpu.config

✔︎

check.ICON_CLM

✔︎

✔︎

✔︎

✔︎

clmexp.ICON_CLM_global_mean_no_boundary

✔︎

✔︎

clmexp.ICON_CLM_output_coupling

✔︎

✔︎

check.aes_amip_pfts_test

✔︎

✔︎

✔︎

✔︎

✔︎

✔︎

✔︎

check.aes_amip_test

✔︎

✔︎

✔︎

✔︎

✔︎

✔︎

✔︎

check.atm_ape

✔︎

✔︎

✔︎

✔︎

✔︎

✔︎

check.atm_bubble_land_test

✔︎

✔︎

✔︎

✔︎

✔︎

✔︎

✔︎

✔︎

check.atm_bubble_test

✔︎

✔︎

✔︎

✔︎

✔︎

✔︎

✔︎

✔︎

check.atm_cph_nest_R2B4_test

✔︎

✔︎

✔︎

✔︎

✔︎

✔︎

✔︎

check.atm_heldsuarez

✔︎

✔︎

✔︎

✔︎

✔︎

✔︎

✔︎

check.atm_nwp_jsbach_test

✔︎

✔︎

✔︎

✔︎

check.atm_qubicc

✔︎

✔︎

✔︎

✔︎

✔︎

✔︎

✔︎

✔︎

check.atm_qubicc_grb

✔︎

✔︎

✔︎

✔︎

✔︎

✔︎

✔︎

✔︎

check.atm_qubicc_nofor

✔︎

✔︎

✔︎

✔︎

✔︎

✔︎

✔︎

✔︎

check.atm_qubicc_onlyfor

✔︎

✔︎

✔︎

✔︎

✔︎

✔︎

✔︎

✔︎

check.atm_qubicc_pfts

✔︎

✔︎

✔︎

✔︎

✔︎

✔︎

✔︎

✔︎

check.atm_qubicc_test_aero

✔︎

✔︎

✔︎

✔︎

✔︎

✔︎

✔︎

✔︎

check.atm_qubicc_tmx

✔︎

✔︎

✔︎

✔︎

✔︎

✔︎

✔︎

✔︎

check.atm_sma

✔︎

✔︎

✔︎

✔︎

✔︎

check.atm_tracer_df4

✔︎

✔︎

check.land_quincy_canopy_test

✔︎

✔︎

✔︎

✔︎

check.waves_R2B4_global_no_forcing

✔︎

exp.aes_bubble_test.run

✔︎

exp.host_and_vector_only_tests.run

✔︎

exp.check_externals_LEVANTE.run

✔︎

✔︎

✔︎

✔︎

✔︎

✔︎

✔︎

✔︎

✔︎

exp.check_externals_LEVANTE_gpu.run

✔︎

exp.check_icon

✔︎

✔︎

✔︎

✔︎

✔︎

✔︎

✔︎

✔︎

✔︎

✔︎

exp.atm_tracer_Hadley_comin_portability

✔︎

✔︎

exp.build_comin_plugins_LEVANTE.run

✔︎

✔︎

exp.run_ICON_17_R2B4_AO_coupled_LEVANTE.run

✔︎

✔︎

exp.atm_memLog

✔︎

✔︎

exp.atm_memLog_AsyncIO

✔︎

✔︎

exp.oce_memLog

✔︎

✔︎

exp.mkexp_log_monitoring.run

✔︎

exp.atm_amip_R2B4_1day

✔︎

exp.atm_amip_R2B4_1day_pio

✔︎

exp.esm_bb_ruby0_check_output_LEVANTE.run

✔︎

exp.hamocc_omip_10days

✔︎

exp.icon-testbed_communication_orig

✔︎

✔︎

exp.icon-testbed_communication_yaxt

✔︎

✔︎

exp.icon-testbed_sync

✔︎

✔︎

✔︎

✔︎

✔︎

✔︎

exp.ac3_les_20210211.run

✔︎

exp.test_atm_nwp_yac_o3coupling.run

✔︎

nwpexp.run_ICON_18_R2B4_waves_adv_nophys

✔︎

✔︎

nwpexp.run_ICON_19_R2B4_cmip_forcing

✔︎

✔︎

nwpexp.run_ICON_21_R2B4_waves_standalone_restart

✔︎

nwpexp.run_ICON_23_R2B4_atmo_waves_coupled

✔︎

✔︎

exp.ocean_WilliamsonTestCase2_Hex

✔︎

✔︎

✔︎

✔︎

✔︎

✔︎

exp.ocean_omip_R2B4_V0_GM0

✔︎

✔︎

exp.ocean_omip_R2B4_V0_GM0_icono

✔︎

✔︎

exp.ocean_omip_R2B4_V1_GM0

✔︎

✔︎

exp.test_concurrent_hamocc_omip_10days

✔︎

✔︎

✔︎

✔︎

✔︎

exp.test_hamocc_omip_10days

✔︎

✔︎

✔︎

✔︎

✔︎

exp.test_ocean_newice_omip_10days

✔︎

✔︎

✔︎

✔︎

✔︎

exp.test_ocean_omip_10days

✔︎

✔︎

✔︎

✔︎

✔︎

exp.test_ocean_zstar_omip_10days

✔︎

✔︎

✔︎

✔︎

✔︎

exp.ocean_omip_ptest

✔︎

✔︎

✔︎

✔︎

✔︎

exp.test_multioutput_model_40km

✔︎

✔︎

✔︎

exp.test_ocean_omip_technical

✔︎

✔︎

✔︎

✔︎

✔︎

exp.atm_tracer_Hadley

✔︎

✔︎

exp.esm_bb_ruby0

✔︎

✔︎

✔︎

✔︎

✔︎

✔︎

exp.esm_bb_ruby0_pio

✔︎

exp.seamless_bb-ecradmin

✔︎

✔︎

✔︎

✔︎

exp.seamless_bb-proto1

✔︎

✔︎

✔︎

✔︎

✔︎

exp.seamless_bb-proto2

✔︎

✔︎

✔︎

✔︎

exp.test_nwp_R02B04N06multi

✔︎

✔︎

✔︎

✔︎

✔︎

exp.test_nwp_R02B04_R02B05_nest

✔︎

✔︎

✔︎

✔︎

✔︎

exp.test_nwp_R02B04_R02B05_nest_comin

✔︎

✔︎

exp.test_nwp_R02B04_R02B05_nest_comin_python

✔︎

✔︎

test_nextGEMS.config

✔︎

test_yaxt_xchange.config

✔︎

Machine: lumi

lumi_cpu
lumi_gpu

check.aes_amip_pfts_test

✔︎

✔︎

check.aes_amip_test

✔︎

✔︎

check.atm_bubble_test

✔︎

✔︎

check.atm_heldsuarez

✔︎

✔︎

check.atm_qubicc

✔︎

✔︎

check.atm_qubicc_grb

✔︎

✔︎

check.atm_qubicc_nofor

✔︎

✔︎

check.atm_qubicc_onlyfor

✔︎

✔︎

check.atm_qubicc_pfts

✔︎

✔︎

check.atm_qubicc_test_aero

✔︎

✔︎

check.atm_qubicc_tmx

✔︎

✔︎

exp.check_icon

✔︎

✔︎

exp.icon-testbed_sync

✔︎

✔︎

exp.ocean_WilliamsonTestCase2_Hex

✔︎

exp.test_ocean_zstar_omip_10days

✔︎

exp.atm_tracer_Hadley_R2B4

✔︎

✔︎

Machine: mpimac

mpimac_gcc

check.atm_qubicc_short

✔︎

exp.ocean_omip_short_r2b4

✔︎

Machine: santis

santis_cpu_nvhpc
santis_gpu_nvhpc

check.c2sm_clm_r13b03_seaice

✔︎

✔︎

check.dwd_run_ICON_09_R2B4N5_EPS

✔︎

✔︎

check.exclaim_ape_R02B04

✔︎

✔︎

check.mch_icon-ch1_small

✔︎

✔︎

check.mch_icon-ch2_small

✔︎

✔︎

check.mch_opr_r04b07_sstice_inst

✔︎

✔︎

check.mch_opr_r19b07_lpi

✔︎

✔︎

check.mch_opr_r19b07_turb

✔︎

✔︎

ICON Development Checksuite#

The ICON development checksuite (icon-dev.checksuite) defines a set of generic system tests that can be applied to any ICON configuration/experiment. One or several of the checksuite flags (see definitions below) defined in the check.<exp-name> file indicates which of the system tests are applied to the experiment. check.<exp-name> may also be used to add specific input to the experiments. See a list of checksuite experiments here.

Note: Checksuite is only compatible with make_runscripts, but not with mkexp at the moment.

The central element of the checksuite is the base (b) test, which runs a simulation of the experiment and only fails if the simulation crashes (i.e., a smoke test). Most of the other tests use this first simulation (referred hereon as the base simulation) as a reference for comparison of results, see e.g. restart test below. Note that:

  • If check.<exp-name> does not define 'b' among its flags, the base simulation still runs in order to have a reference for other tests.

  • If there are several tests defined in check.<exp-name>, the base simulation runs once and the same base simulation is used as reference for all tests.

Performance Tests#

  • Performance (p) test: Measures the total runtime of the simulation and compares it to a stored reference value. The test fails if the current run is more than 10% slower than the reference.

Regression Tests#

  • Tolerance (t) test: Uses probtest to generate statistics on the output of the base simulation. These statistics are then compared to a set of stored reference values, and the test checks that all deviations fall within predefined tolerance intervals. Both the reference values and their associated tolerances are stored in advance.

  • Update (u) test: Checks bit-identity of results of the base simulation with stored reference values.

Runtime Configuration Tests#

  • CUDA Graph (g) test: Runs the simulation with CUDA Graphs enabled and compares the output to the base simulation to verify correctness. This tests compatibility and stability of the code when GPU graph execution is active.

Sanitizer Tests#

  • Compute Sanitizer (c) test: Runs the simulation under NVIDIA’s Compute Sanitizer to detect memory access errors, uninitialized memory usage, synchronization issues, and potential data races on the GPU. This test is useful for debugging low-level GPU issues but can significantly increase runtime and produce very large log files.

Technical Feature Tests#

  • Restart (r) test: Tests the “restart” feature, by writing a checkpoint file at the middle of the base simulation runtime, and then running a second simulation starting from that checkpoint. Then, the test checks bit-identity of results between the second half of the base simulation and the second simulation.

Technical Decomposition Tests#

  • mpi (m) test: Runs a simulation with modified MPI settings. Specifically, it reduces the number of MPI processes per node by 1 if the number is greater than 1; otherwise, it reduces the number of nodes by 1. After running the simulation, it checks bit-identity of results between this and the base simulation.

  • nproma (n) test: Runs a simulation with modified nproma settings (nproma modified = nproma base + 1), then checks bit-identity of results between this and the base simulation.

  • omp (o) test: Runs a simulation with modified omp settings (one thread less per MPI processor), then checks bit-identity of results between this and the base simulation.