Preliminary performance test has been conducted on Nestum cluster environment in order to estimate the parallel performance. The theoretical peak performance Rmax was calculated as following Rmax = Number of nodes [latex]\times[/latex] Number of cores per node [latex]\times[/latex] AVX2 base frequency [latex]\times[/latex] Number of DP operation per cycle = 24 [latex]\times[/latex] 32 [latex]\times[/latex] 1.9 [latex]\times[/latex] 16 = 23347 Gflops. Standard LINPACK test from the HPL-2.2 package, performed with Intel OneaAPI 2022 and OpenMPI-1.10.3 and the following parameters:
...
615936 Ns
1 # of NBs
192 NBs
...
1 # of process grids (P x Q)
24 Ps
32 Qs
...
measured Rpeak = 18931 Gflops and parallel efficiency 80.8 %. These results place Nestum as the third fastest supercomputer in Bulgaria.