# Performance

Preliminary performance test has been conducted on Nestum cluster environment in order to estimate the parallel performance. The theoretical peak performance Rmax was calculated as following Rmax = Number of nodes $\times$ Number of cores per node $\times$ AVX2 base frequency $\times$ Number of DP operation per cycle = 24 $\times$ 32 $\times$ 1.9 $\times$ 16 = 23347 Gflops. Standard LINPACK test from the HPL-2.2 package, performed with Intel OneaAPI 2022  and OpenMPI-1.10.3 and the following parameters:

...
615936 Ns
1 # of NBs
192 NBs
...
1 # of process grids (P x Q)
24 Ps
32 Qs
...


measured Rpeak = 18931 Gflops and parallel efficiency 80.8 %. These results place Nestum as the third fastest supercomputer in Bulgaria.