# Performance

Preliminary performance test has been conducted on Nestum cluster environment in order to estimate the parallel performance. The theoretical peak performance Rmax was calculated as following Rmax = Number of nodes $\times$ Number of cores per node $\times$ AVX2 base frequency $\times$ Number of DP operation per cycle = 24 $\times$ 32 $\times$ 1.9 $\times$ 16 = 23347 Gflops. Standard LINPACK test from the HPL-2.2 package, performed with Intel Compiler XE 2017 developer edition and OpenMPI-1.10.3 and the following parameters:

...
615936  Ns
1            # of NBs
192          NBs
...
1            # of process grids (P x Q)
24            Ps
32           Qs
...


measured Rpeak = 19001 Gflops and parallel efficiency 81.4 %. These results place Nestum as the second fastest supercomputer in Bulgaria.