hlbl_optimisation

Some optimised kernels for hadronic light by light calculations in cvc. The CPU version has improved data layout and better OpenMP and MPI parallelisation. Correctness tests i.e. comparisons to the original are included. GPU version is in working.

2 + 2 disconnected calculations:

Computation Pi (contraction) rearranged the data layout and loop order to expose the data parallelism in the problem.
Copmutation of P1 (integral of volume) benefits the same way as pi from the adjusted data layout
computation of P2 and P3 (loop over n_y points and integral over volume) parallised with omp threads over y since they are independent calculations. Thread local arrays in an attempt to reduce sharing (? need to verify effectiveness)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hlbl_optimisation

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

hlbl_optimisation