-
Notifications
You must be signed in to change notification settings - Fork 15.1k
Description
The small reproducers below are insipired by CPU2000/172.mgrid code from resid routine. LLVM is not able to fully optimize redundant loads in loops accessing adjacent elements, e.g. a(i-1)+a(i)+a(i+1).
Here is the most simple example for 1D array: https://godbolt.org/z/cacGdbhnr
Here, GVN is able to optimize only one load, but fails to optimize the second one.
A more complex example with a 2D array: https://godbolt.org/z/EboxdG5ae
Here, GVN is not able to optimize any of the loads.
The actual 172.mgrid code uses 3D arrays.
I tried to allow Add operations with non-constant operands in canPHITrans, PHITransAddr::translateSubExpr and PHITransAddr::insertTranslatedSubExpr, but GVN bails out early during the memory dependencies computations in the 2D case.
I will appreciate any suggestions how to tackle this problem, and any hints!