You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[SelectionDAG] Fix unsafe cases for loop.dependence.{war/raw}.mask (#168565)
Both `LOOP_DEPENDENCE_WAR_MASK` and `LOOP_DEPENDENCE_RAW_MASK` are
currently hard to split correctly, and there are a number of incorrect
cases.
The difficulty comes from how the intrinsics are defined. For example,
take `LOOP_DEPENDENCE_WAR_MASK`.
It is defined as the OR of:
* `(ptrB - ptrA) <= 0`
* `elementSize * lane < (ptrB - ptrA)`
Now, if we want to split a loop dependence mask for the high half of the
mask we want to compute:
* `(ptrB - ptrA) <= 0`
* `elementSize * (lane + LoVT.getElementCount()) < (ptrB - ptrA)`
However, with the current opcode definitions, we can only modify ptrA or
ptrB, which may change the result of the first case, which should be
invariant to the lane.
This patch resolves these cases by adding a "lane offset" to the ISD
opcodes. The lane offset is always a constant. For scalable masks, it is
implicitly multiplied by vscale.
This makes splitting trivial as we increment the lane offset by
`LoVT.getElementCount()` now.
Note: In the AArch64 backend, we only support zero lane offsets (as
other cases are tricky to lower to whilewr/rw).
---------
Co-authored-by: Benjamin Maxwell <[email protected]>
0 commit comments