You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[LAYOUTS] Implement IR support for LinearLayouts (#5170)
We also exercise this in scale_dot, where we enable support for warps of
arbitrary shape (before we just allowed `[num_warps, 1]`).
With this infra in place, it should be rather easy to move from the
legacy layouts to using LLs to represent all of our layouts.
Something I'm concerned about is the amount of recomputation that
happens when calling methods like `getSizePerThread` and the like, where
we keep recomputing the result. There might be an optimisation
opportunity here where we cache the result of all these functions.
We choose the IR representation of an LL via its canonical form + a
`repOrder` for several reasons:
- It's generally more compact
- It's easier to CSE, so it's easier to see when two layouts are in fact
the same.
- A technical reason: the `toLinearLayout` function returns a tensor
with dimensions `dim0, ..., dim<rank-1>`, in other words, it "forgets"
the repetition order. Without the repetition order, we cannot recover
the tile size of the argument. In particular, we cannot recover
`getSizePerThread`. There is an argument to be made about whether
`getSizePerThread` is useful on its own, or whether it is
`getElemsPerThread` the real useful abstraction here, but for now, we
keep both for BC.
0 commit comments