Commit 4f6f768
authored
[AMD] Reland sinking the 2nd tt.load after local_load's (#4935)
This PR adds more restrictions about when should we apply
the sched-load optimizations and un-revert
triton-lang/triton#4823.
We will only apply the optimization when all of the following is
satisfied:
1. pureMatmulProblem, i.e. 1 `tt.dot` in the main loop
2. two `tt.load`s in the main loop
3. 2nd `tt.load` is ahead of the `tt.dot`
4. 1st user of 2nd `tt.load` is after the `tt.dot`
5. tile size is large enough, i.e. nonKDim >= 128 and kDim >= 641 parent 6693ddd commit 4f6f768
File tree
3 files changed
+297
-429
lines changed- test/TritonGPU/amd
- third_party/amd/lib/TritonAMDGPUTransforms
3 files changed
+297
-429
lines changed
0 commit comments