Skip to content

Commit a19f324

Browse files
authored
Replace open-coded indirect load elimination loop (#4952)
Not only does this have issues with iterator invalidation, it also runs out of bounds if the last element is erased. We should just use the functions for doing this rather than worrying about how to implement it correctly ourselves. The core Triton is a small number of people, and we receive many PRs (thank you!). To help us review your code more quickly, **if you are a new contributor (less than 3 PRs merged) we ask that you complete the following tasks and include the filled-out checklist in your PR description.** Complete the following tasks before sending your PR, and replace `[ ]` with `[x]` to indicate you have done them. - [X] I am not making a trivial change, such as fixing a typo in a comment. - [X] I have written a PR description following these [rules](https://cbea.ms/git-commit/#why-not-how). - [X] I have run `pre-commit run --from-ref origin/main --to-ref HEAD`. - Select one of the following. - [ ] I have added tests. - `/test` for `lit` tests - `/unittest` for C++ tests - `/python/test` for end-to-end tests - [ ] This PR does not need a test because `I don't actually know how to exercise this code`. - Select one of the following. - [X] I have not added any `lit` tests. - [ ] The `lit` tests I have added follow these [best practices](https://mlir.llvm.org/getting_started/TestingGuide/#filecheck-best-practices), including the "tests should be minimal" section. (Usually running Python code and using the instructions it generates is not minimal.)
1 parent 9d424e0 commit a19f324

File tree

1 file changed

+8
-11
lines changed

1 file changed

+8
-11
lines changed

lib/Dialect/TritonGPU/Transforms/Pipeliner/MatmulLoopPipeline.cpp

Lines changed: 8 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -538,17 +538,14 @@ scheduleLoads(scf::ForOp forOp, tt::CoarseSchedule &schedule,
538538
if (loadOpToIndLevelAndUse.empty())
539539
return {};
540540

541-
for (auto iter = loadOpToIndLevelAndUse.begin();
542-
iter != loadOpToIndLevelAndUse.end();) {
543-
auto iterNext = iter + 1;
544-
if (std::get<1>(*iter) >= numStages - 1)
545-
// We assume loads with different dist are assigned to different stages.
546-
// If numStages is 2, we will have no stage available for indirect loads
547-
// with dist >= 1. In general, when dist is equal to numStages - 1, we
548-
// should not pipeline it.
549-
loadOpToIndLevelAndUse.erase(iter);
550-
iter = iterNext;
551-
}
541+
// We assume loads with different dist are assigned to different stages.
542+
// If numStages is 2, we will have no stage available for indirect loads
543+
// with dist >= 1. In general, when dist is equal to numStages - 1, we
544+
// should not pipeline it.
545+
auto it = llvm::remove_if(loadOpToIndLevelAndUse, [=](auto op) {
546+
return std::get<1>(op) >= numStages - 1;
547+
});
548+
loadOpToIndLevelAndUse.erase(it, loadOpToIndLevelAndUse.end());
552549

553550
// Check which loads are good for pipelining, and assign them
554551
// memory layouts.

0 commit comments

Comments
 (0)