-
Notifications
You must be signed in to change notification settings - Fork 1
[MLIR][Affine] Fixed nesting of loops in fusion-maximal (Issue #61820) #9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MLIR][Affine] Fixed nesting of loops in fusion-maximal (Issue #61820) #9
Conversation
|
Hi @bondhugula, a gentle reminder for a review. |
bondhugula
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The API update isn't proper.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
else if
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't proper to do. computeSliceUnion shouldn't have anything to do with the fusion strategy or anything fusion related. You'll have to rethink what option to pass if the behavior has to change here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again this is an analysis utility and should not refer to any specific fusion-related things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For eg., one can add an option skipParallelismCheck.
Lower-level utilities shouldn't refer to or encode any semantics from higher-level/client utilities.
|
I see, this lower-level utility might be used by other mechanisms than loop fusion, and should be agnostic of that. So rather than passing the Will make the changes. |
cafe7bb to
6486e51
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
modified signatures of getComputeSliceState() and computeSliceUnion() to also receive a bool skipParallismCheck, which is a flag to skip parallelism check on sibling loop if set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
modified FusionStrategy class to have member bool maximalFusion which represents if fusion-maximal is set or not. Also, added a method that returns it's set value.
SwapnilGhanshyala
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @bondhugula, pushed the changes addressing previous comments.
Importantly, addressed the comment to make the slice computation agnostic of where it is called from.
bondhugula
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing test case!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
heuristic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All inputs should appear before. sliceUnion is an output arg.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maximal fusion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to add a code comment here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
6486e51 to
9049ded
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added unit test.
…1820) When maximal fusion is requested, slice computation can skip checking if the sibling loop is parallel or not, since maximal does not care about performance improvement. Modified slice computation to skip said check if skipParallelismCheck flag is set. Modified FusionStrategy class to have a boolean member representing maximal fusion and modified slice computation methods to pass the information about the selected strategy. Added test case in loop-fusion-4.mlir
9049ded to
bb7e75f
Compare
|
Hi @bondhugula, addressed the comments and added a unit test to loop-fusion-4.mlir. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @bondhugula, a gentle reminder for review.
|
Hi @bondhugula , gentle reminder for review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't the right fix or approach. The incorrect fusion has nothing specific to do with sequential loops or parallel loops nor to do with maximal fusion (it could also happen with non-maximal fusion where the compute tolerance allows sufficient redundant computation), nor is it specific to reduction nests. The invalid fusion here is due to cyclic dependences in the source nest. This is fixed comprehensively by llvm#128397
It also required a fix to a check on the compute tolerance.
When maximal fusion is requested, slice computation can skip checking if the sibling loop is parallel or not, since maximal does not care about performance improvement. Modified slice computation to skip said check if the maximal flag is set. Modified FusionStrategy class to have a boolean member representing maximal fusion and modified slice computation methods to pass the information about the selected strategy.
Issue: The Issue we are trying to solve is the unnecessary nesting of loops as a result of maximal fusion llvm#61820
Original Pipeline: If the sibling was sequential, then its bounds were set to be constants (the original bounds). This was forcing a nesting of the loops instead of fusion. In standard fusion (non-maximal), the strategy would have been rejected based on performance heuristics, and another iteration of bounds calculation would start. However, in maximal-fusion, this nested strategy would not be rejected.
Solution: When in maximal fusion, skip the check on the sibling to see if it is parallel or not. This prevents the bounds from being set as constants.
Reasoning: The fusion of parallel to sequential loops will produce sequential loops. It seems to be a performance-based decision to not let this happen. Hence maximal fusion can safely ignore this check.