[AMD] DCE/canonicalize true epilogue conditionals #6314

makslevental · 2025-03-26T19:59:59Z

~~Waiting on llvm/llvm-project#133151 upstream.~~

This PR adds a pattern that folds "true" arith.cmpi operations to arith.constant true; e.g.

%c0 = arith.constant 0 : i32
%c1024_i32 = arith.constant 1024 : i32
%cmpsge = arith.cmpi sge, %c1024_i32, %c0 : i32

->

%cmpsge = arith.constant true

(after DCE).

The specific use case is "unguarding" the epilogue in pipelined loops (e.g., as produced by tritonamdgpu-stream-pipeline). So e.g.,

tt.func @assume_matmul(%arg0: index, %arg1: index, %arg2: index, %arg3: !tt.ptr<f16>, %arg4: !tt.ptr<f16>) -> tensor<128x128xf32, #mma> {
  ...
  %20:6 = scf.for ... {
    scf.yield ...
  }
  ...
  %27 = arith.cmpi sge, %26, %c1 : index
  %31 = scf.if %27 -> (tensor<128x128xf32, #mma>) {
    %33 = tt.dot %28, %30, %20#2
    scf.yield %33 : tensor<128x128xf32, #mma>
  } else {
    scf.yield %20#2 : tensor<128x128xf32, #mma>
  }
  %32 = arith.select %27, %31, %20#2 : tensor<128x128xf32, #mma>
  ttg.local_dealloc %10 : !ttg.memdesc<1x128x32xf16, #shared, #smem, mutable>
  ttg.local_dealloc %11 : !ttg.memdesc<1x32x128xf16, #shared1, #smem, mutable>
  tt.return %32 : tensor<128x128xf32, #mma>
}

becomes

tt.func @assume_matmul(%arg0: index, %arg1: index, %arg2: index, %arg3: !tt.ptr<f16>, %arg4: !tt.ptr<f16>) -> tensor<128x128xf32, #mma> {
  ...
  %20:6 = scf.for ... {
    scf.yield ... 
  }
  %21 = ttg.local_load %20#4
  %22 = ttg.local_load %20#5
  %23 = arith.mulf %22, %cst
  %24 = tt.dot %21, %23, %20#2
  ttg.local_dealloc %10 : !ttg.memdesc<1x128x32xf16, #shared, #smem, mutable>
  ttg.local_dealloc %11 : !ttg.memdesc<1x32x128xf16, #shared1, #smem, mutable>
  tt.return %24 : tensor<128x128xf32, #mma>
}

Notice both the scf.if and arith.select are canonicalized away.

Note, this usually requires the use of tl.assume to hint/constrain the operands of the arith.cmpi; specifically wrt the original loop bounds something like %stop // %step >= 1 (or whatever the arithmetic on the loop bounds needs to be...).

~~Currently this is failing because I need to cherry-pick/PR an LLVM bump.~~

~~Waiting on #6334.~~

makslevental · 2025-03-29T01:50:14Z

Some kind of bug around here https://github.com/llvm/llvm-project/blob/8726e973459d93d34653946ba1e01ad198cdf11f/mlir/lib/Dialect/Arith/Transforms/IntRangeOptimizations.cpp#L56-L81 related to how the constant is materialized. Will figure it out next week.

makslevental · 2025-03-29T02:47:19Z

Upstream bug fix: llvm/llvm-project#133556

makslevental · 2025-03-29T16:01:12Z

Same failure as here #6343 - related to a recent change @Mogball made upstream also to range analysis.

Mogball · 2025-03-29T16:49:21Z

I put a fix in the branch. IntRangeAnalysis will now return a dummy return for noninteger values, because it has to return something.

third_party/amd/include/Analysis/RangeAnalysis.h

third_party/amd/lib/Analysis/RangeAnalysis.cpp

makslevental force-pushed the makslevental/loop-epilogue-range-canon branch from ca93969 to 15df1ce Compare March 26, 2025 20:22

makslevental marked this pull request as ready for review March 28, 2025 20:05

makslevental requested review from antiagainst, ptillet and zhanglx13 as code owners March 28, 2025 20:05

makslevental force-pushed the makslevental/loop-epilogue-range-canon branch 2 times, most recently from dcbd67c to 987cfd6 Compare March 29, 2025 00:35

makslevental force-pushed the makslevental/loop-epilogue-range-canon branch from 1d4b9a6 to 9e80a42 Compare March 31, 2025 19:35

makslevental mentioned this pull request Mar 31, 2025

[Backend] Update to llvm/llvm-project@1d4801f22ab #6352

Merged

makslevental force-pushed the makslevental/loop-epilogue-range-canon branch 2 times, most recently from 27261f4 to 9803f17 Compare April 1, 2025 21:01

makslevental added 5 commits April 1, 2025 17:05

[AMD] DCE/canonicalize true epilogue conditionals

64eb6d9

add fold-true-cmpi pattern/test pass

95d7dc0

add fold-true-cmpi pattern to StreamPipeline.cpp

efefa95

add tests

0e55f55

special case in test_assume

9bc5cec

makslevental force-pushed the makslevental/loop-epilogue-range-canon branch from 9803f17 to 9bc5cec Compare April 1, 2025 21:05

antiagainst requested changes Apr 1, 2025

View reviewed changes

third_party/amd/include/Analysis/RangeAnalysis.h Show resolved Hide resolved

third_party/amd/lib/Analysis/RangeAnalysis.cpp Show resolved Hide resolved

address comments

e6703de

antiagainst approved these changes Apr 1, 2025

View reviewed changes

fix bitPosition error

74a5f63

antiagainst approved these changes Apr 1, 2025

View reviewed changes

antiagainst merged commit 0315d72 into triton-lang:main Apr 1, 2025
8 checks passed

makslevental deleted the makslevental/loop-epilogue-range-canon branch April 1, 2025 23:24

This was referenced Apr 8, 2025

[AMD][Pipeliner][Draft] Optimize compute logic of pipeliner through unguarding loads in the epilogue #6430

Closed

[AMD][Pipeliner][Draft] Optimize compute logic of pipeliner through unguarding loads in the epilogue #6432

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[AMD] DCE/canonicalize true epilogue conditionals #6314

[AMD] DCE/canonicalize true epilogue conditionals #6314

Uh oh!

makslevental commented Mar 26, 2025 •

edited

Loading

Uh oh!

makslevental commented Mar 29, 2025

Uh oh!

makslevental commented Mar 29, 2025 •

edited

Loading

Uh oh!

makslevental commented Mar 29, 2025

Uh oh!

Mogball commented Mar 29, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[AMD] DCE/canonicalize true epilogue conditionals #6314

[AMD] DCE/canonicalize true epilogue conditionals #6314

Uh oh!

Conversation

makslevental commented Mar 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

makslevental commented Mar 29, 2025

Uh oh!

makslevental commented Mar 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

makslevental commented Mar 29, 2025

Uh oh!

Mogball commented Mar 29, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

makslevental commented Mar 26, 2025 •

edited

Loading

makslevental commented Mar 29, 2025 •

edited

Loading