Commit fdd0512
committed
[OptRed] Extend
Support `repCluster[0] > 2` by using 7-D tensors and adding a `convert_layout` operation before the final `reshape`.
See code for implementation details.
Signed-off-by: victor-eds <[email protected]>-tritonintelgpu-optimize-reduction-locality to support repCluster[0] > 2
1 parent 5ed11cb commit fdd0512
File tree
3 files changed
+356
-191
lines changed- test/TritonIntelGPU
- third_party/intel
- include/Dialect/TritonIntelGPU/Transforms
- lib/TritonIntelGPUTransforms
3 files changed
+356
-191
lines changed
0 commit comments