Skip to content

Commit fdd0512

Browse files
committed
[OptRed] Extend -tritonintelgpu-optimize-reduction-locality to support repCluster[0] > 2
Support `repCluster[0] > 2` by using 7-D tensors and adding a `convert_layout` operation before the final `reshape`. See code for implementation details. Signed-off-by: victor-eds <[email protected]>
1 parent 5ed11cb commit fdd0512

File tree

3 files changed

+356
-191
lines changed

3 files changed

+356
-191
lines changed

0 commit comments

Comments
 (0)