Skip to content

Commit 6f74535

Browse files
committed
[OptRed] Extend -tritonintelgpu-optimize-reduction-locality to support repCluster[0] > 2
Support `repCluster[0] > 2` by using 7-D tensors and adding a `convert_layout` operation before the final `reshape`. See code for implementation details. Signed-off-by: victor-eds <[email protected]>
1 parent a9ca5f0 commit 6f74535

File tree

3 files changed

+356
-191
lines changed

3 files changed

+356
-191
lines changed

0 commit comments

Comments
 (0)