Skip to content

Commit 04a7b65

Browse files
authored
[OptRed] Extend -tritonintelgpu-optimize-reduction-locality to support repCluster[0] > 2 (#2533)
Support `repCluster[0] > 2` by using 7-D tensors and adding a `convert_layout` operation before the final `reshape`. See code for implementation details. --------- Signed-off-by: victor-eds <[email protected]>
1 parent 6db3b52 commit 04a7b65

File tree

3 files changed

+376
-208
lines changed

3 files changed

+376
-208
lines changed

0 commit comments

Comments
 (0)