Skip to content

Conversation

@dchigarev
Copy link
Contributor

@dchigarev dchigarev commented Nov 21, 2024

This PR adds support for memrefs of any dimensions as well as for thread indices of all 3 dimensions (thread_idx_z was not supported before)

This PR fixes the failure of the ROPE module caused by the AllocsToSLM pass

Signed-off-by: dchigarev <[email protected]>
Signed-off-by: dchigarev <[email protected]>
Signed-off-by: dchigarev <[email protected]>
Signed-off-by: dchigarev <[email protected]>
int64_t newX =
originalShape[0] * blockSizes[0] * blockSizes[1] * blockSizes[2];
SmallVector<int64_t> newShape({newX});
newShape.append(originalShape.begin() + 1, originalShape.end());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic of scaling allocation size by the total number of threads was reworked. We now scale only the zero dimension of the memref:

// num threads in the work group: X=2, Y=3, Z=4
// alloc before the pass
%slm_buff = memref.alloc() : memref<8x16xf16>

// alloc after the pass (only scaled the zero dimension)
%slm_buff_root = memref.alloc() : memref<192x16xf16, 3>
%x_offset = slm_chunk_x_size * (X_thread_idx * Y_block_size * Z_block_size + Y_thread_idx * Z_block_size + Z_thread_idx) = 8 * (X_thread_idx * 12 + Y_thread_idx * 4 + Z_thread_idx)
%slm_buff = memref.subview [%x_offset, 0] : memref<192x16xf16, 3> -> memref<8x16xf16, 3>

@dchigarev dchigarev marked this pull request as ready for review November 21, 2024 09:24
Copy link
Contributor

@kurapov-peter kurapov-peter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG

@dchigarev dchigarev merged commit 9978725 into intel:main Nov 21, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants