Skip to content

Conversation

chengjunlu
Copy link
Contributor

@chengjunlu chengjunlu commented Jul 2, 2025

It is a experimental code to load column major matrix with 2d block io for fp8.
Need to make sure it is good in performance for GEMM and attention kernel.

Note: this PR is not intended to be merged.

@chengjunlu chengjunlu changed the title load column major matrix with 2d block io [BACKEND]Load column major matrix with 2d block io Jul 2, 2025
@chengjunlu chengjunlu force-pushed the chengjun/load_column_major_matrix_2D_block_io branch from dc975e3 to 7f0ecf9 Compare July 2, 2025 07:51
Copy link
Contributor

@etiotto etiotto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The GenISA interface is not official and therefore we should not use it. This PR is just for your experiment, or do you really want to merge it into the main branch ?

@@ -2105,6 +2114,10 @@ struct LoadOpConversion
rewriter.eraseOp(load2dOp);
return failure();
}
#if 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Enclose the traces into LLVM_DEBUG ?

@etiotto etiotto marked this pull request as draft July 2, 2025 13:35
@chengjunlu
Copy link
Contributor Author

The GenISA interface is not official and therefore we should not use it. This PR is just for your experiment, or do you really want to merge it into the main branch ?

Just use this PR to run the CI tests automatically.
There is missing 2D block IO OCL interface for column major matrix loading.
E.G:

L0 build module failed. Log:
error: undefined reference to `__internal_intel_sub_group_2d_block_read_transpose_32b_16r4x1c_cache_controls'
in function: '__internal_intel_sub_group_2d_block_read_transpose_32b_16r4x1c_cache_controls' called by kernel: 'test'

The GenISA is a work around to make it work for experimental.

@etiotto etiotto changed the title [BACKEND]Load column major matrix with 2d block io [EXPERIMENTAL]: Load column major matrix with 2d block io Jul 9, 2025
@chengjunlu
Copy link
Contributor Author

Close this PR. Use a new PR #4870

@chengjunlu chengjunlu closed this Aug 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[06-fused-attention] Determine if FP8 operand B can use 2d block load
2 participants