Skip to content

Conversation

@YixingZhang007
Copy link
Contributor

@YixingZhang007 YixingZhang007 commented Nov 26, 2024

Description:
The support for combination 32x32x16 on DG2 has been implemented in commit intel/intel-graphics-compiler@6a06c93. This PR re-enables the Joint Matrix tests for the combination 32x32x16, which were previously unsupported.

This PR also adds XFAIL: gpu to two tests, joint_matrix_bf16_fill_k_cache_arg_dim.cpp and joint_matrix_bf16_fill_k_cache_runtime_dim.cpp, because the driver has not yet been updated to include the necessary changes in IGC to support these tests. As a result, these tests should not be tested in CI at this time.

@YixingZhang007 YixingZhang007 requested a review from a team as a code owner November 26, 2024 21:58
@YixingZhang007 YixingZhang007 changed the title [SYCL][Joint Matrix][E2E] Uncomment the Joint Matrix tests for combination 32x32x16 combination [SYCL][Joint Matrix][E2E] Uncomment the Joint Matrix tests for combination 32x32x16 Nov 26, 2024
@YixingZhang007 YixingZhang007 marked this pull request as draft November 26, 2024 21:58
@YixingZhang007 YixingZhang007 self-assigned this Nov 26, 2024
Copy link
Contributor

@dkhaldi dkhaldi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@YixingZhang007 YixingZhang007 force-pushed the uncomment_test_32x32x16 branch from 3eee097 to bf10daf Compare November 30, 2024 01:29
@YixingZhang007 YixingZhang007 marked this pull request as ready for review November 30, 2024 02:35
@YixingZhang007 YixingZhang007 requested a review from a team as a code owner November 30, 2024 02:35
@YixingZhang007
Copy link
Contributor Author

@intel/llvm-gatekeepers This PR has been approved and all the CI tests have passed. Could you please help merge the PR? Thanks for helping :)

// (i.e. reduce) the number and the list below.
//
// NUMBER-OF-XFAIL-WITHOUT-TRACKER: 77
// NUMBER-OF-XFAIL-WITHOUT-TRACKER: 79
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quoting from line 43 from this very same file:

// That number *must not* increase. Any PR which causes this number to grow
// should be rejected and it should be updated to either keep the number as-is
// or have it reduced (preferably, down to zero).

Please, read carefully the header of this file to understand what you need to do. Basically, you cannot add new XFAIL directives without a tracker to address it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry about the mistake! Thanks for pointing it out :)
I have removed the change in file no-xfail-without-tracker.cpp, and added XFAIL-TRACKER to the tests.


// -ffp-model=precise is added to not depend on compiler defaults.

// XFAIL: gpu
Copy link
Contributor

@dkhaldi dkhaldi Dec 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add
// XFAIL-TRACKER: CMPLRLLVM-63710
so you don't increase number of xfail without tracker

Also, should you add the xfail on only !igc-dev?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion! I have made the change.
These two tests are not supported in the igc-dev CI test because the commit in IGC has not yet been integrated into the driver. I will remove the XFAIL once the IGC change has been pull downed.

Copy link
Contributor

@YuriPlyakhin YuriPlyakhin Dec 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@YixingZhang007 ,
looking at the branch that is used for igc-dev here https://github.com/intel/llvm/blob/sycl/devops/dependencies-igc-dev.json

it should include your changes for runtime dimension support:
https://github.com/intel/intel-graphics-compiler/commits/6ee988a5948c6aaecbfe00605f52694c55cf1052/IGC/Compiler/Optimizer/OpenCLPasses/JointMatrixFuncsResolutionPass/JointMatrixFuncsResolutionPass.cpp

So, tests should be passing already with igc-dev in intel/llvm CI.
What am I missing?

Copy link
Contributor Author

@YixingZhang007 YixingZhang007 Dec 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ve noticed that the igc_dev CI test is not reflecting the IGC commits it's supposed to be testing. It seems to still be using an older version of IGC, likely the version set for the regular IGC.

I’m experiencing the same issue with another LLVM PR (#15880), where my IGC commit should already be integrated into igc_dev, but the tests get_coordinate_ops.cpp is still failing due to errors that should have been addressed by that IGC commit. As a temporary approach, I’ve added XFAIL to the test while we wait for this issue to be resolved or until the regular IGC is updated to include the changes.

Please let me know if this approach would work and if you have any suggestions! Thanks!

Copy link
Contributor

@YuriPlyakhin YuriPlyakhin Dec 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@YixingZhang007 ,
"while we wait for this issue to be resolved" - does it mean someone is working on resolving the issue? If not, could you please submit the issue to https://github.com/intel/llvm/issues, so that it is reported, and we can assign someone to work on it?
@jsji , @sarnex , FYI

I'm OK with adding XFAIL, since we have this problem.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are working on something called IGC 2 I think, so maybe they enabled it by default?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have created an issue #16251 regarding this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If so, we need to make sure we softlink to libigc.so to libigc.so.2 for igc-dev now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah im gonna look at the issue and try to see what's going on, i gotta fix igc-dev containers first

// -ffp-model=precise is added to not depend on compiler defaults.

// XFAIL: gpu

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add
// XFAIL-TRACKER: CMPLRLLVM-63710
so you don't increase number of xfail without tracker

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I have updated the test.

// CHECK-NEXT: Matrix/SG32/joint_matrix_rowmajorA_rowmajorB.cpp
// CHECK-NEXT: Matrix/SG32/joint_matrix_unaligned_k.cpp
// CHECK-NEXT: Matrix/joint_matrix_bf16_fill_k_cache_arg_dim.cpp
// CHECK-NEXT: Matrix/joint_matrix_bf16_fill_k_cache_runtime_dim.cpp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove these two addition as you add the trackers in the tests

Copy link
Contributor

@dkhaldi dkhaldi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@YixingZhang007
Copy link
Contributor Author

YixingZhang007 commented Dec 3, 2024

This PR would be merged for now, marking the tests joint_matrix_bf16_fill_k_cache_runtime_dim.cpp and joint_matrix_bf16_fill_k_cache_arg_dim.cpp with XFAIL: gpu. Once the igc-dev CI issue is resolved, it will be updated to XFAIL: !igc-dev.

@YixingZhang007
Copy link
Contributor Author

@intel/llvm-gatekeepers This PR has been approved and all the CI tests have passed. Could you please help merge the PR? Thanks for helping :)

@sarnex sarnex merged commit 4881d6d into intel:sycl Dec 3, 2024
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants