Skip to content

Conversation

@JonathanC-ARM
Copy link
Contributor

Description

  • Integration of SME1 Variant of existing SME2 convolution Kernel, kai_run_imatmul_clamp_f32_f32p2vlx1_f32p2vlx1b_2vlx2vl_sme_mopa and associated packing functions
  • Formatting changes in convolve_kleidiai.cpp
  • Addition of proper sme2 gate for dynamic qgemm
  • Updating of kleidiai version to 1.14 (first version which contains the appropriate kernel)

@hariharans29
Copy link
Member

/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline

@azure-pipelines
Copy link

Azure Pipelines successfully started running 4 pipeline(s).

@hariharans29
Copy link
Member

Can you please re-base ? @JonathanC-ARM

Signed-off-by: Jonathan Clohessy <[email protected]>
@JonathanC-ARM JonathanC-ARM force-pushed the jclohess_sme1_convolution_integration branch from b49e63e to 20ce3f1 Compare November 11, 2025 15:20
@hariharans29
Copy link
Member

/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline

@azure-pipelines
Copy link

Azure Pipelines successfully started running 4 pipeline(s).

@hariharans29
Copy link
Member

hariharans29 commented Nov 13, 2025

Can you please confirm if the corresponding tests passed on an SME1 machine or using SME1 mode on an SME2 machine ?

@JonathanC-ARM
Copy link
Contributor Author

Hi Hari, yeah so I've one push to come fixing the sme2 checks. These are going to be changed subsequently via the gemv pr anyway but in the meantime you are correct its right to fix them now.

The sme1 tests pass also, we test on both sme2 and sme1. We typically force sme2 to be false to verify these kernels during development. Please see attached output of the tests for ort test all and mlas tests. Both of which pass and in the case of mlas tests we see that there are a few hundred skipped tests with a message "MlasDynamicQGemmBatch() requires ARM64 SME2 but it was not detected. Skipping test."
ort_mlas_test.txt
ort_test_all.txt

@hariharans29
Copy link
Member

/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline

@azure-pipelines
Copy link

Azure Pipelines successfully started running 4 pipeline(s).

@hariharans29 hariharans29 merged commit 8fe4804 into microsoft:main Nov 13, 2025
90 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants