-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Implement FP32 kleidiai Gemv #26302
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Implement FP32 kleidiai Gemv #26302
Changes from all commits
82480ad
4cf6ccd
733cd76
1ead7ca
4067e9b
d920aa1
5b4ff13
0b44a68
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -15,6 +15,10 @@ | |
| #include <algorithm> | ||
| #include <vector> | ||
|
|
||
| #if defined(USE_KLEIDIAI) && !defined(_MSC_VER) | ||
| #include "core/mlas/lib/kleidiai/mlasi_kleidiai.h" | ||
| #endif | ||
|
|
||
| namespace onnxruntime { | ||
| namespace contrib { | ||
|
|
||
|
|
@@ -215,7 +219,7 @@ class DynamicQuantizeMatMul final : public MatMulIntegerToFloatBase { | |
|
|
||
| // Currently, MlasDynamicQGemmBatch() and associated functions require SME2 or else they are no-ops. | ||
| // We check that here too before attempting to use them. | ||
| if (!CPUIDInfo::GetCPUIDInfo().HasArm_SME2()) { | ||
| if (!SMEInfo::CanUseSME2) { | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we skip the qgemm.cpp / dynamic_quantize_matmul.cc / test_dynamic_qgemm.cpp changes in this PR ? #26598 is taking care of it with a new MLAS API for this.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sure I can revert this change altogether in favor of #26598 |
||
| can_use_dynamic_quant_mlas_ = false; | ||
| } | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -51,9 +51,7 @@ | |
|
|
||
| namespace ArmKleidiAI { | ||
|
|
||
| // By default we should try for SME2 first before falling back to SME. | ||
| inline const bool UseSME2 = MLAS_CPUIDINFO::GetCPUIDInfo().HasArm_SME2(); | ||
|
|
||
| // | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Stray change ?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes that seems to be the case, removed the above code but looks like I left the initial // |
||
| // Buffer packing routines. | ||
| // | ||
| size_t | ||
|
|
@@ -77,6 +75,19 @@ MlasGemmPackB( | |
| void* PackedB | ||
| ); | ||
|
|
||
| bool | ||
| MLASCALL | ||
| MlasFp32Gemv( | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Consider renaming to
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Seems like a sensible suggestion, doesn't really impact anything other than making things consistent. So I'd be happy to make the change |
||
| CBLAS_TRANSPOSE TransA, | ||
| CBLAS_TRANSPOSE TransB, | ||
| size_t M, | ||
| size_t N, | ||
| size_t K, | ||
| const MLAS_SGEMM_DATA_PARAMS* Data, | ||
| size_t BatchSize | ||
| ); | ||
|
|
||
|
|
||
| bool | ||
| MLASCALL | ||
| MlasGemmBatch( | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please remind me - why do we need this cupinfo dependency update for this PR ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I made this change, it might be somehow an older version of the deps file. Will look into it.