Skip to content

Commit b47b9c4

Browse files
authored
Clarify Batch Matrix Multiply operator usage
Updated the explanation of the Batch Matrix Multiply operator and clarified the instructions for constructing benchmark models.
1 parent 45b0c1d commit b47b9c4

File tree

1 file changed

+4
-5
lines changed

1 file changed

+4
-5
lines changed

content/learning-paths/mobile-graphics-and-gaming/measure-kleidiai-kernel-performance-on-executorch/06-create-matrix-mul-model.md

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -6,9 +6,9 @@ weight: 7
66
layout: learningpathall
77
---
88

9-
In the previous section, we discussed that the Batch Matrix Multiply operator supports multiple GEMM (General Matrix Multiplication) variants.
9+
The Batch Matrix Multiply operator (torch.bmm) under XNNPACK lowers to GEMM and, when shapes and dtypes match supported patterns, can dispatch to KleidiAI micro-kernels on Arm.
1010

11-
To evaluate the performance of these variants across different hardware platforms, we construct a set of benchmark models that utilize the batch matrix multiply operator with different GEMM implementations for comparative analysis.
11+
To evaluate the performance of these variants across different hardware platforms, you will construct a set of benchmark models that utilize the batch matrix multiply operator with different GEMM implementations for comparative analysis.
1212

1313

1414
### Matrix multiply benchmark model
@@ -72,11 +72,10 @@ export_mutrix_mul_model(torch.float32,"matrix_mul_pf32_gemm")
7272

7373
```
7474

75-
**NOTE:**
76-
75+
{{%notice Note%}}
7776
When exporting models, the **generate_etrecord** option is enabled to produce the .etrecord file alongside the .pte model file.
7877
These ETRecord files are essential for subsequent model analysis and performance evaluation.
79-
78+
{{%/notice%}}
8079

8180
After running this script, both the PTE model file and the etrecord file are generated.
8281

0 commit comments

Comments
 (0)