-
Notifications
You must be signed in to change notification settings - Fork 29
Description
Hello all,
I'm currently working on bflaot16 GEMM kernels for XDNA (Phoenix) and XDNA 2 (Strix Halo), and I'm looking for documentation on writing assembly GEMM kernels.
I found the SoC documentation for XDNA/AIE-ML and AIE-ML v2, but not for XDNA 2. Is there any public documentation for XDNA 2 SoC?
I am also looking for ISA documentation for XDNA and XDNA 2, as currently I use the assembly output of this compiler to infer the instructions and there restrictions for XDNA and XDNA 2. This was straightforward with XDNA as the SoC documentation is available and the vmac.f operation performs a simple matrix-matrix multiplication. However, there is no equivalent operation for bfloat16 in XDNA 2 (see the AI Engine API User Guide), and the only high-performing operation is bfp16 x bfp16, which uses 576-bit registers (e.g. ex0) and block format. This makes it a lot more tedious without ISA for XDNA 2.