perf: add AArch64 GEMM/GEMV for q4_0. #104

yirongjie · 2024-07-30T07:19:05Z

What's new

Copy GEMM/GEMV from ggml
- SGEMM.hpp/cpp is copied from Improve cpu prompt eval speed ggml-org/llama.cpp#6414
- GEMM_Arch64.fpp/cppis copied from Arm AArch64: optimized GEMV and GEMM kernels for q4_0_q8_0, and q8_0_q8_0 quantization ggml-org/llama.cpp#5780
You can download Q4_0_4x4 model here.
Optimize operators: Softmax, SiLU, RoPE

Fix bugs

BUG when load models for ARM.

…LAMAFILE_SGEMM` in `Type.hpp`

Signed-off-by: yirongjie <[email protected]>

yirongjie added 19 commits July 23, 2024 08:45

fix: typo in Module::load()

ad9320e

fix: Combine all mat_mul_xxx to mat_mul

ab61c27

fix: apply SGEMM when q*k & qk*v

4365fe6

fix: rename check_sgemm

5a34f49

fix: remove unuse

6ac3e57

fix: use MultiHeadAttentnion funcation, patch KVCache if `#define L…

1c31d69

…LAMAFILE_SGEMM` in `Type.hpp`

perf: omp in Softmax & Mask

a0daee5

fix: add llamafile_sgemm 's after quantize in matmul

92301f3

fix:CMakeList.txt armv8.2-a+dotprod

8a9d9aa

Signed-off-by: yirongjie <[email protected]>

feat：add q4_0_4x4 gemm-aarch64;

459838b

fix:demo imagebind 1mod

1ce7b8a

Signed-off-by: yirongjie <[email protected]>

Merge remote-tracking branch 'refs/remotes/origin/main'

e7ba402

fix:assert & some functions

6638038

Signed-off-by: yirongjie <[email protected]>

fix: change omp parallel

06d8e0a

Merge branch 'main' of https://github.com/yirongjie/mllm

53a9e8d

perf: Merge "softax"&"causalMask", perf RoPE.

1bbb173

fix: rewrite quantize Q4_0_4X4

855e425

fix not Q_K quantizer

04f01b7

fix: rope

6e941b5

yirongjie requested a review from UbiquitousLearning July 30, 2024 07:33

UbiquitousLearning approved these changes Jul 30, 2024

View reviewed changes

yirongjie merged commit affe946 into UbiquitousLearning:main Jul 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf: add AArch64 GEMM/GEMV for q4_0. #104

perf: add AArch64 GEMM/GEMV for q4_0. #104

Uh oh!

yirongjie commented Jul 30, 2024 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

perf: add AArch64 GEMM/GEMV for q4_0. #104

perf: add AArch64 GEMM/GEMV for q4_0. #104

Uh oh!

Conversation

yirongjie commented Jul 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What's new

Fix bugs

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

yirongjie commented Jul 30, 2024 •

edited

Loading