vulkan: optimize iq1 coopmat2 dequant functions #12427

jeffbolznv · 2025-03-17T13:22:36Z

Perf on RTX 4070:

before:
  MUL_MAT(type_a=iq1_s,type_b=f32,m=4096,n=512,k=14336,bs=[1,1],nr=[1,1],per=[0,1,2,3]):                 460 runs -  2174.26 us/run -  60.13 GFLOP/run -  27.66 TFLOPS
  MUL_MAT(type_a=iq1_m,type_b=f32,m=4096,n=512,k=14336,bs=[1,1],nr=[1,1],per=[0,1,2,3]):                 288 runs -  3482.95 us/run -  60.13 GFLOP/run -  17.26 TFLOPS
  
after:
  MUL_MAT(type_a=iq1_s,type_b=f32,m=4096,n=512,k=14336,bs=[1,1],nr=[1,1],per=[0,1,2,3]):                 726 runs -  1379.33 us/run -  60.13 GFLOP/run -  43.59 TFLOPS
  MUL_MAT(type_a=iq1_m,type_b=f32,m=4096,n=512,k=14336,bs=[1,1],nr=[1,1],per=[0,1,2,3]):                 412 runs -  2428.05 us/run -  60.13 GFLOP/run -  24.76 TFLOPS

ggml/src/ggml-vulkan/vulkan-shaders/dequant_funcs_cm2.comp

0cc4m · 2025-03-19T18:56:08Z

Master:

model	size	params	backend	ngl	test	t/s
llama 8B IQ1_S - 1.5625 bpw	1.87 GiB	8.03 B	Vulkan	99	pp512	2609.84 ± 16.94
llama 8B IQ1_S - 1.5625 bpw	1.87 GiB	8.03 B	Vulkan	99	tg128	65.52 ± 0.36

PR:

model	size	params	backend	ngl	test	t/s
llama 8B IQ1_S - 1.5625 bpw	1.87 GiB	8.03 B	Vulkan	99	pp512	3461.18 ± 47.70
llama 8B IQ1_S - 1.5625 bpw	1.87 GiB	8.03 B	Vulkan	99	tg128	66.59 ± 1.57

vulkan: optimize iq1 coopmat2 dequant functions

5876bba

jeffbolznv requested a review from 0cc4m March 17, 2025 13:22

github-actions bot added Vulkan Issues specific to the Vulkan backend ggml changes relating to the ggml tensor library for machine learning labels Mar 17, 2025

CISC reviewed Mar 17, 2025

View reviewed changes

ggml/src/ggml-vulkan/vulkan-shaders/dequant_funcs_cm2.comp Show resolved Hide resolved

ggml/src/ggml-vulkan/vulkan-shaders/dequant_funcs_cm2.comp Show resolved Hide resolved

0cc4m approved these changes Mar 19, 2025

View reviewed changes

0cc4m merged commit a9b5928 into ggml-org:master Mar 19, 2025
43 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

vulkan: optimize iq1 coopmat2 dequant functions #12427

vulkan: optimize iq1 coopmat2 dequant functions #12427

Uh oh!

jeffbolznv commented Mar 17, 2025

Uh oh!

Uh oh!

Uh oh!

0cc4m commented Mar 19, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

vulkan: optimize iq1 coopmat2 dequant functions #12427

vulkan: optimize iq1 coopmat2 dequant functions #12427

Uh oh!

Conversation

jeffbolznv commented Mar 17, 2025

Uh oh!

Uh oh!

Uh oh!

0cc4m commented Mar 19, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants