vulkan: handle large sizes for get_rows #15686

jeffbolznv · 2025-08-30T19:15:13Z

I'll push tests in a separate PR because I suspect they'll break other backends.

0cc4m · 2025-08-31T08:06:46Z

I see no issues. Looks good.

gpu_info	backends	model_type	model_size	test	avg_ts(master)	avg_ts(pr)	%
AMD Radeon (TM) Pro VII (RADV VEGA20)	Vulkan	gpt-oss 20B Q8_0	11.27 GiB	pp512	573.07	572.15	-0.2%
AMD Radeon (TM) Pro VII (RADV VEGA20)	Vulkan	gpt-oss 20B Q8_0	11.27 GiB	tg128	108.78	109.03	+0.2%
AMD Radeon (TM) Pro VII (RADV VEGA20)	Vulkan	llama 13B Q4_0	12.56 GiB	pp512	260.80	260.26	-0.2%
AMD Radeon (TM) Pro VII (RADV VEGA20)	Vulkan	llama 13B Q4_0	12.56 GiB	tg128	26.62	26.55	-0.2%
AMD Radeon (TM) Pro VII (RADV VEGA20)	Vulkan	llama 7B Q4_0	3.56 GiB	pp512	835.60	831.11	-0.5%
AMD Radeon (TM) Pro VII (RADV VEGA20)	Vulkan	llama 7B Q4_0	3.56 GiB	tg128	80.48	80.30	-0.2%
AMD Radeon (TM) Pro VII (RADV VEGA20)	Vulkan	llama 8B Q4_K - Small	4.36 GiB	pp512	293.51	292.27	-0.4%
AMD Radeon (TM) Pro VII (RADV VEGA20)	Vulkan	llama 8B Q4_K - Small	4.36 GiB	tg128	71.97	71.54	-0.6%
Intel(R) Arc(tm) A770 Graphics (DG2)	Vulkan	gpt-oss 20B Q8_0	11.27 GiB	pp512	184.80	184.76	-0.0%
Intel(R) Arc(tm) A770 Graphics (DG2)	Vulkan	gpt-oss 20B Q8_0	11.27 GiB	tg128	20.03	20.02	-0.1%
Intel(R) Arc(tm) A770 Graphics (DG2)	Vulkan	llama 13B Q4_0	12.56 GiB	pp512	276.56	273.86	-1.0%
Intel(R) Arc(tm) A770 Graphics (DG2)	Vulkan	llama 13B Q4_0	12.56 GiB	tg128	16.47	16.49	+0.1%
Intel(R) Arc(tm) A770 Graphics (DG2)	Vulkan	llama 7B Q4_0	3.56 GiB	pp512	658.03	657.06	-0.1%
Intel(R) Arc(tm) A770 Graphics (DG2)	Vulkan	llama 7B Q4_0	3.56 GiB	tg128	46.83	46.72	-0.2%
Intel(R) Arc(tm) A770 Graphics (DG2)	Vulkan	llama 8B Q4_K - Small	4.36 GiB	pp512	100.39	100.39	-0.0%
Intel(R) Arc(tm) A770 Graphics (DG2)	Vulkan	llama 8B Q4_K - Small	4.36 GiB	tg128	30.55	30.52	-0.1%
NVIDIA GeForce RTX 3090	Vulkan	gpt-oss 20B Q8_0	11.27 GiB	pp512	3832.77	3816.03	-0.4%
NVIDIA GeForce RTX 3090	Vulkan	gpt-oss 20B Q8_0	11.27 GiB	tg128	147.77	146.95	-0.6%
NVIDIA GeForce RTX 3090	Vulkan	llama 13B Q4_0	12.56 GiB	pp512	1722.12	1718.14	-0.2%
NVIDIA GeForce RTX 3090	Vulkan	llama 13B Q4_0	12.56 GiB	tg128	52.11	51.55	-1.1%
NVIDIA GeForce RTX 3090	Vulkan	llama 7B Q4_0	3.56 GiB	pp512	4406.60	4371.66	-0.8%
NVIDIA GeForce RTX 3090	Vulkan	llama 7B Q4_0	3.56 GiB	tg128	143.50	143.31	-0.1%
NVIDIA GeForce RTX 3090	Vulkan	llama 8B Q4_K - Small	4.36 GiB	pp512	4492.99	4488.97	-0.1%
NVIDIA GeForce RTX 3090	Vulkan	llama 8B Q4_K - Small	4.36 GiB	tg128	118.45	117.96	-0.4%

vulkan: handle large sizes for get_rows

763be1a

jeffbolznv requested a review from 0cc4m as a code owner August 30, 2025 19:15

github-actions bot added Vulkan Issues specific to the Vulkan backend ggml changes relating to the ggml tensor library for machine learning labels Aug 30, 2025

This was referenced Aug 30, 2025

tests: large sizes for get_rows #15687

Merged

ggml: add ops for WAN video model (cuda && cpu) #15669

Merged

0cc4m approved these changes Aug 31, 2025

View reviewed changes

0cc4m merged commit bbbf5ec into ggml-org:master Aug 31, 2025
46 of 48 checks passed

walidbr pushed a commit to walidbr/llama.cpp that referenced this pull request Sep 7, 2025

vulkan: handle large sizes for get_rows (ggml-org#15686)

e16026b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

vulkan: handle large sizes for get_rows #15686

vulkan: handle large sizes for get_rows #15686

Uh oh!

jeffbolznv commented Aug 30, 2025

Uh oh!

0cc4m commented Aug 31, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vulkan: handle large sizes for get_rows #15686

vulkan: handle large sizes for get_rows #15686

Uh oh!

Conversation

jeffbolznv commented Aug 30, 2025

Uh oh!

0cc4m commented Aug 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

0cc4m commented Aug 31, 2025 •

edited

Loading