vulkan: enable fp16 for gcn 3 and 4 chips #13396

netrunnereve · 2025-05-09T03:46:14Z

Basically GCN 3 and 4 chips support FP16, but it's unable to process two values at once like GCN 5 can.. Since there's apparently no performance benefit shaderFloat16 is disabled in the drivers, but the chip fully supports it and RADV is able to generate those instructions.

While the actual FMAs won't run any faster having FP16 means that I can use four times less shared memory for mul mat and save a little bit of memory bandwidth when reading the B matrix. As a result I get a little improvement in prompt processing on my RX 470.

PR:

model	size	params	backend	ngl	test	t/s
llama 7B Q4_0	3.56 GiB	6.74 B	Vulkan	99	pp512	195.34 ± 0.70
llama 7B Q4_0	3.56 GiB	6.74 B	Vulkan	99	tg128	33.48 ± 0.53

Master:

model	size	params	backend	ngl	test	t/s
llama 7B Q4_0	3.56 GiB	6.74 B	Vulkan	99	pp512	188.32 ± 0.38
llama 7B Q4_0	3.56 GiB	6.74 B	Vulkan	99	tg128	33.62 ± 0.28

I'm leaving this as a draft for now as it's a bit hacky and I'm not sure if the proprietary drivers support this. The good thing here though is that it'll let me work on and test the FP16 shaders using my old card.

0cc4m

This can only be merged if it passes validation, which I doubt. If it does not, I would require disabling it by default and hiding behind an environment variable. The backend has to follow the Vulkan specification.

ggml/src/ggml-vulkan/ggml-vulkan.cpp

0cc4m · 2025-05-09T08:27:48Z

ggml/src/ggml-vulkan/ggml-vulkan.cpp


-        device->fp16 = device->fp16 && vk12_features.shaderFloat16;
+        // GCN 3 and 4 chips support FP16 at regular speed, but the drivers don't indicate it
+        device->fp16 = device->fp16 && (vk12_features.shaderFloat16 || (device->architecture == AMD_GCN34));


It's against the spec to enable this if support is not indicated with the shaderFloat16 feature, which it is not for these GPUs. Did you try running this with validation layers enabled? I'm relatively sure this should throw validation issues.

netrunnereve · 2025-05-10T00:33:33Z

I finally got the validation layers working and it fails with vkCreateComputePipelines(): pCreateInfos[0].stage SPIR-V Capability Float16 was declared, but one of the following requirements is required (VkPhysicalDeviceVulkan12Features::shaderFloat16 OR VK_AMD_gpu_shader_half_float).. So yeah Vulkan isn't happy with how I'm using FP16 when the driver says it doesn't support it.

Considering how small the improvement is I don't think it's worth having a special environment variable and all that, and I'm just going to close this.

enable fp16 for gcn 3 and 4

2610381

github-actions bot added Vulkan Issues specific to the Vulkan backend ggml changes relating to the ggml tensor library for machine learning labels May 9, 2025

0cc4m reviewed May 9, 2025

View reviewed changes

fix features

ee4af4e

netrunnereve closed this May 10, 2025

netrunnereve deleted the fp16 branch May 10, 2025 00:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

vulkan: enable fp16 for gcn 3 and 4 chips #13396

vulkan: enable fp16 for gcn 3 and 4 chips #13396

Uh oh!

netrunnereve commented May 9, 2025 •

edited

Loading

Uh oh!

0cc4m left a comment

Uh oh!

Uh oh!

Uh oh!

0cc4m May 9, 2025

Uh oh!

netrunnereve commented May 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vulkan: enable fp16 for gcn 3 and 4 chips #13396

vulkan: enable fp16 for gcn 3 and 4 chips #13396

Uh oh!

Conversation

netrunnereve commented May 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

0cc4m left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

0cc4m May 9, 2025

Choose a reason for hiding this comment

Uh oh!

netrunnereve commented May 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

netrunnereve commented May 9, 2025 •

edited

Loading