vulkan : add GGML_VK_FORCE_HEAP_INDEX env var #9734

gyf304 · 2024-10-04T02:40:53Z

Some vulkan devices (namely integrated graphics cards) have multiple memory heaps: a smaller dedicated memory and a larger shared memory.

ggml uses the first usable memory type, which usually resides on the smaller dedicated memory heap. This can likely cause allocation failures. This patch adds an environment variable that forces allocation on a specific memory heap.

I have read the contributing guidelines
Self-reported review complexity:
- Low
- Medium
- High

0cc4m · 2024-10-04T19:32:09Z

ggml/src/ggml-vulkan.cpp

-        memory_type_index = find_properties(&mem_props, &mem_req, fallback_flags);
+        memory_type_index = find_properties(&mem_props, &mem_req, fallback_flags, device->force_heap_index);
        buf->memory_property_flags = fallback_flags;
    }


Shouldn't this fallback already handle that situation? That's basically what it was introduced for. You have a smaller (DEVICE_LOCAL) and a larger heap (DEVICE_LOCAL, HOST_VISIBLE and HOST_COHERENT), so once the small one is exhausted you fall back to the larger one.
Can you add the vulkaninfo of the device where this fallback does not work and describe why?

vulkaninfo.txt

The problem is that find_properties doesn't take in account about memory usage (as vk::PhysicalDeviceMemoryProperties reports total mem, not available mem)
On boot, with a few chrome tabs open, my laptop consumes 2.8/4GiB of the smaller heap, so if it tries to allocate 2GiB, it will fail.

One could argue the better fix is to either

use VK_EXT_memory_budget for querying actual memory usage in find_properties, so it doesn't select a memory type that does not have enough available memory

use a loop while allocating, if the allocation fails with ErrorOutOfDeviceMemory, try the next available memory type

or a combination of both

I did the minimal viable solution here, which is to force allocation from a known bigger heap (in my case, heap 1).

@gyf304 Thank you.

Take a look at ggml-org/whisper.cpp#2451. This might be the kind of fallback that you described in your second point. Can you check whether that would fix your issue?

FYI, I just pulled ggml-org/whisper.cpp#2451 and will soon sync it upstream with ggml and llama.cpp. Not sure if it fixes the problem here, but it worked for a couple of whisper.cpp-related issues.

github-actions bot added Vulkan Issues specific to the Vulkan backend ggml changes relating to the ggml tensor library for machine learning labels Oct 4, 2024

0cc4m self-requested a review October 4, 2024 19:20

0cc4m reviewed Oct 4, 2024

View reviewed changes

0cc4m mentioned this pull request Oct 6, 2024

Retry allocation with fallback flags ggml-org/whisper.cpp#2451

Merged

gyf304 closed this Aug 23, 2025

gyf304 force-pushed the master branch from fa5b31a to e92734d Compare August 23, 2025 02:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

vulkan : add GGML_VK_FORCE_HEAP_INDEX env var #9734

vulkan : add GGML_VK_FORCE_HEAP_INDEX env var #9734

Uh oh!

gyf304 commented Oct 4, 2024

Uh oh!

0cc4m Oct 4, 2024

Uh oh!

gyf304 Oct 4, 2024 •

edited

Loading

Uh oh!

0cc4m Oct 6, 2024

Uh oh!

ggerganov Oct 6, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

vulkan : add GGML_VK_FORCE_HEAP_INDEX env var #9734

vulkan : add GGML_VK_FORCE_HEAP_INDEX env var #9734

Uh oh!

Conversation

gyf304 commented Oct 4, 2024

Uh oh!

0cc4m Oct 4, 2024

Choose a reason for hiding this comment

Uh oh!

gyf304 Oct 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

0cc4m Oct 6, 2024

Choose a reason for hiding this comment

Uh oh!

ggerganov Oct 6, 2024

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

gyf304 Oct 4, 2024 •

edited

Loading