-
Notifications
You must be signed in to change notification settings - Fork 13.8k
Description
Name and Version
Hello all, I've been having issues with Vulkan lately on my intel hardware (Iris Plus iGPU). I love Vulkan because it really speeds up image processing.
broken.txt
inxi.txt
vulkaninfo.txt
works.txt
I did a git bisect to determine the first broken commit and it is the following:
439342e is the first bad commit
commit 439342e (HEAD, tag: b7065)
Author: Jeff Bolz [email protected]
Date: Sat Nov 15 04:56:15 2025 -0600
vulkan: Use ggml_vk_tensor_subbuffer in mul_mat_vec(id) paths (#17244)
* vulkan: Use ggml_vk_tensor_subbuffer in mul_mat_vec(id) paths
* set allow_misalign
ggml/src/ggml-vulkan/ggml-vulkan.cpp | 631 +++++++++++++++++++++++++++++++++++----------------------------------------------------------------
1 file changed, 220 insertions(+), 411 deletions(-)
I am using the following command:
llama-mtmd-cli -ngl 0 -m /vlms/SmolVLM2-2.2B-Instruct-f16.gguf --mmproj /SmolVLM2-2.2B-Instruct-mmproj-f16.gguf --image ./tmp6zsf6lbv.jpg -p "Describe. It should be more than 10 words but less than 50 words."
I am also attaching a logs from a run that works, a run that doesn't work, inxi and vulkaninfo. Please let me know if you need additional information.
CPU version from the same commit works fine. Removing this commit also works fine.
Tagging @jeffbolznv for visibility
Operating systems
Linux
GGML backends
Vulkan
Hardware
Graphics:
Device-1: Intel Iris Plus Graphics G7 driver: i915 v: kernel
API: Vulkan v: 1.4.321 drivers: intel,llvmpipe surfaces: N/A
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Intel(R) Iris(R) Plus Graphics (ICL GT2) (Intel open-source Mesa driver) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 32 | shared memory: 65536 | int dot: 0 | matrix cores: none
build: xxxx (xxxx) with cc (GCC) 15.2.1 20250813 for x86_64-pc-linux-gnu
llama_model_load_from_file_impl: using device Vulkan0 (Intel(R) Iris(R) Plus Graphics (ICL GT2)) (0000:00:02.0) - 10494 MiB free
Models
I tried with SmolVLM, but also seeing it with other models such as LFM2.
Problem description & steps to reproduce
llama-mtmd-cli -ngl 0 -m /vlms/SmolVLM2-2.2B-Instruct-f16.gguf --mmproj /SmolVLM2-2.2B-Instruct-mmproj-f16.gguf --image ./tmp6zsf6lbv.jpg -p "Describe. It should be more than 10 words but less than 50 words."
First Bad Commit
Relevant log output
warmup: flash attention is enabled
main: loading model: /vlms/SmolVLM2-2.2B-Instruct-f16.gguf
encoding image slice...
image slice encoded in 16314 ms
decoding image batch 1/1, n_tokens_batch = 81
Batch offset=0xc000 len=0x0 on queue 0 (aperture: 1218.7Mb, 0.0Mb VRAM only)
BO: addr=0x000000058a200000-0x000000058b794fff size= 22100KB handle=00014 capture=0 vram_only=0 name=user
BO: addr=0x0000000540000000-0x00000005524bffff size= 299776KB handle=00010 capture=0 vram_only=0 name=user
BO: addr=0x0000000587800000-0x000000058a07ffff size= 41472KB handle=00013 capture=0 vram_only=0 name=user
BO: addr=0x00000003c0000000-0x00000003c01fffff size= 2048KB handle=00002 capture=1 vram_only=0 name=dynamic pool
BO: addr=0x0000000300000000-0x00000003001fffff size= 2048KB handle=00003 capture=1 vram_only=0 name=instruction pool
BO: addr=0x00000000c0000000-0x00000000c01fffff size= 2048KB handle=00006 capture=1 vram_only=0 name=binding table pool
BO: addr=0x0000000140000000-0x00000001401fffff size= 2048KB handle=00005 capture=1 vram_only=0 name=bindless surface state pool
BO: addr=0x00000002c0000000-0x00000002c01fffff size= 2048KB handle=00007 capture=1 vram_only=0 name=indirect push descriptor pool
BO: addr=0x0000000100000000-0x00000001001fffff size= 2048KB handle=00004 capture=1 vram_only=0 name=internal surface state pool
BO: addr=0x0000000000200000-0x00000000003fffff size= 2048KB handle=00001 capture=1 vram_only=0 name=general pool
BO: addr=0x0000000552600000-0x0000000553605fff size= 16408KB handle=00011 capture=0 vram_only=0 name=user
BO: addr=0x0000000553800000-0x00000005877ddfff size= 851832KB handle=00012 capture=0 vram_only=0 name=user
BO: addr=0xffffeffeffe00000-0xffffeffeffffffff size= 2048KB handle=00008 capture=1 vram_only=0 name=slab_parent
[New LWP 80882]
[New LWP 80881]
[New LWP 80880]
[New LWP 80879]
[New LWP 80878]
[New LWP 80876]
This GDB supports auto-downloading debuginfo from the following URLs:
<https://debuginfod.archlinux.org>
Enable debuginfod for this session? (y or [n]) [answered N; input not from terminal]
Debuginfod has been disabled.
To make this setting permanent, add 'set debuginfod enabled off' to .gdbinit.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
0x00007fb610e9f042 in ?? () from /usr/lib/libc.so.6
#0 0x00007fb610e9f042 in ?? () from /usr/lib/libc.so.6
#1 0x00007fb610e931ac in ?? () from /usr/lib/libc.so.6
#2 0x00007fb610e931f4 in ?? () from /usr/lib/libc.so.6
#3 0x00007fb610f03dcf in wait4 () from /usr/lib/libc.so.6
#4 0x00007fb61456cb6b in ggml_print_backtrace () from /llama.cpp/build_vulkan/bin/libggml-base.so.0
#5 0x00007fb61457f379 in ggml_uncaught_exception() () from /llama.cpp/build_vulkan/bin/libggml-base.so.0
#6 0x00007fb6112b1eba in __cxxabiv1::__terminate (handler=<optimized out>) at /usr/src/debug/gcc/gcc/libstdc++-v3/libsupc++/eh_terminate.cc:48
warning: 48 /usr/src/debug/gcc/gcc/libstdc++-v3/libsupc++/eh_terminate.cc: No such file or directory
#7 0x00007fb6112975d9 in std::terminate () at /usr/src/debug/gcc/gcc/libstdc++-v3/libsupc++/eh_terminate.cc:58
58 in /usr/src/debug/gcc/gcc/libstdc++-v3/libsupc++/eh_terminate.cc
#8 0x00007fb6112b2176 in __cxxabiv1::__cxa_throw (obj=<optimized out>, tinfo=0x7fb6144d7778 <typeinfo for vk::DeviceLostError>, dest=0x7fb61179cd80 <vk::DeviceLostError::~DeviceLostError()>) at /usr/src/debug/gcc/gcc/libstdc++-v3/libsupc++/eh_throw.cc:98
warning: 98 /usr/src/debug/gcc/gcc/libstdc++-v3/libsupc++/eh_throw.cc: No such file or directory
#9 0x00007fb61168639d in ggml_vk_submit(std::shared_ptr<vk_context_struct>&, vk::Fence) [clone .cold] () from /llama.cpp/build_vulkan/bin/libggml-vulkan.so.0
#10 0x00007fb61176d7a4 in ggml_vk_buffer_write_2d(std::shared_ptr<vk_buffer_struct>&, unsigned long, void const*, unsigned long, unsigned long, unsigned long) [clone .constprop.0] () from /llama.cpp/build_vulkan/bin/libggml-vulkan.so.0
#11 0x00007fb61176e406 in ggml_backend_vk_buffer_set_tensor(ggml_backend_buffer*, ggml_tensor*, void const*, unsigned long, unsigned long) () from /llama.cpp/build_vulkan/bin/libggml-vulkan.so.0
#12 0x00007fb6145878b4 in ggml_backend_sched_graph_compute_async () from /llama.cpp/build_vulkan/bin/libggml-base.so.0
#13 0x00007fb61469afa0 in llama_context::graph_compute(ggml_cgraph*, bool) () from /llama.cpp/build_vulkan/bin/libllama.so.0
#14 0x00007fb61469ce63 in llama_context::process_ubatch(llama_ubatch const&, llm_graph_type, llama_memory_context_i*, ggml_status&) () from /llama.cpp/build_vulkan/bin/libllama.so.0
#15 0x00007fb6146a202f in llama_context::decode(llama_batch const&) () from /llama.cpp/build_vulkan/bin/libllama.so.0
#16 0x00007fb6146a2fce in llama_decode () from /llama.cpp/build_vulkan/bin/libllama.so.0
#17 0x00007fb614be6903 in mtmd_helper_decode_image_chunk () from /llama.cpp/build_vulkan/bin/libmtmd.so.0
#18 0x00007fb614be7c9d in mtmd_helper_eval_chunk_single () from /llama.cpp/build_vulkan/bin/libmtmd.so.0
#19 0x00007fb614be800c in mtmd_helper_eval_chunks () from /llama.cpp/build_vulkan/bin/libmtmd.so.0
#20 0x0000562078ae902c in eval_message(mtmd_cli_context&, common_chat_msg&) ()
#21 0x0000562078ae519a in main ()
[Inferior 1 (process 80874) detached]
terminate called after throwing an instance of 'vk::DeviceLostError'
what(): vk::Queue::submit: ErrorDeviceLost