Port of Qwen3-VL support from mainline #883

Thireus · 2025-10-30T23:07:53Z

convert_hf_to_gguf.py - Not touched, use llama.cpp to convert model instead
sycl and metal support for imrope not added - See ggml-metal and ggml-sycl folders of https://github.com/ggml-org/llama.cpp/pull/16780/files
Vulkan support for imrope not tested
Code not tested against any GGUF (looking for testers...)

I have read the contributing guidelines
Self-reported review complexity:
- Low
- Medium
- High

- convert_hf_to_gguf.py - Not touched, use llama.cpp to convert model instead - sysl and metal support for imrope not added - Vulkan support for imrope not tested - Code not tested

ranilongxi · 2025-10-31T03:28:46Z

I tried to build this branch with cpu-only and CUDA but it kept failing:

CUDA
cmake -B build -G Ninja -DCMAKE_BUILD_TYPE=Release -DGGML_LTO=ON -DGGML_NATIVE=ON -DGGML_CUDA=ON -DGGML_CUDA_F16=ON -DGGML_CUDA_FA_ALL_QUANTS=ON -DGGML_CUDA_DMMV_F16=ON -DGGML_CUDA_FORCE_MMQ=ON -DCMAKE_CUDA_ARCHITECTURES=89 -DLLAMA_CURL=ON -DGGML_BLAS=OFF -DGGML_SCHED_MAX_COPIES=1

FAILED: src/CMakeFiles/llama.dir/llama-load-tensors.cpp.o .../llama-load-tensors.cpp:1006:13: error: conflicting declaration 'int64_t n_embd' int64_t n_embd = hparams.n_embd / (hparams.n_deepstack_layers + 1); ^~~~~~ .../llama-load-tensors.cpp:227:40: note: previous declaration as 'const int64_t n_embd' [[maybe_unused]] const int64_t n_embd = hparams.n_embd; \ ... in expansion of macro 'LOADING_PRELUDE' .../llama-load-tensors.cpp:1045:13: error: conflicting declaration 'int64_t n_embd'
CPU only
cmake -B build -G Ninja -DCMAKE_BUILD_TYPE=Release ninja -C build -j22

FAILED: src/CMakeFiles/llama.dir/llama-load-tensors.cpp.o .../llama-load-tensors.cpp:1006:13: error: conflicting declaration 'int64_t n_embd' .../llama-load-tensors.cpp:227:40: note: previous declaration as 'const int64_t n_embd' ... in expansion of macro 'LOADING_PRELUDE'

ikawrakow#883 (comment)

Thireus · 2025-10-31T06:29:54Z

@ranilongxi, thank you; should be fixed now.

ikawrakow

Apart from the comments, LGTM.

But we need to get this tested by someone before merging.

src/llama-load-tensors.cpp

examples/mtmd/clip.cpp

Thireus · 2025-10-31T09:20:49Z

Thank you @ikawrakow, I'll do my best to resolve your comments. There are still some compilation issues which I'll resolve as well.

ikawrakow#883 (comment)

https://github.com/ikawrakow/ik_llama.cpp/pull/883/files/59ceaf8fcbaf28b29c96d7afb04c526e5c82c2fd#r2480395800 https://github.com/ikawrakow/ik_llama.cpp/pull/883/files/59ceaf8fcbaf28b29c96d7afb04c526e5c82c2fd#r2480398187

Thireus · 2025-10-31T10:12:58Z

@ikawrakow please let me know if you are happy with how I've addressed your comments:
e552942
and
6a24dec

src/llama-load-tensors.cpp

ikawrakow#883 (comment)

Port of Qwen3-VL for latest ik_llama.cpp

0e6f8b2

- convert_hf_to_gguf.py - Not touched, use llama.cpp to convert model instead - sysl and metal support for imrope not added - Vulkan support for imrope not tested - Code not tested

Bugfix n_embd was declared multiple times

59ceaf8

ikawrakow#883 (comment)

Thireus mentioned this pull request Oct 31, 2025

Bugfix n_embd was declared multiple times Thireus/ik_llama.cpp#31

Merged

4 tasks

ikawrakow reviewed Oct 31, 2025

View reviewed changes

src/llama-load-tensors.cpp Outdated Show resolved Hide resolved

examples/mtmd/clip.cpp Outdated Show resolved Hide resolved

examples/mtmd/clip.cpp Outdated Show resolved Hide resolved

Thireus added 3 commits October 31, 2025 09:33

Fix n_embd issue with qwen3vl

1485f75

model.output tensor not required

e552942

ikawrakow#883 (comment)

Improved logic for qkv combined tensors

6a24dec

https://github.com/ikawrakow/ik_llama.cpp/pull/883/files/59ceaf8fcbaf28b29c96d7afb04c526e5c82c2fd#r2480395800 https://github.com/ikawrakow/ik_llama.cpp/pull/883/files/59ceaf8fcbaf28b29c96d7afb04c526e5c82c2fd#r2480398187

ikawrakow reviewed Oct 31, 2025

View reviewed changes

src/llama-load-tensors.cpp Show resolved Hide resolved

Fix n_embd for merge_qkv() + cleaner code

cddb222

ikawrakow#883 (comment)

Thireus mentioned this pull request Oct 31, 2025

Fix n_embd for merge_qkv() + cleaner code Thireus/ik_llama.cpp#34

Merged

4 tasks

Revert TENSOR_NOT_REQUIRED

5aaa485

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Port of Qwen3-VL support from mainline #883

Port of Qwen3-VL support from mainline #883

Uh oh!

Thireus commented Oct 30, 2025

Uh oh!

ranilongxi commented Oct 31, 2025

Uh oh!

Thireus commented Oct 31, 2025

Uh oh!

ikawrakow left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Thireus commented Oct 31, 2025

Uh oh!

Thireus commented Oct 31, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Port of Qwen3-VL support from mainline #883

Are you sure you want to change the base?

Port of Qwen3-VL support from mainline #883

Uh oh!

Conversation

Thireus commented Oct 30, 2025

Uh oh!

ranilongxi commented Oct 31, 2025

Uh oh!

Thireus commented Oct 31, 2025

Uh oh!

ikawrakow left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Thireus commented Oct 31, 2025

Uh oh!

Thireus commented Oct 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Thireus commented Oct 31, 2025 •

edited

Loading