You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Integrates Qwen3-VL and Qwen3VL-MoE architecture support from upstream.
Implements IMROPE (Interleaved Multi-resolution RoPE) for vision models.
Adds deepstack layer support for visual feature processing.
Changes include:
- New architecture types: LLM_ARCH_QWEN3VL, LLM_ARCH_QWEN3VLMOE
- IMROPE rope type for vision position encoding
- Deepstack visual feature handling in clip.cpp
- GGML CUDA kernels for IMROPE
- Tensor mappings for Qwen3VL architecture
Upstream PR: ggml-org/llama.cpp#16780
Contributors: @JJJYmmm@yairpatch@Thireus@LETS-BEE
0 commit comments