v0.3.9-cu126-AVX2-win-20250525
          ·
          
            115 commits
          
          to main
          since this release
        
        
        
Implement mtmd_cpp.py, base on tools/mtmd/mtmd.h #MTMD_API
Note: llava_cpp.py will be removed after llama_chat_format.py is adjusted.
It cannot connect to llava.dll (it is now mtmd.dll)
Sync kv-cache : add SWA support
Update llama.cpp API code 20250513
Sync context : remove logits_all flag and update API
Update LLAVA_API code in llava_cpp.py
Sync llava_cpp code: Update clip.h function API
Sync quantize: Handle user-defined quantization levels for additional tensors (#12511)
Sync llama : Support llama 4 text-only
Update llama : add option to override model tensor buffers
Sync llama-vocab : add SuperBPE pre-tokenizer
class LlamaSampler: append add_xtc(), add_top_n_sigma() and add_dry()