Releases: JamePeng/llama-cpp-python
v0.3.16-cu124-AVX2-win-20250913
v0.3.16-cu124-AVX2-linux-20250913
v0.3.16-cu128-AVX2-win-20250831
feat: Update Submodule vendor/llama.cpp 6c442f4..bbbf5ec
feat: Sync llama : remove KV cache defragmentation logic
feat: Sync model : jina-embeddings-v3 support
feat: Sync llama: use FA + max. GPU layers by default, the flash_attn parameter in context_params has been deleted and replaced by flash_attn_type as the default auto initialization parameter.
v0.3.16-cu128-AVX2-linux-20250831
feat: Update Submodule vendor/llama.cpp 6c442f4..bbbf5ec
feat: Sync llama : remove KV cache defragmentation logic
feat: Sync model : jina-embeddings-v3 support
feat: Sync llama: use FA + max. GPU layers by default, the flash_attn parameter in context_params has been deleted and replaced by flash_attn_type as the default auto initialization parameter.
v0.3.16-cu126-AVX2-win-20250831
feat: Update Submodule vendor/llama.cpp 6c442f4..bbbf5ec
feat: Sync llama : remove KV cache defragmentation logic
feat: Sync model : jina-embeddings-v3 support
feat: Sync llama: use FA + max. GPU layers by default, the flash_attn parameter in context_params has been deleted and replaced by flash_attn_type as the default auto initialization parameter.
v0.3.16-cu126-AVX2-linux-20250831
feat: Update Submodule vendor/llama.cpp 6c442f4..bbbf5ec
feat: Sync llama : remove KV cache defragmentation logic
feat: Sync model : jina-embeddings-v3 support
feat: Sync llama: use FA + max. GPU layers by default, the flash_attn parameter in context_params has been deleted and replaced by flash_attn_type as the default auto initialization parameter.
v0.3.16-cu124-AVX2-win-20250831
feat: Update Submodule vendor/llama.cpp 6c442f4..bbbf5ec
feat: Sync llama : remove KV cache defragmentation logic
feat: Sync model : jina-embeddings-v3 support
feat: Sync llama: use FA + max. GPU layers by default, the flash_attn parameter in context_params has been deleted and replaced by flash_attn_type as the default auto initialization parameter.
v0.3.16-cu124-AVX2-linux-20250831
feat: Update Submodule vendor/llama.cpp 6c442f4..bbbf5ec
feat: Sync llama : remove KV cache defragmentation logic
feat: Sync model : jina-embeddings-v3 support
feat: Sync llama: use FA + max. GPU layers by default, the flash_attn parameter in context_params has been deleted and replaced by flash_attn_type as the default auto initialization parameter.