Skip to content

v0.3.16-cu128-AVX2-win-20250831

Choose a tag to compare

@github-actions github-actions released this 31 Aug 18:35
· 25 commits to main since this release

feat: Update Submodule vendor/llama.cpp 6c442f4..bbbf5ec
feat: Sync llama : remove KV cache defragmentation logic
feat: Sync model : jina-embeddings-v3 support
feat: Sync llama: use FA + max. GPU layers by default, the flash_attn parameter in context_params has been deleted and replaced by flash_attn_type as the default auto initialization parameter.