You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* master: (113 commits)
webui: updated the chat service to only include max_tokens in the req… (ggml-org#16489)
cpu : optimize the ggml NORM operation (ggml-org#15953)
server : host-memory prompt caching (ggml-org#16391)
No markdown in cot (ggml-org#16483)
model-conversion : add support for SentenceTransformers (ggml-org#16387)
ci: add ARM64 Kleidiai build and test support (ggml-org#16462)
CANN: Improve ACL graph matching (ggml-org#16166)
kleidiai: kernel interface refactoring (ggml-org#16460)
[SYCL] refactor soft_max, add soft_max_back (ggml-org#16472)
model: EmbeddingGemma Adding Support for SentenceTransformers Dense Modules (ggml-org#16367)
refactor: centralize CoT parsing in backend for streaming mode (ggml-org#16394)
Disable CUDA host buffers on integrated GPUs (ggml-org#16308)
server : fix cancel pending task (ggml-org#16467)
metal : mark FA blocks (ggml-org#16372)
server : improve context checkpoint logic (ggml-org#16440)
ggml webgpu: profiling, CI updates, reworking of command submission (ggml-org#16452)
llama : support LiquidAI LFM2-MoE hybrid model (ggml-org#16464)
server : add `/v1/health` endpoint (ggml-org#16461)
webui : added download action (ggml-org#13552) (ggml-org#16282)
presets : fix pooling param for embedding models (ggml-org#16455)
...
0 commit comments