Releases: ggml-org/llama.cpp
Releases · ggml-org/llama.cpp
b6058
model : add hunyuan dense (#14878) * support hunyuan_v1_dense Signed-off-by: stevenkuang <[email protected]> * update hunyuan_moe to hunyuan_v1_moe Signed-off-by: stevenkuang <[email protected]> * fix rope alpha assert and bos token Signed-off-by: stevenkuang <[email protected]> * add blank line Signed-off-by: stevenkuang <[email protected]> * Revert "update hunyuan_moe to hunyuan_v1_moe" This reverts commit aa973ca21913aba77f6e81a935270ef7be222e75. * use hunyuan_dense instead of hunyuan_v1_dense Signed-off-by: stevenkuang <[email protected]> * fix hunyuan_moe chat template Signed-off-by: stevenkuang <[email protected]> * remove leftover code Signed-off-by: stevenkuang <[email protected]> * update hunyuan dense chat template Signed-off-by: stevenkuang <[email protected]> * fix hunyuan dense vocab and chat template Signed-off-by: stevenkuang <[email protected]> --------- Signed-off-by: stevenkuang <[email protected]>
b6057
opencl: add f16 for `add`, `sub`, `mul`, `div` (#14984)
b6056
ggml : Q2k interleaving implementation - x86/x64 SIMD (#14373) * Initial Q2_K Block Interleaving Implementation * Addressed review comments and clean up of the code * Post rebase fixes * Initial CI/CD fixes * Update declarations in arch-fallback.h * Changes for GEMV Q2_K in arch-fallback.h * Enable repacking only on AVX-512 machines * Update comments in repack.cpp * Address q2k comments --------- Co-authored-by: Manogna-Sree <[email protected]>
b6055
graph : fix equal_seq() check (#14986) ggml-ci
b6054
docker : add cann build pipline (#14591) * docker: add cann build pipline * docker: add cann build pipline * docker: fix cann devops * cann : fix multi card hccl * Update ggml/src/ggml-cann/ggml-cann.cpp Co-authored-by: Xuan-Son Nguyen <[email protected]> * Update ggml-cann.cpp --------- Co-authored-by: Georgi Gerganov <[email protected]> Co-authored-by: Xuan-Son Nguyen <[email protected]>
b6052
quantize : skip tensor override when in fallback mode (#14995)
b6051
llama : add simple option to enable CPU for MoE weights (--cpu-moe) (…
b6050
Fix params bug in diffusion example (#14993)
b6049
llama : allow other bufts when overriding to CPU, add --no-repack opt…
b6048
Vulkan: Fix minor debug mode issues (#14899) * vulkan: fix debug mode issues * vulkan: remove broken check_results GGML_OP_SET_ROWS support