Skip to content

Releases: ggml-org/llama.cpp

b6058

01 Aug 14:00
0f5ccd6
Compare
Choose a tag to compare
model : add hunyuan dense (#14878)

* support hunyuan_v1_dense

Signed-off-by: stevenkuang <[email protected]>

* update hunyuan_moe to hunyuan_v1_moe

Signed-off-by: stevenkuang <[email protected]>

* fix rope alpha assert and bos token

Signed-off-by: stevenkuang <[email protected]>

* add blank line

Signed-off-by: stevenkuang <[email protected]>

* Revert "update hunyuan_moe to hunyuan_v1_moe"

This reverts commit aa973ca21913aba77f6e81a935270ef7be222e75.

* use hunyuan_dense instead of hunyuan_v1_dense

Signed-off-by: stevenkuang <[email protected]>

* fix hunyuan_moe chat template

Signed-off-by: stevenkuang <[email protected]>

* remove leftover code

Signed-off-by: stevenkuang <[email protected]>

* update hunyuan dense chat template

Signed-off-by: stevenkuang <[email protected]>

* fix hunyuan dense vocab and chat template

Signed-off-by: stevenkuang <[email protected]>

---------

Signed-off-by: stevenkuang <[email protected]>

b6057

01 Aug 11:32
1c872f7
Compare
Choose a tag to compare
opencl: add f16 for `add`, `sub`, `mul`, `div` (#14984)

b6056

01 Aug 06:37
baad948
Compare
Choose a tag to compare
ggml : Q2k interleaving implementation - x86/x64 SIMD (#14373)

* Initial Q2_K Block Interleaving Implementation

* Addressed review comments and clean up of the code

* Post rebase fixes

* Initial CI/CD fixes

* Update declarations in arch-fallback.h

* Changes for GEMV Q2_K in arch-fallback.h

* Enable repacking only on AVX-512 machines

* Update comments in repack.cpp

* Address q2k comments

---------

Co-authored-by: Manogna-Sree <[email protected]>

b6055

01 Aug 04:05
ba42794
Compare
Choose a tag to compare
graph : fix equal_seq() check (#14986)

ggml-ci

b6054

01 Aug 02:22
2860d47
Compare
Choose a tag to compare
docker : add cann build pipline (#14591)

* docker: add cann build pipline

* docker: add cann build pipline

* docker: fix cann devops

* cann : fix multi card hccl

* Update ggml/src/ggml-cann/ggml-cann.cpp

Co-authored-by: Xuan-Son Nguyen <[email protected]>

* Update ggml-cann.cpp

---------

Co-authored-by: Georgi Gerganov <[email protected]>
Co-authored-by: Xuan-Son Nguyen <[email protected]>

b6052

31 Jul 20:04
daf2dd7
Compare
Choose a tag to compare
quantize : skip tensor override when in fallback mode (#14995)

b6051

31 Jul 19:09
a06ed5f
Compare
Choose a tag to compare
llama : add simple option to enable CPU for MoE weights (--cpu-moe) (…

b6050

31 Jul 18:29
7845240
Compare
Choose a tag to compare
Fix params bug in diffusion example (#14993)

b6049

31 Jul 17:45
d6818d0
Compare
Choose a tag to compare
llama : allow other bufts when overriding to CPU, add --no-repack opt…

b6048

31 Jul 17:49
e08a988
Compare
Choose a tag to compare
Vulkan: Fix minor debug mode issues (#14899)

* vulkan: fix debug mode issues

* vulkan: remove broken check_results GGML_OP_SET_ROWS support