Releases · ggml-org/llama.cpp

01 Aug 14:00

0f5ccd6

b6058

model : add hunyuan dense (#14878)

* support hunyuan_v1_dense

Signed-off-by: stevenkuang <[email protected]>

* update hunyuan_moe to hunyuan_v1_moe

Signed-off-by: stevenkuang <[email protected]>

* fix rope alpha assert and bos token

Signed-off-by: stevenkuang <[email protected]>

* add blank line

Signed-off-by: stevenkuang <[email protected]>

* Revert "update hunyuan_moe to hunyuan_v1_moe"

This reverts commit aa973ca21913aba77f6e81a935270ef7be222e75.

* use hunyuan_dense instead of hunyuan_v1_dense

Signed-off-by: stevenkuang <[email protected]>

* fix hunyuan_moe chat template

Signed-off-by: stevenkuang <[email protected]>

* remove leftover code

Signed-off-by: stevenkuang <[email protected]>

* update hunyuan dense chat template

Signed-off-by: stevenkuang <[email protected]>

* fix hunyuan dense vocab and chat template

Signed-off-by: stevenkuang <[email protected]>

---------

Signed-off-by: stevenkuang <[email protected]>

Assets 15

01 Aug 11:32

github-actions

b6057

1c872f7

b6057

opencl: add f16 for `add`, `sub`, `mul`, `div` (#14984)

Assets 15

01 Aug 06:37

github-actions

b6056

baad948

b6056

ggml : Q2k interleaving implementation - x86/x64 SIMD (#14373)

* Initial Q2_K Block Interleaving Implementation

* Addressed review comments and clean up of the code

* Post rebase fixes

* Initial CI/CD fixes

* Update declarations in arch-fallback.h

* Changes for GEMV Q2_K in arch-fallback.h

* Enable repacking only on AVX-512 machines

* Update comments in repack.cpp

* Address q2k comments

---------

Co-authored-by: Manogna-Sree <[email protected]>

Assets 15

01 Aug 04:05

github-actions

b6055

ba42794

b6055

graph : fix equal_seq() check (#14986)

ggml-ci

Assets 15

01 Aug 02:22

github-actions

b6054

2860d47

b6054

docker : add cann build pipline (#14591)

* docker: add cann build pipline

* docker: add cann build pipline

* docker: fix cann devops

* cann : fix multi card hccl

* Update ggml/src/ggml-cann/ggml-cann.cpp

Co-authored-by: Xuan-Son Nguyen <[email protected]>

* Update ggml-cann.cpp

---------

Co-authored-by: Georgi Gerganov <[email protected]>
Co-authored-by: Xuan-Son Nguyen <[email protected]>

Assets 15

31 Jul 20:04

github-actions

b6052

daf2dd7

b6052

quantize : skip tensor override when in fallback mode (#14995)

Assets 15

31 Jul 19:09

github-actions

b6051

a06ed5f

b6051

llama : add simple option to enable CPU for MoE weights (--cpu-moe) (…

Assets 15

31 Jul 18:29

github-actions

b6050

7845240

b6050

Fix params bug in diffusion example (#14993)

Assets 15

31 Jul 17:45

github-actions

b6049

d6818d0

b6049

llama : allow other bufts when overriding to CPU, add --no-repack opt…

Assets 15

31 Jul 17:49

github-actions

b6048

e08a988

b6048

Vulkan: Fix minor debug mode issues (#14899)

* vulkan: fix debug mode issues

* vulkan: remove broken check_results GGML_OP_SET_ROWS support

Assets 15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Releases: ggml-org/llama.cpp

b6058

Uh oh!

b6057

Uh oh!

b6056

Uh oh!

b6055

Uh oh!

b6054

Uh oh!

b6052

Uh oh!

b6051

Uh oh!

b6050

Uh oh!

b6049

Uh oh!

b6048

Uh oh!