Sync master with upstream release b4980 #33

jan-service-account · 2025-03-28T00:08:06Z

Updates dev branch with latest release (b4980) from ggml-org/llama.cpp

* SYCL: implement memset ggml backend buffer interface * use GGML_ABORT macro * Do not wait for all queues to finish for memset operation

* llama : make loras compatible with repacking ggml-ci * cont : simplify ggml-ci * cont : add TODO [no ci]

* ggml : add 128-bit RVV support * ggml : revert to old RVV 256+ q2_K, q3_K, q4_K, q6_K impl * remove trailing whitespaces * restructure vector length selection code

This change upstreams llamafile's cpu matrix multiplication kernels for ppc64le ISA using MMA builtins. This patch handles matrix multiplication between quantised datatypes, block_q4_0 and block_q8_0. This change results in 5% - 50% improvement in total speed(ie all tokens/total time), across various batch sizes. The patch is tested with Meta-Lllama-3-8B, Mistral-7B, Llama-2-7B-chat-hf models on a IBM POWER10 machine. Signed-off-by: Amrita H S <[email protected]>

ggml-ci

* add edgellm model arch[conversation feature doesn't work] * remove output.weight layer for edgellm arch * [Model] update the name of the model * update the name of model arch in convert gguf * [Model] Refarctor the model arch into llama-model * [Bug] Fix the bug in create attn kv * [Code] Fix editorconfig erros * [Code] Remove Trailing whitespace * [Code] Remove Trailing whitespace * [Code] Change the order of model arch in list * [Code] Fix flake8 Lint errors * Remove trailing white space * [Code] Remove call in model arch

…g#12600) * opencl: add `im2col` * opencl: add `gelu_quick` * opencl: add mrope * opencl: add vision rope

* server : Bump cpp-httplib to include AF_UNIX windows support Signed-off-by: Piotr Stankiewicz <[email protected]> * server : Allow running the server example on a unix socket Signed-off-by: Piotr Stankiewicz <[email protected]> --------- Signed-off-by: Piotr Stankiewicz <[email protected]>

qnixsynapse and others added 14 commits March 27, 2025 09:46

SYCL: implement memset ggml backend buffer interface (ggml-org#12580)

f17a3bb

* SYCL: implement memset ggml backend buffer interface * use GGML_ABORT macro * Do not wait for all queues to finish for memset operation

llama : make loras compatible with repacking (ggml-org#12593)

f28bc4c

* llama : make loras compatible with repacking ggml-ci * cont : simplify ggml-ci * cont : add TODO [no ci]

ggml : riscv: add 128-bit RVV support (ggml-org#12530)

24feaec

* ggml : add 128-bit RVV support * ggml : revert to old RVV 256+ q2_K, q3_K, q4_K, q6_K impl * remove trailing whitespaces * restructure vector length selection code

cmake : sync/merge PowerPC build commands (#0)

0306aad

sync : ggml

df0665a

ggml-ci

scripts : update sync + fix cmake merge

771d843

ggml-ci

sync : ggml

029c693

ggml-ci

convert : Support Qwen2_5_VLForConditionalGeneration (ggml-org#12595)

d5c6309

model : restore support for T5Encoder (ggml-org#12590)

953c2a6

opencl: add multi and vision rope, gelu_quick and im2col (ggml-or…

5dec47d

…g#12600) * opencl: add `im2col` * opencl: add `gelu_quick` * opencl: add mrope * opencl: add vision rope

media : add SVG logo [no ci] (ggml-org#12616)

2969019

github-actions bot added SYCL examples python ggml documentation server script labels Mar 28, 2025

jan-service-account merged commit f98f3f5 into dev Mar 28, 2025
16 checks passed

jan-service-account deleted the update-dev-from-master-2025-03-28-00-08 branch March 28, 2025 00:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Sync master with upstream release b4980 #33

Sync master with upstream release b4980 #33

Uh oh!

jan-service-account commented Mar 28, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

11 participants

Sync master with upstream release b4980 #33

Sync master with upstream release b4980 #33

Uh oh!

Conversation

jan-service-account commented Mar 28, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

11 participants