Sync master with upstream release b5142 #58

jan-service-account · 2025-04-16T00:08:42Z

Updates dev branch with latest release (b5142) from ggml-org/llama.cpp

Multiple optional memory pools are provided for CANN, including VMM, priority queue-based, and traditional memory pools. 1.When the memory pool is available and GGML_CANN_DISABLE_VMM_POOL is not defined, the VMM pool is selected by default. 2.Otherwise, if GGML_CANN_ENABLE_BUF_PRIO_POOL is defined, the priority queue-based memory pool is used. 3.If neither condition is met, the default memory pool is used.

* [CANN]Opt ROPE optimization * [CANN]Codestyle adjustment * [CANN]Fix the ROPE precision issue * [CANN]codestyle fix * [CANN]add rope unsupport case Signed-off-by: noemotiovon <[email protected]>

* Add AVX512 implementation of GEMM - q4kx8 * Update changes to remove unnecessary whitespaces

* Merged using squash to remove all noise commit messages * Force flash attention off for `LLM_ARCH_DEEPSEEK2` - embedding too large * Removed 3 conts (2x RoPE and 1x RMS-norm) * Changed to use `<cmath>` instead of `<math.h>` * Reverted removal of the 3 conts * Used `reshape` in `llm_graph_context::build_attn_mha()` * Use `k_pe = ggml_reshape` * Removed the 3 conts again * Removed the 3D views of `wk_b` and `wv_b`, and just save and 3D in GGUF * Removed MQA optimisation from `build_attn_mha()` as no gains now * Simplified `is_mla` branch in `llm_build_deepseek2()` * Removed `build_attn_mla` and added `nullptr` to all `build_atnn` calls * Fixed call to `build_attn` in `llm_build_t5_enc`

* SYCL: Add ROPE vision kernel * Add comment about rope mode

…2934) Replace compile-time `GGML_HIP_UMA` with environment variable `GGML_CUDA_ENABLE_UNIFIED_MEMORY`. This unifies the usage on NVIDIA and AMD GPUs, and allows a single binary to be shared between integrated and dedicated GPUs.

* CANN: Add x86 build ci * CANN: fix code format

ggml-ci

…org#12886) * opencl: refactor - split the kernel files --------- Co-authored-by: Shangqing Gu <[email protected]> * opencl: split more kernels into separate files * opencl: specify subgroup size instead of querying it * opencl: refine Adreno cl compiler version parsing * opencl: skip some kernels not used by Adreno on old compilers * opencl: refine logic for selecting Adreno kernels * opencl: refine Adreno cl compiler version * opencl: cleanup preprocessor for kernels * opencl: consider Adreno CL compiler on Windows * opencl: add final newline for `mul_mv_f16_f16.cl` --------- Co-authored-by: Shangqing Gu <[email protected]>

bachelor-dou and others added 10 commits April 15, 2025 10:04

CANN: Opt ROPE optimization (ggml-org#12865)

0019279

* [CANN]Opt ROPE optimization * [CANN]Codestyle adjustment * [CANN]Fix the ROPE precision issue * [CANN]codestyle fix * [CANN]add rope unsupport case Signed-off-by: noemotiovon <[email protected]>

ggml : Add AVX512 implementation of GEMM - Q4_Kx8 (ggml-org#12829)

eccc7a1

* Add AVX512 implementation of GEMM - q4kx8 * Update changes to remove unnecessary whitespaces

SYCL: Add ROPE vision kernel (ggml-org#12887)

5106764

* SYCL: Add ROPE vision kernel * Add comment about rope mode

CANN: Add x86 build ci (ggml-org#12950)

54a7272

* CANN: Add x86 build ci * CANN: fix code format

metal : add FA-vec kernels for head size 96 (ggml-org#12952)

f8f820c

ggml-ci

Merge branch 'dev' into update-dev-from-master-2025-04-16-00-08

f9cc3f8

vansangpfiev approved these changes Apr 16, 2025

View reviewed changes

vansangpfiev merged commit 9cd30a0 into dev Apr 16, 2025
7 checks passed

vansangpfiev deleted the update-dev-from-master-2025-04-16-00-08 branch April 16, 2025 02:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Sync master with upstream release b5142 #58

Sync master with upstream release b5142 #58

Uh oh!

jan-service-account commented Apr 16, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

12 participants

Sync master with upstream release b5142 #58

Sync master with upstream release b5142 #58

Uh oh!

Conversation

jan-service-account commented Apr 16, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

12 participants