Sync master with upstream release b5121 #54

jan-service-account · 2025-04-12T00:08:12Z

Updates dev branch with latest release (b5121) from ggml-org/llama.cpp

* ggml: fixes ggml-org#12846 compilation error Signed-off-by: Aaron Teo <[email protected]> Co-authored-by: Aleksei Nikiforov <[email protected]> * ggml: add documentation for code change Signed-off-by: Aaron Teo <[email protected]> Co-authored-by: Aleksei Nikiforov <[email protected]> * ggml: refactor to type-cast and update documentation Signed-off-by: Aaron Teo <[email protected]> Co-authored-by: Aleksei Nikiforov <[email protected]> * ggml: update documentation to provide full issue link Signed-off-by: Aaron Teo <[email protected]> Co-authored-by: Aleksei Nikiforov <[email protected]> --------- Co-authored-by: Aleksei Nikiforov <[email protected]>

* Llama-4 mapping * remove hacky renaming --------- Co-authored-by: Daniel Han <[email protected]>

This commit adds a check for the visionos build version used with vtool in build-xcframework.sh. The script now checks the Xcode version and determines whether to use "xros" or "visionos" for the build version. This commit also uses xcrun for the vtool so that the version of vtool in xcode command line tools is used instead of the one in the system path. Refs: ggml-org/whisper.cpp#2994 (comment)

…#12861) Signed-off-by: Xiaodong Ye <[email protected]>

* SYCL: Add fp16 support to some elementwise OP kernels * remove comment ggml-ci * Use static_cast directly * remove not needed cast from tanh * Use static cast and remove unneeded castings * Adjust device_support_op for unary OPs * Use cast_data and typed_data struct to deduplicate casting code

* clip : use smart pointers * fix warmup * add forward declaration * misisng include * fix include (2) * composite * simplify batch ptr * fix conflict

…2867) * GLM-4-0414 * use original one * Using with tensor map * fix bug * change order * change order * format with flask8

* support download from modelscope * support login * remove comments * add arguments * fix code * fix win32 * test passed * fix readme * revert readme * change to MODEL_ENDPOINT * revert tail line * fix readme * refactor model endpoint * remove blank line * fix header * fix as comments * update comment * update readme --------- Co-authored-by: tastelikefeet <yuze.zyz@alibaba-inc/com>

The current usage of the SYCL-Graph extension checks for the `sycl_ext_oneapi_graph` device aspect. However, it is also possible to support `sycl_ext_oneapi_limied_graph` devices that don't support update

…templates (ggml-org#12900) * `tool-call`: don't call common_chat_params_init_hermes_2_pro when there aren't tools (or when there's a schema) * test all chat formats w/o tools

* server : add VSCode's Github Copilot Chat support * cont : update handler name

taronaeo and others added 15 commits April 11, 2025 08:20

llama : correct rms norm for llama 4 (ggml-org#12882)

8b91d53

convert : proper tensor name mapping for llama4 (ggml-org#12870)

5b1f13c

* Llama-4 mapping * remove hacky renaming --------- Co-authored-by: Daniel Han <[email protected]>

ci : Replace freediskspace to free_disk_space in docker.yml (ggml-org…

8ac9f5d

…#12861) Signed-off-by: Xiaodong Ye <[email protected]>

convert : Llama4 RoPE fix (ggml-org#12889)

ec6c09d

clip : use smart pointer (⚠️ breaking change) (ggml-org#12869)

0c50923

* clip : use smart pointers * fix warmup * add forward declaration * misisng include * fix include (2) * composite * simplify batch ptr * fix conflict

llama-model : add Glm4Model implementation for GLM-4-0414 (ggml-org#1…

06bb53a

…2867) * GLM-4-0414 * use original one * Using with tensor map * fix bug * change order * change order * format with flask8

sycl: Support sycl_ext_oneapi_limited_graph (ggml-org#12873)

578754b

The current usage of the SYCL-Graph extension checks for the `sycl_ext_oneapi_graph` device aspect. However, it is also possible to support `sycl_ext_oneapi_limied_graph` devices that don't support update

common : Define cache directory on FreeBSD (ggml-org#12892)

68b08f3

tool-call: fix non-tool-calling grammar crashes w/ Qwen / Hermes 2 …

b6930eb

…templates (ggml-org#12900) * `tool-call`: don't call common_chat_params_init_hermes_2_pro when there aren't tools (or when there's a schema) * test all chat formats w/o tools

rpc : Set cache directory in rpc-server.cpp on FreeBSD (ggml-org#12903)

e8a6263

server : add VSCode's Github Copilot Chat support (ggml-org#12896)

c94085d

* server : add VSCode's Github Copilot Chat support * cont : update handler name

jan-service-account merged commit 932e858 into dev Apr 12, 2025
15 checks passed

jan-service-account deleted the update-dev-from-master-2025-04-12-00-08 branch April 12, 2025 00:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Sync master with upstream release b5121 #54

Sync master with upstream release b5121 #54

Uh oh!

jan-service-account commented Apr 12, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

14 participants

Sync master with upstream release b5121 #54

Sync master with upstream release b5121 #54

Uh oh!

Conversation

jan-service-account commented Apr 12, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

14 participants