Sync master with upstream release b6490 #252

jan-service-account · 2025-09-17T00:33:09Z

Updates dev branch with latest release (b6490) from ggml-org/llama.cpp

* ci : update macos-latest* jobs to use macos-latest This commit updates the jobs that are named macos-latest* to use the macos-latest label instead explicit versions. The motivation for this is that there is currently a mixuture of versions in this workflow and there are jobs that are failing because they require a newer version. Refs: https://github.com/ggml-org/llama.cpp/actions/runs/17644792595/job/50140010907#step:5:1759 * ci : add xcodebuild -downloadPlatform iOS command

clang-format previously broke long CUDA macros (e.g. __launch_bounds__) into unreadable line breaks inside template declarations, such as: template<int D, int ncols, int nwarps, int VKQ_stride, typename KQ_acc_t, bool use_logit_softcap> __launch_bounds__(nwarps*ggml_cuda_get_physical_warp_size(), 1) This change adjusts formatting rules so that CUDA macros remain consistent and aligned with the surrounding template syntax.

…6010) This commit updates the github workflows build.yml file to include steps for uploading and downloading the xcframework artifact. The macos-latest-swift job now depends on the ios-xcode-build job and downloads the xcframework artifact produced by it. The motivation for this changes is that it takes a long time to build the xcframework and we are currently doing this twice in the workflow. With this change, we only build it once and reuse the artifact.

* ggml : remove adding extra dim timestep embedding This commit updates the ggml_timestep_embedding function to no longer add an extra dimension when the specified dimension is odd. The motivation for this change is that this introduces an unnecessary dimension when the dimension is odd, which caused an issue in the kernels which were not expecting this extra dimension and it resulted in uninitialized memory for the second to last dimension. * ggml-cuda : fix padding in timestep embedding kernel This commit removes the zeroing out of the last dimension now that we are not adding the extra padding dimension. * ggml-metal : fix padding in timestep embedding kernel This commit fixes the zero padding for odd dimensions in the timestep embedding kernel * ggml-opencl : fix padding in timestep embedding kernel This commit fixes the zero padding for odd dimensions in the timestep embedding kernel. * ggml-sycl : fix padding in timestep embedding kernel This commit fixes the zero padding for odd dimensions in the timestep embedding kernel. * ggml-vulkan : fix padding in timestep embedding kernel This commit fixes the zero padding for odd dimensions in the timestep embedding kernel. * ggml-cpu : fix padding in timestep embedding function This commit removes the zeroing out of the last dimension now that we are not adding the extra padding dimension.

This commit updates the runs-on field for the macOS arm64 webgpu build job to use macos-latest instead of just latest. The motivation for this is that this job can wait for a runner to pick up the job for a very long time, sometimes over 7 hours. This is an attempt to see if this change can help reduce the wait time. Refs: https://github.com/ggml-org/llama.cpp/actions/runs/17754163447/job/50454257570?pr=16004

* llama-bench: add --n-cpu-moe support Support --n-cpu-moe in llama-bench the same way it is supported by llama-server.

am17an and others added 8 commits September 16, 2025 10:38

Add LLaDA-7b-MoE diffusion model (ggml-org#16003)

6d75883

cmake : Do not install tools on iOS targets (ggml-org#15903)

07808eb

llama-bench: add --n-cpu-moe support (ggml-org#15952)

8ff2060

* llama-bench: add --n-cpu-moe support Support --n-cpu-moe in llama-bench the same way it is supported by llama-server.

jan-service-account merged commit e4912fc into dev Sep 17, 2025
3 checks passed

jan-service-account deleted the update-dev-from-master-2025-09-17-00-33 branch September 17, 2025 00:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Sync master with upstream release b6490 #252

Sync master with upstream release b6490 #252

Uh oh!

jan-service-account commented Sep 17, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Sync master with upstream release b6490 #252

Sync master with upstream release b6490 #252

Uh oh!

Conversation

jan-service-account commented Sep 17, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants