Sync master with upstream release b6052 #185

jan-service-account · 2025-08-01T00:14:48Z

Updates dev branch with latest release (b6052) from ggml-org/llama.cpp

* graph : avoid creating redundant s_copy views * graph : comment the s_copy views

…t. (ggml-org#14985) * CANN: Improve loading efficiency after converting weights to NZ format. * CANN: fix typo

* Add support for Llada-8b: diffusion model * Add README * Fix README and convert_hf_to_gguf * convert_hf_to_gguf.py: address review comments * Make everything in a single example * Remove model-specific sampling * Remove unused argmax * Remove braced initializers, improve README.md a bit * Add diffusion specific gguf params in set_vocab, remove setting rope_theta and rms_norm_eps * Remove adding the mask token * Move add_add_bos_token to set_vocab * use add_bool in gguf_writer.py

Signed-off-by: Lukas Straub <[email protected]>

…gml-org#14968)

* llama-server : implement universal assisted decoding * Erase prompt tail for kv-cache * set vocab_dft_compatible in common_speculative * rename ctx_main to ctx_tgt * move vocab_dft_compatible to spec struct * clear mem_dft, remove mem * detokenize id_last for incompatible models * update comment * add --spec-replace flag * accept special tokens when translating between draft/main models * Escape spec-replace * clamp draft result to size to params.n_draft * fix comment * clean up code * restore old example * log common_speculative_are_compatible in speculative example * fix * Update common/speculative.cpp Co-authored-by: Georgi Gerganov <[email protected]> * Update common/speculative.cpp Co-authored-by: Georgi Gerganov <[email protected]> * Update common/speculative.cpp Co-authored-by: Georgi Gerganov <[email protected]> --------- Co-authored-by: Georgi Gerganov <[email protected]>

* MODEL_TENSOR.SSM_DT_NORM has defined twice, and second overwritten the jamba model's layername * correct order

* support minicpm-v 4 * add md * support MiniCPM-o 4.0 * add default location * temp rm MiniCPM-o 4.0 * fix code * fix "minicpmv_projector" default path

* vulkan: fix debug mode issues * vulkan: remove broken check_results GGML_OP_SET_ROWS support

…ion (ggml-org#14990)

…gml-org#14992)

compilade and others added 13 commits July 31, 2025 08:02

graph : reduce splits for recurrent and hybrid models (ggml-org#14825)

66625a5

* graph : avoid creating redundant s_copy views * graph : comment the s_copy views

CANN: Improve loading efficiency after converting weights to NZ forma…

11490b3

…t. (ggml-org#14985) * CANN: Improve loading efficiency after converting weights to NZ format. * CANN: fix typo

server : add openai-style logit_bias support (ggml-org#14946)

a9f77a8

Signed-off-by: Lukas Straub <[email protected]>

llama : merge build_moe_ffn_from_probs function into build_moe_ffn (g…

c1dacaa

…gml-org#14968)

MODEL_TENSOR.SSM_DT_NORM has defined twice (ggml-org#14991)

36e5fe7

* MODEL_TENSOR.SSM_DT_NORM has defined twice, and second overwritten the jamba model's layername * correct order

mtmd : support MiniCPM-V 4.0 (ggml-org#14983)

952a47f

* support minicpm-v 4 * add md * support MiniCPM-o 4.0 * add default location * temp rm MiniCPM-o 4.0 * fix code * fix "minicpmv_projector" default path

Vulkan: Fix minor debug mode issues (ggml-org#14899)

e08a988

* vulkan: fix debug mode issues * vulkan: remove broken check_results GGML_OP_SET_ROWS support

llama : allow other bufts when overriding to CPU, add --no-repack opt…

d6818d0

…ion (ggml-org#14990)

Fix params bug in diffusion example (ggml-org#14993)

7845240

llama : add simple option to enable CPU for MoE weights (--cpu-moe) (g…

a06ed5f

…gml-org#14992)

quantize : skip tensor override when in fallback mode (ggml-org#14995)

daf2dd7

jan-service-account merged commit daf2dd7 into dev Aug 1, 2025
18 of 19 checks passed

jan-service-account deleted the update-dev-from-master-2025-08-01-00-14 branch August 1, 2025 07:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Sync master with upstream release b6052 #185

Sync master with upstream release b6052 #185

Uh oh!

jan-service-account commented Aug 1, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

13 participants

Sync master with upstream release b6052 #185

Sync master with upstream release b6052 #185

Uh oh!

Conversation

jan-service-account commented Aug 1, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

13 participants