Releases · ngxson/llama.cpp

20 Aug 12:48

657b8a7

b6215

chat: handle gpt-oss return/end token inconsistency (#15421)

This commit addresses an inconsistency during inference by adding a new
member to the `templates_params` struct to indicate whether the chat is
in inference mode. This allows the gpt-oss specific function
`common_chat_params_init_gpt_oss` to check this flag and the
`add_generation_prompt` flag to determine if it should replace the
`<|return|>` token with the `<|end|>` token in the prompt.

The motivation for this change is to ensure that the formatted prompt of
past messages in `common_chat_format_single` matches the output of the
formatted new message. The issue is that the gpt-oss template returns
different end tags: `<|return|>` when `add_generation_prompt` is false,
and `<|end|>` when `add_generation_prompt` is true. This causes the
substring function to start at an incorrect position, resulting in
tokenization starting with 'tart|>' instead of '<|start|>'.

Resolves: https://github.com/ggml-org/llama.cpp/issues/15417

Assets 15

20 Aug 11:01

github-actions

b6213

1a99c2d

b6213

cmake : fix target include directories (#15450)

* Update docker.yml

修改docker.yml文件中的内容使其停止周期性的运行该workflow，如果想要运行该workflow可以手动启动

* feat:Modify the header file include path

1. There's no llava directory in the tools directory.
2. Because the command `target_include_directories(mtmd PUBLIC .)` is used in the `mtmd` CMakeLists.txt file, other targets that link against `mtmd` automatically include the `mtmd` directory as a search path for header files. Therefore, you can remove `target_include_directories(${TARGET} PRIVATE ../llava`` or use `target_include_directories(${TARGET} PRIVATE ../mtmd`` to explicitly require the `llama-server` target to use header files from `mtmd`.

* Restore the docker.yml file

Assets 15

20 Aug 03:02

github-actions

b6210

a094f38

b6210

musa: fix build warnings (#15258)

* musa: fix build warnings

Signed-off-by: Xiaodong Ye <[email protected]>

* fix warning: comparison of integers of different signs: 'const int' and 'unsigned int' [-Wsign-compare]

Signed-off-by: Xiaodong Ye <[email protected]>

---------

Signed-off-by: Xiaodong Ye <[email protected]>

Assets 15

19 Aug 18:49

github-actions

b6209

fb22dd0

b6209

opencl: mark `argsort` unsupported if cols exceed workgroup limit (#1…

Assets 15

19 Aug 17:22

github-actions

b6208

9ef6b0b

b6208

model : add gpt-oss type strings (#15424)

Assets 15

19 Aug 14:33

github-actions

b6206

d2fcd91

b6206

server : disable context shift by default (#15416)

* server : disable context shift by default

ggml-ci

* server : make scopr of test parameters local

Assets 15

19 Aug 09:13

github-actions

b6203

6424594

b6203

ggml-cpu: add mxfp4 VSX intrinsics for Power9+ (ppc64le) hardware (#1…

Assets 15

19 Aug 08:51

github-actions

b6202

e9288e8

b6202

chat : clarify the meaning of reasoning_format (#15408)

* chat : clarify the meaning of reasoning_format

* add link to this PR

Assets 15

19 Aug 06:01

github-actions

b6201

9d262f4

b6201

server : remove swa_full warning (#15399)

Assets 15

18 Aug 21:16

github-actions

b6199

f08c4c0

b6199

mtmd : clean up clip_n_output_tokens (#15391)

Assets 15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Releases: ngxson/llama.cpp

b6215

Uh oh!

b6213

Uh oh!

b6210

Uh oh!

b6209

Uh oh!

b6208

Uh oh!

b6206

Uh oh!

b6203

Uh oh!

b6202

Uh oh!

b6201

Uh oh!

b6199

Uh oh!