Skip to content

Releases: ngxson/llama.cpp

b6215

20 Aug 12:48
657b8a7
Compare
Choose a tag to compare
chat: handle gpt-oss return/end token inconsistency (#15421)

This commit addresses an inconsistency during inference by adding a new
member to the `templates_params` struct to indicate whether the chat is
in inference mode. This allows the gpt-oss specific function
`common_chat_params_init_gpt_oss` to check this flag and the
`add_generation_prompt` flag to determine if it should replace the
`<|return|>` token with the `<|end|>` token in the prompt.

The motivation for this change is to ensure that the formatted prompt of
past messages in `common_chat_format_single` matches the output of the
formatted new message. The issue is that the gpt-oss template returns
different end tags: `<|return|>` when `add_generation_prompt` is false,
and `<|end|>` when `add_generation_prompt` is true. This causes the
substring function to start at an incorrect position, resulting in
tokenization starting with 'tart|>' instead of '<|start|>'.

Resolves: https://github.com/ggml-org/llama.cpp/issues/15417

b6213

20 Aug 11:01
1a99c2d
Compare
Choose a tag to compare
cmake : fix target include directories (#15450)

* Update docker.yml

修改docker.yml文件中的内容使其停止周期性的运行该workflow,如果想要运行该workflow可以手动启动

* feat:Modify the header file include path

1. There's no llava directory in the tools directory.
2. Because the command `target_include_directories(mtmd PUBLIC .)` is used in the `mtmd` CMakeLists.txt file, other targets that link against `mtmd` automatically include the `mtmd` directory as a search path for header files. Therefore, you can remove `target_include_directories(${TARGET} PRIVATE ../llava`` or use `target_include_directories(${TARGET} PRIVATE ../mtmd`` to explicitly require the `llama-server` target to use header files from `mtmd`.

* Restore the docker.yml file

b6210

20 Aug 03:02
a094f38
Compare
Choose a tag to compare
musa: fix build warnings (#15258)

* musa: fix build warnings

Signed-off-by: Xiaodong Ye <[email protected]>

* fix warning: comparison of integers of different signs: 'const int' and 'unsigned int' [-Wsign-compare]

Signed-off-by: Xiaodong Ye <[email protected]>

---------

Signed-off-by: Xiaodong Ye <[email protected]>

b6209

19 Aug 18:49
fb22dd0
Compare
Choose a tag to compare
opencl: mark `argsort` unsupported if cols exceed workgroup limit (#1…

b6208

19 Aug 17:22
9ef6b0b
Compare
Choose a tag to compare
model : add gpt-oss type strings (#15424)

b6206

19 Aug 14:33
d2fcd91
Compare
Choose a tag to compare
server : disable context shift by default (#15416)

* server : disable context shift by default

ggml-ci

* server : make scopr of test parameters local

b6203

19 Aug 09:13
6424594
Compare
Choose a tag to compare
ggml-cpu: add mxfp4 VSX intrinsics for Power9+ (ppc64le) hardware (#1…

b6202

19 Aug 08:51
e9288e8
Compare
Choose a tag to compare
chat : clarify the meaning of reasoning_format (#15408)

* chat : clarify the meaning of reasoning_format

* add link to this PR

b6201

19 Aug 06:01
9d262f4
Compare
Choose a tag to compare
server : remove swa_full warning (#15399)

b6199

18 Aug 21:16
f08c4c0
Compare
Choose a tag to compare
mtmd : clean up clip_n_output_tokens (#15391)