-
Notifications
You must be signed in to change notification settings - Fork 154
Closed
Description
What happened?
when use deepseek, qwen-think, glm in think mode. the response start without <think>
token.
I try unsloth and ubergarm gguf, all have this problem.
Name and Version
for the last 2 week, I try rebuild at every new commit. they all has this problem.
What operating system are you seeing the problem on?
Linux
Relevant log output
llama-server --port 1024 -a a --threads 40 --threads-batch 80 --no-mmap -vq --no-display-prompt -m unsloth/DeepSeek-V3.1-UD-IQ1_M-00001-of-00005.gguf --jinja --chat-template-kwargs {"thinking":true} --temp 0.6 --top-p 0.95 -c 65536 -np 1 -mla 3 -fmoe -ctk q8_0 -fa -ub 4096 -b 4096
curl -v -N http://127.0.0.1:1024/v1/chat/completions -H 'Connection: keep-alive' -H 'Content-Type: application/json' --data-raw '{"messages":[{"role":"user","content":"hi"}],"stream":false,"cache_prompt":false,"timings_per_token":false}'
* Trying 127.0.0.1:1024...
* Connected to 127.0.0.1 (127.0.0.1) port 1024 (#0)
> POST /v1/chat/completions HTTP/1.1
> Host: 127.0.0.1:1024
> User-Agent: curl/7.88.1
> Accept: */*
> Connection: keep-alive
> Content-Type: application/json
> Content-Length: 107
>
< HTTP/1.1 200 OK
< Access-Control-Allow-Origin:
< Content-Length: 1011
< Content-Type: application/json; charset=utf-8
< Keep-Alive: timeout=5, max=5
< Server: llama.cpp
<
{"choices":[{"finish_reason":"stop","index":0,"message":{"role":"assistant","content":"Hmm, the user just said \"hi\" which is a simple greeting. No additional context or request provided. \n\nSince it's a casual opening, a warm and friendly response would be appropriate. No need to overcomplicate it. \n\nI can keep it simple with a cheerful greeting and an open-ended offer to help. That way the user can choose to elaborate or ask something specific. \n\nAdding a bit of emoji might make it feel more natural and engaging.</think>Hi! How can I help you today? π"}}],"created":1757612171,"model":"gpt-3.5-turbo-0613","object":"chat.completion","usage":{"completion_tokens":108,"prompt_tokens":5,"total_tokens":113},"id":"chatcmpl-sQiTyk5PlTbcQwji39vAqpblZvECeGHG","timings":{"prompt_n":5,"prompt_ms":450.934,"prompt_per_token_ms":90.1868,"prompt_per_second":11.088097149471984,"predicted_n":108,"predicted_ms":12810.345,"predicted_per_token_ms":118.61430555555555,"predicted_per_second":8.43068629299211}}* Connection #0 to host 127.0.0.1 left intact
Metadata
Metadata
Assignees
Labels
No labels