-
Notifications
You must be signed in to change notification settings - Fork 23
Description
I notice that model runner no longer outputs the reasoning content (i.e. the <think>...</think> block) for models such as qwen3 or deepseek-r1-distill-llama. I started noticing this following one of the more recent updates, since previously I did see the models output this type of content.
For example, below is the output from Qwen3 and DeepSeek from Model Runner.
Qwen3
docker model run ai/qwen3
Interactive chat mode started. Type '/bye' to exit.
> How many Rs in strawberry?
The word **"strawberry"** contains **three** instances of the letter **R**.
Here's the breakdown:
**S-T-R-A-W-B-E-R-R-Y**
- **R** at position 3
- **R** at position 8
- **R** at position 9
**Answer:** 3 R's.
DeepSeek-R1-Distill-Llama
docker model run ai/deepseek-r1-distill-llama
Interactive chat mode started. Type '/bye' to exit.
> How many Rs in strawberry?
The word "strawberry" contains three Rs.
**Step-by-Step Explanation:**
1. **Write Out the Word:** S-T-R-A-W-B-E-R-R-Y.
2. **Identify Each Letter:** S, T, R, A, W, B, E, R, R, Y.
3. **Count the Rs:**
- The third letter is R (1).
- The eighth letter is R (2).
- The ninth letter is R (3).
4. **Total Count:** 3 Rs.
**Answer:** There are three Rs in "strawberry."
For contrast, this is the output from Qwen3 running in Ollama.
Qwen3 (in Ollama)
ollama run qwen3
>>> How many Rs in strawberry?
Thinking...
Okay, the user is asking how many times the letter 'R' appears in the word "strawberry." Let me break this down
step by step.
First, I should write out the word clearly: S-T-R-A-W-B-E-R-R-Y. Wait, let me make sure I have the spelling right.
Strawberry is spelled S-T-R-A-W-B-E-R-R-Y. Let me count each letter one by one.
Starting with the first letter: S (1), T (2), R (3), A (4), W (5), B (6), E (7), R (8), R (9), Y (10). Wait,
that's 10 letters. Let me check again. S-T-R-A-W-B-E-R-R-Y. Yes, that's correct. So the letters are S, T, R, A, W,
B, E, R, R, Y.
Now, I need to count how many 'R's are there. Let's go through each letter:
1. S – not R
2. T – not R
3. R – that's one
4. A – not R
5. W – not R
6. B – not R
7. E – not R
8. R – that's two
9. R – that's three
10. Y – not R
So there are three 'R's in "strawberry." Wait, but let me double-check. Sometimes people might miscount. Let me
write them out again:
Letters in strawberry: S, T, R, A, W, B, E, R, R, Y. So positions 3, 8, and 9 are R. That's three Rs.
Hmm, but I've heard before that sometimes people might miss a letter. Let me confirm once more. The word is
spelled S-T-R-A-W-B-E-R-R-Y. Yes, the third letter is R, then after the E, there are two more Rs. So three Rs in
total.
I think that's correct. So the answer should be 3.
...done thinking.
The word **strawberry** contains **3** instances of the letter **R**.
**Breakdown:**
S-T-R-A-W-B-E-R-R-Y
Positions: 3rd, 8th, and 9th letters are **R**.
**Answer:** 3 Rs.
Also, inferencing seems to be significantly slower after the latest updates, but this could be subjective since I didn't pay attention to this aspect previously. Qwen3 is certainly a lot slower (by more than 100%) than this model running in Ollama but again, I'm not certain if both models are exactly the same.
https://hub.docker.com/r/ai/qwen3
docker model inspect ai/qwen3
{
"id": "sha256:79fa56c07429f64f41950fe2f524937cf3ae9ea9bd3d7ada72170b036ea3cc85",
"tags": [
"ai/qwen3"
],
"created": 1746022478,
"config": {
"format": "gguf",
"quantization": "IQ2_XXS/Q4_K_M",
"parameters": "8.19 B",
"architecture": "qwen3",
"size": "4.68 GiB"
}
}
https://ollama.com/library/qwen3:8b
ollama show qwen3
Model
architecture qwen3
parameters 8.2B
context length 40960
embedding length 4096
quantization Q4_K_M
Capabilities
completion
tools
thinking
Parameters
top_k 20
top_p 0.95
repeat_penalty 1
stop "<|im_start|>"
stop "<|im_end|>"
temperature 0.6
License
Apache License
Version 2.0, January 2004