Model runner no longer outputs reasoning tokens?

I notice that model runner no longer outputs the reasoning content (i.e. the _&lt;think&gt;...&lt;/think&gt;_ block) for models such as **qwen3** or **deepseek-r1-distill-llama**. I started noticing this following one of the more recent updates, since previously I did see the models output this type of content.

For example, below is the output from Qwen3 and DeepSeek from Model Runner. 

## Qwen3

```
docker model run ai/qwen3
Interactive chat mode started. Type '/bye' to exit.
> How many Rs in strawberry?
The word **"strawberry"** contains **three** instances of the letter **R**.

Here's the breakdown:
**S-T-R-A-W-B-E-R-R-Y**
- **R** at position 3
- **R** at position 8
- **R** at position 9

**Answer:** 3 R's.
```

## DeepSeek-R1-Distill-Llama

```
docker model run ai/deepseek-r1-distill-llama
Interactive chat mode started. Type '/bye' to exit.
> How many Rs in strawberry?
The word "strawberry" contains three Rs.

**Step-by-Step Explanation:**

1. **Write Out the Word:** S-T-R-A-W-B-E-R-R-Y.
2. **Identify Each Letter:** S, T, R, A, W, B, E, R, R, Y.
3. **Count the Rs:**
   - The third letter is R (1).
   - The eighth letter is R (2).
   - The ninth letter is R (3).
4. **Total Count:** 3 Rs.

**Answer:** There are three Rs in "strawberry."
```

For contrast, this is the output from Qwen3 running in Ollama.

## Qwen3 (in Ollama)

```
ollama run qwen3
>>> How many Rs in strawberry?
Thinking...
Okay, the user is asking how many times the letter 'R' appears in the word "strawberry." Let me break this down
step by step.

First, I should write out the word clearly: S-T-R-A-W-B-E-R-R-Y. Wait, let me make sure I have the spelling right.
Strawberry is spelled S-T-R-A-W-B-E-R-R-Y. Let me count each letter one by one.

Starting with the first letter: S (1), T (2), R (3), A (4), W (5), B (6), E (7), R (8), R (9), Y (10). Wait,
that's 10 letters. Let me check again. S-T-R-A-W-B-E-R-R-Y. Yes, that's correct. So the letters are S, T, R, A, W,
B, E, R, R, Y.

Now, I need to count how many 'R's are there. Let's go through each letter:

1. S – not R
2. T – not R
3. R – that's one
4. A – not R
5. W – not R
6. B – not R
7. E – not R
8. R – that's two
9. R – that's three
10. Y – not R

So there are three 'R's in "strawberry." Wait, but let me double-check. Sometimes people might miscount. Let me
write them out again:

Letters in strawberry: S, T, R, A, W, B, E, R, R, Y. So positions 3, 8, and 9 are R. That's three Rs.

Hmm, but I've heard before that sometimes people might miss a letter. Let me confirm once more. The word is
spelled S-T-R-A-W-B-E-R-R-Y. Yes, the third letter is R, then after the E, there are two more Rs. So three Rs in
total.

I think that's correct. So the answer should be 3.
...done thinking.

The word **strawberry** contains **3** instances of the letter **R**.

**Breakdown:**
S-T-R-A-W-B-E-R-R-Y
Positions: 3rd, 8th, and 9th letters are **R**.

**Answer:** 3 Rs.
```

Also, inferencing seems to be significantly slower after the latest updates, but this could be subjective since I didn't pay attention to this aspect previously. Qwen3 is certainly a lot slower (by more than 100%) than this model running in Ollama but again, I'm not certain if both models are exactly the same.

https://hub.docker.com/r/ai/qwen3
```
docker model inspect ai/qwen3
{
    "id": "sha256:79fa56c07429f64f41950fe2f524937cf3ae9ea9bd3d7ada72170b036ea3cc85",
    "tags": [
        "ai/qwen3"
    ],
    "created": 1746022478,
    "config": {
        "format": "gguf",
        "quantization": "IQ2_XXS/Q4_K_M",
        "parameters": "8.19 B",
        "architecture": "qwen3",
        "size": "4.68 GiB"
    }
}
```

https://ollama.com/library/qwen3:8b
```
ollama show qwen3
  Model
    architecture        qwen3
    parameters          8.2B
    context length      40960
    embedding length    4096
    quantization        Q4_K_M

  Capabilities
    completion
    tools
    thinking

  Parameters
    top_k             20
    top_p             0.95
    repeat_penalty    1
    stop              "<|im_start|>"
    stop              "<|im_end|>"
    temperature       0.6

  License
    Apache License
    Version 2.0, January 2004
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Model runner no longer outputs reasoning tokens? #121

Qwen3

DeepSeek-R1-Distill-Llama

Qwen3 (in Ollama)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Model runner no longer outputs reasoning tokens? #121

Description

Qwen3

DeepSeek-R1-Distill-Llama

Qwen3 (in Ollama)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions