6bit better than bf16? Qwen3-30B-A3B (hybrid, no think) #616

aniolekx · 2025-11-17T08:25:28Z

aniolekx
Nov 17, 2025

I've been running a large set of benchmarks on Qwen3-30B-A3B (hybrid, no think), and I keep seeing a surprising pattern:

The 6-bit model outperforms the BF16 model in actual task accuracy.
With strict greedy decoding (temp=0, rep_pen=1), results are the same.
But as soon as I switch to more open sampling settings, the 6-bit version consistently does better.

This is counter-intuitive — BF16 should be the “clean”, full-precision reference.
So now I'm wondering:

Why is 6-bit beating BF16?
Is there some issue with the BF16, or qwen3_moe implemntation?

Probably I have seen on X that someone higlighted the same issue for a diffrent model?

awni · 2025-11-17T14:35:52Z

awni
Nov 17, 2025
Maintainer

It's pretty difficult to say why 6-bit is better than bf16. It might just be luck. If you try a few different PRNG seeds do the results change much? That would be a good first step to see if it's just variance in the results.

It's pretty unlikely but not impossible that there is a bug with bf16 that makes it worse than 6-bit. You could also check 5-bit and 8-bit to get a couple more data points.

0 replies

aniolekx · 2025-11-18T10:26:53Z

aniolekx
Nov 18, 2025
Author

Does luck apply that much to greedy decoding (temp=0)? Today I have converted bf16 to fp16, and fp16 permforms better, in next few days I wil do more tests, but I already did plenty with bf16 and 8bit, 6bit and 4bit dwq.. I will do more with fp16 too. Each run, I do on seeds like 1001, 1002, 1003, is it random enough?

1 reply

awni Nov 18, 2025
Maintainer

Does luck apply that much to greedy decoding (temp=0)?

There is always variance in how well a model performs on data regardless of if you are sampling or not. For a large and diverse enough benchmark one would expect bf16 to outperform a low precision quant. 6-bit is usually slightly worse than full precision, but not by much.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

6bit better than bf16? Qwen3-30B-A3B (hybrid, no think) #616

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

6bit better than bf16? Qwen3-30B-A3B (hybrid, no think) #616

Uh oh!

aniolekx Nov 17, 2025

Replies: 2 comments · 1 reply

Uh oh!

awni Nov 17, 2025 Maintainer

Uh oh!

aniolekx Nov 18, 2025 Author

Uh oh!

awni Nov 18, 2025 Maintainer

aniolekx
Nov 17, 2025

Replies: 2 comments 1 reply

awni
Nov 17, 2025
Maintainer

aniolekx
Nov 18, 2025
Author

awni Nov 18, 2025
Maintainer