fix(fish_qwen3_omni): support loading quantized/converted models in sanitize()#584
Conversation
…anitize()
sanitize() was skipping all keys that did not start with
"text_model.model." or "audio_decoder.", including keys already in
MLX format ("model.*"). This caused quantized models produced by
mlx_audio.convert to fail to load with "Missing N parameters" error,
because sanitize() returned an empty dict for such models.
Fix: pass through "model.*" keys unchanged so that both the original
HuggingFace weights and previously converted/quantized MLX weights
are handled correctly.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
@yoshphys Thanks! Did you test audio generation and does it sound good to your ear? Your agent can't hear very well so it needs human verification :) |
|
@lucasnewman Thank you for paying attention on this PR. Sure. I wished I could make my agent hear the sound, but it cannot. So I had to listen to some audio outputs after modifying the code. I generated a couple of sets of audio sound with different sets of quantized parameters, 4bit and 8 bit, respectively. In the case of 4bit, I got aware that a few parts of audio with inline control had less quality, but I guess that was because of the quantization. In the case of 8bit-parameters set, there's no problem and the audio quality was good enough. |
Blaizzy
left a comment
There was a problem hiding this comment.
Hey @yoshphys
Thanks for your contribution!
I would add a quant predicate to skip the codec, embedding and projections from the quantisation process or use a higher quant. This would ensure high quality even in low quants precision.
Am thinking about how we can automatically scan and suggest this with PRs such as #490 but for now has to be manual.
cc: @lucasnewman
|
The conversion actually works correctly, so that part should be good. I tested your change with an 8-bit conversion and it looks good, thank you! |
…tized model loading sanitize() drops weight keys not found in the model's current parameter shapes. Since the model isn't quantized yet at sanitize time, quantization metadata keys (.scales, .biases) are silently removed. Later, apply_quantization() checks for these keys to decide which layers to quantize -- finds nothing -- skips quantization -- and loading fails with a shape mismatch. Preserve .scales and .biases keys through sanitization, matching the existing pattern in chatterbox/s3gen. Same class of bug as Blaizzy#584 (fish_qwen3_omni sanitize fix).
Problem
sanitize()infish_qwen3_omni/fish_speech.pyskips any weight key that does not start withtext_model.model.oraudio_decoder., viaelse: continue. This means that when loading a model that was already converted to MLX format (e.g., produced bypython -m mlx_audio.convert --quantize), all weight keys (which start withmodel.) are silently dropped, resulting in an empty dict.The downstream effect:
apply_quantization()finds no.scaleskeys → skips quantization entirelyload_weights()is called with an empty dict → raisesValueError: Missing N parametersThis is the root cause of the bug reported in #578 ("Fish s2 pro Breaks when quantizing using convert").
Fix
Add a
model.*passthrough case before the existing conditions so that keys already in MLX format are preserved as-is:How to reproduce
Testing
After the fix, quantized models load and run correctly.
🤖 Generated with Claude Code