fix(fish_qwen3_omni): support loading quantized/converted models in sanitize() by yoshphys · Pull Request #584 · Blaizzy/mlx-audio

yoshphys · 2026-03-16T13:05:20Z

Problem

sanitize() in fish_qwen3_omni/fish_speech.py skips any weight key that does not start with text_model.model. or audio_decoder., via else: continue. This means that when loading a model that was already converted to MLX format (e.g., produced by python -m mlx_audio.convert --quantize), all weight keys (which start with model.) are silently dropped, resulting in an empty dict.

The downstream effect:

apply_quantization() finds no .scales keys → skips quantization entirely
load_weights() is called with an empty dict → raises ValueError: Missing N parameters

This is the root cause of the bug reported in #578 ("Fish s2 pro Breaks when quantizing using convert").

Fix

Add a model.* passthrough case before the existing conditions so that keys already in MLX format are preserved as-is:

if key.startswith("model."):
    # Already in MLX format (e.g., previously converted/quantized model)
    remapped[key] = value
elif key.startswith("text_model.model."):
    ...

How to reproduce

python -m mlx_audio.convert \
  --hf-path mlx-community/fish-audio-s2-pro-bf16 \
  --mlx-path ./fish-audio-s2-pro-4bit \
  -q --q-bits 4 --q-group-size 64 --model-domain tts

python -c "
from mlx_audio.tts.utils import load_model
model = load_model('./fish-audio-s2-pro-4bit')  # raises ValueError before fix
"

Testing

After the fix, quantized models load and run correctly.

🤖 Generated with Claude Code

…anitize() sanitize() was skipping all keys that did not start with "text_model.model." or "audio_decoder.", including keys already in MLX format ("model.*"). This caused quantized models produced by mlx_audio.convert to fail to load with "Missing N parameters" error, because sanitize() returned an empty dict for such models. Fix: pass through "model.*" keys unchanged so that both the original HuggingFace weights and previously converted/quantized MLX weights are handled correctly. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

lucasnewman · 2026-03-16T17:23:50Z

@yoshphys Thanks! Did you test audio generation and does it sound good to your ear? Your agent can't hear very well so it needs human verification :)

yoshphys · 2026-03-17T00:12:26Z

@lucasnewman Thank you for paying attention on this PR. Sure. I wished I could make my agent hear the sound, but it cannot. So I had to listen to some audio outputs after modifying the code. I generated a couple of sets of audio sound with different sets of quantized parameters, 4bit and 8 bit, respectively. In the case of 4bit, I got aware that a few parts of audio with inline control had less quality, but I guess that was because of the quantization. In the case of 8bit-parameters set, there's no problem and the audio quality was good enough.

Blaizzy

Hey @yoshphys

Thanks for your contribution!

I would add a quant predicate to skip the codec, embedding and projections from the quantisation process or use a higher quant. This would ensure high quality even in low quants precision.

Am thinking about how we can automatically scan and suggest this with PRs such as #490 but for now has to be manual.

cc: @lucasnewman

lucasnewman · 2026-03-20T15:37:29Z

The conversion actually works correctly, so that part should be good. I tested your change with an 8-bit conversion and it looks good, thank you!

…tized model loading sanitize() drops weight keys not found in the model's current parameter shapes. Since the model isn't quantized yet at sanitize time, quantization metadata keys (.scales, .biases) are silently removed. Later, apply_quantization() checks for these keys to decide which layers to quantize -- finds nothing -- skips quantization -- and loading fails with a shape mismatch. Preserve .scales and .biases keys through sanitization, matching the existing pattern in chatterbox/s3gen. Same class of bug as Blaizzy#584 (fish_qwen3_omni sanitize fix).

Blaizzy reviewed Mar 17, 2026

View reviewed changes

lucasnewman approved these changes Mar 20, 2026

View reviewed changes

lucasnewman merged commit bdfaaf3 into Blaizzy:main Mar 20, 2026
10 checks passed

lucasnewman mentioned this pull request Mar 20, 2026

Fish s2 pro Breaks when quantizing using convert #578

Closed

korale77 mentioned this pull request Mar 25, 2026

fix(vibevoice): preserve quantization metadata in sanitize() for quantized model loading #604

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(fish_qwen3_omni): support loading quantized/converted models in sanitize()#584

fix(fish_qwen3_omni): support loading quantized/converted models in sanitize()#584
lucasnewman merged 1 commit intoBlaizzy:mainfrom
yoshphys:fix/fish-qwen3-omni-sanitize-quantized-weights

yoshphys commented Mar 16, 2026

Uh oh!

lucasnewman commented Mar 16, 2026

Uh oh!

yoshphys commented Mar 17, 2026

Uh oh!

Blaizzy left a comment

Uh oh!

lucasnewman commented Mar 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

yoshphys commented Mar 16, 2026

Problem

Fix

How to reproduce

Testing

Uh oh!

lucasnewman commented Mar 16, 2026

Uh oh!

yoshphys commented Mar 17, 2026

Uh oh!

Blaizzy left a comment

Choose a reason for hiding this comment

Uh oh!

lucasnewman commented Mar 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants