[New Model]: XiaomiMiMo/MiMo-Audio-7B-Instruct support#750
[New Model]: XiaomiMiMo/MiMo-Audio-7B-Instruct support#750hsliuustc0106 merged 269 commits intovllm-project:mainfrom
Conversation
Signed-off-by: wangyu31577 <wangyu31577@hundsun.com> Co-authored-by: wangyu31577 <wangyu31577@hundsun.com> Signed-off-by: 齐保元 <qibaoyuan@xiaomi.com>
Signed-off-by: 齐保元 <qibaoyuan@xiaomi.com>
Signed-off-by: hsliu <liuhongsheng4@huawei.com> Signed-off-by: Hongsheng Liu <liuhongsheng4@huawei.com> Signed-off-by: 齐保元 <qibaoyuan@xiaomi.com>
Signed-off-by: GG-li <3226868735@qq.com> Signed-off-by: Sihao Li <111170255+GG-li@users.noreply.github.com> Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com> Signed-off-by: 齐保元 <qibaoyuan@xiaomi.com>
…llm-project#721) Signed-off-by: XU Mingshi <91017482+mxuax@users.noreply.github.com> Signed-off-by: mxuax <mxuax@connect.ust.hk> Signed-off-by: 齐保元 <qibaoyuan@xiaomi.com>
68eca4f to
56535a4
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 734774bb95
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
Co-authored-by: Dovis01(shijin zhang) <zsj1364226740@gmail.com> Signed-off-by: Baoyuan Qi <qibaoyuan@126.com>
Signed-off-by: hsliu <liuhongsheng4@huawei.com> Signed-off-by: 齐保元 <qibaoyuan@xiaomi.com>
sync Signed-off-by: Baoyuan Qi <qibaoyuan@126.com> Signed-off-by: 齐保元 <qibaoyuan@xiaomi.com>
Signed-off-by: gcanlin <canlinguosdu@gmail.com> Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn> Signed-off-by: 齐保元 <qibaoyuan@xiaomi.com>
…ct#718) Signed-off-by: wuzhongjian <wuzhongjian_yewu@cmss.chinamobile.com> Signed-off-by: Hongsheng Liu <liuhongsheng4@huawei.com> Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com> Co-authored-by: WeiQing Chen <40507679+david6666666@users.noreply.github.com> Signed-off-by: 齐保元 <qibaoyuan@xiaomi.com>
Signed-off-by: 齐保元 <qibaoyuan@xiaomi.com>
Signed-off-by: gcanlin <canlinguosdu@gmail.com> Signed-off-by: 齐保元 <qibaoyuan@xiaomi.com>
Signed-off-by: 齐保元 <qibaoyuan@xiaomi.com>
Signed-off-by: 齐保元 <qibaoyuan@xiaomi.com>
Signed-off-by: 齐保元 <qibaoyuan@xiaomi.com>
Signed-off-by: 齐保元 <qibaoyuan@xiaomi.com>
Signed-off-by: 齐保元 <qibaoyuan@xiaomi.com>
Signed-off-by: 齐保元 <qibaoyuan@xiaomi.com>
Signed-off-by: 齐保元 <qibaoyuan@xiaomi.com>
Signed-off-by: 齐保元 <qibaoyuan@xiaomi.com>
…lm-project#735) Signed-off-by: dongbo910220 <1275604947@qq.com> Signed-off-by: dongbo910220 <32610838+dongbo910220@users.noreply.github.com> Signed-off-by: Jiangyun Zhu <riverclouds.zhu@qq.com> Co-authored-by: Jiangyun Zhu <riverclouds.zhu@qq.com> Signed-off-by: 齐保元 <qibaoyuan@xiaomi.com>
…vllm-project#697) Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com> Signed-off-by: 齐保元 <qibaoyuan@xiaomi.com>
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com> Signed-off-by: 齐保元 <qibaoyuan@xiaomi.com>
Signed-off-by: hsliu <liuhongsheng4@huawei.com> Signed-off-by: 齐保元 <qibaoyuan@xiaomi.com>
|
vllm_omni/worker/gpu_model_runner.py:100 The model loading path should include validation. Add a helpful error message if the model is not found or incompatible. |
Signed-off-by: 齐保元 <qibaoyuan@xiaomi.com> Co-authored-by: shijin zhang <75300765+Dovis01@users.noreply.github.com> Co-authored-by: ning ding <nndding@gmail.com>
Signed-off-by: 齐保元 <qibaoyuan@xiaomi.com> Co-authored-by: shijin zhang <75300765+Dovis01@users.noreply.github.com> Co-authored-by: ning ding <nndding@gmail.com>
Signed-off-by: 齐保元 <qibaoyuan@xiaomi.com> Co-authored-by: shijin zhang <75300765+Dovis01@users.noreply.github.com> Co-authored-by: ning ding <nndding@gmail.com>
Signed-off-by: Shijin Zhang <75300765+Dovis01@users.noreply.github.com> Co-authored-by: 齐保元 <qibaoyuan@xiaomi.com> Co-authored-by: ning ding <nndding@gmail.com>
Signed-off-by: Shijin Zhang <75300765+Dovis01@users.noreply.github.com> Co-authored-by: 齐保元 <qibaoyuan@xiaomi.com> Co-authored-by: ning ding <nndding@gmail.com>
Signed-off-by: 齐保元 <qibaoyuan@xiaomi.com> Co-authored-by: shijin zhang <75300765+Dovis01@users.noreply.github.com> Co-authored-by: ning ding <nndding@gmail.com>
Signed-off-by: 齐保元 <qibaoyuan@xiaomi.com> Co-authored-by: shijin zhang <75300765+Dovis01@users.noreply.github.com> Co-authored-by: ning ding <nndding@gmail.com>
|
@qibaoyuan |
Signed-off-by: 齐保元 <qibaoyuan@xiaomi.com> Co-authored-by: shijin zhang <75300765+Dovis01@users.noreply.github.com> Co-authored-by: ning ding <nndding@gmail.com>
Signed-off-by: 齐保元 <qibaoyuan@xiaomi.com> Co-authored-by: shijin zhang <75300765+Dovis01@users.noreply.github.com> Co-authored-by: ning ding <nndding@gmail.com>
fixed |
Signed-off-by: Hongsheng Liu <liuhongsheng4@huawei.com>
|
done for #151 |
The model to consider
LLM Model Weights: https://huggingface.co/XiaomiMiMo/MiMo-Audio-7B-Instruct
Audio Tokenizer Weights:https://huggingface.co/XiaomiMiMo/MiMo-Audio-Tokenizer
Model Code: https://github.com/XiaomiMiMo/MiMo-Audio
Model description
This PR enables stage-based deployment for the Mimo-Audio model, aligning it with the vllm-omni architecture. Specific changes include:
Added Stage Configuration
Introduced
vllm_omni/model_executor/stage_configs/mimo_audio.yamlto define the multi-stage pipeline.Refactored Model Structure: Split the Mimo-Audio into two stages:
Stage 0 (LLM+LocalForward)
MiMoAudioLLMForConditionalGeneration(AR mode) for multimodal understanding and text generation.Stage 1 (code2wav):
MiMoAudioToken2WavForConditionalGenerationVLLMfor video generation.Test plan
NOTE: The
MIMO_AUDIO_TOKENIZER_PATHenvironment variable is mandatory due to the specialized architecture.offline
online
server side:
client:
Test result
TTS(with reference audio)
0_6ca65429-1027-4797-963e-963a1de6c286.wav
Audio understanding
text:
audio:
0_43fce45f-3809-48f8-b059-351072a4743c.wav