Skip to content

Add Qwen3-TTS benchmark scripts#1573

Open
linyueqian wants to merge 4 commits intovllm-project:mainfrom
linyueqian:feat/qwen3-tts-benchmark-scripts
Open

Add Qwen3-TTS benchmark scripts#1573
linyueqian wants to merge 4 commits intovllm-project:mainfrom
linyueqian:feat/qwen3-tts-benchmark-scripts

Conversation

@linyueqian
Copy link
Contributor

Summary

  • Add benchmark scripts comparing vLLM-Omni streaming serving vs HuggingFace Transformers offline inference for Qwen3-TTS
  • Measures TTFP, E2E latency, RTF, and throughput across configurable concurrency levels
  • Includes orchestration script (run_benchmark.sh) with env vars for GPU, model, batch size, and concurrency

Files

  • bench_tts_serve.py - Streaming serving benchmark client
  • bench_tts_hf.py - HuggingFace offline baseline
  • run_benchmark.sh - Orchestration script
  • plot_results.py - Comparison plotting utility
  • configs/qwen3_tts_bs1.yaml - batch_size=1 stage config
  • configs/qwen3_tts_bs4.yaml - batch_size=4 stage config

Usage

# Full benchmark (vllm-omni + HF)
bash run_benchmark.sh

# vllm-omni only, 1.7B model
MODEL=Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice bash run_benchmark.sh --async-only

# Custom GPU and concurrency
GPU_DEVICE=1 CONCURRENCY="1 4" bash run_benchmark.sh

Test plan

  • Verified on H200 with 0.6B model, bs1 config, concurrency 1 and 4
  • Reviewer runs on their own hardware to validate

Related: #938

…nce comparison

Signed-off-by: linyueqian <linyueqian@outlook.com>
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0b573d1ab9

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant