[Bugfix][Qwen3TTS] by JuanPZuluaga · Pull Request #1289 · vllm-project/vllm-omni

JuanPZuluaga · 2026-02-09T12:49:56Z

PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.

Purpose

This PR #891 introduced an issue while computing the metrics when sending a TTS request with Qwen3TTS:

(APIServer pid=987948) INFO 02-09 12:37:54 [serving_speech.py:236] TTS speech request speech-bf3b5480759cbfef: text='Hello, how are you?', task_type=CustomVoice
(APIServer pid=987948) INFO 02-09 12:37:54 [async_omni.py:315] [AsyncOrchestrator] Entering scheduling loop: stages=1, final_stage=0
(Worker pid=989530) [Stage-0] INFO 02-09 12:37:54 [configuration_qwen3_tts.py:441] code_predictor_config is None. Initializing code_predictor model with default values
(Worker pid=989530) Setting `pad_token_id` to `eos_token_id`:2150 for open-end generation.
(APIServer pid=987948) ERROR 02-09 12:37:55 [serving_speech.py:310] Speech generation failed: tuple index out of range
(APIServer pid=987948) ERROR 02-09 12:37:55 [serving_speech.py:310] Traceback (most recent call last):
(APIServer pid=987948) ERROR 02-09 12:37:55 [serving_speech.py:310]   File "/home/pablo/agigo/claude_code/agigo_tts_offline/dialog-platform/tts/agigo-tts-offline/vllm-omni/vllm_omni/entrypo
ints/openai/serving_speech.py", line 253, in create_speech
(APIServer pid=987948) ERROR 02-09 12:37:55 [serving_speech.py:310]     async for res in generator:
(APIServer pid=987948) ERROR 02-09 12:37:55 [serving_speech.py:310]   File "/home/pablo/agigo/claude_code/agigo_tts_offline/dialog-platform/tts/agigo-tts-offline/vllm-omni/vllm_omni/entrypo
ints/async_omni.py", line 331, in generate
(APIServer pid=987948) ERROR 02-09 12:37:55 [serving_speech.py:310]     async for output in self._process_sequential_results(
(APIServer pid=987948) ERROR 02-09 12:37:55 [serving_speech.py:310]   File "/home/pablo/agigo/claude_code/agigo_tts_offline/dialog-platform/tts/agigo-tts-offline/vllm-omni/vllm_omni/entrypo
ints/async_omni.py", line 431, in _process_sequential_results
(APIServer pid=987948) ERROR 02-09 12:37:55 [serving_speech.py:310]     metrics.record_audio_generated_frames(
(APIServer pid=987948) ERROR 02-09 12:37:55 [serving_speech.py:310]   File "/home/pablo/agigo/claude_code/agigo_tts_offline/dialog-platform/tts/agigo-tts-offline/vllm-omni/vllm_omni/metrics
/stats.py", line 229, in record_audio_generated_frames
(APIServer pid=987948) ERROR 02-09 12:37:55 [serving_speech.py:310]     nframes = int(multimodal_output[-1].shape[0])
(APIServer pid=987948) ERROR 02-09 12:37:55 [serving_speech.py:310]                   ~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^
(APIServer pid=987948) ERROR 02-09 12:37:55 [serving_speech.py:310] IndexError: tuple index out of range

Also, there's an update in the Qwen3 Speech tokenizer done in the main repo: QwenLM/Qwen3-TTS@6cafe55

Test Plan

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 368d815bcf

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-02-09T12:54:29Z

vllm_omni/metrics/stats.py

            and (multimodal_output := output_to_yield.multimodal_output.get("audio")) is not None
        ):
-            nframes = int(multimodal_output[-1].shape[0])
+            nframes = int(multimodal_output.shape[0])


Guard list audio payloads before reading tensor shape

When audio models return more than one waveform per prompt, DiffusionEngine.step stores multimodal_output["audio"] as a Python list (audio_payload = outputs when len(outputs) > 1), but this code now unconditionally calls .shape[0] on that value. In that scenario, record_audio_generated_frames raises AttributeError: 'list' object has no attribute 'shape', aborting the request path instead of just updating metrics.

Useful? React with 👍 / 👎.

LJH-LBJ · 2026-02-09T13:07:55Z

Thanks for your changes—this is a more robust approach. What motivated the change to audio_lengths?

Copilot

Pull request overview

Fixes regressions introduced by PR #891 affecting Qwen3-TTS audio metrics computation, and updates Qwen3-TTS tokenizer padding/decoding behavior to align with upstream tokenizer changes.

Changes:

Update Qwen3-TTS tokenizer decode paths to support -1 padding (via clamping before decode).
Change Qwen3-TTS tokenizer wrapper padding from 0 to -1 for padded batches.
Make audio frame metrics aggregation handle non-list audio payloads by summing frames across chunks.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File	Description
vllm_omni/model_executor/models/qwen3_tts/tokenizer_25hz/modeling_qwen3_tts_tokenizer_v1.py	Adjusts decode to compute lengths before clamping padded codes.
vllm_omni/model_executor/models/qwen3_tts/tokenizer_12hz/modeling_qwen3_tts_tokenizer_v2.py	Same as above for the 12Hz tokenizer variant.
vllm_omni/model_executor/models/qwen3_tts/qwen3_tts_tokenizer.py	Changes batch padding value for audio codes from `0` to `-1`.
vllm_omni/metrics/stats.py	Updates audio-generated-frames metric computation to sum across chunked outputs.

Comments suppressed due to low confidence (2)

vllm_omni/model_executor/models/qwen3_tts/tokenizer_12hz/modeling_qwen3_tts_tokenizer_v2.py:1000

Same issue as V1: audio_lengths is computed before clamping (to ignore -1 padding) but then overwritten after torch.clamp() using (audio_codes[..., 0] > 0), which can miscompute lengths once pads have been clamped to 0 (and if 0 is a valid code). Use the pre-clamp length for trimming and drop or correct the post-clamp recomputation.

        audio_lengths = (audio_codes[..., 0] > -1).sum(1) * self.decode_upsample_rate

        audio_codes = torch.clamp(audio_codes, min=0)
        audio_values = self.decoder.chunked_decode(audio_codes.transpose(1, 2)).squeeze(1)

        audio_lengths = (audio_codes[..., 0] > 0).sum(1) * self.decode_upsample_rate
        audio_values = [a[:length] for a, length in zip(audio_values, audio_lengths)]

vllm_omni/model_executor/models/qwen3_tts/tokenizer_25hz/modeling_qwen3_tts_tokenizer_v1.py:1519

In decode(), audio_lengths is computed using the pre-clamp codes (to account for -1 padding), but then it is overwritten using (audio_codes > 0) after torch.clamp(). This defeats the purpose of using -1 padding (pads become 0) and can truncate outputs incorrectly when code 0 is valid. Keep and use the pre-clamp length (e.g., based on audio_codes > -1 / >= 0) and remove or adjust the second audio_lengths assignment.

        audio_lengths = (audio_codes > -1).sum(1) * self.decode_upsample_rate

        audio_codes = torch.clamp(audio_codes, min=0)
        audio_values = self.decoder(code=audio_codes, reference_mel=ref_mels, conditioning=xvectors)

        audio_lengths = (audio_codes > 0).sum(1) * self.decode_upsample_rate
        audio_values = [a[:length] for a, length in zip(audio_values, audio_lengths)]

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-09T13:30:40Z

vllm_omni/metrics/stats.py

+            nframes = sum(
+                int(t.shape[0])
+                for t in (multimodal_output if isinstance(multimodal_output, list) else [multimodal_output])
+            )


record_audio_generated_frames() still isn’t robust to the audio payload types that triggered the original failure. If multimodal_output is an empty tuple / tuple of chunks, the current logic wraps it as a single element and then accesses t.shape, which will raise. Also, using t.shape[0] undercounts for common audio shapes like [1, N] (e.g., Qwen3-Omni code2wav returns reshape(1, -1)), so the metric becomes 1 instead of N. Consider handling (list, tuple) as sequences, gracefully treating empty sequences as 0 frames, and counting frames via shape[-1] (or numel() for 1D) to match how audio chunks are concatenated along the last dimension elsewhere.

Copilot · 2026-02-09T13:30:40Z

vllm_omni/metrics/stats.py

+            nframes = sum(
+                int(t.shape[0])
+                for t in (multimodal_output if isinstance(multimodal_output, list) else [multimodal_output])
+            )


There are existing unit tests for OrchestratorAggregator (see tests/metrics/test_stats.py), but the updated record_audio_generated_frames() behavior isn’t covered. Adding tests for at least these cases would prevent regressions: audio as a single tensor, audio as a list/tuple of chunk tensors, and empty list/tuple (should record 0 without raising).

Gaohan123

Thanks for the nice catch. Could you please post some test results?

JuanPZuluaga · 2026-02-10T06:28:41Z

Thanks for the nice catch. Could you please post some test results?

Hi, do you mean the stats? Like this one:

(APIServer pid=3129320) INFO 02-10 06:29:15 [serving_speech.py:236] TTS speech request speech-806e9155062abcf3: text='Hi Gao Han, vllm-omni is a very cool project!', task_type=CustomVoice
(APIServer pid=3129320) INFO 02-10 06:29:15 [async_omni.py:315] [AsyncOrchestrator] Entering scheduling loop: stages=1, final_stage=0
(Worker pid=3130826) [Stage-0] INFO 02-10 06:29:15 [configuration_qwen3_tts.py:441] code_predictor_config is None. Initializing code_predictor model with default values
(Worker pid=3130826) Setting `pad_token_id` to `eos_token_id`:2150 for open-end generation.
(APIServer pid=3129320) INFO 02-10 06:29:18 [stats.py:429]
(APIServer pid=3129320) INFO 02-10 06:29:18 [stats.py:429] [Overall Summary]
(APIServer pid=3129320) INFO 02-10 06:29:18 [stats.py:429] +-----------------------------+-----------+
(APIServer pid=3129320) INFO 02-10 06:29:18 [stats.py:429] | Field                       |     Value |
(APIServer pid=3129320) INFO 02-10 06:29:18 [stats.py:429] +-----------------------------+-----------+
(APIServer pid=3129320) INFO 02-10 06:29:18 [stats.py:429] | e2e_requests                |         1 |
(APIServer pid=3129320) INFO 02-10 06:29:18 [stats.py:429] | e2e_wall_time_ms            | 2,596.641 |
(APIServer pid=3129320) INFO 02-10 06:29:18 [stats.py:429] | e2e_total_tokens            |        25 |
(APIServer pid=3129320) INFO 02-10 06:29:18 [stats.py:429] | e2e_avg_time_per_request_ms | 2,596.641 |
(APIServer pid=3129320) INFO 02-10 06:29:18 [stats.py:429] | e2e_avg_tokens_per_s        |     9.628 |
(APIServer pid=3129320) INFO 02-10 06:29:18 [stats.py:429] | e2e_stage_0_wall_time_ms    | 2,596.608 |
(APIServer pid=3129320) INFO 02-10 06:29:18 [stats.py:429] +-----------------------------+-----------+
(APIServer pid=3129320) INFO 02-10 06:29:18 [stats.py:455]
(APIServer pid=3129320) INFO 02-10 06:29:18 [stats.py:455] [RequestE2EStats [request_id=speech-806e9155062abcf3]]
(APIServer pid=3129320) INFO 02-10 06:29:18 [stats.py:455] +------------------+-----------+
(APIServer pid=3129320) INFO 02-10 06:29:18 [stats.py:455] | Field            |     Value |
(APIServer pid=3129320) INFO 02-10 06:29:18 [stats.py:455] +------------------+-----------+
(APIServer pid=3129320) INFO 02-10 06:29:18 [stats.py:455] | e2e_total_ms     | 2,596.607 |
(APIServer pid=3129320) INFO 02-10 06:29:18 [stats.py:455] | e2e_total_tokens |        25 |
(APIServer pid=3129320) INFO 02-10 06:29:18 [stats.py:455] +------------------+-----------+
(APIServer pid=3129320) INFO 02-10 06:29:18 [stats.py:508]
(APIServer pid=3129320) INFO 02-10 06:29:18 [stats.py:508] [StageRequestStats [request_id=speech-806e9155062abcf3]]
(APIServer pid=3129320) INFO 02-10 06:29:18 [stats.py:508] +------------------------+-----------+
(APIServer pid=3129320) INFO 02-10 06:29:18 [stats.py:508] | Field                  |         0 |
(APIServer pid=3129320) INFO 02-10 06:29:18 [stats.py:508] +------------------------+-----------+
(APIServer pid=3129320) INFO 02-10 06:29:18 [stats.py:508] | audio_generated_frames |    86,400 |
(APIServer pid=3129320) INFO 02-10 06:29:18 [stats.py:508] | batch_id               |         1 |
(APIServer pid=3129320) INFO 02-10 06:29:18 [stats.py:508] | batch_size             |         1 |
(APIServer pid=3129320) INFO 02-10 06:29:18 [stats.py:508] | num_tokens_in          |        25 |
(APIServer pid=3129320) INFO 02-10 06:29:18 [stats.py:508] | stage_gen_time_ms      | 2,591.946 |
(APIServer pid=3129320) INFO 02-10 06:29:18 [stats.py:508] +------------------------+-----------+

linyueqian

Thanks for syncing with the upstream Qwen3-TTS tokenizer fix and addressing the metrics crash!

Metrics fix (stats.py): This change is no longer needed — the crash was already fixed on main via #1206 with a more defensive implementation (try/except guard, len check, scalar tensor handling). Since this branch was forked before that merge, the fix here is outdated. I'd suggest dropping the stats.py changes and rebasing on latest main.

Tokenizer padding fix: The padding_value=-1 and torch.clamp changes are correct and align with upstream, but there's a critical issue — see inline comments.

vllm_omni/model_executor/models/qwen3_tts/tokenizer_12hz/modeling_qwen3_tts_tokenizer_v2.py

vllm_omni/model_executor/models/qwen3_tts/tokenizer_25hz/modeling_qwen3_tts_tokenizer_v1.py

Signed-off-by: pablo <juanz9312@gmail.com>

JuanPZuluaga · 2026-02-12T07:28:03Z

thanks @linyueqian, it's aligned now to upstream.

JuanPZuluaga · 2026-02-23T14:48:06Z

@Gaohan123 @linyueqian I closed this by mistake. Could we merge this?

linyueqian · 2026-02-23T16:43:48Z

LGTM @hsliuustc0106

hsliuustc0106 · 2026-02-24T08:02:20Z

@vllm-omni-reviewer

github-actions · 2026-02-24T08:03:44Z

🤖 VLLM-Omni PR Review

Code Review: [Bugfix][Qwen3TTS]

1. Overview

This PR fixes a bug introduced in PR #891 that caused an IndexError: tuple index out of range when computing metrics for TTS requests with Qwen3TTS. The fix involves:

Changing the padding value from 0 to -1 in the audio codes padding sequence
Computing audio lengths before decoding using the -1 sentinel value
Adding torch.clamp(audio_codes, min=0) to ensure negative padding values don't propagate to the decoder

The changes align with an upstream update in the Qwen3-TTS repository.

Overall Assessment: Positive - The fix addresses a real bug and follows the upstream implementation pattern.

2. Code Quality

Strengths

Consistent application: The fix is applied consistently across both tokenizer versions (12Hz and 25Hz)
Minimal changes: The diff is focused and targeted on the specific issue
Follows upstream: Aligns with the referenced Qwen3-TTS commit

Potential Issues

Logic correctness of padding value change:
The change from padding_value=0 to padding_value=-1 is correct because 0 could be a valid audio code token. Using -1 as a sentinel value is a common pattern for masking/padding.

Order of operations:
The reordering to compute audio_lengths before decoding is important because after torch.clamp, the -1 padding values become 0, which would make the length calculation incorrect if done after clamping.

# Correct order:
audio_lengths = (audio_codes[..., 0] > -1).sum(1) * self.decode_upsample_rate  # Before clamp
audio_codes = torch.clamp(audio_codes, min=0)  # Then clamp
audio_values = self.decoder.chunked_decode(...)  # Then decode

3. Architecture & Design

Integration

The changes integrate well with the existing codebase structure
Both tokenizer variants (12Hz and 25Hz) receive the same fix, maintaining consistency

Design Consideration

The use of -1 as a padding sentinel and subsequent clamping is a reasonable design pattern for handling variable-length sequences in tensor operations.

4. Security & Safety

No significant security concerns. The changes are:

Memory-safe (no new allocations that could cause issues)
Numerically safe (clamping prevents negative indices/values)

5. Testing & Documentation

Concerns

Missing Test Plan and Results:
The PR description has empty "Test Plan" and "Test Result" sections. Given the error traceback provided, the PR should include:

Verification that the specific error case now works
Confirmation that audio output is correctly generated
Metrics are properly recorded

Recommended test verification:

# Example test command that should be documented
python -c "
from vllm_omni import AsyncLLMEngine
# ... test code that reproduces the original error scenario
"

6. Specific Suggestions

`qwen3_tts_tokenizer.py:331`

audio_codes_padded = pad_sequence(audio_codes_list, batch_first=True, padding_value=-1).to(self.device)

Suggestion: Consider adding a comment explaining why -1 is used as the padding value:

# Use -1 as padding value since 0 can be a valid audio code token
audio_codes_padded = pad_sequence(audio_codes_list, batch_first=True, padding_value=-1).to(self.device)

`tokenizer_12hz/modeling_qwen3_tts_tokenizer_v2.py:994-997`

The logic is correct, but consider adding a brief comment:

# Compute length before clamping since -1 is used as padding marker
audio_lengths = (audio_codes[..., 0] > -1).sum(1) * self.decode_upsample_rate
# Clamp to ensure padding values don't cause issues in decoder
audio_codes = torch.clamp(audio_codes, min=0)

`tokenizer_25hz/modeling_qwen3_tts_tokenizer_v1.py:1513-1516`

Same suggestion as above for consistency.

7. Approval Status

LGTM with suggestions

The fix is technically correct and addresses the root cause of the bug. The changes follow the upstream implementation and are applied consistently across both tokenizer variants.

Before merging, please:

Fill in the Test Plan and Test Results sections in the PR description
Verify the fix resolves the original error with a concrete test case
Consider adding the suggested comments for code clarity (optional but recommended)

The core logic changes are sound and ready to merge once testing is documented.

This review was generated automatically by the VLLM-Omni PR Reviewer Bot
using glm-5.

hsliuustc0106

lgtm

Signed-off-by: pablo <juanz9312@gmail.com> Co-authored-by: Gao Han <gaohan19@huawei.com> Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>

JuanPZuluaga requested a review from hsliuustc0106 as a code owner February 9, 2026 12:49

chatgpt-codex-connector bot reviewed Feb 9, 2026

View reviewed changes

hsliuustc0106 requested a review from Copilot February 9, 2026 13:25

Copilot started reviewing on behalf of hsliuustc0106 February 9, 2026 13:26 View session

Copilot AI reviewed Feb 9, 2026

View reviewed changes

linyueqian mentioned this pull request Feb 10, 2026

[Test] Add e2e tests for Qwen3-TTS speech endpoint #1206

Merged

5 tasks

Gaohan123 reviewed Feb 10, 2026

View reviewed changes

Gaohan123 added this to the v0.16.0 milestone Feb 10, 2026

linyueqian suggested changes Feb 11, 2026

View reviewed changes

vllm_omni/model_executor/models/qwen3_tts/tokenizer_12hz/modeling_qwen3_tts_tokenizer_v2.py Outdated Show resolved Hide resolved

vllm_omni/model_executor/models/qwen3_tts/tokenizer_25hz/modeling_qwen3_tts_tokenizer_v1.py Outdated Show resolved Hide resolved

JuanPZuluaga added 3 commits February 12, 2026 07:21

update with upstream qwen3tts and fix bug

d3be228

Signed-off-by: pablo <juanz9312@gmail.com>

support multiaudio,codex

0608f9b

Signed-off-by: pablo <juanz9312@gmail.com>

remove what was supposed to be removed.

2cb5d11

Signed-off-by: pablo <juanz9312@gmail.com>

JuanPZuluaga force-pushed the qwen3tts.fix-bug branch from 170ba43 to 2cb5d11 Compare February 12, 2026 07:22

fix.

78874e3

Signed-off-by: pablo <juanz9312@gmail.com>

JuanPZuluaga closed this Feb 21, 2026

JuanPZuluaga deleted the qwen3tts.fix-bug branch February 21, 2026 10:57

JuanPZuluaga restored the qwen3tts.fix-bug branch February 23, 2026 14:47

JuanPZuluaga reopened this Feb 23, 2026

Merge branch 'vllm-project:main' into qwen3tts.fix-bug

5e9f102

Gaohan123 added the ready label to trigger buildkite CI label Feb 25, 2026

Merge branch 'main' into qwen3tts.fix-bug

d9e2351

hsliuustc0106 approved these changes Feb 26, 2026

View reviewed changes

Merge branch 'main' into qwen3tts.fix-bug

0cfa0e0

hsliuustc0106 merged commit 66457c3 into vllm-project:main Feb 26, 2026
6 of 7 checks passed

JuanPZuluaga deleted the qwen3tts.fix-bug branch February 26, 2026 13:54

xuechendi pushed a commit to xuechendi/vllm-omni that referenced this pull request Feb 26, 2026

[Bugfix][Qwen3TTS] (vllm-project#1289)

55d7db4

Signed-off-by: pablo <juanz9312@gmail.com> Co-authored-by: Gao Han <gaohan19@huawei.com> Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>

Conversation

JuanPZuluaga commented Feb 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

LJH-LBJ commented Feb 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

Gaohan123 left a comment

Choose a reason for hiding this comment

Uh oh!

JuanPZuluaga commented Feb 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

linyueqian left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

JuanPZuluaga commented Feb 12, 2026

Uh oh!

JuanPZuluaga commented Feb 23, 2026

Uh oh!

linyueqian commented Feb 23, 2026

Uh oh!

hsliuustc0106 commented Feb 24, 2026

Uh oh!

github-actions bot commented Feb 24, 2026

🤖 VLLM-Omni PR Review

Code Review: [Bugfix][Qwen3TTS]

1. Overview

2. Code Quality

Strengths

Potential Issues

3. Architecture & Design

Integration

Design Consideration

4. Security & Safety

5. Testing & Documentation

Concerns

6. Specific Suggestions

qwen3_tts_tokenizer.py:331

tokenizer_12hz/modeling_qwen3_tts_tokenizer_v2.py:994-997

tokenizer_25hz/modeling_qwen3_tts_tokenizer_v1.py:1513-1516

7. Approval Status

LGTM with suggestions

Uh oh!

hsliuustc0106 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

JuanPZuluaga commented Feb 9, 2026 •

edited

Loading

LJH-LBJ commented Feb 9, 2026 •

edited

Loading

JuanPZuluaga commented Feb 10, 2026 •

edited

Loading

`qwen3_tts_tokenizer.py:331`

`tokenizer_12hz/modeling_qwen3_tts_tokenizer_v2.py:994-997`

`tokenizer_25hz/modeling_qwen3_tts_tokenizer_v1.py:1513-1516`