Skip to content

Commit 15173d5

Browse files
authored
Add a functional test of frame stacking (NVIDIA-NeMo#15424)
* Add a test for 4x-stacked Magpie-TTS with local transformer inference. Signed-off-by: Fejgin, Roy <rfejgin@nvidia.com> * Tighten pass/fail thresholds for frame stacking test. Signed-off-by: Fejgin, Roy <rfejgin@nvidia.com> * Add frame stacking test to CI. Signed-off-by: Fejgin, Roy <rfejgin@nvidia.com> * Cleanup Signed-off-by: Fejgin, Roy <rfejgin@nvidia.com> * Disable attention prior for now (debugging a test failure) Signed-off-by: Fejgin, Roy <rfejgin@nvidia.com> * Update tests 1. Enable attention prior in frame stacking test. Did some basic tuning to get the attention prior to be functional with this checkpoints. 2. Decrease the SSIM target for the longform MOE test since it was sporadically failing wiht SSIM just below the threshold. Signed-off-by: Fejgin, Roy <rfejgin@nvidia.com> * Disabling attention prior It would need to be tuned further to work with 4x stacking. Signed-off-by: Fejgin, Roy <rfejgin@nvidia.com> --------- Signed-off-by: Fejgin, Roy <rfejgin@nvidia.com>
1 parent 1692a8f commit 15173d5

File tree

3 files changed

+37
-1
lines changed

3 files changed

+37
-1
lines changed

.github/workflows/cicd-main-speech.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -199,6 +199,8 @@ jobs:
199199
script: L2_TTS_InferEvaluate_Magpietts_MoE_ZeroShot
200200
- runner: self-hosted-azure
201201
script: L2_TTS_InferEvaluatelongform_Magpietts_MoE_ZeroShot
202+
- runner: self-hosted-azure
203+
script: L2_TTS_InferEvaluate_Magpietts_FrameStacking
202204
needs: [unit-tests]
203205
runs-on: ${{ matrix.runner }}
204206
name: ${{ matrix.is-optional && 'PLEASEFIXME_' || '' }}${{ matrix.script }}
Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
# Copyright (c) 2020-2025, NVIDIA CORPORATION.
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
15+
# Tests a 4x-stacked model with local transformer inference.
16+
17+
TORCH_FORCE_NO_WEIGHTS_ONLY_LOAD=1 coverage run -a --data-file=/workspace/.coverage --source=/workspace/nemo examples/tts/magpietts_inference.py \
18+
--codecmodel_path /home/TestData/tts/21fps_causal_codecmodel.nemo \
19+
--datasets_json_path examples/tts/evalset_config.json \
20+
--datasets an4_val_ci \
21+
--out_dir ./mp_fs_4x_0 \
22+
--batch_size 4 \
23+
--use_cfg \
24+
--cfg_scale 2.5 \
25+
--num_repeats 1 \
26+
--temperature 0.6 \
27+
--hparams_files /home/TestData/tts/2602_FrameStacking4x/hparams.yaml \
28+
--checkpoint_files /home/TestData/tts/2602_FrameStacking4x/frame-stacking-4x-english-nanocodec.ckpt \
29+
--run_evaluation \
30+
--clean_up_disk \
31+
--cer_target 0.11 \
32+
--ssim_target 0.6 \
33+
--use_local_transformer \
34+
--longform_mode never

tests/functional_tests/L2_TTS_InferEvaluatelongform_Magpietts_MoE_ZeroShot.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,4 +24,4 @@ TORCH_FORCE_NO_WEIGHTS_ONLY_LOAD=1 coverage run -a --data-file=/workspace/.cover
2424
--run_evaluation \
2525
--clean_up_disk \
2626
--cer_target 0.08 \
27-
--ssim_target 0.50
27+
--ssim_target 0.40

0 commit comments

Comments
 (0)