[TTS] MagpieTTS inference: Add command line option to select a subset of datasets to run inference on by rfejgin · Pull Request #15212 · NVIDIA-NeMo/NeMo

rfejgin · 2025-12-20T00:33:38Z

Added a command line option to select a subset of datasets to run inference on.

Reasoning: for day-to-day work we need a way to select a subset of datasets to run inference on. Directly editing the JSON file leads to non-reproducible local testing as the JSON file is edited over and over in a non-traceable way. Hence adding a programmatic way to choose datasets. It's entirely optional to specify this new command line argument; if not specified, all datasets in the JSON file will be processed.

New command line argument format: --datasets <dataset1,dataset2,...> where
dataset1, dataset2, ... are the names of datasets to process in the
datasets_json_path file.

If not specified, all datasets in the datasets_json_path will be processed.
If specified, only the datasets in the list will be processed.

New command line argument: --datasets <dataset1,dataset2,...> where dataset1, dataset2, ... are the names datasets to process in the datasets_json_path file. If not specified, all datasets in the datasets_json_path will be processed. If specified, only the datasets in the list will be processed. Signed-off-by: Fejgin, Roy <rfejgin@nvidia.com>

* Correctly handle comma-separated list of dataset names in the --datasets argument. * Help text Signed-off-by: Fejgin, Roy <rfejgin@nvidia.com>

blisc · 2025-12-31T15:56:35Z

Closing in favour of #15242

blisc · 2025-12-31T17:18:38Z

Will merge this instead of #15242

…15212 Signed-off-by: Jason <jasoli@nvidia.com>

* move inference params to checkpoint and make do_tts apply prior Signed-off-by: Jason <jasoli@nvidia.com> * Apply isort and black reformatting Signed-off-by: blisc <blisc@users.noreply.github.com> * Enable LT in do_tts; add docstrings Signed-off-by: Jason <jasoli@nvidia.com> * update defaults; merge inference dataclasses Signed-off-by: Jason <jasoli@nvidia.com> * Apply isort and black reformatting Signed-off-by: blisc <blisc@users.noreply.github.com> * update epsilon value Signed-off-by: Jason <jasoli@nvidia.com> * add defaults for inference; fix longform mode; fix bug introduced in #15212 Signed-off-by: Jason <jasoli@nvidia.com> * Apply isort and black reformatting Signed-off-by: blisc <blisc@users.noreply.github.com> * fix field key usage Signed-off-by: Jason <jasoli@nvidia.com> --------- Signed-off-by: Jason <jasoli@nvidia.com> Signed-off-by: blisc <blisc@users.noreply.github.com> Co-authored-by: blisc <blisc@users.noreply.github.com>

… of datasets to run inference on (NVIDIA-NeMo#15212) * Added datasets filtering to the inference script New command line argument: --datasets <dataset1,dataset2,...> where dataset1, dataset2, ... are the names datasets to process in the datasets_json_path file. If not specified, all datasets in the datasets_json_path will be processed. If specified, only the datasets in the list will be processed. Signed-off-by: Fejgin, Roy <rfejgin@nvidia.com> * Refined datasets filtering in the inference script * Correctly handle comma-separated list of dataset names in the --datasets argument. * Help text Signed-off-by: Fejgin, Roy <rfejgin@nvidia.com> --------- Signed-off-by: Fejgin, Roy <rfejgin@nvidia.com> Signed-off-by: Akhil Varanasi <akhilvaranasi23@gmail.com>

* move inference params to checkpoint and make do_tts apply prior Signed-off-by: Jason <jasoli@nvidia.com> * Apply isort and black reformatting Signed-off-by: blisc <blisc@users.noreply.github.com> * Enable LT in do_tts; add docstrings Signed-off-by: Jason <jasoli@nvidia.com> * update defaults; merge inference dataclasses Signed-off-by: Jason <jasoli@nvidia.com> * Apply isort and black reformatting Signed-off-by: blisc <blisc@users.noreply.github.com> * update epsilon value Signed-off-by: Jason <jasoli@nvidia.com> * add defaults for inference; fix longform mode; fix bug introduced in NVIDIA-NeMo#15212 Signed-off-by: Jason <jasoli@nvidia.com> * Apply isort and black reformatting Signed-off-by: blisc <blisc@users.noreply.github.com> * fix field key usage Signed-off-by: Jason <jasoli@nvidia.com> --------- Signed-off-by: Jason <jasoli@nvidia.com> Signed-off-by: blisc <blisc@users.noreply.github.com> Co-authored-by: blisc <blisc@users.noreply.github.com> Signed-off-by: Akhil Varanasi <akhilvaranasi23@gmail.com>

… of datasets to run inference on (NVIDIA-NeMo#15212) * Added datasets filtering to the inference script New command line argument: --datasets <dataset1,dataset2,...> where dataset1, dataset2, ... are the names datasets to process in the datasets_json_path file. If not specified, all datasets in the datasets_json_path will be processed. If specified, only the datasets in the list will be processed. Signed-off-by: Fejgin, Roy <rfejgin@nvidia.com> * Refined datasets filtering in the inference script * Correctly handle comma-separated list of dataset names in the --datasets argument. * Help text Signed-off-by: Fejgin, Roy <rfejgin@nvidia.com> --------- Signed-off-by: Fejgin, Roy <rfejgin@nvidia.com>

* move inference params to checkpoint and make do_tts apply prior Signed-off-by: Jason <jasoli@nvidia.com> * Apply isort and black reformatting Signed-off-by: blisc <blisc@users.noreply.github.com> * Enable LT in do_tts; add docstrings Signed-off-by: Jason <jasoli@nvidia.com> * update defaults; merge inference dataclasses Signed-off-by: Jason <jasoli@nvidia.com> * Apply isort and black reformatting Signed-off-by: blisc <blisc@users.noreply.github.com> * update epsilon value Signed-off-by: Jason <jasoli@nvidia.com> * add defaults for inference; fix longform mode; fix bug introduced in NVIDIA-NeMo#15212 Signed-off-by: Jason <jasoli@nvidia.com> * Apply isort and black reformatting Signed-off-by: blisc <blisc@users.noreply.github.com> * fix field key usage Signed-off-by: Jason <jasoli@nvidia.com> --------- Signed-off-by: Jason <jasoli@nvidia.com> Signed-off-by: blisc <blisc@users.noreply.github.com> Co-authored-by: blisc <blisc@users.noreply.github.com>

rfejgin added 2 commits December 19, 2025 15:57

Refined datasets filtering in the inference script

10a838b

* Correctly handle comma-separated list of dataset names in the --datasets argument. * Help text Signed-off-by: Fejgin, Roy <rfejgin@nvidia.com>

github-actions bot added the TTS label Dec 20, 2025

rfejgin changed the title ~~[TTS] MagpieTTS inference: Add command line to select a subset of datasets to run inference on~~ [TTS] MagpieTTS inference: Add command line option to select a subset of datasets to run inference on Dec 20, 2025

Merge branch 'main' into magpietts_datasets_inferece_command_line

650ab32

rfejgin requested review from blisc and subhankar-ghosh December 20, 2025 00:42

rfejgin added the Run CICD label Dec 20, 2025

blisc approved these changes Dec 30, 2025

View reviewed changes

blisc closed this Dec 31, 2025

blisc reopened this Dec 31, 2025

chtruong814 added Run CICD and removed Run CICD labels Dec 31, 2025

blisc merged commit 316aea2 into NVIDIA-NeMo:main Dec 31, 2025
108 of 111 checks passed

chtruong814 temporarily deployed to test December 31, 2025 17:19 — with GitHub Actions Inactive

blisc added a commit that referenced this pull request Jan 7, 2026

add defaults for inference; fix longform mode; fix bug introduced in #…

9f65e6b

…15212 Signed-off-by: Jason <jasoli@nvidia.com>

blisc mentioned this pull request Jan 7, 2026

Update MagpieTTS' Inference Parameter Configuration #15254

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TTS] MagpieTTS inference: Add command line option to select a subset of datasets to run inference on#15212

[TTS] MagpieTTS inference: Add command line option to select a subset of datasets to run inference on#15212
blisc merged 3 commits intoNVIDIA-NeMo:mainfrom
rfejgin:magpietts_datasets_inferece_command_line

rfejgin commented Dec 20, 2025 •

edited

Loading

Uh oh!

blisc commented Dec 31, 2025

Uh oh!

blisc commented Dec 31, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

rfejgin commented Dec 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

blisc commented Dec 31, 2025

Uh oh!

blisc commented Dec 31, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

rfejgin commented Dec 20, 2025 •

edited

Loading