Skip to content

Commit 7630336

Browse files
committed
clean up and Merge branch 'main' into heh/eou_pr
Signed-off-by: stevehuang52 <[email protected]>
2 parents 99b980e + 6442018 commit 7630336

File tree

613 files changed

+21120
-55841
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

613 files changed

+21120
-55841
lines changed

.github/workflows/cicd-main-nemo2.yml

Lines changed: 1 addition & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,7 @@ jobs:
3939
runner: self-hosted-azure
4040
- script: L2_NeMo_2_llama3_pretraining_recipe
4141
runner: self-hosted-azure
42+
is-optional: true
4243
# - script: L2_NeMo_2_llama3_pytorch_profiler
4344
# runner: self-hosted-azure
4445
# timeout: 20
@@ -162,23 +163,15 @@ jobs:
162163
runner: self-hosted-azure
163164
- script: L2_NEMO_2_LoRA_MERGE
164165
runner: self-hosted-azure
165-
- script: L2_NEMO_2_LoRA_Export
166-
runner: self-hosted-azure-gpus-1
167166
- script: L2_NEMO_2_LoRA_Inference
168167
runner: self-hosted-azure-gpus-1
169168
- script: L2_NeMo_2_NeMo_Mcore_Mixtral_bitexact
170169
runner: self-hosted-azure
171170
is-optional: true
172-
- script: L2_NeMo_2_Automodel_PTQ_trtllm
173-
runner: self-hosted-azure
174-
- script: L2_NeMo_2_Automodel_PTQ_hf
175-
runner: self-hosted-azure
176171
- script: L2_NeMo_2_PTQ_Llama2_FP8_trtllm
177172
runner: self-hosted-azure
178173
- script: L2_NeMo_2_PTQ_Llama2_FP8_nemo
179174
runner: self-hosted-azure
180-
- script: L2_NeMo_2_PTQ_Unified_Export
181-
runner: self-hosted-azure
182175
- script: L2_NeMo_2_Distill_Llama3_TP1PP2
183176
runner: self-hosted-azure
184177
- script: L2_NeMo_2_Prune_Llama_TP1PP2

.github/workflows/cicd-main-speech.yml

Lines changed: 12 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -164,12 +164,6 @@ jobs:
164164
script: L2_Speaker_dev_run_Neural_Diarizer_Inference
165165
- runner: self-hosted-azure
166166
script: L2_Speaker_dev_run_Multispeaker_ASR_Data_Simulation
167-
- runner: self-hosted-azure
168-
script: L2_Segmentation_Tool_Parallel_ctc_segmentation_test_L2_Eng_CitriNet_with_wav
169-
- runner: self-hosted-azure
170-
script: L2_Segmentation_Tool_Parallel_ctc_segmentation_test_L2_Ru_QN_with_mp3
171-
- script: L2_HF_Transformer_SpeechLM_SFT_2gpu
172-
runner: self-hosted-azure
173167
- script: L2_SpeechLM_LoRA_TP1PP1_MBS2
174168
runner: self-hosted-azure
175169
- runner: self-hosted-azure-gpus-1
@@ -189,6 +183,18 @@ jobs:
189183
- runner: self-hosted-azure
190184
script: SPEECHLM_HF_Training_SALM
191185
timeout: 20
186+
- runner: self-hosted-azure
187+
script: L2_TTS_Fast_dev_runs_Magpietts_DecoderContext
188+
- runner: self-hosted-azure
189+
script: L2_TTS_Fast_dev_runs_Magpietts_MultiEncoder
190+
- runner: self-hosted-azure
191+
script: L2_TTS_Fast_dev_runs_Magpietts_OnlinePO
192+
- runner: self-hosted-azure
193+
script: L2_TTS_InferEvaluate_Magpietts_ZeroShot
194+
- runner: self-hosted-azure
195+
script: L2_TTS_InferEvaluate_Magpietts_SeenSpeakers
196+
- runner: self-hosted-azure
197+
script: L2_TTS_InferEvaluatelongform_Magpietts_ZeroShot
192198
needs: [unit-tests]
193199
runs-on: ${{ matrix.runner }}
194200
name: ${{ matrix.is-optional && 'PLEASEFIXME_' || '' }}${{ matrix.script }}

.github/workflows/code-linting.yml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,8 @@ jobs:
4242
"!nemo/collections/audio/**/*.py",
4343
"!nemo/collections/multimodal/speech_llm/**/*.py",
4444
"!nemo/collections/speechlm/**/*.py",
45-
"!nemo/collections/speechlm2/**/*.py"
45+
"!nemo/collections/speechlm2/**/*.py",
46+
"!nemo/export/**/*.py"
4647
] | join(",")')
4748
fi
4849

.github/workflows/update-buildcache.yml

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -93,8 +93,6 @@ jobs:
9393
image-name: nemo_container_speech
9494
- dockerfile: docker/Dockerfile.ci
9595
image-name: nemo_container
96-
- dockerfile: docker/Dockerfile.ci.export_deploy
97-
image-name: nemo_container_export_deploy
9896
with:
9997
image-name: ${{ matrix.image-name }}
10098
dockerfile: ${{ matrix.dockerfile }}

CHANGELOG.md

Lines changed: 222 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,228 @@
11
# Changelog
22

33
<!-- Next changelog -->
4+
## NVIDIA Neural Modules 2.6.0
5+
6+
### Highlights
7+
8+
- Speech
9+
- Add Timestamps to streaming ASR [PR](https://github.com/NVIDIA-NeMo/NeMo/pull/14766)
10+
- Add Streaming decoding policies (Wait-K and AlignAtt) for Canary model [PR](https://github.com/NVIDIA-NeMo/NeMo/pull/14765)
11+
- Add NeMo Voice Agent [PR](https://github.com/NVIDIA-NeMo/NeMo/pull/14325)
12+
- Hybrid RNNT-CTC Prompted Parakeet Model support [PR](https://github.com/NVIDIA-NeMo/NeMo/pull/14561)
13+
- [New] MT-Parakeet Streaming Models [release](https://huggingface.co/nvidia/multitalker-parakeet-streaming-0.6b-v1)
14+
- Removed the Automodel module. Automodel is available in the repo https://github.com/NVIDIA-NeMo/Automodel.
15+
- Removed the Deploy module. Export & Deploy is available in the repo https://github.com/NVIDIA-NeMo/Export-Deploy.
16+
- Non-Speech NeMo 2.0 collections are deprecated and will be removed in a later release. Their functionality is available in the Megatron Bridge repo at https://github.com/NVIDIA-NeMo/Megatron-Bridge.
17+
18+
### Known Issues
19+
20+
- NeMo voice agent pipecat connecting issues
21+
22+
### Detailed Changelogs:
23+
24+
#### ASR
25+
26+
<details><summary>Changelog</summary>
27+
28+
- fixing kernel restarting when transcribing by @weiqingw4ng :: PR: #14665
29+
- Downgrade "datasets" library version in ASR tutorial to ensure compatibility with HF Datasets used by @KunalDhawan :: PR: #14679
30+
- Fixing Sortformer training tutorial notebook by @tango4j :: PR: #14680
31+
- Fix for "EncDecRNNTBPEModel transcribe() failed with TypeError" by @andrusenkoau :: PR: #14698
32+
- Force activations and weights cast to FP32 Jasper Encoder Squeeze-Excite (merge to main) by @erastorgueva-nv :: PR: #14743
33+
- Use lhotse dataloader for ASR models to support in-manifest channel selection for multichannel recordings by @racoiaws :: PR: #14586
34+
- add transducer timestamps without alignments, timestamps to streaming by @lilithgrigoryan :: PR: #14766
35+
- Adding bf16 Sortformer train and inference by @tango4j :: PR: #14627
36+
- Replace texterrors with kaldialign library by @andrusenkoau :: PR: #14775
37+
- fix: Use shutil.copy fallback to handle file metadata permission errors by @vipnydav :: PR: #14639
38+
- Add Customization Capabilities to Cache-Aware Models by @artbataev :: PR: #14757
39+
- Documentation for gpu-based phrase boosting by @andrusenkoau :: PR: #14800
40+
- Streaming decoding policies (Wait-K and AlignAtt) for Canary model by @andrusenkoau :: PR: #14765
41+
- Add tests for streaming buffered and cache-aware transducer models by @artbataev :: PR: #14823
42+
- Merge updates of Multi-Talker Parakeet Model, Modules, Dataloader and Utils PR 01 by @weiqingw4ng :: PR: #14905
43+
- Merge updates of Multi-Talker Parakeet - Unit tests and CI tests PR 02 by @weiqingw4ng :: PR: #14932
44+
- Add Parakeet Hybrid RNNT CTC BPE Model with Prompt support by @ealbasiri :: PR: #14561
45+
- fix notebooks by @nithinraok :: PR: #15079
46+
- cherry pick #15070 by @nithinraok :: PR: #15082
47+
48+
</details>
49+
50+
#### TTS
51+
52+
<details><summary>Changelog</summary>
53+
54+
- Remove outdated TTS Tutorials by @blisc :: PR: #14660
55+
- Add KokoroTTS support for voice agent framework by @tango4j :: PR: #14910
56+
- remove language_modeling by @dimapihtar :: PR: #14192
57+
58+
</details>
59+
60+
#### NLP / NMT
61+
62+
<details><summary>Changelog</summary>
63+
64+
- Add gpt-oss by @cuichenx :: PR: #14457
65+
- Fix sequence packing loss calculation by @rayandasoriya :: PR: #14437
66+
- [Perf script] Llama and GPT3 perf script use mlp cast fusion by @guyueh1 :: PR: #14575
67+
- Delete tutorials/llm/llama/biomedical-qa directory by @cuichenx :: PR: #14653
68+
- Add gpt-oss lora exporter by @cuichenx :: PR: #14589
69+
- Replace MegatronTokenizer with MegatronLegacyTokenizer by @chtruong814 :: PR: #14721
70+
- Update ModelCommPGs API from megatron-core by @yaoyu-33 :: PR: #14578
71+
- feat: Compatibility modification of megatron-fsdp by @shjwudp :: PR: #14593
72+
- imported get_moe_layer_wise_logging_tracker from megatron core moe_utils by @prathamk-tw :: PR: #14694
73+
- Fix gpt-oss yarn_original_max_position_embeddings value by @cuichenx :: PR: #14706
74+
- Update docs per guidance by @pablo-garay :: PR: #14841
75+
- Fixing three mcore links by @aschilling-nv :: PR: #14839
76+
- Documentation for gpu-based phrase boosting by @andrusenkoau :: PR: #14800
77+
- Update gpt-oss configs by @cuichenx :: PR: #14674
78+
- remove language_modeling by @dimapihtar :: PR: #14192
79+
- cp: `remove ExportDeploy` into `r2.6.0` by @pablo-garay :: PR: #15053
80+
- cherry pick #15070 by @nithinraok :: PR: #15082
81+
82+
</details>
83+
84+
#### Export
85+
86+
<details><summary>Changelog</summary>
87+
88+
- fix: fix missing rope scaling in exporting llama embedding model by @ZhiyuLi-Nvidia :: PR: #14523
89+
- Add gpt-oss lora exporter by @cuichenx :: PR: #14589
90+
- Skip trt-llm and vllm install in install test by @chtruong814 :: PR: #14663
91+
- Fix deepseek export dtype by @cuichenx :: PR: #14307
92+
- Remove export-deploy, automodel, and eval tutorials by @chtruong814 :: PR: #14790
93+
- cp: `remove ExportDeploy` into `r2.6.0` by @pablo-garay :: PR: #15053
94+
95+
</details>
96+
97+
#### Uncategorized:
98+
99+
<details><summary>Changelog</summary>
100+
101+
- Version bump to `2.6.0rc0.dev0` by @github-actions[bot] :: PR: #14512
102+
- [Audio]: added conformer U-Net model for SE by @nasretdinovr :: PR: #14442
103+
- hyena/evo2: Make sure to convert to real after fp32 conversion by @antonvnv :: PR: #14515
104+
- Force-set restore path for student in KD mode by @AAnoosheh :: PR: #14532
105+
- Skip PTQ if PTQ model path exists by @jenchen13 :: PR: #14536
106+
- Support QwenVL for inference API by @meatybobby :: PR: #14534
107+
- Hyena: Allow to use unfused RMSNorm + TELinear to restore accuracy and some speed by @antonvnv :: PR: #14542
108+
- [Audio]: added streaming mode to SpectrogramToAudio by @nasretdinovr :: PR: #14524
109+
- Update evo2 defaults so converted checkpoints have the right parameters by @jstjohn :: PR: #14514
110+
- deprecate t0 scripts by @dimapihtar :: PR: #14585
111+
- cfg typo correction by @malay-nagda :: PR: #14588
112+
- [Perf script] Add use_te_activation_func and activation_func_fp8_input_store flags by @guyueh1 :: PR: #14522
113+
- Modify logging message to signal that RestoreConfig will be used by @balvisio :: PR: #14469
114+
- Bump TE and Mcore by @chtruong814 :: PR: #14568
115+
- Avoid host-device sync in PTL logging by @WanZzzzzz :: PR: #14489
116+
- Integrate implicit filter kernel with Hyena layer by @farhadrgh :: PR: #14621
117+
- Fix kv_channels configuration for Gemma2 27b by @ananthsub :: PR: #14590
118+
- [Flux] small fixes by @CarlosGomes98 :: PR: #14333
119+
- [Flux] Add MXFP8 Support by @alpha0422 :: PR: #14473
120+
- Use hugginface_hub for downloading the FLUX checkpoint by @suiyoubi :: PR: #14638
121+
- Fine-tune embedding models (E5-Large-V2 and LLaMA-3.2-1B) on the allnli triplet dataset with NeMo Framework by @girihemant19 :: PR: #14584
122+
- remove service launch scripts by @dimapihtar :: PR: #14647
123+
- Warn instead of error when chat template doesn't contain generation keyword by @jenchen13 :: PR: #14641
124+
- Fix function calling notebook by @cuichenx :: PR: #14643
125+
- [Audio]: fixed bug in conformer unet by @nasretdinovr :: PR: #14626
126+
- Fix code checkout during test by @chtruong814 :: PR: #14658
127+
- Fix Flux seed as optional Arg by @suiyoubi :: PR: #14652
128+
- Remove PEFT scheme condition from recipe by @JRD971000 :: PR: #14661
129+
- Add NeMo Voice Agent by @stevehuang52 :: PR: #14325
130+
- Update get_tensor_shapes function whose signature was refactored by @AAnoosheh :: PR: #14594
131+
- Delete nemo1 notebooks by @cuichenx :: PR: #14677
132+
- Bump latest Mcore 020abf01 by @chtruong814 :: PR: #14676
133+
- [Flux] correct vae_downscale_factor by @CarlosGomes98 :: PR: #14425
134+
- Bump modelopt to 0.35.0 and remove `safe_import("modelopt")` in llm collection by @kevalmorabia97 :: PR: #14656
135+
- Canary tutorial fix by @nune-tadevosyan :: PR: #14699
136+
- Add option for LoRA with Transformer Engine op fuser by @timmoon10 :: PR: #14411
137+
- add load-in-4bit param by @dimapihtar :: PR: #14636
138+
- Support NVFP4 recipe by @WanZzzzzz :: PR: #14625
139+
- Fix broken link in Reasoning-SFT.ipynb by @cuichenx :: PR: #14716
140+
- Remove artificial block to vortex fp8 TP by @jstjohn :: PR: #14684
141+
- Drop speech_llm example suite by @yaoyu-33 :: PR: #14683
142+
- remove env var by @malay-nagda :: PR: #14739
143+
- detach arg option for run scripts by @malay-nagda :: PR: #14722
144+
- Randomized shard slicing for tarred data by @pzelasko :: PR: #14558
145+
- Data prediction objective for flow matching speech enhancement models by @racoiaws :: PR: #14749
146+
- Fix Some Failures by @alpha0422 :: PR: #14763
147+
- Support additional Slurm parameters (#14701) by @bdubauski :: PR: #14742
148+
- [Flux] Remove Redundant Host & Device Sync by @alpha0422 :: PR: #14711
149+
- [Flux] Full Iteration CUDA Graph by @alpha0422 :: PR: #14744
150+
- Update prune-distill notebooks to Qwen3 + simplify + mmlu eval by @kevalmorabia97 :: PR: #14785
151+
- ci: Automodel deprecation warning by @thomasdhc :: PR: #14787
152+
- Bug in MXFP8 recipe by @adityavavreNVDA :: PR: #14793
153+
- feat: Disable blank Issues by @pablo-garay :: PR: #14788
154+
- ci: Add community label bot by @chtruong814 :: PR: #14796
155+
- Add mistral small3 24B config and recipe by @eagle705 :: PR: #14784
156+
- Update changelog for `r2.3.0` by @github-actions[bot] :: PR: #14812
157+
- QWEN2.5-VL 7B FP8 Recipe by @tomlifu :: PR: #14801
158+
- Feat: Disk space management: for nemo install test by @pablo-garay :: PR: #14822
159+
- Evo2 address rare over-masking in 1m context dataset by @jstjohn :: PR: #14821
160+
- Update cherry-pick workflow to use version 0.63.0 by @pablo-garay :: PR: #14832
161+
- Removing automodel items by @aschilling-nv :: PR: #14840
162+
- Update changelog for `v2.4.1` by @github-actions[bot] :: PR: #14828
163+
- Fix lm_eval installation in pruning tutorial for 25.09 container by @kevalmorabia97 :: PR: #14865
164+
- Add nemotron-nano-v2 support to voice agent by @stevehuang52 :: PR: #14704
165+
- Update changelog for 2.5.0 by @chtruong814 :: PR: #14890
166+
- [Qwen3] Fix the flop cal for Qwen3 by @gdengk :: PR: #14897
167+
- [lhotse][aistore] added support input_cfg.yaml directly from aistore bucket by @XuesongYang :: PR: #14891
168+
- Harden _is_target_allowed by adding runtime class validation on top of prefix checks to prevent unsafe target resolution by @KunalDhawan :: PR: #14540
169+
- Enable simplified DistOpt checkpoint formats by @mikolajblaz :: PR: #14428
170+
- Fix the load checkpointing issue -- onelogger callback gets called multiple time in some case. by @liquor233 :: PR: #14945
171+
- Revert "new changelog-build" by @pablo-garay :: PR: #14949
172+
- feat: new changelog-build by @pablo-garay :: PR: #14950
173+
- Update llama4 utils kwargs by @yaoyu-33 :: PR: #14924
174+
- Update README.md by @snowmanwwg :: PR: #14917
175+
- Update all outdated NeMo Curator links by @sarahyurick :: PR: #14760
176+
- Freeze tags in in `r2.6.0` by @github-actions[bot] :: PR: #14957
177+
- cp: `Bump MCore, TE, Pytorch, and modelopt for 25.11 (14946)` into `r2.6.0` by @chtruong814 :: PR: #14976
178+
- cp: `Update ctc-segmentation (14991)` into `r2.6.0` by @chtruong814 :: PR: #14998
179+
- cherry-pick of #14962 by @dimapihtar :: PR: #15000
180+
- cp: `Pass timeout when running speech functional tests (15012)` into `r2.6.0` by @chtruong814 :: PR: #15013
181+
- cp: `check asr models (14989)` into `r2.6.0` by @chtruong814 :: PR: #15002
182+
- cp: `Enable EP in PTQ (15015)` into `r2.6.0` by @chtruong814 :: PR: #15026
183+
- cp: `Update numba to numba-cuda and update cuda python bindings usage (15018)` into `r2.6.0` by @chtruong814 :: PR: #15024
184+
- cp: `Add import guards for mcore lightning module (14970)` into `r2.6.0` by @chtruong814 :: PR: #14981
185+
- cp: `fix loading of hyb ctc rnnt bpe models when using from pretrained (15042)` into `r2.6.0` by @chtruong814 :: PR: #15045
186+
- cp: `fix: fix update-buildcache workflow after ED remove (15051)` into `r2.6.0` by @chtruong814 :: PR: #15052
187+
- cp: `chore: update Lightning requirements version (15004)` into `r2.6.0` by @chtruong814 :: PR: #15049
188+
- cp: `update notebook (15093)` into `r2.6.0` by @chtruong814 :: PR: #15094
189+
- cp: `Fix: Obsolete Attribute [SDE] (15105)` into `r2.6.0` by @chtruong814 :: PR: #15106
190+
- cp: `Upgrade NeMo ASR tutorials from Mozilla/CommonVoice to Google/FLEURS (15103)` into `r2.6.0` by @chtruong814 :: PR: #15107
191+
- cp: `chore: Remove Automodel module (15044)` into `r2.6.0` by @chtruong814 :: PR: #15084
192+
- cp: `Add deprecation notice to modules (15050)` into `r2.6.0` by @chtruong814 :: PR: #15110
193+
194+
</details>
195+
196+
## NVIDIA Neural Modules 2.5.3
197+
198+
### Highlights
199+
200+
- This release addresses known security issues. For the latest NVIDIA Vulnerability Disclosure Information visit <https://www.nvidia.com/en-us/security/>, for acknowledgement please reach out to the NVIDIA PSIRT team at <[email protected]>
201+
- Update nv-one-logger
202+
- Update ctc-segmentation
203+
204+
### Detailed Changelogs:
205+
206+
207+
</details>
208+
209+
#### Text Normalization / Inverse Text Normalization
210+
211+
<details><summary>Changelog</summary>
212+
213+
- chore: update Lightning requirement by @liquor233 :: PR: #15005
214+
215+
</details>
216+
217+
#### Uncategorized:
218+
219+
<details><summary>Changelog</summary>
220+
221+
- cp: `Update ctc-segmentation (14991)` into `r2.5.0` by @chtruong814 :: PR: #15020
222+
- Bump to 2.5.3 by @chtruong814 :: PR: #15022
223+
224+
</details>
225+
4226
## NVIDIA Neural Modules 2.5.2
5227

6228
### Detailed Changelogs:

MANIFEST.in

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,2 +1 @@
11
include requirements/*
2-
include tools/ctc_segmentation/requirements.txt

0 commit comments

Comments
 (0)