|
1 | 1 | # Changelog |
2 | 2 |
|
3 | 3 | <!-- Next changelog --> |
| 4 | +## NVIDIA Neural Modules 2.6.0 |
| 5 | + |
| 6 | +### Highlights |
| 7 | + |
| 8 | +- Speech |
| 9 | + - Add Timestamps to streaming ASR [PR](https://github.com/NVIDIA-NeMo/NeMo/pull/14766) |
| 10 | + - Add Streaming decoding policies (Wait-K and AlignAtt) for Canary model [PR](https://github.com/NVIDIA-NeMo/NeMo/pull/14765) |
| 11 | + - Add NeMo Voice Agent [PR](https://github.com/NVIDIA-NeMo/NeMo/pull/14325) |
| 12 | + - Hybrid RNNT-CTC Prompted Parakeet Model support [PR](https://github.com/NVIDIA-NeMo/NeMo/pull/14561) |
| 13 | + - [New] MT-Parakeet Streaming Models [release](https://huggingface.co/nvidia/multitalker-parakeet-streaming-0.6b-v1) |
| 14 | +- Removed the Automodel module. Automodel is available in the repo https://github.com/NVIDIA-NeMo/Automodel. |
| 15 | +- Removed the Deploy module. Export & Deploy is available in the repo https://github.com/NVIDIA-NeMo/Export-Deploy. |
| 16 | +- Non-Speech NeMo 2.0 collections are deprecated and will be removed in a later release. Their functionality is available in the Megatron Bridge repo at https://github.com/NVIDIA-NeMo/Megatron-Bridge. |
| 17 | + |
| 18 | +### Known Issues |
| 19 | + |
| 20 | +- NeMo voice agent pipecat connecting issues |
| 21 | + |
| 22 | +### Detailed Changelogs: |
| 23 | + |
| 24 | +#### ASR |
| 25 | + |
| 26 | +<details><summary>Changelog</summary> |
| 27 | + |
| 28 | +- fixing kernel restarting when transcribing by @weiqingw4ng :: PR: #14665 |
| 29 | +- Downgrade "datasets" library version in ASR tutorial to ensure compatibility with HF Datasets used by @KunalDhawan :: PR: #14679 |
| 30 | +- Fixing Sortformer training tutorial notebook by @tango4j :: PR: #14680 |
| 31 | +- Fix for "EncDecRNNTBPEModel transcribe() failed with TypeError" by @andrusenkoau :: PR: #14698 |
| 32 | +- Force activations and weights cast to FP32 Jasper Encoder Squeeze-Excite (merge to main) by @erastorgueva-nv :: PR: #14743 |
| 33 | +- Use lhotse dataloader for ASR models to support in-manifest channel selection for multichannel recordings by @racoiaws :: PR: #14586 |
| 34 | +- add transducer timestamps without alignments, timestamps to streaming by @lilithgrigoryan :: PR: #14766 |
| 35 | +- Adding bf16 Sortformer train and inference by @tango4j :: PR: #14627 |
| 36 | +- Replace texterrors with kaldialign library by @andrusenkoau :: PR: #14775 |
| 37 | +- fix: Use shutil.copy fallback to handle file metadata permission errors by @vipnydav :: PR: #14639 |
| 38 | +- Add Customization Capabilities to Cache-Aware Models by @artbataev :: PR: #14757 |
| 39 | +- Documentation for gpu-based phrase boosting by @andrusenkoau :: PR: #14800 |
| 40 | +- Streaming decoding policies (Wait-K and AlignAtt) for Canary model by @andrusenkoau :: PR: #14765 |
| 41 | +- Add tests for streaming buffered and cache-aware transducer models by @artbataev :: PR: #14823 |
| 42 | +- Merge updates of Multi-Talker Parakeet Model, Modules, Dataloader and Utils PR 01 by @weiqingw4ng :: PR: #14905 |
| 43 | +- Merge updates of Multi-Talker Parakeet - Unit tests and CI tests PR 02 by @weiqingw4ng :: PR: #14932 |
| 44 | +- Add Parakeet Hybrid RNNT CTC BPE Model with Prompt support by @ealbasiri :: PR: #14561 |
| 45 | +- fix notebooks by @nithinraok :: PR: #15079 |
| 46 | +- cherry pick #15070 by @nithinraok :: PR: #15082 |
| 47 | + |
| 48 | +</details> |
| 49 | + |
| 50 | +#### TTS |
| 51 | + |
| 52 | +<details><summary>Changelog</summary> |
| 53 | + |
| 54 | +- Remove outdated TTS Tutorials by @blisc :: PR: #14660 |
| 55 | +- Add KokoroTTS support for voice agent framework by @tango4j :: PR: #14910 |
| 56 | +- remove language_modeling by @dimapihtar :: PR: #14192 |
| 57 | + |
| 58 | +</details> |
| 59 | + |
| 60 | +#### NLP / NMT |
| 61 | + |
| 62 | +<details><summary>Changelog</summary> |
| 63 | + |
| 64 | +- Add gpt-oss by @cuichenx :: PR: #14457 |
| 65 | +- Fix sequence packing loss calculation by @rayandasoriya :: PR: #14437 |
| 66 | +- [Perf script] Llama and GPT3 perf script use mlp cast fusion by @guyueh1 :: PR: #14575 |
| 67 | +- Delete tutorials/llm/llama/biomedical-qa directory by @cuichenx :: PR: #14653 |
| 68 | +- Add gpt-oss lora exporter by @cuichenx :: PR: #14589 |
| 69 | +- Replace MegatronTokenizer with MegatronLegacyTokenizer by @chtruong814 :: PR: #14721 |
| 70 | +- Update ModelCommPGs API from megatron-core by @yaoyu-33 :: PR: #14578 |
| 71 | +- feat: Compatibility modification of megatron-fsdp by @shjwudp :: PR: #14593 |
| 72 | +- imported get_moe_layer_wise_logging_tracker from megatron core moe_utils by @prathamk-tw :: PR: #14694 |
| 73 | +- Fix gpt-oss yarn_original_max_position_embeddings value by @cuichenx :: PR: #14706 |
| 74 | +- Update docs per guidance by @pablo-garay :: PR: #14841 |
| 75 | +- Fixing three mcore links by @aschilling-nv :: PR: #14839 |
| 76 | +- Documentation for gpu-based phrase boosting by @andrusenkoau :: PR: #14800 |
| 77 | +- Update gpt-oss configs by @cuichenx :: PR: #14674 |
| 78 | +- remove language_modeling by @dimapihtar :: PR: #14192 |
| 79 | +- cp: `remove ExportDeploy` into `r2.6.0` by @pablo-garay :: PR: #15053 |
| 80 | +- cherry pick #15070 by @nithinraok :: PR: #15082 |
| 81 | + |
| 82 | +</details> |
| 83 | + |
| 84 | +#### Export |
| 85 | + |
| 86 | +<details><summary>Changelog</summary> |
| 87 | + |
| 88 | +- fix: fix missing rope scaling in exporting llama embedding model by @ZhiyuLi-Nvidia :: PR: #14523 |
| 89 | +- Add gpt-oss lora exporter by @cuichenx :: PR: #14589 |
| 90 | +- Skip trt-llm and vllm install in install test by @chtruong814 :: PR: #14663 |
| 91 | +- Fix deepseek export dtype by @cuichenx :: PR: #14307 |
| 92 | +- Remove export-deploy, automodel, and eval tutorials by @chtruong814 :: PR: #14790 |
| 93 | +- cp: `remove ExportDeploy` into `r2.6.0` by @pablo-garay :: PR: #15053 |
| 94 | + |
| 95 | +</details> |
| 96 | + |
| 97 | +#### Uncategorized: |
| 98 | + |
| 99 | +<details><summary>Changelog</summary> |
| 100 | + |
| 101 | +- Version bump to `2.6.0rc0.dev0` by @github-actions[bot] :: PR: #14512 |
| 102 | +- [Audio]: added conformer U-Net model for SE by @nasretdinovr :: PR: #14442 |
| 103 | +- hyena/evo2: Make sure to convert to real after fp32 conversion by @antonvnv :: PR: #14515 |
| 104 | +- Force-set restore path for student in KD mode by @AAnoosheh :: PR: #14532 |
| 105 | +- Skip PTQ if PTQ model path exists by @jenchen13 :: PR: #14536 |
| 106 | +- Support QwenVL for inference API by @meatybobby :: PR: #14534 |
| 107 | +- Hyena: Allow to use unfused RMSNorm + TELinear to restore accuracy and some speed by @antonvnv :: PR: #14542 |
| 108 | +- [Audio]: added streaming mode to SpectrogramToAudio by @nasretdinovr :: PR: #14524 |
| 109 | +- Update evo2 defaults so converted checkpoints have the right parameters by @jstjohn :: PR: #14514 |
| 110 | +- deprecate t0 scripts by @dimapihtar :: PR: #14585 |
| 111 | +- cfg typo correction by @malay-nagda :: PR: #14588 |
| 112 | +- [Perf script] Add use_te_activation_func and activation_func_fp8_input_store flags by @guyueh1 :: PR: #14522 |
| 113 | +- Modify logging message to signal that RestoreConfig will be used by @balvisio :: PR: #14469 |
| 114 | +- Bump TE and Mcore by @chtruong814 :: PR: #14568 |
| 115 | +- Avoid host-device sync in PTL logging by @WanZzzzzz :: PR: #14489 |
| 116 | +- Integrate implicit filter kernel with Hyena layer by @farhadrgh :: PR: #14621 |
| 117 | +- Fix kv_channels configuration for Gemma2 27b by @ananthsub :: PR: #14590 |
| 118 | +- [Flux] small fixes by @CarlosGomes98 :: PR: #14333 |
| 119 | +- [Flux] Add MXFP8 Support by @alpha0422 :: PR: #14473 |
| 120 | +- Use hugginface_hub for downloading the FLUX checkpoint by @suiyoubi :: PR: #14638 |
| 121 | +- Fine-tune embedding models (E5-Large-V2 and LLaMA-3.2-1B) on the allnli triplet dataset with NeMo Framework by @girihemant19 :: PR: #14584 |
| 122 | +- remove service launch scripts by @dimapihtar :: PR: #14647 |
| 123 | +- Warn instead of error when chat template doesn't contain generation keyword by @jenchen13 :: PR: #14641 |
| 124 | +- Fix function calling notebook by @cuichenx :: PR: #14643 |
| 125 | +- [Audio]: fixed bug in conformer unet by @nasretdinovr :: PR: #14626 |
| 126 | +- Fix code checkout during test by @chtruong814 :: PR: #14658 |
| 127 | +- Fix Flux seed as optional Arg by @suiyoubi :: PR: #14652 |
| 128 | +- Remove PEFT scheme condition from recipe by @JRD971000 :: PR: #14661 |
| 129 | +- Add NeMo Voice Agent by @stevehuang52 :: PR: #14325 |
| 130 | +- Update get_tensor_shapes function whose signature was refactored by @AAnoosheh :: PR: #14594 |
| 131 | +- Delete nemo1 notebooks by @cuichenx :: PR: #14677 |
| 132 | +- Bump latest Mcore 020abf01 by @chtruong814 :: PR: #14676 |
| 133 | +- [Flux] correct vae_downscale_factor by @CarlosGomes98 :: PR: #14425 |
| 134 | +- Bump modelopt to 0.35.0 and remove `safe_import("modelopt")` in llm collection by @kevalmorabia97 :: PR: #14656 |
| 135 | +- Canary tutorial fix by @nune-tadevosyan :: PR: #14699 |
| 136 | +- Add option for LoRA with Transformer Engine op fuser by @timmoon10 :: PR: #14411 |
| 137 | +- add load-in-4bit param by @dimapihtar :: PR: #14636 |
| 138 | +- Support NVFP4 recipe by @WanZzzzzz :: PR: #14625 |
| 139 | +- Fix broken link in Reasoning-SFT.ipynb by @cuichenx :: PR: #14716 |
| 140 | +- Remove artificial block to vortex fp8 TP by @jstjohn :: PR: #14684 |
| 141 | +- Drop speech_llm example suite by @yaoyu-33 :: PR: #14683 |
| 142 | +- remove env var by @malay-nagda :: PR: #14739 |
| 143 | +- detach arg option for run scripts by @malay-nagda :: PR: #14722 |
| 144 | +- Randomized shard slicing for tarred data by @pzelasko :: PR: #14558 |
| 145 | +- Data prediction objective for flow matching speech enhancement models by @racoiaws :: PR: #14749 |
| 146 | +- Fix Some Failures by @alpha0422 :: PR: #14763 |
| 147 | +- Support additional Slurm parameters (#14701) by @bdubauski :: PR: #14742 |
| 148 | +- [Flux] Remove Redundant Host & Device Sync by @alpha0422 :: PR: #14711 |
| 149 | +- [Flux] Full Iteration CUDA Graph by @alpha0422 :: PR: #14744 |
| 150 | +- Update prune-distill notebooks to Qwen3 + simplify + mmlu eval by @kevalmorabia97 :: PR: #14785 |
| 151 | +- ci: Automodel deprecation warning by @thomasdhc :: PR: #14787 |
| 152 | +- Bug in MXFP8 recipe by @adityavavreNVDA :: PR: #14793 |
| 153 | +- feat: Disable blank Issues by @pablo-garay :: PR: #14788 |
| 154 | +- ci: Add community label bot by @chtruong814 :: PR: #14796 |
| 155 | +- Add mistral small3 24B config and recipe by @eagle705 :: PR: #14784 |
| 156 | +- Update changelog for `r2.3.0` by @github-actions[bot] :: PR: #14812 |
| 157 | +- QWEN2.5-VL 7B FP8 Recipe by @tomlifu :: PR: #14801 |
| 158 | +- Feat: Disk space management: for nemo install test by @pablo-garay :: PR: #14822 |
| 159 | +- Evo2 address rare over-masking in 1m context dataset by @jstjohn :: PR: #14821 |
| 160 | +- Update cherry-pick workflow to use version 0.63.0 by @pablo-garay :: PR: #14832 |
| 161 | +- Removing automodel items by @aschilling-nv :: PR: #14840 |
| 162 | +- Update changelog for `v2.4.1` by @github-actions[bot] :: PR: #14828 |
| 163 | +- Fix lm_eval installation in pruning tutorial for 25.09 container by @kevalmorabia97 :: PR: #14865 |
| 164 | +- Add nemotron-nano-v2 support to voice agent by @stevehuang52 :: PR: #14704 |
| 165 | +- Update changelog for 2.5.0 by @chtruong814 :: PR: #14890 |
| 166 | +- [Qwen3] Fix the flop cal for Qwen3 by @gdengk :: PR: #14897 |
| 167 | +- [lhotse][aistore] added support input_cfg.yaml directly from aistore bucket by @XuesongYang :: PR: #14891 |
| 168 | +- Harden _is_target_allowed by adding runtime class validation on top of prefix checks to prevent unsafe target resolution by @KunalDhawan :: PR: #14540 |
| 169 | +- Enable simplified DistOpt checkpoint formats by @mikolajblaz :: PR: #14428 |
| 170 | +- Fix the load checkpointing issue -- onelogger callback gets called multiple time in some case. by @liquor233 :: PR: #14945 |
| 171 | +- Revert "new changelog-build" by @pablo-garay :: PR: #14949 |
| 172 | +- feat: new changelog-build by @pablo-garay :: PR: #14950 |
| 173 | +- Update llama4 utils kwargs by @yaoyu-33 :: PR: #14924 |
| 174 | +- Update README.md by @snowmanwwg :: PR: #14917 |
| 175 | +- Update all outdated NeMo Curator links by @sarahyurick :: PR: #14760 |
| 176 | +- Freeze tags in in `r2.6.0` by @github-actions[bot] :: PR: #14957 |
| 177 | +- cp: `Bump MCore, TE, Pytorch, and modelopt for 25.11 (14946)` into `r2.6.0` by @chtruong814 :: PR: #14976 |
| 178 | +- cp: `Update ctc-segmentation (14991)` into `r2.6.0` by @chtruong814 :: PR: #14998 |
| 179 | +- cherry-pick of #14962 by @dimapihtar :: PR: #15000 |
| 180 | +- cp: `Pass timeout when running speech functional tests (15012)` into `r2.6.0` by @chtruong814 :: PR: #15013 |
| 181 | +- cp: `check asr models (14989)` into `r2.6.0` by @chtruong814 :: PR: #15002 |
| 182 | +- cp: `Enable EP in PTQ (15015)` into `r2.6.0` by @chtruong814 :: PR: #15026 |
| 183 | +- cp: `Update numba to numba-cuda and update cuda python bindings usage (15018)` into `r2.6.0` by @chtruong814 :: PR: #15024 |
| 184 | +- cp: `Add import guards for mcore lightning module (14970)` into `r2.6.0` by @chtruong814 :: PR: #14981 |
| 185 | +- cp: `fix loading of hyb ctc rnnt bpe models when using from pretrained (15042)` into `r2.6.0` by @chtruong814 :: PR: #15045 |
| 186 | +- cp: `fix: fix update-buildcache workflow after ED remove (15051)` into `r2.6.0` by @chtruong814 :: PR: #15052 |
| 187 | +- cp: `chore: update Lightning requirements version (15004)` into `r2.6.0` by @chtruong814 :: PR: #15049 |
| 188 | +- cp: `update notebook (15093)` into `r2.6.0` by @chtruong814 :: PR: #15094 |
| 189 | +- cp: `Fix: Obsolete Attribute [SDE] (15105)` into `r2.6.0` by @chtruong814 :: PR: #15106 |
| 190 | +- cp: `Upgrade NeMo ASR tutorials from Mozilla/CommonVoice to Google/FLEURS (15103)` into `r2.6.0` by @chtruong814 :: PR: #15107 |
| 191 | +- cp: `chore: Remove Automodel module (15044)` into `r2.6.0` by @chtruong814 :: PR: #15084 |
| 192 | +- cp: `Add deprecation notice to modules (15050)` into `r2.6.0` by @chtruong814 :: PR: #15110 |
| 193 | + |
| 194 | +</details> |
| 195 | + |
| 196 | +## NVIDIA Neural Modules 2.5.3 |
| 197 | + |
| 198 | +### Highlights |
| 199 | + |
| 200 | +- This release addresses known security issues. For the latest NVIDIA Vulnerability Disclosure Information visit <https://www.nvidia.com/en-us/security/>, for acknowledgement please reach out to the NVIDIA PSIRT team at <[email protected]> |
| 201 | +- Update nv-one-logger |
| 202 | +- Update ctc-segmentation |
| 203 | + |
| 204 | +### Detailed Changelogs: |
| 205 | + |
| 206 | + |
| 207 | +</details> |
| 208 | + |
| 209 | +#### Text Normalization / Inverse Text Normalization |
| 210 | + |
| 211 | +<details><summary>Changelog</summary> |
| 212 | + |
| 213 | +- chore: update Lightning requirement by @liquor233 :: PR: #15005 |
| 214 | + |
| 215 | +</details> |
| 216 | + |
| 217 | +#### Uncategorized: |
| 218 | + |
| 219 | +<details><summary>Changelog</summary> |
| 220 | + |
| 221 | +- cp: `Update ctc-segmentation (14991)` into `r2.5.0` by @chtruong814 :: PR: #15020 |
| 222 | +- Bump to 2.5.3 by @chtruong814 :: PR: #15022 |
| 223 | + |
| 224 | +</details> |
| 225 | + |
4 | 226 | ## NVIDIA Neural Modules 2.5.2 |
5 | 227 |
|
6 | 228 | ### Detailed Changelogs: |
|
0 commit comments