diff --git a/README.md b/README.md index 4d0b2cac73a2..757c8f68a3f5 100644 --- a/README.md +++ b/README.md @@ -13,19 +13,24 @@
- NVIDIA-Nemotron-3-Nano-30B-A3B is out with full reproducible script and recipes! Checkout NeMo Megatron-Bridge, NeMo AutoModel, NeMo-RL and NGC container to try them!(2025-12-15) + NVIDIA-Nemotron-3-Nano-30B-A3B is out with full reproducible script and recipes! Check out NeMo Megatron-Bridge, NeMo AutoModel, NeMo-RL and NGC container to try them! (2025-12-15)
-
- Pivot notice: This repo will pivot to focus on speech models collections only. Please refer to NeMo Framework Github Org for the complete list of repos under NeMo Framework - NeMo 2.0, with its support for LLMs and VLMs will be deprecated by 25.11, and replaced by NeMo Megatron-Bridge and NeMo Automodel. More details can be find in the NeMo Framework github org readme. (2025-10-10) + ⚠️ Pivot notice: This repo will pivot to focus on speech models collections only. Please refer to NeMo Framework Github Org for the complete list of repos under NeMo Framework + NeMo 2.0, with its support for LLMs and VLMs will be deprecated by 25.11, and replaced by NeMo Megatron-Bridge and NeMo AutoModel. More details can be found in the NeMo Framework GitHub org readme. (2025-10-10) + + Following collections are deprecated and will be removed in a later release, please use previous versions if you are using them: + - nlp + - llm + - vlm + - vision
Pretrain and finetune :hugs:Hugging Face models via AutoModel - Nemo Framework's latest feature AutoModel enables broad support for :hugs:Hugging Face models, with 25.04 focusing on + NeMo Framework's latest feature AutoModel enables broad support for :hugs:Hugging Face models, with 25.04 focusing on - AutoModelForCausalLM in the Text Generation category @@ -35,7 +40,7 @@ More Details in Blog:
- Training on Blackwell using Nemo + Training on Blackwell using NeMo NeMo Framework has added Blackwell support, with performance benchmarks on GB200 & B200. More optimizations to come in the upcoming releases.(2025-05-19)
@@ -82,7 +87,7 @@ More Details in Blog: (2024-11-06) - NVIDIA recently announced significant enhancements to the NeMo platform, focusing on multimodal generative AI models. The update includes NeMo Curator and the Cosmos tokenizer, which streamline the data curation process and enhance the quality of visual data. These tools are designed to handle large-scale data efficiently, making it easier to develop high-quality AI models for various applications, including robotics and autonomous driving. The Cosmos tokenizers, in particular, efficiently map visual data into compact, semantic tokens, which is crucial for training large-scale generative models. The tokenizer is available now on the NVIDIA/cosmos-tokenizer GitHub repo and on Hugging Face. + NVIDIA recently announced significant enhancements to the NeMo platform, focusing on multimodal generative AI models. The update includes NeMo Curator and the Cosmos tokenizer, which streamline the data curation process and enhance the quality of visual data. These tools are designed to handle large-scale data efficiently, making it easier to develop high-quality AI models for various applications, including robotics and autonomous driving. The Cosmos tokenizers, in particular, efficiently map visual data into compact, semantic tokens, which is crucial for training large-scale generative models. The tokenizer is available now on the NVIDIA/cosmos-tokenizer GitHub repo and on Hugging Face.

@@ -216,22 +221,14 @@ NVIDIA NeMo 2.0 introduces several significant improvements over its predecessor Overall, these enhancements make NeMo 2.0 a powerful, scalable, and user-friendly framework for AI model development. -> [!IMPORTANT] -> NeMo 2.0 is currently supported by the LLM (large language model) and VLM (vision language model) collections. - ### Get Started with NeMo 2.0 - Refer to the [Quickstart](https://docs.nvidia.com/nemo-framework/user-guide/latest/nemo-2.0/quickstart.html) for examples of using NeMo-Run to launch NeMo 2.0 experiments locally and on a slurm cluster. - For more information about NeMo 2.0, see the [NeMo Framework User Guide](https://docs.nvidia.com/nemo-framework/user-guide/latest/nemo-2.0/index.html). -- [NeMo 2.0 Recipes](https://github.com/NVIDIA/NeMo/blob/main/nemo/collections/llm/recipes) contains additional examples of launching large-scale runs using NeMo 2.0 and NeMo-Run. - For an in-depth exploration of the main features of NeMo 2.0, see the [Feature Guide](https://docs.nvidia.com/nemo-framework/user-guide/latest/nemo-2.0/features/index.html#feature-guide). - To transition from NeMo 1.0 to 2.0, see the [Migration Guide](https://docs.nvidia.com/nemo-framework/user-guide/latest/nemo-2.0/migration/index.html#migration-guide) for step-by-step instructions. -### Get Started with Cosmos - -NeMo Curator and NeMo Framework support video curation and post-training of the Cosmos World Foundation Models, which are open and available on [NGC](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/cosmos/collections/cosmos) and [Hugging Face](https://huggingface.co/collections/nvidia/cosmos-6751e884dc10e013a0a0d8e6). For more information on video datasets, refer to [NeMo Curator](https://developer.nvidia.com/nemo-curator). To post-train World Foundation Models using the NeMo Framework for your custom physical AI tasks, see the [Cosmos Diffusion models](https://github.com/NVIDIA/Cosmos/blob/main/cosmos1/models/diffusion/nemo/post_training/README.md) and the [Cosmos Autoregressive models](https://github.com/NVIDIA/Cosmos/blob/main/cosmos1/models/autoregressive/nemo/post_training/README.md). - -## LLMs and MMs Training, Alignment, and Customization +## Training and Customization All NeMo models are trained with [Lightning](https://github.com/Lightning-AI/lightning). Training is @@ -246,55 +243,15 @@ include Tensor Parallelism (TP), Pipeline Parallelism (PP), Fully Sharded Data Parallelism (FSDP), Mixture-of-Experts (MoE), and Mixed Precision Training with BFloat16 and FP8, as well as others. -NeMo Transformer-based LLMs and MMs utilize [NVIDIA Transformer -Engine](https://github.com/NVIDIA/TransformerEngine) for FP8 training on -NVIDIA Hopper GPUs, while leveraging [NVIDIA Megatron -Core](https://github.com/NVIDIA/Megatron-LM/tree/main/megatron/core) for -scaling Transformer model training. - -NeMo LLMs can be aligned with state-of-the-art methods such as SteerLM, -Direct Preference Optimization (DPO), and Reinforcement Learning from -Human Feedback (RLHF). See [NVIDIA NeMo -Aligner](https://github.com/NVIDIA/NeMo-Aligner) for more information. - In addition to supervised fine-tuning (SFT), NeMo also supports the latest parameter efficient fine-tuning (PEFT) techniques such as LoRA, -P-Tuning, Adapters, and IA3. Refer to the [NeMo Framework User -Guide](https://docs.nvidia.com/nemo-framework/user-guide/latest/sft_peft/index.html) -for the full list of supported models and techniques. - -## LLMs and MMs Deployment and Optimization - -NeMo LLMs and MMs can be deployed and optimized with [NVIDIA NeMo -Microservices](https://developer.nvidia.com/nemo-microservices-early-access). +P-Tuning, Adapters, and IA3. ## Speech AI NeMo ASR and TTS models can be optimized for inference and deployed for production use cases with [NVIDIA Riva](https://developer.nvidia.com/riva). -## NeMo Framework Launcher - -> [!IMPORTANT] -> NeMo Framework Launcher is compatible with NeMo version 1.0 only. [NeMo-Run](https://github.com/NVIDIA/NeMo-Run) is recommended for launching experiments using NeMo 2.0. - -[NeMo Framework -Launcher](https://github.com/NVIDIA/NeMo-Megatron-Launcher) is a -cloud-native tool that streamlines the NeMo Framework experience. It is -used for launching end-to-end NeMo Framework training jobs on CSPs and -Slurm clusters. - -The NeMo Framework Launcher includes extensive recipes, scripts, -utilities, and documentation for training NeMo LLMs. It also includes -the NeMo Framework [Autoconfigurator](https://github.com/NVIDIA/NeMo-Megatron-Launcher#53-using-autoconfigurator-to-find-the-optimal-configuration), -which is designed to find the optimal model parallel configuration for -training on a specific cluster. - -To get started quickly with the NeMo Framework Launcher, please see the -[NeMo Framework -Playbooks](https://docs.nvidia.com/nemo-framework/user-guide/latest/playbooks/index.html). -The NeMo Framework Launcher does not currently support ASR and TTS -training, but it will soon. ## Get Started with NeMo Framework @@ -323,11 +280,9 @@ multi-GPU/multi-node training. ## Key Features -- [Large Language Models](nemo/collections/nlp/README.md) - [Multimodal](nemo/collections/multimodal/README.md) - [Automatic Speech Recognition](nemo/collections/asr/README.md) - [Text to Speech](nemo/collections/tts/README.md) -- [Computer Vision](nemo/collections/vision/README.md) ## Requirements @@ -396,7 +351,7 @@ To install nemo_toolkit from such a wheel, use the following installation method pip install "nemo_toolkit[all]" ``` -If a more specific version is desired, we recommend a Pip-VCS install. From [NVIDIA/NeMo](github.com/NVIDIA/NeMo), fetch the commit, branch, or tag that you would like to install. +If a more specific version is desired, we recommend a Pip-VCS install. From [NVIDIA/NeMo](https://github.com/NVIDIA/NeMo), fetch the commit, branch, or tag that you would like to install. To install nemo_toolkit from this Git reference `$REF`, use the following installation method: ```bash @@ -415,18 +370,16 @@ following domain-specific commands: ```bash pip install nemo_toolkit['all'] # or pip install "nemo_toolkit['all']@git+https://github.com/NVIDIA/NeMo@${REF:-'main'}" pip install nemo_toolkit['asr'] # or pip install "nemo_toolkit['asr']@git+https://github.com/NVIDIA/NeMo@$REF:-'main'}" -pip install nemo_toolkit['nlp'] # or pip install "nemo_toolkit['nlp']@git+https://github.com/NVIDIA/NeMo@${REF:-'main'}" pip install nemo_toolkit['tts'] # or pip install "nemo_toolkit['tts']@git+https://github.com/NVIDIA/NeMo@${REF:-'main'}" -pip install nemo_toolkit['vision'] # or pip install "nemo_toolkit['vision']@git+https://github.com/NVIDIA/NeMo@${REF:-'main'}" pip install nemo_toolkit['multimodal'] # or pip install "nemo_toolkit['multimodal']@git+https://github.com/NVIDIA/NeMo@${REF:-'main'}" ``` ### NGC PyTorch container -**NOTE: The following steps are supported beginning with 24.04 (NeMo-Toolkit 2.3.0)** +**NOTE: The following steps are supported beginning with 25.09 (NeMo-Toolkit 2.6.0)** We recommended that you start with a base NVIDIA PyTorch container: -nvcr.io/nvidia/pytorch:25.01-py3. +nvcr.io/nvidia/pytorch:25.09-py3. If starting with a base NVIDIA PyTorch container, you must first launch the container: @@ -439,10 +392,10 @@ docker run \ --shm-size=16g \ --ulimit memlock=-1 \ --ulimit stack=67108864 \ - nvcr.io/nvidia/pytorch:${NV_PYTORCH_TAG:-'nvcr.io/nvidia/pytorch:25.01-py3'} + ${NV_PYTORCH_TAG:-'nvcr.io/nvidia/pytorch:25.09-py3'} ``` -From [NVIDIA/NeMo](github.com/NVIDIA/NeMo), fetch the commit/branch/tag that you want to install. +From [NVIDIA/NeMo](https://github.com/NVIDIA/NeMo), fetch the commit/branch/tag that you want to install. To install nemo_toolkit including all of its dependencies from this Git reference `$REF`, use the following installation method: ```bash @@ -458,9 +411,9 @@ pip install ".[all]" NeMo containers are launched concurrently with NeMo version updates. NeMo Framework now supports LLMs, MMs, ASR, and TTS in a single -consolidated Docker container. You can find additional information about -released containers on the [NeMo releases -page](https://github.com/NVIDIA/NeMo/releases). +consolidated Docker container. The latest container is based on NeMo 2.6.0. +You can find additional information about released containers on the +[NeMo releases page](https://github.com/NVIDIA/NeMo/releases). To use a pre-built container, run the following code: @@ -472,14 +425,9 @@ docker run \ --shm-size=16g \ --ulimit memlock=-1 \ --ulimit stack=67108864 \ - nvcr.io/nvidia/pytorch:${NV_PYTORCH_TAG:-'nvcr.io/nvidia/nemo:25.02'} + nvcr.io/nvidia/nemo:25.11.01 ``` -## Future Work - -The NeMo Framework Launcher does not currently support ASR and TTS -training, but it will soon. - ## Discussions Board FAQ can be found on the NeMo [Discussions @@ -503,53 +451,6 @@ to the `gh-pages-src` branch of this repository. For detailed information, please consult the README located at the [gh-pages-src branch](https://github.com/NVIDIA/NeMo/tree/gh-pages-src#readme). -## Blogs - - -
- Large Language Models and Multimodal Models -
- - - Bria Builds Responsible Generative AI for Enterprises Using NVIDIA NeMo, Picasso - (2024/03/06) - - Bria, a Tel Aviv startup at the forefront of visual generative AI for enterprises now leverages the NVIDIA NeMo Framework. - The Bria.ai platform uses reference implementations from the NeMo Multimodal collection, trained on NVIDIA Tensor Core GPUs, to enable high-throughput and low-latency image generation. - Bria has also adopted NVIDIA Picasso, a foundry for visual generative AI models, to run inference. -

-
-
- - - New NVIDIA NeMo Framework Features and NVIDIA H200 - (2023/12/06) - - NVIDIA NeMo Framework now includes several optimizations and enhancements, - including: - 1) Fully Sharded Data Parallelism (FSDP) to improve the efficiency of training large-scale AI models, - 2) Mix of Experts (MoE)-based LLM architectures with expert parallelism for efficient LLM training at scale, - 3) Reinforcement Learning from Human Feedback (RLHF) with TensorRT-LLM for inference stage acceleration, and - 4) up to 4.2x speedups for Llama 2 pre-training on NVIDIA H200 Tensor Core GPUs. -

- - H200-NeMo-performance -

-
-
- - - NVIDIA now powers training for Amazon Titan Foundation models - (2023/11/28) - - NVIDIA NeMo Framework now empowers the Amazon Titan foundation models (FM) with efficient training of large language models (LLMs). - The Titan FMs form the basis of Amazon’s generative AI service, Amazon Bedrock. - The NeMo Framework provides a versatile framework for building, customizing, and running LLMs. -

-
-
- - ## Licenses NeMo is licensed under the [Apache License 2.0](https://github.com/NVIDIA/NeMo?tab=Apache-2.0-1-ov-file).