Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
145 changes: 23 additions & 122 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,19 +13,24 @@

<!-- markdownlint-disable -->
<details open>
<summary><b> <a href=https://huggingface.co/nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16>NVIDIA-Nemotron-3-Nano-30B-A3B</a> is out with full reproducible script and recipes! Checkout <a href=https://github.com/NVIDIA-NeMo/Megatron-Bridge/tree/nano-v3>NeMo Megatron-Bridge</a>, <a href=https://github.com/NVIDIA-NeMo/Automodel/blob/main/examples/llm_finetune/nemotron/nemotron_nano_v3_squad.yaml>NeMo AutoModel</a>, <a href=https://github.com/NVIDIA-NeMo/RL>NeMo-RL</a> and <a href=https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo?version=25.11.nemotron_3_nano>NGC container</a> to try them!(2025-12-15)
<summary><b><a href=https://huggingface.co/nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16>NVIDIA-Nemotron-3-Nano-30B-A3B</a> is out with full reproducible script and recipes! Check out <a href=https://github.com/NVIDIA-NeMo/Megatron-Bridge/tree/nano-v3>NeMo Megatron-Bridge</a>, <a href=https://github.com/NVIDIA-NeMo/AutoModel/blob/main/examples/llm_finetune/nemotron/nemotron_nano_v3_squad.yaml>NeMo AutoModel</a>, <a href=https://github.com/NVIDIA-NeMo/RL>NeMo-RL</a> and <a href=https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo?version=25.11.nemotron_3_nano>NGC container</a> to try them!</b> (2025-12-15)
</details>



<details open>
<summary><b>Pivot notice: This repo will pivot to focus on speech models collections only. Please refer to <a href=https://github.com/NVIDIA-NeMo>NeMo Framework Github Org</a> for the complete list of repos under NeMo Framework</b></summary>
NeMo 2.0, with its support for LLMs and VLMs will be deprecated by 25.11, and replaced by <a href=https://github.com/NVIDIA-NeMo/Megatron-Bridge>NeMo Megatron-Bridge</a> and <a href=https://github.com/NVIDIA-NeMo/Automodel>NeMo Automodel</a>. More details can be find in the <a href=https://github.com/NVIDIA-NeMo>NeMo Framework github org readme</a>. (2025-10-10)
<summary><b>⚠️ Pivot notice: This repo will pivot to focus on speech models collections only. Please refer to <a href=https://github.com/NVIDIA-NeMo>NeMo Framework Github Org</a> for the complete list of repos under NeMo Framework</b></summary>
NeMo 2.0, with its support for LLMs and VLMs will be deprecated by 25.11, and replaced by <a href=https://github.com/NVIDIA-NeMo/Megatron-Bridge>NeMo Megatron-Bridge</a> and <a href=https://github.com/NVIDIA-NeMo/AutoModel>NeMo AutoModel</a>. More details can be found in the <a href=https://github.com/NVIDIA-NeMo>NeMo Framework GitHub org readme</a>. (2025-10-10)

Following collections are deprecated and will be removed in a later release, please use previous versions if you are using them:
- nlp
- llm
- vlm
- vision
</details>

<details closed>
<summary><b>Pretrain and finetune :hugs:Hugging Face models via AutoModel</b></summary>
Nemo Framework's latest feature AutoModel enables broad support for :hugs:Hugging Face models, with 25.04 focusing on
NeMo Framework's latest feature AutoModel enables broad support for :hugs:Hugging Face models, with 25.04 focusing on


- <a href=https://huggingface.co/transformers/v3.5.1/model_doc/auto.html#automodelforcausallm>AutoModelForCausalLM</a> in the <a href="https://huggingface.co/models?pipeline_tag=text-generation&sort=trending">Text Generation</a> category
Expand All @@ -35,7 +40,7 @@ More Details in Blog: <a href=https://developer.nvidia.com/blog/run-hugging-face
</details>

<details closed>
<summary><b>Training on Blackwell using Nemo</b></summary>
<summary><b>Training on Blackwell using NeMo</b></summary>
NeMo Framework has added Blackwell support, with <a href=https://docs.nvidia.com/nemo-framework/user-guide/latest/performance/performance_summary.html>performance benchmarks on GB200 & B200</a>. More optimizations to come in the upcoming releases.(2025-05-19)
</details>

Expand Down Expand Up @@ -82,7 +87,7 @@ More Details in Blog: <a href=https://developer.nvidia.com/blog/run-hugging-face
State-of-the-Art Multimodal Generative AI Model Development with NVIDIA NeMo
</a> (2024-11-06)
</summary>
NVIDIA recently announced significant enhancements to the NeMo platform, focusing on multimodal generative AI models. The update includes NeMo Curator and the Cosmos tokenizer, which streamline the data curation process and enhance the quality of visual data. These tools are designed to handle large-scale data efficiently, making it easier to develop high-quality AI models for various applications, including robotics and autonomous driving. The Cosmos tokenizers, in particular, efficiently map visual data into compact, semantic tokens, which is crucial for training large-scale generative models. The tokenizer is available now on the <a href=http://github.com/NVIDIA/cosmos-tokenizer/NVIDIA/cosmos-tokenizer>NVIDIA/cosmos-tokenizer</a> GitHub repo and on <a href=https://huggingface.co/nvidia/Cosmos-Tokenizer-CV8x8x8>Hugging Face</a>.
NVIDIA recently announced significant enhancements to the NeMo platform, focusing on multimodal generative AI models. The update includes NeMo Curator and the Cosmos tokenizer, which streamline the data curation process and enhance the quality of visual data. These tools are designed to handle large-scale data efficiently, making it easier to develop high-quality AI models for various applications, including robotics and autonomous driving. The Cosmos tokenizers, in particular, efficiently map visual data into compact, semantic tokens, which is crucial for training large-scale generative models. The tokenizer is available now on the <a href=https://github.com/NVIDIA/cosmos-tokenizer>NVIDIA/cosmos-tokenizer</a> GitHub repo and on <a href=https://huggingface.co/nvidia/Cosmos-Tokenizer-CV8x8x8>Hugging Face</a>.
<br><br>
</details>
<details>
Expand Down Expand Up @@ -216,22 +221,14 @@ NVIDIA NeMo 2.0 introduces several significant improvements over its predecessor

Overall, these enhancements make NeMo 2.0 a powerful, scalable, and user-friendly framework for AI model development.

> [!IMPORTANT]
> NeMo 2.0 is currently supported by the LLM (large language model) and VLM (vision language model) collections.

### Get Started with NeMo 2.0

- Refer to the [Quickstart](https://docs.nvidia.com/nemo-framework/user-guide/latest/nemo-2.0/quickstart.html) for examples of using NeMo-Run to launch NeMo 2.0 experiments locally and on a slurm cluster.
- For more information about NeMo 2.0, see the [NeMo Framework User Guide](https://docs.nvidia.com/nemo-framework/user-guide/latest/nemo-2.0/index.html).
- [NeMo 2.0 Recipes](https://github.com/NVIDIA/NeMo/blob/main/nemo/collections/llm/recipes) contains additional examples of launching large-scale runs using NeMo 2.0 and NeMo-Run.
- For an in-depth exploration of the main features of NeMo 2.0, see the [Feature Guide](https://docs.nvidia.com/nemo-framework/user-guide/latest/nemo-2.0/features/index.html#feature-guide).
- To transition from NeMo 1.0 to 2.0, see the [Migration Guide](https://docs.nvidia.com/nemo-framework/user-guide/latest/nemo-2.0/migration/index.html#migration-guide) for step-by-step instructions.

### Get Started with Cosmos

NeMo Curator and NeMo Framework support video curation and post-training of the Cosmos World Foundation Models, which are open and available on [NGC](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/cosmos/collections/cosmos) and [Hugging Face](https://huggingface.co/collections/nvidia/cosmos-6751e884dc10e013a0a0d8e6). For more information on video datasets, refer to [NeMo Curator](https://developer.nvidia.com/nemo-curator). To post-train World Foundation Models using the NeMo Framework for your custom physical AI tasks, see the [Cosmos Diffusion models](https://github.com/NVIDIA/Cosmos/blob/main/cosmos1/models/diffusion/nemo/post_training/README.md) and the [Cosmos Autoregressive models](https://github.com/NVIDIA/Cosmos/blob/main/cosmos1/models/autoregressive/nemo/post_training/README.md).

## LLMs and MMs Training, Alignment, and Customization
## Training and Customization

All NeMo models are trained with
[Lightning](https://github.com/Lightning-AI/lightning). Training is
Expand All @@ -246,55 +243,15 @@ include Tensor Parallelism (TP), Pipeline Parallelism (PP), Fully
Sharded Data Parallelism (FSDP), Mixture-of-Experts (MoE), and Mixed
Precision Training with BFloat16 and FP8, as well as others.

NeMo Transformer-based LLMs and MMs utilize [NVIDIA Transformer
Engine](https://github.com/NVIDIA/TransformerEngine) for FP8 training on
NVIDIA Hopper GPUs, while leveraging [NVIDIA Megatron
Core](https://github.com/NVIDIA/Megatron-LM/tree/main/megatron/core) for
scaling Transformer model training.

NeMo LLMs can be aligned with state-of-the-art methods such as SteerLM,
Direct Preference Optimization (DPO), and Reinforcement Learning from
Human Feedback (RLHF). See [NVIDIA NeMo
Aligner](https://github.com/NVIDIA/NeMo-Aligner) for more information.

In addition to supervised fine-tuning (SFT), NeMo also supports the
latest parameter efficient fine-tuning (PEFT) techniques such as LoRA,
P-Tuning, Adapters, and IA3. Refer to the [NeMo Framework User
Guide](https://docs.nvidia.com/nemo-framework/user-guide/latest/sft_peft/index.html)
for the full list of supported models and techniques.

## LLMs and MMs Deployment and Optimization

NeMo LLMs and MMs can be deployed and optimized with [NVIDIA NeMo
Microservices](https://developer.nvidia.com/nemo-microservices-early-access).
P-Tuning, Adapters, and IA3.

## Speech AI

NeMo ASR and TTS models can be optimized for inference and deployed for
production use cases with [NVIDIA Riva](https://developer.nvidia.com/riva).

## NeMo Framework Launcher

> [!IMPORTANT]
> NeMo Framework Launcher is compatible with NeMo version 1.0 only. [NeMo-Run](https://github.com/NVIDIA/NeMo-Run) is recommended for launching experiments using NeMo 2.0.

[NeMo Framework
Launcher](https://github.com/NVIDIA/NeMo-Megatron-Launcher) is a
cloud-native tool that streamlines the NeMo Framework experience. It is
used for launching end-to-end NeMo Framework training jobs on CSPs and
Slurm clusters.

The NeMo Framework Launcher includes extensive recipes, scripts,
utilities, and documentation for training NeMo LLMs. It also includes
the NeMo Framework [Autoconfigurator](https://github.com/NVIDIA/NeMo-Megatron-Launcher#53-using-autoconfigurator-to-find-the-optimal-configuration),
which is designed to find the optimal model parallel configuration for
training on a specific cluster.

To get started quickly with the NeMo Framework Launcher, please see the
[NeMo Framework
Playbooks](https://docs.nvidia.com/nemo-framework/user-guide/latest/playbooks/index.html).
The NeMo Framework Launcher does not currently support ASR and TTS
training, but it will soon.

## Get Started with NeMo Framework

Expand Down Expand Up @@ -323,11 +280,9 @@ multi-GPU/multi-node training.

## Key Features

- [Large Language Models](nemo/collections/nlp/README.md)
- [Multimodal](nemo/collections/multimodal/README.md)
- [Automatic Speech Recognition](nemo/collections/asr/README.md)
- [Text to Speech](nemo/collections/tts/README.md)
- [Computer Vision](nemo/collections/vision/README.md)

## Requirements

Expand Down Expand Up @@ -396,7 +351,7 @@ To install nemo_toolkit from such a wheel, use the following installation method
pip install "nemo_toolkit[all]"
```

If a more specific version is desired, we recommend a Pip-VCS install. From [NVIDIA/NeMo](github.com/NVIDIA/NeMo), fetch the commit, branch, or tag that you would like to install.
If a more specific version is desired, we recommend a Pip-VCS install. From [NVIDIA/NeMo](https://github.com/NVIDIA/NeMo), fetch the commit, branch, or tag that you would like to install.
To install nemo_toolkit from this Git reference `$REF`, use the following installation method:

```bash
Expand All @@ -415,18 +370,16 @@ following domain-specific commands:
```bash
pip install nemo_toolkit['all'] # or pip install "nemo_toolkit['all']@git+https://github.com/NVIDIA/NeMo@${REF:-'main'}"
pip install nemo_toolkit['asr'] # or pip install "nemo_toolkit['asr']@git+https://github.com/NVIDIA/NeMo@$REF:-'main'}"
pip install nemo_toolkit['nlp'] # or pip install "nemo_toolkit['nlp']@git+https://github.com/NVIDIA/NeMo@${REF:-'main'}"
pip install nemo_toolkit['tts'] # or pip install "nemo_toolkit['tts']@git+https://github.com/NVIDIA/NeMo@${REF:-'main'}"
pip install nemo_toolkit['vision'] # or pip install "nemo_toolkit['vision']@git+https://github.com/NVIDIA/NeMo@${REF:-'main'}"
pip install nemo_toolkit['multimodal'] # or pip install "nemo_toolkit['multimodal']@git+https://github.com/NVIDIA/NeMo@${REF:-'main'}"
```

### NGC PyTorch container

**NOTE: The following steps are supported beginning with 24.04 (NeMo-Toolkit 2.3.0)**
**NOTE: The following steps are supported beginning with 25.09 (NeMo-Toolkit 2.6.0)**

We recommended that you start with a base NVIDIA PyTorch container:
nvcr.io/nvidia/pytorch:25.01-py3.
nvcr.io/nvidia/pytorch:25.09-py3.

If starting with a base NVIDIA PyTorch container, you must first launch
the container:
Expand All @@ -439,10 +392,10 @@ docker run \
--shm-size=16g \
--ulimit memlock=-1 \
--ulimit stack=67108864 \
nvcr.io/nvidia/pytorch:${NV_PYTORCH_TAG:-'nvcr.io/nvidia/pytorch:25.01-py3'}
${NV_PYTORCH_TAG:-'nvcr.io/nvidia/pytorch:25.09-py3'}
```

From [NVIDIA/NeMo](github.com/NVIDIA/NeMo), fetch the commit/branch/tag that you want to install.
From [NVIDIA/NeMo](https://github.com/NVIDIA/NeMo), fetch the commit/branch/tag that you want to install.
To install nemo_toolkit including all of its dependencies from this Git reference `$REF`, use the following installation method:

```bash
Expand All @@ -458,9 +411,9 @@ pip install ".[all]"

NeMo containers are launched concurrently with NeMo version updates.
NeMo Framework now supports LLMs, MMs, ASR, and TTS in a single
consolidated Docker container. You can find additional information about
released containers on the [NeMo releases
page](https://github.com/NVIDIA/NeMo/releases).
consolidated Docker container. The latest container is based on NeMo 2.6.0.
You can find additional information about released containers on the
[NeMo releases page](https://github.com/NVIDIA/NeMo/releases).

To use a pre-built container, run the following code:

Expand All @@ -472,14 +425,9 @@ docker run \
--shm-size=16g \
--ulimit memlock=-1 \
--ulimit stack=67108864 \
nvcr.io/nvidia/pytorch:${NV_PYTORCH_TAG:-'nvcr.io/nvidia/nemo:25.02'}
nvcr.io/nvidia/nemo:25.11.01
```

## Future Work

The NeMo Framework Launcher does not currently support ASR and TTS
training, but it will soon.

## Discussions Board

FAQ can be found on the NeMo [Discussions
Expand All @@ -503,53 +451,6 @@ to the `gh-pages-src` branch of this repository. For detailed
information, please consult the README located at the [gh-pages-src
branch](https://github.com/NVIDIA/NeMo/tree/gh-pages-src#readme).

## Blogs

<!-- markdownlint-disable -->
<details open>
<summary><b>Large Language Models and Multimodal Models</b></summary>
<details>
<summary>
<a href="https://blogs.nvidia.com/blog/bria-builds-responsible-generative-ai-using-nemo-picasso/">
Bria Builds Responsible Generative AI for Enterprises Using NVIDIA NeMo, Picasso
</a> (2024/03/06)
</summary>
Bria, a Tel Aviv startup at the forefront of visual generative AI for enterprises now leverages the NVIDIA NeMo Framework.
The Bria.ai platform uses reference implementations from the NeMo Multimodal collection, trained on NVIDIA Tensor Core GPUs, to enable high-throughput and low-latency image generation.
Bria has also adopted NVIDIA Picasso, a foundry for visual generative AI models, to run inference.
<br><br>
</details>
<details>
<summary>
<a href="https://developer.nvidia.com/blog/new-nvidia-nemo-framework-features-and-nvidia-h200-supercharge-llm-training-performance-and-versatility/">
New NVIDIA NeMo Framework Features and NVIDIA H200
</a> (2023/12/06)
</summary>
NVIDIA NeMo Framework now includes several optimizations and enhancements,
including:
1) Fully Sharded Data Parallelism (FSDP) to improve the efficiency of training large-scale AI models,
2) Mix of Experts (MoE)-based LLM architectures with expert parallelism for efficient LLM training at scale,
3) Reinforcement Learning from Human Feedback (RLHF) with TensorRT-LLM for inference stage acceleration, and
4) up to 4.2x speedups for Llama 2 pre-training on NVIDIA H200 Tensor Core GPUs.
<br><br>
<a href="https://developer.nvidia.com/blog/new-nvidia-nemo-framework-features-and-nvidia-h200-supercharge-llm-training-performance-and-versatility">
<img src="https://github.com/sbhavani/TransformerEngine/blob/main/docs/examples/H200-NeMo-performance.png" alt="H200-NeMo-performance" style="width: 600px;"></a>
<br><br>
</details>
<details>
<summary>
<a href="https://blogs.nvidia.com/blog/nemo-amazon-titan/">
NVIDIA now powers training for Amazon Titan Foundation models
</a> (2023/11/28)
</summary>
NVIDIA NeMo Framework now empowers the Amazon Titan foundation models (FM) with efficient training of large language models (LLMs).
The Titan FMs form the basis of Amazon’s generative AI service, Amazon Bedrock.
The NeMo Framework provides a versatile framework for building, customizing, and running LLMs.
<br><br>
</details>
</details>
<!-- markdownlint-enable -->

## Licenses

NeMo is licensed under the [Apache License 2.0](https://github.com/NVIDIA/NeMo?tab=Apache-2.0-1-ov-file).
Loading