|
1 | | -# NeMo RL: A Scalable and Efficient Post-Training Library |
| 1 | +<div align="center"> |
| 2 | + |
| 3 | + # NeMo RL: A Scalable and Efficient Post-Training Library |
2 | 4 |
|
3 | 5 | [](https://github.com/NVIDIA-NeMo/RL/actions/workflows/cicd-main.yml) |
| 6 | +[](https://github.com/NVIDIA-NeMo/RL/stargazers/) |
| 7 | + |
| 8 | +[Documentation](https://docs.nvidia.com/nemo/rl/latest/index.html) | [Discussions](https://github.com/NVIDIA-NeMo/RL/discussions/categories/announcements) | [Contributing](https://github.com/NVIDIA-NeMo/RL/blob/main/CONTRIBUTING.md) |
| 9 | + |
| 10 | +</div> |
4 | 11 |
|
5 | 12 | ## 📣 News |
| 13 | +* [12/1/2025] [Release v0.4.0!](https://github.com/NVIDIA-NeMo/RL/releases/tag/v0.4.0) |
| 14 | + * First release with official NGC Container [nvcr.io/nvidia/nemo-rl:v0.4.0](https://registry.ngc.nvidia.com/orgs/nvidia/containers/nemo-rl/tags). |
| 15 | + * 📊 View the release run metrics on [Google Colab](https://colab.research.google.com/drive/1u5lmjHOsYpJqXaeYstjw7Qbzvbo67U0v?usp=sharing) to get a head start on your experimentation. |
6 | 16 | * [10/10/2025] **DAPO Algorithm Support** |
7 | 17 | NeMo RL now supports [Decoupled Clip and Dynamic Sampling Policy Optimization (DAPO)](https://arxiv.org/pdf/2503.14476) algorithm. |
8 | 18 | DAPO extends GRPO with **Clip-Higher**, **Dynamic Sampling**, **Token-Level Policy Gradient Loss**, and **Overlong Reward Shaping** for more stable and efficient RL training. See the [DAPO guide](docs/guides/dapo.md) for more details. |
9 | | -* [9/30/2025][Accelerated RL on GCP with NeMo RL!](https://discuss.google.dev/t/accelerating-reinforcement-learning-on-google-cloud-using-nvidia-nemo-rl/269579/4) |
| 19 | +* [9/30/2025] [Accelerated RL on GCP with NeMo RL!](https://discuss.google.dev/t/accelerating-reinforcement-learning-on-google-cloud-using-nvidia-nemo-rl/269579/4) |
10 | 20 | * [9/27/2025] [FP8 Quantization in NeMo RL](https://github.com/NVIDIA-NeMo/RL/discussions/1216) |
11 | 21 | * [9/25/2025] On-policy Distillation |
12 | 22 | * Student generates on-policy sequences and aligns logits to a larger teacher via KL, achieving near-larger-model quality at lower cost than RL. See [On-policy Distillation](#on-policy-distillation). |
@@ -96,7 +106,7 @@ For detailed information on backend selection, configuration, and examples, see |
96 | 106 | |-|-|-| |
97 | 107 | |[GRPO](#grpo)|[GRPO Single Node](#grpo-single-node)|[GRPO Multi-node](#grpo-multi-node): [GRPO Qwen2.5-32B](#grpo-qwen25-32b), [GRPO Multi-Turn](#grpo-multi-turn)| |
98 | 108 | |[On-policy Distillation](#on-policy-distillation)|[Distillation Single Node](#on-policy-distillation-single-node)|[Distillation Multi-node](#on-policy-distillation-multi-node)| |
99 | | - |[Supervised Fine-Tuning (SFT)](#supervised-fine-tuning-sft)|[SFT Single Node](#sft-single-node)|[SFT Multi-node](#sft-multi-node)| |
| 109 | + |[SFT](#supervised-fine-tuning-sft)|[SFT Single Node](#sft-single-node)|[SFT Multi-node](#sft-multi-node)| |
100 | 110 | |[DPO](#dpo)|[DPO Single Node](#dpo-single-node)|[DPO Multi-node](#dpo-multi-node)| |
101 | 111 | |[RM](#rm)|[RM Single Node](#rm-single-node)|[RM Multi-node](#rm-multi-node)| |
102 | 112 |
|
@@ -609,9 +619,13 @@ note = {GitHub repository}, |
609 | 619 | } |
610 | 620 | ``` |
611 | 621 |
|
612 | | -## Contributing |
| 622 | +## Acknowledgement and Contribution Guide |
| 623 | + |
| 624 | +NeMo RL would like to acknowledge the adoption and contribution by the following community partners - Google, Argonne National Labs, Atlassian, Camfer, Domyn, Future House, Inflection AI, Lila, Paypal, Pegatron, PyTorch, Radical AI, Samsung, SB Instituition, Shanghai AI Lab, Speakleash, Sword Health, TII, NVIDIA Nemotron team, and many others. |
| 625 | + |
| 626 | +NeMo RL is the re-architected repo of [NeMo Aligner](https://github.com/NVIDIA/NeMo-Aligner), which was one of the earliest LLM Reinforcement Learning libraries, and has inspired other open source libraries such as [VeRL](https://github.com/volcengine/verl) and [ROLL](https://github.com/alibaba/ROLL). |
613 | 627 |
|
614 | | -We welcome contributions to NeMo RL\! Please see our [Contributing Guidelines](https://github.com/NVIDIA-NeMo/RL/blob/main/CONTRIBUTING.md) for more information on how to get involved. |
| 628 | +We welcome contributions to NeMo RL! Please see our [Contributing Guidelines](https://github.com/NVIDIA-NeMo/RL/blob/main/CONTRIBUTING.md) for more information on how to get involved. |
615 | 629 |
|
616 | 630 | ## Licenses |
617 | 631 |
|
|
0 commit comments