Skip to content

Commit df290d8

Browse files
committed
Merge branch 'main' into feature/embodied_unify_interface
2 parents bd27cb0 + 4d9d1f4 commit df290d8

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

44 files changed

+2030
-52
lines changed

README.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,8 @@ RLinf is a flexible and scalable open-source infrastructure designed for post-tr
3030

3131

3232
## What's NEW!
33+
- [2025/11] 🔥 RLinf supports reinforcement learning fine-tuning for [IsaacLab](https://github.com/isaac-sim/IsaacLab). Doc: [RL on IsaacLab](https://rlinf.readthedocs.io/en/latest/rst_source/examples/isaaclab.html)
34+
- [2025/11] 🔥 RLinf supports reinforcement learning fine-tuning for [Behavior 1k](https://github.com/StanfordVL/BEHAVIOR-1K). Doc: [RL on Behavior 1k](https://rlinf.readthedocs.io/en/latest/rst_source/examples/behavior.html)
3335
- [2025/11] 🔥 RLinf supports reinforcement learning fine-tuning for [GR00T-N1.5](https://github.com/NVIDIA/Isaac-GR00T). Doc: [RL on GR00T-N1.5](https://rlinf.readthedocs.io/en/latest/rst_source/examples/gr00t.html).
3436
- [2025/11] 🔥 RLinf supports reinforcement learning fine-tuning for [Metaworld](https://github.com/Farama-Foundation/Metaworld). Doc: [RL on Metaworld](https://rlinf.readthedocs.io/en/latest/rst_source/examples/metaworld.html).
3537
- [2025/11] 🔥 RLinf supports reinforcement learning fine-tuning for [Behavior 1k](https://github.com/StanfordVL/BEHAVIOR-1K). Doc: [RL on Behavior 1k](https://rlinf.readthedocs.io/en/latest/rst_source/examples/behavior.html).
@@ -66,7 +68,7 @@ RLinf is a flexible and scalable open-source infrastructure designed for post-tr
6668
<li>RoboVerse</li>
6769
<li><a href="https://rlinf.readthedocs.io/en/latest/rst_source/examples/behavior.html">BEHAVIOR</a> ✅</li>
6870
<li><a href="https://rlinf.readthedocs.io/en/latest/rst_source/examples/metaworld.html">MetaWorld</a> ✅</li>
69-
<li>IsaacLab</li>
71+
<li><a href="https://rlinf.readthedocs.io/en/latest/rst_source/examples/isaaclab.html">IsaacLab</a> ✅</li>
7072
<li>RoboCasa</li>
7173
<li>More...</li>
7274
</ul>
@@ -199,7 +201,7 @@ and exhibits greater stability.
199201
<td style="text-align:center;">39.10%</td>
200202
</tr>
201203
<tr>
202-
<td style="text-align:center;"><a href="https://huggingface.co/gen-robot/openvla-7b-rlvla-warmup"><img src="docs/source-en/_static/svg/hf-logo.svg" alt="HF" width="16" height="16" style="vertical-align: middle;">RL4VLA (PPO)</a></td>
204+
<td style="text-align:center;"><a href="https://huggingface.co/gen-robot/openvla-7b-rlvla-rl"><img src="docs/source-en/_static/svg/hf-logo.svg" alt="HF" width="16" height="16" style="vertical-align: middle;">RL4VLA (PPO)</a></td>
203205
<td style="text-align:center;">93.75%</td>
204206
<td style="text-align:center;">80.47%</td>
205207
<td style="text-align:center;">75.00%</td>

README.zh-CN.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,8 @@ RLinf 是一个灵活且可扩展的开源框架,专为利用强化学习进
3030

3131

3232
## 最新动态
33+
- [2025/11] 🔥 基于[IsaacLab](https://github.com/isaac-sim/IsaacLab)的强化学习微调已经上线! 文档:[RL on IsaacLab](https://rlinf.readthedocs.io/zh-cn/latest/rst_source/examples/isaaclab.html)
34+
- [2025/11] 🔥 基于[Behavior 1k](https://github.com/StanfordVL/BEHAVIOR-1K)的强化学习微调已经上线! 文档:[RL on Behavior 1k](https://rlinf.readthedocs.io/zh-cn/latest/rst_source/examples/behavior.html)
3335
- [2025/11] 🔥 RLinf现在已经支持强化学习微调[GR00T-N1.5](https://github.com/NVIDIA/Isaac-GR00T)!文档:[RL on GR00T-N1.5](https://rlinf.readthedocs.io/zh-cn/latest/rst_source/examples/gr00t.html)
3436
- [2025/11] 🔥 基于[Metaworld](https://github.com/Farama-Foundation/Metaworld)的强化学习微调已经上线! 文档:[RL on Metaworld](https://rlinf.readthedocs.io/zh-cn/latest/rst_source/examples/metaworld.html)
3537
- [2025/11] 🔥 基于[Behavior 1k](https://github.com/StanfordVL/BEHAVIOR-1K)的强化学习微调已经上线! 文档:[RL on Behavior 1k](https://rlinf.readthedocs.io/zh-cn/latest/rst_source/examples/behavior.html)
@@ -66,7 +68,7 @@ RLinf 是一个灵活且可扩展的开源框架,专为利用强化学习进
6668
<li>RoboVerse</li>
6769
<li><a href="https://rlinf.readthedocs.io/zh-cn/latest/rst_source/examples/behavior.html">BEHAVIOR</a> ✅</li>
6870
<li><a href="https://rlinf.readthedocs.io/zh-cn/latest/rst_source/examples/metaworld.html">MetaWorld</a> ✅</li>
69-
<li>IsaacLab</li>
71+
<li><a href="https://rlinf.readthedocs.io/zh-cn/latest/rst_source/examples/isaaclab.html">IsaacLab</a> ✅</li>
7072
<li>RoboCasa</li>
7173
<li>More...</li>
7274
</ul>
@@ -197,7 +199,7 @@ RLinf 是一个灵活且可扩展的开源框架,专为利用强化学习进
197199
<td style="text-align:center;">39.10%</td>
198200
</tr>
199201
<tr>
200-
<td style="text-align:center;"><a href="https://huggingface.co/gen-robot/openvla-7b-rlvla-warmup"><img src="docs/source-en/_static/svg/hf-logo.svg" alt="HF" width="16" height="16" style="vertical-align: middle;">RL4VLA (PPO)</a></td>
202+
<td style="text-align:center;"><a href="https://huggingface.co/gen-robot/openvla-7b-rlvla-rl"><img src="docs/source-en/_static/svg/hf-logo.svg" alt="HF" width="16" height="16" style="vertical-align: middle;">RL4VLA (PPO)</a></td>
201203
<td style="text-align:center;">93.75%</td>
202204
<td style="text-align:center;">80.47%</td>
203205
<td style="text-align:center;">75.00%</td>

docs/source-en/index.rst

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ RLinf is a flexible and scalable open-source infrastructure designed for post-tr
4040
- Embodied Agent Support
4141

4242
- Fast adaptation support for mainstream VLA models: `OpenVLA`_, `OpenVLA-OFT`_, `π₀`_, `GR00T-N1.5`_
43-
- Support for mainstream CPU & GPU-based simulators via standardized RL interfaces: `ManiSkill3`_, `LIBERO`_
43+
- Support for mainstream CPU & GPU-based simulators via standardized RL interfaces: `ManiSkill3`_, `LIBERO`_, `IsaacLab`_
4444
- Enabling the first RL fine-tuning of the π₀ model family with a flow-matching action expert.
4545

4646
**RLinf is fast with:**
@@ -73,6 +73,7 @@ RLinf is a flexible and scalable open-source infrastructure designed for post-tr
7373
.. _IsaacLab: https://github.com/isaac-sim/IsaacLab
7474
.. _ManiSkill3: https://github.com/haosulab/ManiSkill
7575
.. _LIBERO: https://github.com/Lifelong-Robot-Learning/LIBERO
76+
.. _IsaacLab: https://github.com/isaac-sim/IsaacLab
7677
.. _π₀: https://github.com/Physical-Intelligence/openpi
7778
.. _Megatron-LM: https://github.com/NVIDIA/Megatron-LM
7879
.. _SGLang: https://github.com/sgl-project/sglang

docs/source-en/rst_source/examples/index.rst

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -74,6 +74,17 @@ as well as reinforcement learning training examples on real robots.
7474
</p>
7575
</div>
7676

77+
<div style="flex: 1 1 30%; max-width: 300px; text-align: center;">
78+
<img src="https://github.com/RLinf/misc/raw/main/pic/IsaacLab.png"
79+
style="width: 100%; height: 200px; object-fit: cover; border-radius: 8px; box-shadow: 0 2px 6px rgba(0,0,0,0.15);" />
80+
<p style="margin-top: 8px; font-size: 14px; line-height: 1.4;">
81+
<a href="https://rlinf.readthedocs.io/en/latest/rst_source/examples/isaaclab.html" target="_blank" style="text-decoration: underline; color: blue;">
82+
<b>RL with IsaacLab Benchmark</b>
83+
</a><br>
84+
Support IsaacLab+gr00t+PPO training
85+
</p>
86+
</div>
87+
7788
<div style="flex: 1 1 30%; max-width: 300px; text-align: center;">
7889
<img src="https://github.com/RLinf/misc/raw/main/pic/gr00t.png"
7990
style="width: 100%; height: 200px; object-fit: cover; border-radius: 8px; box-shadow: 0 2px 6px rgba(0,0,0,0.15);" />
@@ -235,6 +246,7 @@ Thanks to this decoupled design, workers can be flexibly and dynamically schedul
235246
libero
236247
behavior
237248
metaworld
249+
isaaclab
238250
pi0
239251
gr00t
240252
reasoning

0 commit comments

Comments
 (0)