You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+94-4Lines changed: 94 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -25,9 +25,33 @@
25
25
26
26
## 🚀 What's New
27
27
28
+
### 🚀 **Command-Line Training Interface** - Train RL Agents Without Writing Code! (Experimental)
29
+
30
+
TorchRL now provides a **powerful command-line interface** that lets you train state-of-the-art RL agents with simple bash commands! No Python scripting required - just run training with customizable parameters:
- ⚙️ **Full Customization**: Override any parameter via command line: `trainer.total_frames=2000000 optimizer.lr=0.0003`
34
+
- 🌍 **Multi-Environment Support**: Switch between Gym, Brax, DM Control, and more with `env=gym training_env.create_env_fn.base_env.env_name=HalfCheetah-v4`
35
+
- 📊 **Built-in Logging**: TensorBoard, Weights & Biases, CSV logging out of the box
36
+
- 🔧 **Hydra-Powered**: Leverages Hydra's powerful configuration system for maximum flexibility
37
+
- 🏃♂️ **Production Ready**: Same robust training pipeline as our SOTA implementations
38
+
39
+
**Perfect for**: Researchers, practitioners, and anyone who wants to train RL agents without diving into implementation details.
40
+
41
+
⚠️ **Note**: This is an experimental feature. The API may change in future versions. We welcome feedback and contributions to help improve this implementation!
42
+
43
+
📋 **Prerequisites**: The training interface requires Hydra for configuration management. Install with:
44
+
```bash
45
+
pip install "torchrl[utils]"
46
+
# or manually:
47
+
pip install hydra-core omegaconf
48
+
```
49
+
50
+
Check out the [complete CLI documentation](https://github.com/pytorch/rl/tree/main/sota-implementations/ppo_trainer) to get started!
51
+
28
52
### LLM API - Complete Framework for Language Model Fine-tuning
29
53
30
-
TorchRL now includes a comprehensive **LLM API** for post-training and fine-tuning of language models! This new framework provides everything you need for RLHF, supervised fine-tuning, and tool-augmented training:
54
+
TorchRL also includes a comprehensive **LLM API** for post-training and fine-tuning of language models! This new framework provides everything you need for RLHF, supervised fine-tuning, and tool-augmented training:
31
55
32
56
- 🤖 **Unified LLM Wrappers**: Seamless integration with Hugging Face models and vLLM inference engines - more to come!
33
57
- 💬 **Conversation Management**: Advanced [`History`](torchrl/data/llm/history.py) class for multi-turn dialogue with automatic chat template detection
@@ -74,6 +98,67 @@ for data in collector:
74
98
75
99
</details>
76
100
101
+
### 🧪 PPOTrainer (Experimental) - High-Level Training Interface
102
+
103
+
TorchRL now includes an **experimental PPOTrainer** that provides a complete, configurable PPO training solution! This prototype feature combines TorchRL's modular components into a cohesive training system with sensible defaults:
104
+
105
+
- 🎯 **Complete Training Pipeline**: Handles environment setup, data collection, loss computation, and optimization automatically
106
+
- ⚙️ **Extensive Configuration**: Comprehensive Hydra-based config system for easy experimentation and hyperparameter tuning
107
+
- 📊 **Built-in Logging**: Automatic tracking of rewards, actions, episode completion rates, and training statistics
108
+
- 🔧 **Modular Design**: Built on existing TorchRL components (collectors, losses, replay buffers) for maximum flexibility
109
+
- 📝 **Minimal Code**: Complete SOTA implementation in [just ~20 lines](sota-implementations/ppo_trainer/train.py)!
110
+
111
+
**Working Example**: See [`sota-implementations/ppo_trainer/`](sota-implementations/ppo_trainer/) for a complete, working PPO implementation that trains on Pendulum-v1 with full Hydra configuration support.
112
+
113
+
**Prerequisites**: Requires Hydra for configuration management: `pip install "torchrl[utils]"`
114
+
115
+
<details>
116
+
<summary>Complete Training Script (sota-implementations/ppo_trainer/train.py)</summary>
**Future Plans**: Additional algorithm trainers (SAC, TD3, DQN) and full integration of all TorchRL components within the configuration system are planned for upcoming releases.
161
+
77
162
## Key features
78
163
79
164
- 🐍 **Python-first**: Designed with Python as the primary language for ease of use and flexibility
@@ -932,7 +1017,7 @@ source torchrl/bin/activate # On Windows use: venv\Scripts\activate
932
1017
Or create a conda environment where the packages will be installed.
933
1018
934
1019
```
935
-
conda create --name torchrl python=3.9
1020
+
conda create --name torchrl python=3.10
936
1021
conda activate torchrl
937
1022
```
938
1023
@@ -945,7 +1030,12 @@ install the latest (nightly) PyTorch release or the latest stable version of PyT
945
1030
See [here](https://pytorch.org/get-started/locally/) for a detailed list of commands,
946
1031
including `pip3` or other special installation instructions.
947
1032
948
-
TorchRL offers a few pre-defined dependencies such as `"torchrl[tests]"`, `"torchrl[atari]"` etc.
1033
+
TorchRL offers a few pre-defined dependencies such as `"torchrl[tests]"`, `"torchrl[atari]"`, `"torchrl[utils]"` etc.
1034
+
1035
+
For the experimental training interface and configuration system, install:
1036
+
```bash
1037
+
pip3 install "torchrl[utils]"# Includes hydra-core and other utilities
1038
+
```
949
1039
950
1040
#### Torchrl
951
1041
@@ -989,7 +1079,7 @@ Importantly, the nightly builds require the nightly builds of PyTorch too.
989
1079
Also, a local build of torchrl with the nightly build of tensordict may fail - install both nightlies or both local builds but do not mix them.
990
1080
991
1081
992
-
**Disclaimer**: As of today, TorchRL is roughly compatible with any pytorch version >= 2.1 and installing it will not
1082
+
**Disclaimer**: As of today, TorchRL requires Python 3.10+ and is roughly compatible with any pytorch version >= 2.1. Installing it will not
993
1083
directly require a newer version of pytorch to be installed. Indirectly though, tensordict still requires the latest
994
1084
PyTorch to be installed and we are working hard to loosen that requirement.
995
1085
The C++ binaries of TorchRL (mainly for prioritized replay buffers) will only work with PyTorch 2.7.0 and above.
0 commit comments