Skip to content

Commit 544f22e

Browse files
committed
docs: update README.md
1 parent 3091442 commit 544f22e

File tree

1 file changed

+41
-9
lines changed

1 file changed

+41
-9
lines changed

README.md

Lines changed: 41 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ We are committed to regularly updating our exploration directions and results in
99

1010
We warmly welcome contributions from the broader community—join us in pushing the boundaries of agent reasoning and tool integration!
1111

12-
Code and dataset coming soon! Stay tuned!
12+
Code and dataset are now available! The `verl` submodule has been integrated for enhanced RL training capabilities.
1313

1414
<div style="display: flex; justify-content: center;">
1515
<div style="width: 100; transform: scale(1.0);">
@@ -146,11 +146,18 @@ Agents are equipped with action-space awareness, employing systematic exploratio
146146
### Integration with RL Tuning Frameworks
147147
We integrate insights and methodologies from leading RL tuning frameworks, including:
148148

149-
- **Verl**
149+
- **Verl** - **Integrated as Git Submodule** - Our primary RL framework, providing advanced training capabilities for agent optimization
150150
- **TinyZero**
151151
- **OpenR1**
152152
- **Trlx**
153153

154+
### Verl Integration
155+
The `verl` submodule is fully integrated into OpenManus-RL, providing:
156+
- **Advanced RL Algorithms** - PPO, DPO, and custom reward modeling
157+
- **Efficient Training** - Optimized for large language model fine-tuning
158+
- **Flexible Configuration** - Easy customization of training parameters
159+
- **Production Ready** - Battle-tested framework from Bytedance
160+
154161
Through these frameworks, agents can effectively balance exploration and exploitation, optimize reasoning processes, and adapt dynamically to novel environments.
155162

156163
In summary, our method systematically integrates advanced reasoning paradigms, diverse rollout strategies, sophisticated reward modeling, and robust RL frameworks, significantly advancing the capability and adaptability of reasoning-enhanced LLM agents.
@@ -208,6 +215,18 @@ We are still laboriously developing this part, welcome feedback.
208215

209216
## Installation
210217

218+
### Prerequisites
219+
This project uses git submodules. After cloning the repository, make sure to initialize and update the submodules:
220+
221+
```bash
222+
# Clone the repository with submodules
223+
git clone --recursive https://github.com/OpenManus/OpenManus-RL.git
224+
225+
# Or if already cloned, initialize and update submodules
226+
git submodule update --init --recursive
227+
```
228+
229+
### Environment Setup
211230
First, create a conda environment and activate it:
212231

213232
```bash
@@ -277,17 +296,17 @@ webshop --port 36001
277296

278297
Note: The WebShop environment requires specific versions of Python, PyTorch, Faiss, and Java. The setup script will handle these dependencies automatically.
279298

280-
## Quick start
299+
## Quick Start
281300

282-
Train a reasoning + search LLM on NQ dataset with e5 as the retriever and wikipedia as the corpus.
301+
### 1. Environment Setup
302+
Make sure you have the required environments set up (see Environment Setup section above).
283303

284-
(1) Download the indexing and corpus.
304+
### 2. Data Preparation
305+
Download the OpenManus-RL dataset from [Hugging Face](https://huggingface.co/datasets/CharlieDreemur/OpenManus-RL).
285306

286-
From https://huggingface.co/datasets/CharlieDreemur/OpenManus-RL
307+
### 3. Training Examples
287308

288-
(3) Launch a local AgentGym server.
289-
290-
(4) Run RL training (PPO).
309+
#### ALFWorld RL Training (PPO)
291310
```bash
292311
conda activate openmanus-rl
293312
bash scripts/ppo_train/train_alfworld.sh
@@ -379,6 +398,19 @@ Please cite the following paper if you find OpenManus helpful!
379398
</a>
380399
</p>
381400

401+
## Project Structure
402+
403+
```
404+
OpenManus-RL/
405+
├── verl/ # Verl RL framework submodule
406+
├── openmanus_rl/ # Main OpenManus-RL library
407+
├── scripts/ # Training and evaluation scripts
408+
├── configs/ # Configuration files
409+
├── environments/ # Agent environment implementations
410+
├── docs/ # Documentation
411+
└── examples/ # Usage examples
412+
```
413+
382414
## Documentation
383415
- [Development Guide (English)](docs/DEVELOPMENT_GUIDE_EN.md)
384416
- [Development Guide (Chinese)](docs/DEVELOPMENT_GUIDE_ZH.md)

0 commit comments

Comments
 (0)