Skip to content

Commit d60f2cd

Browse files
authored
Update README.md
1 parent e7a1e0c commit d60f2cd

File tree

1 file changed

+6
-2
lines changed

1 file changed

+6
-2
lines changed

README.md

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,8 @@ Embodied Planner-R1 enables LLM agents to learn causal relationships between act
3737

3838

3939
## 🚀 Installation
40+
We separate the VERL training framework from the environment and wrap the environment into a [server](verl/alfworld_server/server) for interaction.
41+
4042
1. Embodied-Planner-R1 is based on verl with vLLM>=0.8
4143
```
4244
# Create the conda environment
@@ -54,7 +56,7 @@ pip3 install flash-attn --no-build-isolation
5456
pip3 install tensorboard
5557
```
5658

57-
2. Prepare environment for ALFWorld
59+
2. Prepare the environment for ALFWorld
5860
```
5961
conda create -n alfworld python=3.9
6062
conda activate alfworld
@@ -66,7 +68,7 @@ pip install uvicorn
6668
alfworld-download --data-dir ./get_data/alfworld
6769
```
6870

69-
3. Prepare environment for ScienceWorld
71+
3. Prepare the environment for ScienceWorld
7072
```
7173
conda create --name scienceworld python=3.8
7274
conda activate scienceworld
@@ -86,6 +88,8 @@ bash get_data_for_training.sh
8688
```
8789

8890
## 🕹️ Quick Start
91+
In our experimental setup, we used a 1×8 A100 (80GB) for training, with detailed training parameters provided in [examples/grpo_trainer/alf.sh](examples/grpo_trainer/alf.sh).
92+
8993
```
9094
# Remember to replace the path in the shell script with your local path
9195
bash cmd/alf.sh

0 commit comments

Comments
 (0)