You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: examples/droid/README.md
+21-7Lines changed: 21 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,17 +1,25 @@
1
-
# Run DROID
1
+
# DROID Policies in openpi
2
2
3
-
This example shows how to run the fine-tuned $\pi_0$-FAST-DROID model on the [DROID robot platform](https://github.com/droid-dataset/droid). We also offer a $\pi_0$-DROID model that is fine-tuned from $\pi_0$ and uses flow action decoding. You can use it by replacing `pi0_fast_droid` with `pi0_droid` in the commands below. In practice, we find that out-of-the-box, the $\pi_0$-FAST-DROID model is better at following language commands, so we recommend it as the default checkpoint for DROID evaluation. If you want to fine-tune on a DROID task that requires a fast-to-inference policy, you may still want to consider using the $\pi_0$-DROID model, since it decodes faster. For more details, please see the [FAST paper](https://pi.website/research/fast).
3
+
We offer instructions for:
4
+
-[Running inference for our best $pi_{0.5}$-DROID policy](./README.md#running-droid-inference)
5
+
-[Running inference for other pre-trained DROID policies ($\pi_0$, $\pi_0$-FAST, ...)](./README.md#running-roboarena-baseline-policies)
6
+
-[Pre-training *generalist* policies on the *full* DROID dataset](./README_train.md#training-on-droid)
7
+
-[Fine-tuning expert $\pi_{0.5}$ on your custom DROID dataset](./README_train.md#fine-tuning-on-custom-droid-datasets)
4
8
9
+
## Running DROID Inference
5
10
6
-
## Step 1: Start a policy server
11
+
This example shows how to run the fine-tuned $\pi_{0.5}$-DROID model on the [DROID robot platform](https://github.com/droid-dataset/droid). Based on the [public RoboArena benchmark](https://robo-arena.github.io/leaderboard), this is currently our strongest generalist DROID policy.
12
+
13
+
14
+
### Step 1: Start a policy server
7
15
8
16
Since the DROID control laptop does not have a powerful GPU, we will start a remote policy server on a different machine with a more powerful GPU and then query it from the DROID control laptop during inference.
9
17
10
18
1. On a machine with a powerful GPU (~NVIDIA 4090), clone and install the `openpi` repository following the instructions in the [README](https://github.com/Physical-Intelligence/openpi).
11
19
2. Start the OpenPI server via the following command:
12
20
13
21
```bash
14
-
uv run scripts/serve_policy.py policy:checkpoint --policy.config=pi0_fast_droid --policy.dir=gs://openpi-assets/checkpoints/pi0_fast_droid
22
+
uv run scripts/serve_policy.py policy:checkpoint --policy.config=pi05_droid --policy.dir=gs://openpi-assets/checkpoints/pi05_droid
15
23
```
16
24
17
25
You can also run the equivalent command below:
@@ -20,7 +28,7 @@ You can also run the equivalent command below:
20
28
uv run scripts/serve_policy.py --env=DROID
21
29
```
22
30
23
-
## Step 2: Run the DROID robot
31
+
###Step 2: Run the DROID robot
24
32
25
33
1. Make sure you have the most recent version of the DROID package installed on both the DROID control laptop and the NUC.
26
34
2. On the control laptop, activate your DROID conda environment.
The script will ask you to enter a free-form language instruction for the robot to follow. Make sure to point the cameras at the scene you want the robot to interact with. You _do not_ need to carefully control camera angle, object positions, etc. The policy is fairly robust in our experience. Happy prompting!
38
46
39
-
# Troubleshooting
47
+
##Troubleshooting
40
48
41
49
| Issue | Solution |
42
50
|-------|----------|
@@ -46,11 +54,17 @@ The script will ask you to enter a free-form language instruction for the robot
46
54
| Policy does not perform the task well | In our experiments, the policy could perform simple table top manipulation tasks (pick-and-place) across a wide range of environments, camera positions, and lighting conditions. If the policy does not perform the task well, you can try modifying the scene or object placement to make the task easier. Also make sure that the camera view you are passing to the policy can see all relevant objects in the scene (the policy is only conditioned on a single external camera + wrist camera, make sure you are feeding the desired camera to the policy). Use `ZED_Explore` to check that the camera view you are passing to the policy can see all relevant objects in the scene. Finally, the policy is far from perfect and will fail on more complex manipulation tasks, but it usually makes a decent effort. :) |
47
55
48
56
49
-
# Running RoboArena Baseline Policies
57
+
##Running Other Policies
50
58
51
59
We provide configs for running the baseline DROID policies from the [RoboArena](https://robo-arena.github.io/) paper. Simply run the commands below to start inference servers for the respective policies. Then follow the instructions above to run evaluation on the DROID robot.
52
60
53
61
```
62
+
# Train from pi0-FAST, using FAST tokenizer
63
+
uv run scripts/serve_policy.py policy:checkpoint --policy.config=pi0_fast_droid --policy.dir=gs://openpi-assets/checkpoints/pi0_fast_droid
64
+
65
+
# Train from pi0, using flow matching
66
+
uv run scripts/serve_policy.py policy:checkpoint --policy.config=pi0_droid --policy.dir=gs://openpi-assets/checkpoints/pi0_droid
67
+
54
68
# Trained from PaliGemma, using RT-2 / OpenVLA style binning tokenizer.
55
69
uv run scripts/serve_policy.py policy:checkpoint --policy.config=paligemma_binning_droid --policy.dir=gs://openpi-assets/checkpoints/roboarena/paligemma_binning_droid
Copy file name to clipboardExpand all lines: examples/droid/README_train.md
+46-6Lines changed: 46 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,9 +1,9 @@
1
1
# Training on DROID
2
2
3
-
Here we describe how to fine-tune the pi0-FAST model on the DROID dataset. This is an approximate open-source reproduction of the pi0-FAST-DROID training pipeline.
4
-
(small differences in data loading and the used action space)
3
+
Here we describe how to fine-tune the pi0.5 model on the *full*DROID dataset. This is an approximate open-source reproduction of the pi05-DROID training pipeline.
4
+
(small differences in data loading and the used action space) -- For a tutorial on how to fine-tune your model with a smaller, custom dataset collected on the DROID platform, see below.
5
5
6
-
In contrast to the rest of openpi, which uses LeRobot for data loading, we need to use RLDS as the data format for DROID training (since atm LeRobot isn't scalable enough
6
+
In contrast to the rest of openpi, which uses LeRobot for data loading, we need to use RLDS as the data format for full DROID training (since at the moment LeRobot isn't scalable enough
7
7
for larger datasets like DROID -- they are working on improving it though). Below, we provide instructions for updating your openpi environment for RLDS data loading and where to download the DROID dataset.
8
8
9
9
## Install
@@ -30,15 +30,15 @@ First, change the `rlds_data_dir` path in your `TrainConfig` to the directory th
30
30
31
31
Then, compute normalization statistics (this will take ~10 minutes):
32
32
```bash
33
-
uv run --group rlds scripts/compute_norm_stats.py --config-name pi0_fast_droid_finetune
33
+
uv run --group rlds scripts/compute_norm_stats.py --config-name pi05_full_droid_finetune --max-frames 10_000_000
34
34
```
35
35
36
36
Run training:
37
37
```bash
38
-
XLA_PYTHON_CLIENT_MEM_FRACTION=0.9 uv run --group rlds scripts/train.py pi0_fast_droid_finetune --exp-name=my_experiment --overwrite
38
+
XLA_PYTHON_CLIENT_MEM_FRACTION=0.9 uv run --group rlds scripts/train.py pi05_full_droid_finetune --exp-name=my_experiment --overwrite
39
39
```
40
40
41
-
**Note**: The original pi0-FAST-DROID model was trained with joint velocity actions.
41
+
**Note**: The original pi0.5-DROID model was trained with joint velocity actions.
42
42
Joint velocity actions are not compatible with simulated evaluation environments (much harder to simulate).
43
43
Thus, we do not recommend training with joint velocity actions and instead use joint position actions here.
44
44
@@ -64,3 +64,43 @@ By default, our openpi training recipe implements the same idle filter used to t
64
64
Consider submitting your DROID policies to the [RoboArena benchmark](https://robo-arena.github.io/), which allows you to evaluate your policies on diverse tasks & scenes, **in the real world**! :)
65
65
66
66
If you have questions about RoboArena, please email [karl.pertsch@gmail.com](mailto:karl.pertsch@gmail.com).
67
+
68
+
69
+
# Fine-Tuning on Custom DROID Datasets
70
+
71
+
Here we describe how to fine-tune a model on a custom (smaller) dataset collected on the DROID platform. Like for other datasets, we will first convert the custom DROID dataset to LeRobot and then fine-tune a model (pi05-droid) on it.
72
+
73
+
Note: We use LeRobot here, since we assume the custom DROID fine-tuning dataset to be relatively small (<10s of hours). For larger datasets (like the full DROID dataset) we recommend using RLDS for it's better efficiency (see the example above).
74
+
75
+
76
+
## Step 1: Converting your custom DROID dataset to LeRobot
77
+
78
+
We will use a small subset of the real DROID dataset for this example. This is a subset of just 30 demonstrations -- we assume that you will use your own dataset instead, but here is the command to download our subset (1.6GB):
We will also download the language annotations for the DROID dataset so we can pair our demonstrations with language instructions. Again, for your own data you can manually enter your language instructions and don't need to download our annotations. To download the DROID language annotations (12MB), run:
For your own dataset, make sure that each episode's directory contains a folder called `recordings/MP4` -- if not, you need to first run the MP4 video extraction (from SVO files) using the script [here](https://github.com/droid-dataset/droid/blob/main/scripts/convert/svo_to_mp4.py).
89
+
90
+
Now, we will use the `convert_droid_to_lerobot.py` script to create a LeRobot version of this dataset (takes <5min for the 30 demonstrations):
91
+
```
92
+
uv run examples/droid/convert_droid_data_to_lerobot.py --data_dir <your_target_path>
93
+
```
94
+
95
+
## Step 2: Run fine-tuning with your custom dataset
96
+
97
+
Now we can run fine-tuning with our converted custom dataset. We provide an example config for fine-tuning `pi05_droid` on the custom dataset we created.
98
+
You can modify the config easily to work with other base models, or use your custom DROID dataset in `config.py` (seach for `pi05_droid_finetune`).
99
+
100
+
To launch training:
101
+
```
102
+
uv run scripts/train.py pi05_droid_finetune --exp-name=my_experiment --overwrite
103
+
```
104
+
105
+
Once trained, you can follow the instructions in [`examples/droid/README.md`](examples/droid/README.md) to serve the policy and run it on the robot.
0 commit comments