Skip to content

Commit e061c09

Browse files
authored
Pi05 + PyTorch support (#634)
2 parents e458066 + 9b2fe72 commit e061c09

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

43 files changed

+8857
-2811
lines changed

README.md

Lines changed: 156 additions & 27 deletions
Large diffs are not rendered by default.

examples/convert_jax_model_to_pytorch.py

Lines changed: 587 additions & 0 deletions
Large diffs are not rendered by default.

examples/droid/README.md

Lines changed: 21 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,25 @@
1-
# Run DROID
1+
# DROID Policies in openpi
22

3-
This example shows how to run the fine-tuned $\pi_0$-FAST-DROID model on the [DROID robot platform](https://github.com/droid-dataset/droid). We also offer a $\pi_0$-DROID model that is fine-tuned from $\pi_0$ and uses flow action decoding. You can use it by replacing `pi0_fast_droid` with `pi0_droid` in the commands below. In practice, we find that out-of-the-box, the $\pi_0$-FAST-DROID model is better at following language commands, so we recommend it as the default checkpoint for DROID evaluation. If you want to fine-tune on a DROID task that requires a fast-to-inference policy, you may still want to consider using the $\pi_0$-DROID model, since it decodes faster. For more details, please see the [FAST paper](https://pi.website/research/fast).
3+
We offer instructions for:
4+
- [Running inference for our best $pi_{0.5}$-DROID policy](./README.md#running-droid-inference)
5+
- [Running inference for other pre-trained DROID policies ($\pi_0$, $\pi_0$-FAST, ...)](./README.md#running-roboarena-baseline-policies)
6+
- [Pre-training *generalist* policies on the *full* DROID dataset](./README_train.md#training-on-droid)
7+
- [Fine-tuning expert $\pi_{0.5}$ on your custom DROID dataset](./README_train.md#fine-tuning-on-custom-droid-datasets)
48

9+
## Running DROID Inference
510

6-
## Step 1: Start a policy server
11+
This example shows how to run the fine-tuned $\pi_{0.5}$-DROID model on the [DROID robot platform](https://github.com/droid-dataset/droid). Based on the [public RoboArena benchmark](https://robo-arena.github.io/leaderboard), this is currently our strongest generalist DROID policy.
12+
13+
14+
### Step 1: Start a policy server
715

816
Since the DROID control laptop does not have a powerful GPU, we will start a remote policy server on a different machine with a more powerful GPU and then query it from the DROID control laptop during inference.
917

1018
1. On a machine with a powerful GPU (~NVIDIA 4090), clone and install the `openpi` repository following the instructions in the [README](https://github.com/Physical-Intelligence/openpi).
1119
2. Start the OpenPI server via the following command:
1220

1321
```bash
14-
uv run scripts/serve_policy.py policy:checkpoint --policy.config=pi0_fast_droid --policy.dir=gs://openpi-assets/checkpoints/pi0_fast_droid
22+
uv run scripts/serve_policy.py policy:checkpoint --policy.config=pi05_droid --policy.dir=gs://openpi-assets/checkpoints/pi05_droid
1523
```
1624

1725
You can also run the equivalent command below:
@@ -20,7 +28,7 @@ You can also run the equivalent command below:
2028
uv run scripts/serve_policy.py --env=DROID
2129
```
2230

23-
## Step 2: Run the DROID robot
31+
### Step 2: Run the DROID robot
2432

2533
1. Make sure you have the most recent version of the DROID package installed on both the DROID control laptop and the NUC.
2634
2. On the control laptop, activate your DROID conda environment.
@@ -36,7 +44,7 @@ python3 scripts/main.py --remote_host=<server_ip> --remote_port=<server_port> --
3644

3745
The script will ask you to enter a free-form language instruction for the robot to follow. Make sure to point the cameras at the scene you want the robot to interact with. You _do not_ need to carefully control camera angle, object positions, etc. The policy is fairly robust in our experience. Happy prompting!
3846

39-
# Troubleshooting
47+
## Troubleshooting
4048

4149
| Issue | Solution |
4250
|-------|----------|
@@ -46,11 +54,17 @@ The script will ask you to enter a free-form language instruction for the robot
4654
| Policy does not perform the task well | In our experiments, the policy could perform simple table top manipulation tasks (pick-and-place) across a wide range of environments, camera positions, and lighting conditions. If the policy does not perform the task well, you can try modifying the scene or object placement to make the task easier. Also make sure that the camera view you are passing to the policy can see all relevant objects in the scene (the policy is only conditioned on a single external camera + wrist camera, make sure you are feeding the desired camera to the policy). Use `ZED_Explore` to check that the camera view you are passing to the policy can see all relevant objects in the scene. Finally, the policy is far from perfect and will fail on more complex manipulation tasks, but it usually makes a decent effort. :) |
4755

4856

49-
# Running RoboArena Baseline Policies
57+
## Running Other Policies
5058

5159
We provide configs for running the baseline DROID policies from the [RoboArena](https://robo-arena.github.io/) paper. Simply run the commands below to start inference servers for the respective policies. Then follow the instructions above to run evaluation on the DROID robot.
5260

5361
```
62+
# Train from pi0-FAST, using FAST tokenizer
63+
uv run scripts/serve_policy.py policy:checkpoint --policy.config=pi0_fast_droid --policy.dir=gs://openpi-assets/checkpoints/pi0_fast_droid
64+
65+
# Train from pi0, using flow matching
66+
uv run scripts/serve_policy.py policy:checkpoint --policy.config=pi0_droid --policy.dir=gs://openpi-assets/checkpoints/pi0_droid
67+
5468
# Trained from PaliGemma, using RT-2 / OpenVLA style binning tokenizer.
5569
uv run scripts/serve_policy.py policy:checkpoint --policy.config=paligemma_binning_droid --policy.dir=gs://openpi-assets/checkpoints/roboarena/paligemma_binning_droid
5670

examples/droid/README_train.md

Lines changed: 46 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,9 @@
11
# Training on DROID
22

3-
Here we describe how to fine-tune the pi0-FAST model on the DROID dataset. This is an approximate open-source reproduction of the pi0-FAST-DROID training pipeline.
4-
(small differences in data loading and the used action space)
3+
Here we describe how to fine-tune the pi0.5 model on the *full* DROID dataset. This is an approximate open-source reproduction of the pi05-DROID training pipeline.
4+
(small differences in data loading and the used action space) -- For a tutorial on how to fine-tune your model with a smaller, custom dataset collected on the DROID platform, see below.
55

6-
In contrast to the rest of openpi, which uses LeRobot for data loading, we need to use RLDS as the data format for DROID training (since atm LeRobot isn't scalable enough
6+
In contrast to the rest of openpi, which uses LeRobot for data loading, we need to use RLDS as the data format for full DROID training (since at the moment LeRobot isn't scalable enough
77
for larger datasets like DROID -- they are working on improving it though). Below, we provide instructions for updating your openpi environment for RLDS data loading and where to download the DROID dataset.
88

99
## Install
@@ -30,15 +30,15 @@ First, change the `rlds_data_dir` path in your `TrainConfig` to the directory th
3030

3131
Then, compute normalization statistics (this will take ~10 minutes):
3232
```bash
33-
uv run --group rlds scripts/compute_norm_stats.py --config-name pi0_fast_droid_finetune
33+
uv run --group rlds scripts/compute_norm_stats.py --config-name pi05_full_droid_finetune --max-frames 10_000_000
3434
```
3535

3636
Run training:
3737
```bash
38-
XLA_PYTHON_CLIENT_MEM_FRACTION=0.9 uv run --group rlds scripts/train.py pi0_fast_droid_finetune --exp-name=my_experiment --overwrite
38+
XLA_PYTHON_CLIENT_MEM_FRACTION=0.9 uv run --group rlds scripts/train.py pi05_full_droid_finetune --exp-name=my_experiment --overwrite
3939
```
4040

41-
**Note**: The original pi0-FAST-DROID model was trained with joint velocity actions.
41+
**Note**: The original pi0.5-DROID model was trained with joint velocity actions.
4242
Joint velocity actions are not compatible with simulated evaluation environments (much harder to simulate).
4343
Thus, we do not recommend training with joint velocity actions and instead use joint position actions here.
4444

@@ -64,3 +64,43 @@ By default, our openpi training recipe implements the same idle filter used to t
6464
Consider submitting your DROID policies to the [RoboArena benchmark](https://robo-arena.github.io/), which allows you to evaluate your policies on diverse tasks & scenes, **in the real world**! :)
6565

6666
If you have questions about RoboArena, please email [karl.pertsch@gmail.com](mailto:karl.pertsch@gmail.com).
67+
68+
69+
# Fine-Tuning on Custom DROID Datasets
70+
71+
Here we describe how to fine-tune a model on a custom (smaller) dataset collected on the DROID platform. Like for other datasets, we will first convert the custom DROID dataset to LeRobot and then fine-tune a model (pi05-droid) on it.
72+
73+
Note: We use LeRobot here, since we assume the custom DROID fine-tuning dataset to be relatively small (<10s of hours). For larger datasets (like the full DROID dataset) we recommend using RLDS for it's better efficiency (see the example above).
74+
75+
76+
## Step 1: Converting your custom DROID dataset to LeRobot
77+
78+
We will use a small subset of the real DROID dataset for this example. This is a subset of just 30 demonstrations -- we assume that you will use your own dataset instead, but here is the command to download our subset (1.6GB):
79+
```
80+
gsutil -m cp -r gs://gresearch/robotics/droid_raw/1.0.1/IRIS/success/2023-12-04 <your_target_path>
81+
```
82+
83+
We will also download the language annotations for the DROID dataset so we can pair our demonstrations with language instructions. Again, for your own data you can manually enter your language instructions and don't need to download our annotations. To download the DROID language annotations (12MB), run:
84+
```
85+
gsutil -m cp -r gs://gresearch/robotics/droid_raw/1.0.1/aggregated-annotations-030724.json <your_target_dir>
86+
```
87+
88+
For your own dataset, make sure that each episode's directory contains a folder called `recordings/MP4` -- if not, you need to first run the MP4 video extraction (from SVO files) using the script [here](https://github.com/droid-dataset/droid/blob/main/scripts/convert/svo_to_mp4.py).
89+
90+
Now, we will use the `convert_droid_to_lerobot.py` script to create a LeRobot version of this dataset (takes <5min for the 30 demonstrations):
91+
```
92+
uv run examples/droid/convert_droid_data_to_lerobot.py --data_dir <your_target_path>
93+
```
94+
95+
## Step 2: Run fine-tuning with your custom dataset
96+
97+
Now we can run fine-tuning with our converted custom dataset. We provide an example config for fine-tuning `pi05_droid` on the custom dataset we created.
98+
You can modify the config easily to work with other base models, or use your custom DROID dataset in `config.py` (seach for `pi05_droid_finetune`).
99+
100+
To launch training:
101+
```
102+
uv run scripts/train.py pi05_droid_finetune --exp-name=my_experiment --overwrite
103+
```
104+
105+
Once trained, you can follow the instructions in [`examples/droid/README.md`](examples/droid/README.md) to serve the policy and run it on the robot.
106+

0 commit comments

Comments
 (0)