You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+6-6Lines changed: 6 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -80,7 +80,7 @@ Non-edge components (e.g., Policy Training, Model Arithmetic) have been tested o
80
80
81
81
### Hardware
82
82
83
-
For real-robot deployment (dual-arm setup, cameras, and table layout), see **[Hardware Setup & 3D Print Files](setup/README.md)**. That document covers supported platforms (Agilex Piper for FlattenFold / TeeShirtSort, ARX X5 for HangCloth), Intel RealSense D435i camera placement, 3D-printed grippers and mounts with usage notes, and inference host GPU (RTX 4090 in Ubuntu 20.04).
83
+
For real-robot deployment (dual-arm setup, cameras, and table layout), see **[Hardware Setup & 3D Print Files](setup/README.md)**. That document covers supported platforms (Agilex Piper for Task_A / Task_B, ARX X5 for Task_C), Intel RealSense D435i camera placement, 3D-printed grippers and mounts with usage notes, and inference host GPU (RTX 4090 in Ubuntu 20.04).
84
84
85
85
## Installation
86
86
@@ -116,11 +116,11 @@ Download the Kai0 dataset so it is available under `./data` for training and eva
116
116
python scripts/download_dataset.py
117
117
```
118
118
119
-
This fetches the full dataset from [Hugging Face](https://huggingface.co/datasets/OpenDriveLab-org/Kai0) into `./data` (FlattenFold, HangCloth, TeeShirtSort). To download only specific tasks or use a custom path, see the [dataset docs](docs/dataset.md#step-1-download-the-dataset).
119
+
This fetches the full dataset from [Hugging Face](https://huggingface.co/datasets/OpenDriveLab-org/Kai0) into `./data` (Task_A, Task_B, Task_C). To download only specific tasks or use a custom path, see the [dataset docs](docs/dataset.md#step-1-download-the-dataset).
120
120
121
121
### 2. Download checkpoints (optional, for testing)
122
122
123
-
We provide **one best model per task** (FlattenFold, HangCloth, TeeShirtSort) in the [Kai0 repo on Hugging Face](https://huggingface.co/OpenDriveLab-org/Kai0/tree/main).
123
+
We provide **one best model per task** (Task_A, Task_B, Task_C) in the [Kai0 repo on Hugging Face](https://huggingface.co/OpenDriveLab-org/Kai0/tree/main).
124
124
125
125
From the repository root, you can download all best-model checkpoints to `./checkpoints` with:
After download, set `weight_loader` in the training config to the path of the corresponding checkpoint directory (see step 3 below). You can also use openpi’s pretrained π₀.5 checkpoint instead.
@@ -144,7 +144,7 @@ After the dataset is in `./data`, you can run **normal π₀.₅ full fine-tunin
144
144
145
145
Edit [`src/openpi/training/config.py`](src/openpi/training/config.py) (around lines 1173–1226) for the task(s) you need:
146
146
147
-
-**`repo_id`**: set to the **absolute path** to the dataset subset, e.g. `<path_to_repo_root>/data/FlattenFold/base`, `<path_to_repo_root>/data/TeeShirtSort/base`, or `<path_to_repo_root>/data/HangCloth/base`.
147
+
-**`repo_id`**: set to the **absolute path** to the dataset subset, e.g. `<path_to_repo_root>/data/Task_A/base`, `<path_to_repo_root>/data/Task_B/base`, or `<path_to_repo_root>/data/Task_C/base`.
148
148
-**`weight_loader`**: set to the path of your **π₀.₅ base checkpoint** — either the best model you downloaded in step 2 above, or openpi’s pretrained π₀.₅ checkpoint.
149
149
150
150
Config names to use: e.g. `pi05_flatten_fold_normal`
@@ -300,7 +300,7 @@ For a ready-to-use script with environment setup (conda/venv activation, DDP con
300
300
**Stage 2 — Advantage Estimation on New Data**: Use the trained estimator to label datasets with predicted advantage values.
301
301
302
302
```bash
303
-
uv run python stage_advantage/annotation/eval.py Flatten-Fold KAI0 /path/to/dataset
303
+
uv run python stage_advantage/annotation/eval.py Task-A KAI0 /path/to/dataset
304
304
```
305
305
306
306
For a ready-to-use script with environment setup and status logging, see `stage_advantage/annotation/eval.sh`.
Copy file name to clipboardExpand all lines: stage_advantage/README.md
+7-7Lines changed: 7 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -235,18 +235,18 @@ Examples:
235
235
236
236
```bash
237
237
# KAI0 (two-timestep) on a dataset
238
-
uv run python stage_advantage/annotation/eval.py Flatten-Fold KAI0 /path/to/dataset
238
+
uv run python stage_advantage/annotation/eval.py Task-A KAI0 /path/to/dataset
239
239
240
240
# PI06 (single-timestep)
241
-
uv run python stage_advantage/annotation/eval.py Flatten-Fold PI06 /path/to/dataset
241
+
uv run python stage_advantage/annotation/eval.py Task-A PI06 /path/to/dataset
242
242
```
243
243
244
-
`<model_type>` is a key in `eval.py`'s `MODELS_CONFIG_MAP` (e.g. `Flatten-Fold`); `<model_name>` is `PI06` or `KAI0`; `<repo_id>` is the path to the LeRobot dataset. Results are written under `<repo_id>/data_<model_name>_<ckpt_steps>/`.
244
+
`<model_type>` is a key in `eval.py`'s `MODELS_CONFIG_MAP` (e.g. `Task-A`); `<model_name>` is `PI06` or `KAI0`; `<repo_id>` is the path to the LeRobot dataset. Results are written under `<repo_id>/data_<model_name>_<ckpt_steps>/`.
245
245
246
246
For a ready-to-use script with environment setup (conda/venv activation, environment variables) and status logging, see **`annotation/eval.sh`**:
@@ -274,13 +274,13 @@ The output parquets can then be used in Stage 3 (AWBC) or fed back into Stage 0
274
274
275
275
**Goal**: Train a policy using **Advantage-Weighted Behavior Cloning (AWBC)**. The advantage labels (from Stage 0 + Stage 2) are stored as `task_index` per frame and as prompt strings in `meta/tasks.jsonl`. By setting **`prompt_from_task=True`** in the data config, each sample’s prompt is taken from that mapping, so the policy is conditioned on the advantage-derived label (e.g. high vs low advantage) and effectively does advantage-weighted behavior cloning via the language channel.
276
276
277
-
**Configs** (in `src/openpi/training/config.py`): `pi05_flatten_fold_awbc`, `pi05_tee_shirt_sort_awbc`, `pi05_hang_cloth_awbc`. Each uses `LerobotAgilexDataConfig` or `LerobotARXDataConfig` with `base_config=DataConfig(prompt_from_task=True)` and `repo_id` pointing to the **advantage** dataset (e.g. `.../data/FlattenFold/advantage`).
277
+
**Configs** (in `src/openpi/training/config.py`): `pi05_flatten_fold_awbc`, `pi05_tee_shirt_sort_awbc`, `pi05_hang_cloth_awbc`. Each uses `LerobotAgilexDataConfig` or `LerobotARXDataConfig` with `base_config=DataConfig(prompt_from_task=True)` and `repo_id` pointing to the **advantage** dataset (e.g. `.../data/Task_A/advantage`).
278
278
279
279
### What the policy sees as prompt (training)
280
280
281
281
The prompt is read from the dataset’s **`meta/tasks.jsonl`**: each frame’s `task_index` is mapped to a task string, and that string is passed to the policy as the language prompt. **`gt_label.py`** (Stage 0) writes these strings when it builds the advantage-labeled dataset.
282
282
283
-
-**Binary mode** (typical): `task_index=0` → `"<task>, Advantage: negative"`, `task_index=1` → `"<task>, Advantage: positive"`. The `<task>` text is set in `gt_label.py` (e.g. `"fold the cloth"` for FlattenFold).
283
+
-**Binary mode** (typical): `task_index=0` → `"<task>, Advantage: negative"`, `task_index=1` → `"<task>, Advantage: positive"`. The `<task>` text is set in `gt_label.py` (e.g. `"fold the cloth"` for Task_A).
So during AWBC training the model is conditioned on prompts that explicitly include the advantage label (e.g. `"fold the cloth, Advantage: positive"` or `"fold the cloth, Advantage: negative"`).
@@ -299,7 +299,7 @@ At **inference** time you must use the **same prompt format** as in training. To
299
299
300
300
### Before training
301
301
302
-
1.**Produce the advantage dataset:** Run Stage 2 (eval) on your dataset so it has `data_PI06_100000/` or `data_KAI0_100000/`. Then run Stage 0 (e.g. `gt_labeling.sh`) with `DATA_PATH` = that repo and source subdirs `data_PI06_100000` / `data_KAI0_100000`; the script outputs a directory with `data/` (parquets with `task_index`), `meta/tasks.jsonl`, and `videos`. Use that directory as the advantage dataset (e.g. copy or link it to `./data/FlattenFold/advantage`).
302
+
1.**Produce the advantage dataset:** Run Stage 2 (eval) on your dataset so it has `data_PI06_100000/` or `data_KAI0_100000/`. Then run Stage 0 (e.g. `gt_labeling.sh`) with `DATA_PATH` = that repo and source subdirs `data_PI06_100000` / `data_KAI0_100000`; the script outputs a directory with `data/` (parquets with `task_index`), `meta/tasks.jsonl`, and `videos`. Use that directory as the advantage dataset (e.g. copy or link it to `./data/Task_A/advantage`).
303
303
2. In `config.py`, set **`repo_id`** to that advantage dataset path and **`weight_loader`** to your π₀.5 base checkpoint for the AWBC config(s) you use.
304
304
3.**Compute norm stats:**
305
305
`uv run python scripts/compute_norm_states_fast.py --config-name pi05_flatten_fold_awbc`
Each uses `base_config=DataConfig(prompt_from_task=True)` so that the dataset’s `task_index` column and `meta/tasks.jsonl` supply the prompt (advantage-derived label) per frame.
16
16
@@ -24,11 +24,11 @@ Each uses `base_config=DataConfig(prompt_from_task=True)` so that the dataset’
24
24
To build your own advantage dataset instead:
25
25
- Run **Stage 2** (eval) on your dataset → get `data_PI06_100000/` or `data_KAI0_100000/` with advantage columns.
26
26
- Run **Stage 0** on that output: `gt_label.py --advantage-source absolute_advantage` (or `gt_labeling.sh` with `DATA_PATH` = the eval repo). The resulting directory (with `data/`, `meta/tasks.jsonl`, `videos/`) is your advantage dataset.
27
-
- Place or link it at e.g. `./data/FlattenFold/advantage` and set `repo_id` in config to that path.
27
+
- Place or link it at e.g. `./data/Task_A/advantage` and set `repo_id` in config to that path.
28
28
29
29
2.**Config paths**
30
30
In `src/openpi/training/config.py`, for the AWBC config(s) you use:
31
-
- Set **`repo_id`** to the **absolute path** of the advantage dataset (e.g. `<path_to_repo_root>/data/FlattenFold/advantage`).
31
+
- Set **`repo_id`** to the **absolute path** of the advantage dataset (e.g. `<path_to_repo_root>/data/Task_A/advantage`).
32
32
- Set **`weight_loader`** to your **π₀.5 base checkpoint** path.
0 commit comments