You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+60-16Lines changed: 60 additions & 16 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -21,7 +21,7 @@
21
21
22
22
-**[Model Arithmetic](#model-arithmetic)**: A weight-space merging strategy that combines models trained on different data subsets, efficiently capturing diverse knowledge without architectural complexity. **[Released]**
23
23
-**[Stage Advantage](#stage-advantage)**: A stage-aware advantage estimator that provides stable, dense progress signals for policy training. **[Released]**
24
-
-**[Train-Deploy Alignment](#train-deploy-alignment-coming-soon)**: Bridges the distribution gap via spatio-temporal augmentation, heuristic DAgger corrections, and temporal chunk-wise smoothing. **[Coming Soon]**
24
+
-**[Train-Deploy Alignment](#train-deploy-alignment)**: Bridges the distribution gap via spatio-temporal augmentation, heuristic DAgger corrections, and temporal chunk-wise smoothing. **[Released]**
25
25
26
26
χ₀ enables two sets of dual-arm robots to collaboratively orchestrate long-horizon garment manipulation — flattening, folding, and hanging — surpassing the state-of-the-art $\pi_{0.5}$ baseline by approximately 250% in success rate, with `only 20 hours of data and 8 A100 GPUs`.
-[Feb 15 2026] Stage Advantage **advantage labels** (`Task_A/advantage/`) released on [Hugging Face](https://huggingface.co/datasets/OpenDriveLab-org/Kai0) and [ModelScope](https://www.modelscope.cn/datasets/OpenDriveLab/Kai0).
58
+
-[Feb 15 2026] Release of the **Train-Deploy Alignment** module: data augmentation (time scaling, space mirroring), DAgger data collection, inference with temporal smoothing/ensembling and RTC, and HDF5-to-LeRobot conversion.
57
59
-[Feb 14 2026] Release of the **Stage Advantage** module: advantage estimator training, evaluation, GT labeling, and AWBC training pipeline.
58
60
-[Feb 10 2026] Initial release of the **Model Arithmetic** module with support for both JAX and PyTorch checkpoints (not tested thoroughly).
59
61
-[Feb 10 2026] χ₀ paper released.
@@ -80,7 +82,7 @@ Non-edge components (e.g., Policy Training, Model Arithmetic) have been tested o
80
82
81
83
### Hardware
82
84
83
-
For real-robot deployment (dual-arm setup, cameras, and table layout), see **[Hardware Setup & 3D Print Files](setup/README.md)**. That document covers supported platforms (Agilex Piper for FlattenFold / TeeShirtSort, ARX X5 for HangCloth), Intel RealSense D435i camera placement, 3D-printed grippers and mounts with usage notes, and inference host GPU (RTX 4090 in Ubuntu 20.04).
85
+
For real-robot deployment (dual-arm setup, cameras, and table layout), see **[Hardware Setup & 3D Print Files](setup/README.md)**. That document covers supported platforms (Agilex Piper for Task_A / Task_B, ARX X5 for Task_C), Intel RealSense D435i camera placement, 3D-printed grippers and mounts with usage notes, and inference host GPU (RTX 4090 in Ubuntu 20.04).
84
86
85
87
## Installation
86
88
@@ -116,11 +118,11 @@ Download the Kai0 dataset so it is available under `./data` for training and eva
116
118
python scripts/download_dataset.py
117
119
```
118
120
119
-
This fetches the full dataset from [Hugging Face](https://huggingface.co/datasets/OpenDriveLab-org/Kai0) into `./data` (FlattenFold, HangCloth, TeeShirtSort). To download only specific tasks or use a custom path, see the [dataset docs](docs/dataset.md#step-1-download-the-dataset).
121
+
This fetches the full dataset from [Hugging Face](https://huggingface.co/datasets/OpenDriveLab-org/Kai0) into `./data` (Task_A, Task_B, Task_C). To download only specific tasks or use a custom path, see the [dataset docs](docs/dataset.md#step-1-download-the-dataset).
120
122
121
123
### 2. Download checkpoints (optional, for testing)
122
124
123
-
We provide **one best model per task** (FlattenFold, HangCloth, TeeShirtSort) in the [Kai0 repo on Hugging Face](https://huggingface.co/OpenDriveLab-org/Kai0/tree/main).
125
+
We provide **one best model per task** (Task_A, Task_B, Task_C) in the [Kai0 repo on Hugging Face](https://huggingface.co/OpenDriveLab-org/Kai0/tree/main).
124
126
125
127
From the repository root, you can download all best-model checkpoints to `./checkpoints` with:
After download, set `weight_loader` in the training config to the path of the corresponding checkpoint directory (see step 3 below). You can also use openpi’s pretrained π₀.5 checkpoint instead.
@@ -144,7 +146,7 @@ After the dataset is in `./data`, you can run **normal π₀.₅ full fine-tunin
144
146
145
147
Edit [`src/openpi/training/config.py`](src/openpi/training/config.py) (around lines 1173–1226) for the task(s) you need:
146
148
147
-
-**`repo_id`**: set to the **absolute path** to the dataset subset, e.g. `<path_to_repo_root>/data/FlattenFold/base`, `<path_to_repo_root>/data/TeeShirtSort/base`, or `<path_to_repo_root>/data/HangCloth/base`.
149
+
-**`repo_id`**: set to the **absolute path** to the dataset subset, e.g. `<path_to_repo_root>/data/Task_A/base`, `<path_to_repo_root>/data/Task_B/base`, or `<path_to_repo_root>/data/Task_C/base`.
148
150
-**`weight_loader`**: set to the path of your **π₀.₅ base checkpoint** — either the best model you downloaded in step 2 above, or openpi’s pretrained π₀.₅ checkpoint.
149
151
150
152
Config names to use: e.g. `pi05_flatten_fold_normal`
@@ -210,8 +212,8 @@ Checkpoints are written to the config’s checkpoint directory. You can then use
210
212
-[x] kai0 oracle: training and inference code with non-advantage data of three tasks
211
213
-[x] Model Arithmetic: code of different baselines for weight-space interpolation
212
214
-[x] Stage Advantage: code, data (advantage labels), and checkpoints
213
-
-[ ] HuggingFace & ModelScope: upload Stage Advantage data and checkpoints — **Feb 14**
-[x] HuggingFace & ModelScope: Stage Advantage data (`Task_A/advantage/`) and checkpoints uploaded
215
217
216
218
## Model Arithmetic
217
219
@@ -300,7 +302,7 @@ For a ready-to-use script with environment setup (conda/venv activation, DDP con
300
302
**Stage 2 — Advantage Estimation on New Data**: Use the trained estimator to label datasets with predicted advantage values.
301
303
302
304
```bash
303
-
uv run python stage_advantage/annotation/eval.py Flatten-Fold KAI0 /path/to/dataset
305
+
uv run python stage_advantage/annotation/eval.py Task-A KAI0 /path/to/dataset
304
306
```
305
307
306
308
For a ready-to-use script with environment setup and status logging, see `stage_advantage/annotation/eval.sh`.
@@ -315,14 +317,56 @@ For a ready-to-use script with environment setup and automatic log management, s
315
317
316
318
For the full pipeline details, configuration instructions, and all parameters, see [`stage_advantage/README.md`](stage_advantage/README.md).
317
319
318
-
## Train-Deploy Alignment (Coming Soon)
320
+
## Train-Deploy Alignment
319
321
320
-
Train-Deploy Alignment bridges the distribution gap between training and real-world deployment through:
321
-
-**Spatio-temporal augmentation**: Data augmentation including space mirroring and time scaling for dual-arm setups.
322
-
-**Heuristic DAgger corrections**: Interactive on-robot data collection for iterative policy improvement.
323
-
-**Temporal chunk-wise smoothing**: Smoothed action execution to reduce jitter during deployment.
322
+
Train-Deploy Alignment bridges the distribution gap between training and real-world deployment through three sub-modules:
324
323
325
-
**This module is currently under refinement and will be released soon.**
324
+
-**Data Augmentation** (`train_deploy_alignment/data_augment/`): Time scaling (frame extraction at configurable rates), space mirroring (left/right arm swap + video flip), dataset merging, and HDF5-to-LeRobot format conversion.
325
+
-**DAgger** (`train_deploy_alignment/dagger/`): Policy-in-the-loop data collection for both Agilex Piper and ARX X5 platforms. Operators run inference, switch to DAgger mode for human corrections, and save episodes (HDF5 + optional videos + intervention labels).
326
+
-**Inference** (`train_deploy_alignment/inference/`): Deployment code for Agilex and ARX robots with multiple execution modes — synchronous, temporal smoothing, temporal ensembling, and **RTC (real-time chunking)**. Uses a two-machine setup (GPU policy server + robot IPC client).
For full setup instructions (IPC environment, CAN, ROS/ROS2, platform-specific details), see [`train_deploy_alignment/README.md`](train_deploy_alignment/README.md).
0 commit comments