[UPDATE]: document advantage label release on HuggingFace & ModelScope

ChonghaoSima · claude · ChonghaoSima · commit e8a54a7724fa · 2026-02-15T04:13:25.000+08:00
Update stage_advantage READMEs to note that Task_A/advantage/ is
available on both HuggingFace and ModelScope dataset repos. Check off
the HuggingFace &amp; ModelScope to-do item and add update log entry.

Co-Authored-By: Claude Opus 4.6 &lt;noreply@anthropic.com&gt;
diff --git a/README.md b/README.md
@@ -54,6 +54,7 @@ https://github.com/user-attachments/assets/3f5f0c48-ff3f-4b9b-985b-59ad0b2ea97c
 
 ## Update
 
+- [Feb 15 2026] Stage Advantage **advantage labels** (`Task_A/advantage/`) released on [Hugging Face](https://huggingface.co/datasets/OpenDriveLab-org/Kai0) and [ModelScope](https://www.modelscope.cn/datasets/OpenDriveLab/Kai0).
 - [Feb 15 2026] Release of the **Train-Deploy Alignment** module: data augmentation (time scaling, space mirroring), DAgger data collection, inference with temporal smoothing/ensembling and RTC, and HDF5-to-LeRobot conversion.
 - [Feb 14 2026] Release of the **Stage Advantage** module: advantage estimator training, evaluation, GT labeling, and AWBC training pipeline.
 - [Feb 10 2026] Initial release of the **Model Arithmetic** module with support for both JAX and PyTorch checkpoints (not tested thoroughly).
@@ -212,7 +213,7 @@ Checkpoints are written to the config’s checkpoint directory. You can then use
 - [x] Model Arithmetic: code of different baselines for weight-space interpolation
 - [x] Stage Advantage: code, data (advantage labels), and checkpoints
 - [x] Train-Deploy Alignment: data augmentation, DAgger, inference (temporal smoothing, ensembling, RTC)
-- [ ] HuggingFace & ModelScope: upload Stage Advantage data and checkpoints
+- [x] HuggingFace & ModelScope: Stage Advantage data (`Task_A/advantage/`) and checkpoints uploaded
 
 ## Model Arithmetic
 
diff --git a/stage_advantage/README.md b/stage_advantage/README.md
@@ -22,7 +22,7 @@ This module implements a pipeline for training an **Advantage Estimator** and us
 
 **End-to-end order for AWBC:** (1) Stage 0 on data with `progress` → optional for Stage 1. (2) Stage 1 → train estimator. (3) Stage 2 → run eval on your dataset so it gets `data_PI06_100000/` or `data_KAI0_100000/` with advantage columns. (4) Run Stage 0 again with `--advantage-source absolute_advantage` on that dataset (e.g. via `gt_labeling.sh` with `DATA_PATH` = the repo you ran eval on, and source subdirs `data_PI06_100000` / `data_KAI0_100000`). (5) Point AWBC config `repo_id` at the resulting advantage-labeled directory and run Stage 3 training.
 
-**Pre-annotated data:** The downloaded dataset includes **`data/Task_A/advantage`**, a fully annotated advantage dataset that can be used **directly for AWBC training** (Stage 3) without running Stage 0–2. Set the AWBC config `repo_id` to that path and run training.
+**Pre-annotated data:** The released dataset includes **`Task_A/advantage/`**, a fully annotated advantage dataset that can be used **directly for AWBC training** (Stage 3) without running Stage 0–2. It is available in both the [Hugging Face](https://huggingface.co/datasets/OpenDriveLab-org/Kai0) and [ModelScope](https://www.modelscope.cn/datasets/OpenDriveLab/Kai0) dataset repos. After downloading (e.g. via `scripts/download_dataset.py`), set the AWBC config `repo_id` to the local path (e.g. `<repo_root>/data/Task_A/advantage`) and run training.
 
 ---
 
diff --git a/stage_advantage/awbc/README.md b/stage_advantage/awbc/README.md
@@ -19,7 +19,7 @@ Each uses `base_config=DataConfig(prompt_from_task=True)` so that the dataset’
 1. **Advantage dataset**  
    The data must have `task_index` in each parquet and `meta/tasks.jsonl` (prompt strings per `task_index`).
 
-   **Pre-annotated data:** The downloaded dataset includes **`data/Task_A/advantage`**, a fully annotated advantage dataset that can be used **directly for AWBC training** (no need to run Stage 0–2 first). Set the AWBC config `repo_id` to that path and run the training commands below.
+   **Pre-annotated data:** The released dataset includes **`Task_A/advantage/`**, a fully annotated advantage dataset that can be used **directly for AWBC training** (no need to run Stage 0–2 first). It is available in both the [Hugging Face](https://huggingface.co/datasets/OpenDriveLab-org/Kai0) and [ModelScope](https://www.modelscope.cn/datasets/OpenDriveLab/Kai0) dataset repos. After downloading, set the AWBC config `repo_id` to the local path (e.g. `<repo_root>/data/Task_A/advantage`) and run the training commands below.
 
    To build your own advantage dataset instead:
    - Run **Stage 2** (eval) on your dataset → get `data_PI06_100000/` or `data_KAI0_100000/` with advantage columns.