| license | task_categories | tags | configs | |||||||
|---|---|---|---|---|---|---|---|---|---|---|
cc-by-nc-sa-4.0 |
|
|
|
- The advantage label will be coming soon.
- About the Dataset
- Step 1: Download the Dataset
- Step 2: Load the Dataset
- Dataset Structure
- License and Citation
-
~134 hours real world scenarios
-
Main Tasks
- FlattenFold
- Single task
- Initial state: T-shirts are randomly tossed onto the table, presenting random crumpled configurations
- Manipulation task: Operate the robotic arm to unfold the garment, then fold it
- HangCloth
- Single task
- Initial state: Hanger is randomly placed, garment is randomly positioned on the table
- Manipulation task: Operate the robotic arm to thread the hanger through the garment, then hang it on the rod
- TeeShirtSort
- Garment classification and arrangement task
- Initial state: Randomly pick a garment from the laundry basket
- Classification: Determine whether the garment is a T-shirt or a dress shirt
- Manipulation task:
- If it is a T-shirt, fold the garment
- If it is a dress shirt, expose the collar, then push it to one side of the table
- FlattenFold
-
Count of the dataset
Task Base (episodes count/hours) DAgger (episodes count/hours) Total(episodes count/hours) FlattenFold 3,055/~42 hours 3,457/ ~13 Hours 6,512 /~55 hours HangCloth 6954/~61 hours 686/~12 hours 7640/~73 hours TeeShirtSort 5988/~31 hours 769/~22 hours 6757/~53 hours Total 15,997/~134 hours 4,912/~47 hours 20,909/~181 hours
Recommended (one command to ./data): From the repository root of kai0, run:
pip install huggingface_hub
python scripts/download_dataset.pyThe dataset is saved under ./data (FlattenFold, HangCloth, TeeShirtSort). Training and evaluation scripts expect this path by default.
Optional: Download only specific tasks or to a custom directory:
python scripts/download_dataset.py --tasks FlattenFold HangCloth --local-dir /path/to/outputManual download (Hugging Face):
# Full dataset to a directory of your choice
hf download OpenDriveLab-org/Kai0 --repo-type dataset --local-dir /path/to/outputOr in Python:
from huggingface_hub import snapshot_download
snapshot_download(
repo_id="OpenDriveLab-org/Kai0",
repo_type="dataset",
local_dir="/path/to/output",
)This dataset is in LeRobot format (v2.1).
| Version | Import |
|---|---|
| ≤ 0.1.0 | from lerobot.common.datasets.lerobot_dataset import LeRobotDataset |
| > 0.1.0 and < 0.4.0 | from lerobot.datasets.lerobot_dataset import LeRobotDataset |
from lerobot.datasets.lerobot_dataset import LeRobotDataset # adjust by version
# After running scripts/download_dataset.py (default ./data)
dataset = LeRobotDataset("path/to/kai0/repo/data/FlattenFold/base") # or local path to a task subsetMigrate v2.1 → v3.0 first: LeRobot dataset v3 migration.
python -m lerobot.datasets.v30.convert_dataset_v21_to_v30 --repo-id=OpenDriveLab-org/Kai0Under each task directory, data is partitioned into two subsets: base and dagger.
- base — Original demonstration trajectories for garment manipulation.
- dagger — On-policy recovery trajectories from iterative DAgger (failure-recovery modes).
Kai0-data/
├── FlattenFold/
│ ├── base/
│ │ ├── data/
│ │ │ ├── chunk-000/
│ │ │ │ ├── episode_000000.parquet
│ │ │ │ ├── episode_000001.parquet
│ │ │ │ └── ...
│ │ │ └── ...
│ │ ├── videos/
│ │ │ ├── chunk-000/
│ │ │ │ ├── observation.images.hand_left/
│ │ │ │ │ ├── episode_000000.mp4
│ │ │ │ │ ├── episode_000001.mp4
│ │ │ │ │ └── ...
│ │ │ │ ├── observation.images.hand_right/
│ │ │ │ │ ├── episode_000000.mp4
│ │ │ │ │ ├── episode_000001.mp4
│ │ │ │ │ └── ...
│ │ │ │ ├── observation.images.top_head/
│ │ │ │ │ ├── episode_000000.mp4
│ │ │ │ │ ├── episode_000001.mp4
│ │ │ │ │ └── ...
│ │ │ │ └── ...
│ │ │ └── ...
│ │ └── meta/
│ │ ├── info.json
│ │ ├── episodes.jsonl
│ │ ├── tasks.jsonl
│ │ └── episodes_stats.jsonl
│ └── dagger/
├── HangCloth/
│ ├── base/
│ └── dagger/
├── TeeShirtSort/
│ ├── base/
│ └── dagger/
└── README.md
the basic struct of the info.json
{
"codebase_version": "v2.1",
"robot_type": "agilex",
"total_episodes": ..., # the total episodes in the dataset
"total_frames": ..., # The total number of video frames in any single camera perspective
"total_tasks": ..., # Total number of tasks
"total_videos": ..., # The total number of videos from all camera perspectives in the dataset
"total_chunks": ..., # The number of chunks in the dataset
"chunks_size": ..., # The max number of episodes in a chunk
"fps": ..., # Video frame rate per second
"splits": { # how to split the dataset
"train": ...
},
"data_path": "data/chunk-{episode_chunk:03d}/episode_{episode_index:06d}.parquet",
"video_path": "videos/chunk-{episode_chunk:03d}/{video_key}/episode_{episode_index:06d}.mp4",
"features": {
"observation.images.top_head": { # the camera perspective
"dtype": "video",
"shape": [
480,
640,
3
],
"names": [
"height",
"width",
"channel"
],
"info": {
"video.height": 480,
"video.width": 640,
"video.codec": "av1",
"video.pix_fmt": "yuv420p",
"video.is_depth_map": false,
"video.fps": 30,
"video.channels": 3,
"has_audio": false
}
},
"observation.images.hand_left": { # the camera perspective
...
},
"observation.images.hand_right": { # the camera perspective
...
},
"observation.state": {
"dtype": "float32",
"shape": [
14
],
"names": null
},
"action": {
"dtype": "float32",
"shape": [
14
],
"names": null
},
"timestamp": {
"dtype": "float32",
"shape": [
1
],
"names": null
},
"frame_index": {
"dtype": "int64",
"shape": [
1
],
"names": null
},
"episode_index": {
"dtype": "int64",
"shape": [
1
],
"names": null
},
"index": {
"dtype": "int64",
"shape": [
1
],
"names": null
},
"task_index": {
"dtype": "int64",
"shape": [
1
],
"names": null
}
}
}| Field Name | shape | Meaning |
|---|---|---|
| observation.state | [N, 14] | left [:, :6], right [:, 7:13], joint angleleft [:, 6], right [:, 13] , gripper open range |
| action | [N, 14] | left [:, :6], right [:, 7:13], joint angleleft [:, 6], right [:, 13] , gripper open range |
| timestamp | [N, 1] | Time elapsed since the start of the episode (in seconds) |
| frame_index | [N, 1] | Index of this frame within the current episode (0-indexed) |
| episode_index | [N, 1] | Index of the episode this frame belongs to |
| index | [N, 1] | Global unique index across all frames in the dataset |
| task_index | [N, 1] | Index identifying the task type being performed |
Task language prompts (natural language instructions). Each entry maps a task_index to its description for language-conditioned policy training.
All the data and code within this repo are under . Please consider citing our project if it helps your research.
@misc{,
title={},
author={},
howpublished={\url{}},
year={}
}