-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Description
Ticket Type
🐛 Bug Report (Something isn't working)
Environment & System Info
- LeRobot version: 0.4.4
- Platform: Linux-5.15.0-1072-nvidia-x86_64-with-glibc2.35
- Python version: 3.10.16
- Huggingface Hub version: 0.35.3
- Datasets version: 4.1.1
- Numpy version: 2.2.6
- FFmpeg version: 4.4.2-0ubuntu0.22.04.1
- PyTorch version: 2.7.1+cu126
- Is PyTorch built with CUDA support?: True
- Cuda version: 12.6
- GPU model: NVIDIA A100-SXM4-80GB
- lerobot scripts: ['lerobot-calibrate', 'lerobot-dataset-viz', 'lerobot-edit-dataset', 'lerobot-eval', 'lerobot-find-cameras', 'lerobot-find-joint-limits', 'lerobot-find-port', 'lerobot-imgtransform-viz', 'lerobot-info', 'lerobot-record', 'lerobot-replay', 'lerobot-setup-can', 'lerobot-setup-motors', 'lerobot-teleoperate', 'lerobot-train', 'lerobot-train-tokenizer']Description
I am following the documentation from the Pi0 FAST page introduced in PR #2734 . I am running the following command, taken from that page.
lerobot-train \
--dataset.repo_id=lerobot/libero \
--output_dir=outputs/libero_pi0fast \
--job_name=libero_pi0fast \
--policy.path=lerobot/pi0fast-base \
--policy.dtype=bfloat16 \
--steps=10000 \
--save_freq=2000 \
--batch_size=32 \
--policy.device=cuda \
--policy.compile_model=true \
--policy.scheduler_warmup_steps=4000 \
--policy.scheduler_decay_steps=100000 \
--policy.scheduler_decay_lr=1e-5 \
--policy.gradient_checkpointing=true \
--policy.chunk_size=10 \
--policy.n_action_steps=10 \
--policy.max_action_tokens=256 \
--policy.empty_cameras=1 \
--policy.push_to_hub=false \
--dataset.use_imagenet_stats=false \However, the code errors out because the Pi0Fast config expects image keys like so
{
'observation.images.base_0_rgb': PolicyFeature(type=<FeatureType.VISUAL: 'VISUAL'>, shape=(3, 224, 224)),
'observation.images.left_wrist_0_rgb': PolicyFeature(type=<FeatureType.VISUAL: 'VISUAL'>, shape=(3, 224, 224)),
'observation.images.right_wrist_0_rgb': PolicyFeature(type=<FeatureType.VISUAL: 'VISUAL'>, shape=(3, 224, 224)),
'observation.images.empty_camera_0': PolicyFeature(type=<FeatureType.VISUAL: 'VISUAL'>, shape=(3, 224, 224))}
But the lerobot/libero dataset given in the command has only
'observation.images.image', 'observation.images.image2'
The number of images expected by the FAST model here is more than what the dataset. Also, the lerobot/libero dataset given in the command seems to be a partial version since it does not have all images for all the episodes. Looking at the videos folder, there are only 36 mp4 files, but the total dataset is 1693 episodes.
Could you please provide the exact command for reproducing the training run, with the dataset used? If possible could you also provide Wandb logs so I can compare my run?
Also, side note, did you enable dataset.use_imagenet_stats for the training run? It was throwing an error so I disabled it.
Context & Reproduction
No response
Relevant logs or stack trace
Checklist
- I have searched existing tickets to ensure this isn't a duplicate.
- I am using the latest version of the
mainbranch. - I have verified this is not an environment-specific problem.
Additional Info / Workarounds
No response