Skip to content

Unable to run Pi0 Fast Training #2890

@shivakanthsujit

Description

@shivakanthsujit

Ticket Type

🐛 Bug Report (Something isn't working)

Environment & System Info

- LeRobot version: 0.4.4
- Platform: Linux-5.15.0-1072-nvidia-x86_64-with-glibc2.35
- Python version: 3.10.16
- Huggingface Hub version: 0.35.3
- Datasets version: 4.1.1
- Numpy version: 2.2.6
- FFmpeg version: 4.4.2-0ubuntu0.22.04.1
- PyTorch version: 2.7.1+cu126
- Is PyTorch built with CUDA support?: True
- Cuda version: 12.6
- GPU model: NVIDIA A100-SXM4-80GB
- lerobot scripts: ['lerobot-calibrate', 'lerobot-dataset-viz', 'lerobot-edit-dataset', 'lerobot-eval', 'lerobot-find-cameras', 'lerobot-find-joint-limits', 'lerobot-find-port', 'lerobot-imgtransform-viz', 'lerobot-info', 'lerobot-record', 'lerobot-replay', 'lerobot-setup-can', 'lerobot-setup-motors', 'lerobot-teleoperate', 'lerobot-train', 'lerobot-train-tokenizer']

Description

I am following the documentation from the Pi0 FAST page introduced in PR #2734 . I am running the following command, taken from that page.

lerobot-train \
  --dataset.repo_id=lerobot/libero \
  --output_dir=outputs/libero_pi0fast \
  --job_name=libero_pi0fast \
  --policy.path=lerobot/pi0fast-base \
  --policy.dtype=bfloat16 \
  --steps=10000 \
  --save_freq=2000 \
  --batch_size=32 \
  --policy.device=cuda \
  --policy.compile_model=true \
  --policy.scheduler_warmup_steps=4000 \
  --policy.scheduler_decay_steps=100000 \
  --policy.scheduler_decay_lr=1e-5 \
  --policy.gradient_checkpointing=true \
  --policy.chunk_size=10 \
  --policy.n_action_steps=10 \
  --policy.max_action_tokens=256 \
  --policy.empty_cameras=1 \
  --policy.push_to_hub=false \
  --dataset.use_imagenet_stats=false \

However, the code errors out because the Pi0Fast config expects image keys like so

{
'observation.images.base_0_rgb': PolicyFeature(type=<FeatureType.VISUAL: 'VISUAL'>, shape=(3, 224, 224)), 
'observation.images.left_wrist_0_rgb': PolicyFeature(type=<FeatureType.VISUAL: 'VISUAL'>, shape=(3, 224, 224)), 
'observation.images.right_wrist_0_rgb': PolicyFeature(type=<FeatureType.VISUAL: 'VISUAL'>, shape=(3, 224, 224)), 
'observation.images.empty_camera_0': PolicyFeature(type=<FeatureType.VISUAL: 'VISUAL'>, shape=(3, 224, 224))}

But the lerobot/libero dataset given in the command has only

'observation.images.image', 'observation.images.image2'

The number of images expected by the FAST model here is more than what the dataset. Also, the lerobot/libero dataset given in the command seems to be a partial version since it does not have all images for all the episodes. Looking at the videos folder, there are only 36 mp4 files, but the total dataset is 1693 episodes.

Could you please provide the exact command for reproducing the training run, with the dataset used? If possible could you also provide Wandb logs so I can compare my run?

Also, side note, did you enable dataset.use_imagenet_stats for the training run? It was throwing an error so I disabled it.

Context & Reproduction

No response

Relevant logs or stack trace

Checklist

  • I have searched existing tickets to ensure this isn't a duplicate.
  • I am using the latest version of the main branch.
  • I have verified this is not an environment-specific problem.

Additional Info / Workarounds

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn’t working correctlyconfigurationProblems with configuration files or settingsdatasetIssues regarding data inputs, processing, or datasetsdocumentationImprovements or fixes to the project’s docsperformanceIssues aimed at improving speed or resource usagepoliciesItems related to robot policiestrainingIssues related at training time

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions