Skip to content

[Bug] Diffusion Policy: Missing image resize logic for high-res datasets (e.g., aloha_sim) #2918

@fxdllr

Description

@fxdllr

Ticket Type

🐛 Bug Report (Something isn't working)

Environment & System Info

LeRobot : 0.4.3
Python : 3.13.11

Description

Summary:
I'm training Diffusion Policy using the official aloha_sim_transfer_cube_scripted dataset (640x480 resolution).
The default DiffusionConfig sets crop_shape=(84, 84), but there is no automatic resize step in the pipeline.
The Problem:
The validation logic in configuration_diffusion.py only checks if crop_shape < image_shape. Since 84 < 480, it passes.
However, during training, RandomCrop(84) is applied directly to the 640x480 image. This means the model only sees a tiny crop (2.7% of the image), often missing the robot arm or object entirely.
Expected Behavior:

  1. The code should warn the user if image_shape is significantly larger than crop_shape.
  2. Or, ImageCropResizeProcessorStep should be automatically configured to resize images to something close to crop_shape (e.g., 96x96) before cropping.
    Impact:
    Users training on official datasets get very poor success rates because the model is training on empty crops.

Context & Reproduction

No response

Relevant logs or stack trace

Checklist

  • I have searched existing tickets to ensure this isn't a duplicate.
  • I am using the latest version of the main branch.
  • I have verified this is not an environment-specific problem.

Additional Info / Workarounds

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn’t working correctlydatasetIssues regarding data inputs, processing, or datasetspoliciesItems related to robot policiesprocessorIssue related to processortrainingIssues related at training time

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions