Skip to content

Add LeRobot Data Handler for Dataset Export#41

Merged
yuecideng merged 32 commits intomainfrom
yhn/to_dataset
Jan 12, 2026
Merged

Add LeRobot Data Handler for Dataset Export#41
yuecideng merged 32 commits intomainfrom
yhn/to_dataset

Conversation

@yhnsu
Copy link
Collaborator

@yhnsu yhnsu commented Dec 17, 2025

Description

This PR introduces a new dataset functor architecture in EmbodiChain, enabling efficient export of environment episodes to the LeRobot dataset format. The design adopts a functor-based manager pattern, supporting multi-modal data and flexible configuration.

Key Features

  • Functor Pattern: Dataset export logic is encapsulated in functor classes (e.g., LeRobotRecorder)
  • Unified Manager: DatasetManager orchestrates functor calls per environment step, supporting multiple dataset formats
  • Multi-Modal Support: Handles RGB images (including stereo), proprioceptive states, and actions
  • Flexible Configuration: All parameters (e.g., video/image mode, metadata) are configurable via gym config
  • Consistent API: Usage matches event/observation manager patterns

Usage Example

# In gym config (JSON):
"dataset": {
    "lerobot": {
        "func": "LeRobotRecorder",
        "mode": "save",
        "params": {
            "save_path": "/tmp/lerobot_data",
            "robot_meta": {...},
            "extra": {"scene_type": "kitchen"},
            "use_videos": true
        }
    }
}
# In environment loop
obs, info = env.reset()
for step in range(max_steps):
    action = ...
    obs, reward, done, info = env.step(action)
    # DatasetManager automatically records and saves episodes

Technical Details

  • Efficient Storage: Data is buffered and saved per episode, supporting parallel environments
  • Extensible: New dataset formats can be added by implementing functor classes
  • Configurable: All functor parameters are set via config

Dependencies

  • Requires lerobot package for LeRobot export: pip install lerobot

Type of Change

  • New feature (LeRobot export)

Benefits

  • Performance: Efficient episode saving
  • Extensibility: Easy to add new dataset formats
  • Consistency: Unified manager/functor pattern across events, observations, datasets
  • Interoperability: Direct export to LeRobot format for visualization and training

@yhnsu yhnsu requested a review from yuecideng December 17, 2025 08:29
@yuecideng yuecideng changed the title draft: Add LeRobot Data Handler for Dataset Export [Draft]: Add LeRobot Data Handler for Dataset Export Dec 17, 2025
@yuecideng yuecideng marked this pull request as draft December 17, 2025 08:32
@yuecideng yuecideng changed the title [Draft]: Add LeRobot Data Handler for Dataset Export Add LeRobot Data Handler for Dataset Export Jan 11, 2026
@yuecideng yuecideng marked this pull request as ready for review January 11, 2026 08:34
@yuecideng yuecideng merged commit a0cf9c7 into main Jan 12, 2026
9 of 10 checks passed
@yuecideng yuecideng deleted the yhn/to_dataset branch January 12, 2026 13:42
yangchen73 pushed a commit that referenced this pull request Jan 20, 2026
Co-authored-by: yuanhaonan <yuanhaonan@dexforce.top>
Co-authored-by: Yueci Deng <dengyueci@qq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants