Skip to content

Conversation

@jadechoghari
Copy link
Member

@jadechoghari jadechoghari commented Dec 30, 2025

Title

feat(policies): add autoregressive VLAs with tokenization PiFast

This PR brings autoregressive Vision-Language-Action (VLA) models back to LeRobot, alongside the existing flow-matching–based policies.

Unlike flow matching, which predicts actions in parallel over a horizon, autoregressive VLAs model actions sequentially as discrete tokens.
As a first step toward supporting multiple action tokenizers, this PR introduces PiFast, together with a training script for FAST tokenization, this provides a concrete reference implementation for autoregressive action modeling in LeRobot.

Future work will extend this framework to additional tokenizers and autoregressive variants.

TODO:
2- Provide PiFast pretrained checkpoints, and unveil HF LeRobot new AR VLA work.
3- Add testing and docs.

DONE:
1- Trained and evaluated successfully on libero, we will share the ckpts along with the results.
2- Support KV-caching for faster inference (a must for this PR) https://mett29.github.io/posts/kv-cache/

Copilot AI review requested due to automatic review settings December 30, 2025 15:59
@jadechoghari jadechoghari added the policies Items related to robot policies label Dec 30, 2025
@github-actions github-actions bot added the processor Issue related to processor label Dec 30, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces autoregressive Vision-Language-Action (VLA) models to LeRobot, implementing PiFast alongside existing flow-matching policies. Unlike flow matching which predicts actions in parallel over a horizon, this implementation models actions sequentially as discrete tokens using the FAST (Fast Action Sequence Tokenization) tokenizer. The PR provides a complete reference implementation including model architecture, training scripts, and processor pipelines.

Key Changes:

  • Implements PI0Fast policy with autoregressive action token prediction using cross-entropy loss
  • Adds FAST tokenizer integration for converting continuous actions to discrete tokens via DCT coefficients and BPE
  • Introduces custom attention masking patterns supporting bidirectional attention for images/language and causal attention for action tokens

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 12 comments.

Show a summary per file
File Description
src/lerobot/utils/constants.py Adds constants for action tokens and token masks
src/lerobot/processor/tokenizer_processor.py Implements ActionTokenizerProcessorStep for tokenizing actions using FAST with PaliGemma token space conversion
src/lerobot/processor/init.py Exports ActionTokenizerProcessorStep for use in pipelines
src/lerobot/policies/pi0_fast/train_fast_tokenizer.py Provides training script for FAST tokenizer with delta transforms, normalization, and compression statistics
src/lerobot/policies/pi0_fast/processor_pi0_fast.py Creates pre/post-processor pipelines including state discretization and language tokenization
src/lerobot/policies/pi0_fast/modeling_pi0_fast.py Implements core PI0FastPytorch model with PaliGemma+Gemma expert architecture and autoregressive decoding
src/lerobot/policies/pi0_fast/configuration_pi0_fast.py Defines PI0FastConfig with model hyperparameters and training settings
src/lerobot/policies/pi0_fast/init.py Exports PI0Fast components for module access
src/lerobot/policies/factory.py Registers PI0FastPolicy in the policy factory
src/lerobot/policies/init.py Exports PI0FastConfig at package level

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@jadechoghari jadechoghari self-assigned this Dec 30, 2025
@github-actions github-actions bot added the tests Problems with test coverage, failures, or improvements to testing label Jan 6, 2026
@github-actions github-actions bot added the documentation Improvements or fixes to the project’s docs label Jan 6, 2026
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or fixes to the project’s docs policies Items related to robot policies processor Issue related to processor tests Problems with test coverage, failures, or improvements to testing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants