Skip to content

v0.1.1 Release

Latest

Choose a tag to compare

@puyuan1996 puyuan1996 released this 23 Jan 07:02
· 13 commits to main since this release

This release expands LightRFT into the video domain, optimizes evaluation pipelines for reward models, and ensures compatibility with the latest upstream inference and training libraries (specifically sglang).

✨ Multimodal & Advanced Training

  • Video Support: Expanded multimodal capabilities with the addition of Video reinforcement fine-tuning (#4).
  • Enhanced Evaluation: Implemented and optimized evaluation logic for SRM (Step-wise Reward Model) and GRM (Generative Reward Model) trainers.
  • Exploration: Added a high entropy token selection mechanism to improve generation diversity during training (#6).

📊 Benchmarks & Metrics

  • New Eval Datasets: Added support for AIME24/25 and GPQA Diamond evaluations (#27), facilitating stronger reasoning capability assessment.
  • Trajectory Metrics: Added comprehensive analysis metrics to saved trajectories for deeper insight into training dynamics (#5).
  • T2I Benchmarks: Updated GRM on T2I (Text-to-Image) benchmark results and analysis in best practices.

⚙️ Core Optimization & Compatibility

  • Library Updates: Adapted framework to support the latest versions of sglang (#24) and ensured compatibility with upstream ecosystem updates.
  • Transformers Compatibility: Standardized configuration by renaming dtype to torch_dtype for seamless integration with Hugging Face Transformers.
  • Code Polish: Removed redundant tuple nesting in prepare_reward_model return values and polished return logic.

🐛 Bug Fixes

  • Video Metadata: Resolved compatibility bugs related to video metadata handling (#25).
  • GRM Formatting: Fixed bugs in GRM dataset message formatting and evaluation logic.
  • General Fixes: Fixed file path handling bugs and resolved various linting/style issues (flake8, yapf).

📚 Documentation & Community

  • Community Health: Added standard Issue and PR templates to streamline contributions (#20).
  • Docs Automation: Established automated documentation deployment actions (#18).
  • API Docs: Polished API docstrings and Python typing for better developer experience (#21).

Full Changelog: v0.1.0...v0.1.1

Contributors: OpenDILab, System Platform Center and Safe and Trustworthy AI Center at Shanghai AI Laboratory.