Skip to content

Releases: EvolvingLMMs-Lab/LLaVA-OneVision-1.5

1.5

26 Dec 02:32

Choose a tag to compare

Release v1.5

This is the first official release of LLaVA-OneVision-1.5, focusing on documentation improvements, data packing & preprocessing reliability, training configuration defaults, evaluation additions, and multiple bug fixes.

Highlights

  • Improved documentation and onboarding experience (README, training TODOs, LICENSE updates)
  • More robust data packing / WebDataset conversion with support for mixed packing
  • Added SFT data preprocessing guidance and materials
  • Introduced evaluation-related content
  • Improved training defaults (e.g., dtype bfloat16, default 4B merge, auto config updates)
  • Numerous bug fixes across packing, filtering, and demo workflows

What’s Changed

Documentation & Project Maintenance

Data Processing / Packing / Filtering

Training Defaults & Configuration

  • set processor.image_processor.max_pixels = 1600*1600 by @yiyexy in #8
  • default merge 4b by @yiyexy in #13
  • auto update config by @yiyexy in #15
  • update default dtype to bfloat16 by @yiyexy in #18
  • Remove the hosts Settings for multiple machines and only support sing… by @chengzheng345 in #56

Evaluation

SFT Data

Bug Fixes & Robustness Improvements

  • fix a bug by @yiyexy in #30
  • fix_issue#31 by @killTheHostage in #33
  • Remove duplicate dependency py-cpuinfo libraries. by @Lornatang in #50
  • Fixed the issue that the model_path name does not correspond to the HF, causing the model to fail to load. by @Lornatang in #51
  • refactor(merge_model): Enhanced file implementation robustness by @Lornatang in #52
  • fix Stage 1.5 Mid-Training demo error. by @Lornatang in #57

New Contributors

Full Changelog: https://github.com/EvolvingLMMs-Lab/LLaVA-OneVision-1.5/commits/1.5