v0.5.1

Latest

Latest

pan-x-c released this 12 Feb 10:16

· 5 commits to main since this release

c3d356c

Overview

Enhanced support for multi-modal models (including Qwen2.5 VL, Qwen3 VL and Kimi-VL-A3B-Thinking series)
Refactored trinity command line interface using typer
Added a log management tool and fixed bugs in the logging system.
Added Jensen-Shannon Divergence for on-policy distillation.
Fixed bugs in model weight synchronization and over-rollout.

What's Changed

Update algorithm List in README by @pan-x-c in #498
Refactor Launcher with Typer by @pan-x-c in #502
Fix memory resume by @hiyuchang in #505
Fix logger in debug mode by @pan-x-c in #504
Fix Logger in Workflow by @pan-x-c in #506
Fix over_rollout by @luyi256 in #500
Enhance support for VL models by @chenyushuo in #501
jsd implement for opd by @kokolerk in #499
Add log manager to track experiement logs by @pan-x-c in #507

New Contributors

@luyi256 made their first contribution in #500

Full Changelog: v0.5.0...v0.5.1

Contributors

chenyushuo, pan-x-c, and 3 other contributors

Assets 2