Releases · EvolvingLMMs-Lab/lmms-engine

16 Jan 10:15

kcz358

v0.1.3

45c1944

[v0.1.3] More model, parallelism coverage with feature and bug fix Latest

Latest

What's Changed

doc: fix github link by @mwxely in #65
fix: Fix llava ov batched image padding issue by @kcz358 in #72
[test] add test for qwen2.5omni, fix qwen2.5omni example by @ngquangtrung57 in #71
[feat] support BAGEL training with Liger Kernel by @pufanyi in #74
[docs] Fix BAGEL model packing status in README by @pufanyi in #79
Enhance MFU reference document introduction by @kcz358 in #82
add linux uv sync script with automatic platform detection by @oneScotch in #81
[feat] Qwen3 MoE EP Support by @Jinghao-Guo in #75
[docs] Add Docker usage instructions to README by @pangyyyyy in #85
add llada and dream arch examples for dllm training by @JinjieNi in #84
[feat] Qwen 3 Omni MOE with EP support by @ngquangtrung57 in #88
[docs] correct doc: diffusion language model by @KemingWu in #89
Update README.md by @kcz358 in #90
[fix]: Remove rank == 0 in all makedirs (#93) by @VietCT04 in #94
[feat] Qwen 3 VL MOE with EP support by @ngquangtrung57 in #92
[feat] Qwen3 Training support by @yiyexy in #95
[feat] SP loss better alignment and patch qwen3 vl conv implementation to linear by @kcz358 in #96
[fix] Handle router logits in Qwen 3 moe and Qwen 3 omni moe for aux loss by @ngquangtrung57 in #98
[feat] LLaVA-Video Training support by @nssmd in #97
[feat] Gradient accumulation by @pufanyi in #103
Remove linear patch for conv3d for now for precision issue by @kcz358 in #105
[feat] Allow bagel to output logits and logprobs for sde, fix collator padding for padded images by @kcz358 in #109
[fix] Update Hydra command for multi-node training by @pufanyi in #108
[fix] Fix some training mismatch in qwen3 vl and rfc parallel logic by @kcz358 in #106
Add projects using LMMs-Engine to README by @KemingWu in #111
Fix badge formatting for LongVT project link by @mwxely in #112
LLaVAOneVision1_5 Support by @Jinghao-Guo in #101
[feat] Add map style dataset for qwen3 vl by @kcz358 in #115
[fix] Align better bagel original eval with option to align with flow-grpo sde settings by @kcz358 in #117
Update section title and project descriptions in README by @mwxely in #118
[feat] enable freeze submodules by @gathierry in #119
[feat] Better imports utils for lmms-engine by @kcz358 in #122
[feat] add EMA (Exponential Moving Average) support for FSDP2 training by @KemingWu in #120
[fix] relax overwrite_config typing to support non-string config overrides by @KemingWu in #124
Add Bagel Trainer and fix config, bagel data processor by @KemingWu in #126
[fix] Applied different rnd seed in bagel so that the noise would be sample… by @kcz358 in #129
[fix]: use valid labels for SP loss normalization by @kcz358 in #130

New Contributors

@oneScotch made their first contribution in #81
@pangyyyyy made their first contribution in #85
@JinjieNi made their first contribution in #84
@KemingWu made their first contribution in #89
@VietCT04 made their first contribution in #94
@yiyexy made their first contribution in #95
@nssmd made their first contribution in #97
@gathierry made their first contribution in #119

Full Changelog: v0.1.2...v0.1.3

Contributors

gathierry, mwxely, and 11 other contributors

Assets 2

25 Oct 11:11

kcz358

v0.1.2

78a0185

[v0.1.2] First official Release with new models and feature support

What's Changed

feat: Bagel Image Understanding by @pufanyi in #43
fix: Bagel Docs Data Format by @pufanyi in #44
fix: Allow training bagel on understanding dataset when visual_gen=True by @pufanyi in #46
feat: Bagel naive implementation of sparse attention by @kcz358 in #45
feat: Better merge and print batch input by @kcz358 in #48
fix: Merge fsdp by @kcz358 in #49
add_single_gpu_muon&fix_some_bugs by @BIGKnight in #53
[v0.1.2] release: hydra launch config, sit, and rae training by @kcz358 in #50
fix: Fix launch from cli using config examples by @kcz358 in #54
feat: Support Qwen2.5 Omni Thinker by @kcz358 in #56
feat: Add llava_ov, bagel and better cicd readme and control by @kcz358 in #57
docs: Add a auto build docs, may be deprecated by @kcz358 in #58
feat: Support Qwen3-VL ulysses sequence parallel operation by @kcz358 in #59
fix: Fix random shuffle seed on same dp rank to prevent sp hang by @kcz358 in #60
docs: improve documentation accuracy and add Qwen-VL training guide by @mwxely in #62
Fix/reorg examples by @Luodian in #61
Dev/readme by @Luodian in #63
docs: Fix some examples error and better documentation on implementing new class by @kcz358 in #64

New Contributors

@mwxely made their first contribution in #62

Full Changelog: v0.1.1...v0.1.2

Contributors

Luodian, BIGKnight, and 3 other contributors

Assets 2

18 Sep 09:54

kcz358

v0.1.1

b66c50a

[v0.1.1] Bagel, WanVideo, stream packing and refactor for better repo structure

What's Changed

feat: Custom FSDP2 trainer by @kcz358 in #8
feat: Add Save and Load logic for fsdp2 trainer by @kcz358 in #9
Dev/bo 0809 by @Luodian in #10
feat: Add flash-attn and liger-kernel dependencies by @Luodian in #11
feat: Support Qwen2 for remove padding training by @kcz358 in #14
[feat] enable dllm training by @BIGKnight in #15
feat: Add cicd by @kcz358 in #16
feat: LLaVA-Ov ops and liger-kernel rfc by @kcz358 in #17
rfc: Better base dataset abstract class and flexible args for kwargs by @kcz358 in #20
test: Multi-gpu cicd test for robustness by @kcz358 in #22
Dev/wan by @BIGKnight in #23
feat: Add Qwen2 ulysses sequence parallel by @kcz358 in #24
rfc: Refactor video loading logic and processor by @kcz358 in #25
rfc: Train implementation, monkey patch logic by @kcz358 in #28
Add efficient loss for dllms by @yshenaw in #27
feat: Add profiler by @kcz358 in #30
fix: profile error by @kcz358 in #31
feat: Support stream packing by @kcz358 in #32
Dev/muon by @BIGKnight in #34
fix: Force iterable max steps by @kcz358 in #35
feat: Support bagel training by @kcz358 in #33
fix: Image tensor size error by @pufanyi in #40