English Version

New Features:

Support for Liger, which accommodates models like LLaMA, Qwen, Mistral, etc., and reduces memory usage by 10% to 60%.
Support for custom loss function training using a registration mechanism.
Training now supports pushing models to ModelScope and HuggingFace.
Support for the freeze_vit parameter to control the behavior of full parameter training for multimodal models.

New Models:

Qwen2-VL series includes GPTQ/AWQ quantized models. For best practices, see here.
InternVL2 AWQ quantized models.

New Datasets:

qwen2-pro series

中文版

新特性：

支持 Liger训练LLaMA、Qwen、Mistral 等模型，内存使用降低 10% 至 60%。
支持使用注册机制进行自定义损失函数的训练。
训练支持将模型推送至 ModelScope 和 HuggingFace。
支持 freeze_vit 参数，以控制多模态模型全参数训练的行为。

新模型：

Qwen2-VL 系列包括 GPTQ/AWQ 量化模型，最佳实践可以查看这里。
InternVL2 AWQ 量化模型。

新数据集：

qwen2-pro 系列

What's Changed

compat with vllm==0.5.5 by @Jintao-Huang in #1812
Support zero2 offload by @Jintao-Huang in #1814
fix mp+ddp & resume_from_checkpoint by @Jintao-Huang in #1815
fix preprocess_num_proc by @Jintao-Huang in #1818
Support liger by @tastelikefeet in #1819
fix dora deployment by @tastelikefeet in #1821
Support register loss func by @Jintao-Huang in #1822
use default-lora by @Jintao-Huang in #1823
fix minicpm-v 2.6 infer device_map by @Jintao-Huang in #1832
Fix code by @tastelikefeet in #1824
fix inject by @tastelikefeet in #1835
support qwen2-pro dataset by @Jintao-Huang in #1834
add ddp_timeout parameter by @tastelikefeet in #1836
fix internlm-xcomposer rlhf by @hjh0119 in #1838
Support eval_nproc by @tastelikefeet in #1843
support qwen2-vl by @Jintao-Huang in #1842
Add internvl2 awq models by @tastelikefeet in #1846
Fix some datasets for streaming by @tastelikefeet in #1848
Fix Pissa and OLoRA by @tastelikefeet in #1852
Support qwen2 vl grounding by @tastelikefeet in #1854
support qwen2-vl & video finetune by @Jintao-Huang in #1849
Update new datasets by @tastelikefeet in #1855
update qwen2-vl docs by @Jintao-Huang in #1856
update qwen2-vl docs by @Jintao-Huang in #1858
fix qwen2-vl docs by @Jintao-Huang in #1861
fix requirements by @Jintao-Huang in #1864
update docs qwen2-vl by @Jintao-Huang in #1869
Support faster data map by @tastelikefeet in #1871
[TorchAcc] fix serveral bugs for torchacc FSDP. by @baoleai in #1872
Add train record by @tastelikefeet in #1873
Fix num_proc by @Jintao-Huang in #1874
Fix neftune doc by @tastelikefeet in #1875
add duet by @tastelikefeet in #1877
use model.generation_config by @Jintao-Huang in #1850
Support freeze vit by @Jintao-Huang in #1880
support qwen2-vl gptq awq by @Jintao-Huang in #1884
Refactor push_to_hub by @tastelikefeet in #1883
Fix push to hub logic by @tastelikefeet in #1888
add vllm lmdeploy benchmark by @Jintao-Huang in #1889
Add some warnings and fix RLHF by @tastelikefeet in #1890

Full Changelog: v2.3.2...v2.4.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

v2.4.0

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

English Version

New Features:

New Models:

New Datasets:

中文版

新特性：

新模型：

新数据集：

What's Changed

Contributors

Uh oh!