Skip to content

Commit 89432b5

Browse files
authored
Fix internvl2.5/3 deepspeed packing (#3855)
1 parent 51b3718 commit 89432b5

File tree

6 files changed

+6
-6
lines changed

6 files changed

+6
-6
lines changed

docs/source/Instruction/命令行参数.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -584,7 +584,7 @@ qwen2_5_omni除了包含qwen2_5_vl和qwen2_audio的模型特定参数外,还
584584
- MAX_NUM: 默认为12
585585
- INPUT_SIZE: 默认为448
586586

587-
### internvl2, internvl2_phi3, internvl2_5
587+
### internvl2, internvl2_phi3, internvl2_5, internvl3
588588
参数含义可以查看[这里](https://modelscope.cn/models/OpenGVLab/InternVL2_5-2B)
589589
- MAX_NUM: 默认为12
590590
- INPUT_SIZE: 默认为448

docs/source_en/Instruction/Command-line-parameters.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -596,7 +596,7 @@ For the meaning of the arguments, please refer to [here](https://modelscope.cn/m
596596
- MAX_NUM: Default is 12
597597
- INPUT_SIZE: Default is 448
598598

599-
### internvl2, internvl2_phi3, internvl2_5
599+
### internvl2, internvl2_phi3, internvl2_5, internvl3
600600
For the meaning of the arguments, please refer to [here](https://modelscope.cn/models/OpenGVLab/InternVL2_5-2B)
601601
- MAX_NUM: Default is 12
602602
- INPUT_SIZE: Default is 448

examples/train/packing/qwen2_5_omni.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# 4 * 32GB
2-
# Multimodal packing currently only supports qwen2_vl, qwen2_5_vl, qwen2_5_omni, internvl2_5
2+
# Multimodal packing currently only supports qwen2_vl, qwen2_5_vl, qwen2_5_omni, internvl2_5/3
33
# A demo for four modalities that can be run directly
44
# For local datasets, it is recommended to use streaming: `--streaming true` (save memory)
55
pip uninstall transformers

examples/train/packing/qwen2_5_vl.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# 4 * 36GB
2-
# Multimodal packing currently only supports qwen2_vl, qwen2_5_vl, qwen2_5_omni, internvl2_5
2+
# Multimodal packing currently only supports qwen2_vl, qwen2_5_vl, qwen2_5_omni, internvl2_5/3
33
# Efficiency: With packing: 10 minutes; Without packing: >=1 hour
44
# For local datasets, it is recommended to use streaming: `--streaming true` (save memory)
55
NPROC_PER_NODE=4 \

swift/hub/hub.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -295,7 +295,7 @@ def load_dataset(cls,
295295
version=revision,
296296
download_mode=download_mode,
297297
use_streaming=streaming,
298-
trust_remote_code=True)
298+
)
299299

300300
@classmethod
301301
def download_model(cls,

swift/trainers/trainers.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -201,7 +201,7 @@ def compute_loss(self, model, inputs, return_outputs=False, num_items_in_batch=N
201201
if loss_scale is not None:
202202
loss_kwargs['loss_scale'] = loss_scale
203203

204-
with self.template.compute_loss_context(model, inputs):
204+
with self.template.compute_loss_context(self.model, inputs):
205205
outputs = model(**inputs)
206206
# Save past state if it exists
207207
# TODO: this needs to be fixed and made cleaner later.

0 commit comments

Comments
 (0)