Skip to content

Commit a5e9c7d

Browse files
committed
revert vllm video
1 parent 3b5086c commit a5e9c7d

File tree

4 files changed

+10
-19
lines changed

4 files changed

+10
-19
lines changed

docs/source/Instruction/支持的模型和数据集.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -845,10 +845,10 @@
845845
|[OpenGVLab/InternVL3_5-241B-A28B](https://modelscope.cn/models/OpenGVLab/InternVL3_5-241B-A28B)|internvl3_5|internvl3_5|transformers>=4.37.2, timm|✔|vision, video|[OpenGVLab/InternVL3_5-241B-A28B](https://huggingface.co/OpenGVLab/InternVL3_5-241B-A28B)|
846846
|[OpenGVLab/InternVL3_5-GPT-OSS-20B-A4B-Preview](https://modelscope.cn/models/OpenGVLab/InternVL3_5-GPT-OSS-20B-A4B-Preview)|internvl3_5_gpt|internvl3_5_gpt|transformers>=4.37.2, timm|✘|vision, video|[OpenGVLab/InternVL3_5-GPT-OSS-20B-A4B-Preview](https://huggingface.co/OpenGVLab/InternVL3_5-GPT-OSS-20B-A4B-Preview)|
847847
|[OpenGVLab/InternVL3_5-GPT-OSS-20B-A4B-Preview-HF](https://modelscope.cn/models/OpenGVLab/InternVL3_5-GPT-OSS-20B-A4B-Preview-HF)|internvl_gpt_hf|internvl_hf|transformers>=4.55.0, timm|✘|vision, video|[OpenGVLab/InternVL3_5-GPT-OSS-20B-A4B-Preview-HF](https://huggingface.co/OpenGVLab/InternVL3_5-GPT-OSS-20B-A4B-Preview-HF)|
848-
|[Shanghai_AI_Laboratory/Intern-S1-mini](https://modelscope.cn/models/Shanghai_AI_Laboratory/Intern-S1-mini)|interns1|interns1|transformers>=4.55.2|✘|vision, video|[internlm/Intern-S1-mini](https://huggingface.co/internlm/Intern-S1-mini)|
849-
|[Shanghai_AI_Laboratory/Intern-S1](https://modelscope.cn/models/Shanghai_AI_Laboratory/Intern-S1)|interns1|interns1|transformers>=4.55.2|✘|vision, video|[internlm/Intern-S1](https://huggingface.co/internlm/Intern-S1)|
850-
|[Shanghai_AI_Laboratory/Intern-S1-mini-FP8](https://modelscope.cn/models/Shanghai_AI_Laboratory/Intern-S1-mini-FP8)|interns1|interns1|transformers>=4.55.2|✘|vision, video|[internlm/Intern-S1-mini-FP8](https://huggingface.co/internlm/Intern-S1-mini-FP8)|
851-
|[Shanghai_AI_Laboratory/Intern-S1-FP8](https://modelscope.cn/models/Shanghai_AI_Laboratory/Intern-S1-FP8)|interns1|interns1|transformers>=4.55.2|✘|vision, video|[internlm/Intern-S1-FP8](https://huggingface.co/internlm/Intern-S1-FP8)|
848+
|[Shanghai_AI_Laboratory/Intern-S1-mini](https://modelscope.cn/models/Shanghai_AI_Laboratory/Intern-S1-mini)|interns1|interns1|transformers>=4.55.2,<4.56|&#x2718;|vision, video|[internlm/Intern-S1-mini](https://huggingface.co/internlm/Intern-S1-mini)|
849+
|[Shanghai_AI_Laboratory/Intern-S1](https://modelscope.cn/models/Shanghai_AI_Laboratory/Intern-S1)|interns1|interns1|transformers>=4.55.2,<4.56|&#x2718;|vision, video|[internlm/Intern-S1](https://huggingface.co/internlm/Intern-S1)|
850+
|[Shanghai_AI_Laboratory/Intern-S1-mini-FP8](https://modelscope.cn/models/Shanghai_AI_Laboratory/Intern-S1-mini-FP8)|interns1|interns1|transformers>=4.55.2,<4.56|&#x2718;|vision, video|[internlm/Intern-S1-mini-FP8](https://huggingface.co/internlm/Intern-S1-mini-FP8)|
851+
|[Shanghai_AI_Laboratory/Intern-S1-FP8](https://modelscope.cn/models/Shanghai_AI_Laboratory/Intern-S1-FP8)|interns1|interns1|transformers>=4.55.2,<4.56|&#x2718;|vision, video|[internlm/Intern-S1-FP8](https://huggingface.co/internlm/Intern-S1-FP8)|
852852
|[Shanghai_AI_Laboratory/internlm-xcomposer2-7b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm-xcomposer2-7b)|xcomposer2|ixcomposer2|-|&#x2718;|vision|[internlm/internlm-xcomposer2-7b](https://huggingface.co/internlm/internlm-xcomposer2-7b)|
853853
|[Shanghai_AI_Laboratory/internlm-xcomposer2-4khd-7b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm-xcomposer2-4khd-7b)|xcomposer2_4khd|ixcomposer2|-|&#x2718;|vision|[internlm/internlm-xcomposer2-4khd-7b](https://huggingface.co/internlm/internlm-xcomposer2-4khd-7b)|
854854
|[Shanghai_AI_Laboratory/internlm-xcomposer2d5-7b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm-xcomposer2d5-7b)|xcomposer2_5|xcomposer2_5|decord|&#x2718;|vision|[internlm/internlm-xcomposer2d5-7b](https://huggingface.co/internlm/internlm-xcomposer2d5-7b)|

docs/source_en/Instruction/Supported-models-and-datasets.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -845,10 +845,10 @@ The table below introduces the models integrated with ms-swift:
845845
|[OpenGVLab/InternVL3_5-241B-A28B](https://modelscope.cn/models/OpenGVLab/InternVL3_5-241B-A28B)|internvl3_5|internvl3_5|transformers>=4.37.2, timm|&#x2714;|vision, video|[OpenGVLab/InternVL3_5-241B-A28B](https://huggingface.co/OpenGVLab/InternVL3_5-241B-A28B)|
846846
|[OpenGVLab/InternVL3_5-GPT-OSS-20B-A4B-Preview](https://modelscope.cn/models/OpenGVLab/InternVL3_5-GPT-OSS-20B-A4B-Preview)|internvl3_5_gpt|internvl3_5_gpt|transformers>=4.37.2, timm|&#x2718;|vision, video|[OpenGVLab/InternVL3_5-GPT-OSS-20B-A4B-Preview](https://huggingface.co/OpenGVLab/InternVL3_5-GPT-OSS-20B-A4B-Preview)|
847847
|[OpenGVLab/InternVL3_5-GPT-OSS-20B-A4B-Preview-HF](https://modelscope.cn/models/OpenGVLab/InternVL3_5-GPT-OSS-20B-A4B-Preview-HF)|internvl_gpt_hf|internvl_hf|transformers>=4.55.0, timm|&#x2718;|vision, video|[OpenGVLab/InternVL3_5-GPT-OSS-20B-A4B-Preview-HF](https://huggingface.co/OpenGVLab/InternVL3_5-GPT-OSS-20B-A4B-Preview-HF)|
848-
|[Shanghai_AI_Laboratory/Intern-S1-mini](https://modelscope.cn/models/Shanghai_AI_Laboratory/Intern-S1-mini)|interns1|interns1|transformers>=4.55.2|&#x2718;|vision, video|[internlm/Intern-S1-mini](https://huggingface.co/internlm/Intern-S1-mini)|
849-
|[Shanghai_AI_Laboratory/Intern-S1](https://modelscope.cn/models/Shanghai_AI_Laboratory/Intern-S1)|interns1|interns1|transformers>=4.55.2|&#x2718;|vision, video|[internlm/Intern-S1](https://huggingface.co/internlm/Intern-S1)|
850-
|[Shanghai_AI_Laboratory/Intern-S1-mini-FP8](https://modelscope.cn/models/Shanghai_AI_Laboratory/Intern-S1-mini-FP8)|interns1|interns1|transformers>=4.55.2|&#x2718;|vision, video|[internlm/Intern-S1-mini-FP8](https://huggingface.co/internlm/Intern-S1-mini-FP8)|
851-
|[Shanghai_AI_Laboratory/Intern-S1-FP8](https://modelscope.cn/models/Shanghai_AI_Laboratory/Intern-S1-FP8)|interns1|interns1|transformers>=4.55.2|&#x2718;|vision, video|[internlm/Intern-S1-FP8](https://huggingface.co/internlm/Intern-S1-FP8)|
848+
|[Shanghai_AI_Laboratory/Intern-S1-mini](https://modelscope.cn/models/Shanghai_AI_Laboratory/Intern-S1-mini)|interns1|interns1|transformers>=4.55.2,<4.56|&#x2718;|vision, video|[internlm/Intern-S1-mini](https://huggingface.co/internlm/Intern-S1-mini)|
849+
|[Shanghai_AI_Laboratory/Intern-S1](https://modelscope.cn/models/Shanghai_AI_Laboratory/Intern-S1)|interns1|interns1|transformers>=4.55.2,<4.56|&#x2718;|vision, video|[internlm/Intern-S1](https://huggingface.co/internlm/Intern-S1)|
850+
|[Shanghai_AI_Laboratory/Intern-S1-mini-FP8](https://modelscope.cn/models/Shanghai_AI_Laboratory/Intern-S1-mini-FP8)|interns1|interns1|transformers>=4.55.2,<4.56|&#x2718;|vision, video|[internlm/Intern-S1-mini-FP8](https://huggingface.co/internlm/Intern-S1-mini-FP8)|
851+
|[Shanghai_AI_Laboratory/Intern-S1-FP8](https://modelscope.cn/models/Shanghai_AI_Laboratory/Intern-S1-FP8)|interns1|interns1|transformers>=4.55.2,<4.56|&#x2718;|vision, video|[internlm/Intern-S1-FP8](https://huggingface.co/internlm/Intern-S1-FP8)|
852852
|[Shanghai_AI_Laboratory/internlm-xcomposer2-7b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm-xcomposer2-7b)|xcomposer2|ixcomposer2|-|&#x2718;|vision|[internlm/internlm-xcomposer2-7b](https://huggingface.co/internlm/internlm-xcomposer2-7b)|
853853
|[Shanghai_AI_Laboratory/internlm-xcomposer2-4khd-7b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm-xcomposer2-4khd-7b)|xcomposer2_4khd|ixcomposer2|-|&#x2718;|vision|[internlm/internlm-xcomposer2-4khd-7b](https://huggingface.co/internlm/internlm-xcomposer2-4khd-7b)|
854854
|[Shanghai_AI_Laboratory/internlm-xcomposer2d5-7b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm-xcomposer2d5-7b)|xcomposer2_5|xcomposer2_5|decord|&#x2718;|vision|[internlm/internlm-xcomposer2d5-7b](https://huggingface.co/internlm/internlm-xcomposer2d5-7b)|

swift/llm/model/model/internlm.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -495,7 +495,7 @@ def get_model_tokenizer_internvl_hf(*args, **kwargs):
495495
get_model_tokenizer_interns1,
496496
architectures=['InternS1ForConditionalGeneration'],
497497
model_arch=ModelArch.interns1,
498-
requires=['transformers>=4.55.2'],
498+
requires=['transformers>=4.55.2,<4.56'],
499499
tags=['vision', 'video'],
500500
))
501501

swift/llm/template/base.py

Lines changed: 1 addition & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -825,16 +825,7 @@ def replace_tag(self, media_type: Literal['image', 'video', 'audio'], index: int
825825
return [[-100]]
826826
return self.image_placeholder
827827
elif media_type == 'video':
828-
if self.mode == 'vllm':
829-
# https://github.com/vllm-project/vllm/blob/main/examples/offline_inference/vision_language.py
830-
from vllm.assets.video import video_to_ndarrays, video_get_metadata
831-
num_frames = get_env_args('vllm_num_frames', int, 16)
832-
video_data = video_to_ndarrays(inputs.videos[index], num_frames)
833-
video_metadatas = video_get_metadata(inputs.videos[index], num_frames)
834-
inputs.videos[index] = [(video_data, video_metadatas)]
835-
return self.video_placeholder
836-
else:
837-
return self.video_placeholder
828+
return self.video_placeholder
838829
elif media_type == 'audio':
839830
return self.audio_placeholder
840831

0 commit comments

Comments
 (0)