Skip to content

Commit a7f8188

Browse files
committed
update support_vllm (#415)
1 parent 46a3b09 commit a7f8188

File tree

6 files changed

+47
-39
lines changed

6 files changed

+47
-39
lines changed

docs/source/LLM/命令行参数.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -96,10 +96,10 @@
9696
- `--top_p`: 默认为`0.7`. 该参数只有在`predict_with_generate`设置为True的时候才生效.
9797
- `--repetition_penalty`: 默认为`1.`. 该参数只有在`predict_with_generate`设置为True的时候才生效.
9898
- `--num_beams`: 默认为`1`. 该参数只有在`predict_with_generate`设置为True的时候才生效.
99-
- `--gpu_memory_fraction`: 默认为None. 该参数旨在指定显卡最大可用显存比例的情况下运行训练,用于极限测试.
100-
- `--train_dataset_mix_ratio`: 默认为0. 该参数定义了如何进行数据集打混训练. 指定该参数时, 训练集会以`train_dataset_mix_ratio`倍数混合`train_dataset_mix_ds`指定的通用知识数据集, 使整体数据集长度达到`train_dataset_sample`.
99+
- `--gpu_memory_fraction`: 默认为`None`. 该参数旨在指定显卡最大可用显存比例的情况下运行训练,用于极限测试.
100+
- `--train_dataset_mix_ratio`: 默认为`0`. 该参数定义了如何进行数据集打混训练. 指定该参数时, 训练集会以`train_dataset_mix_ratio`倍数混合`train_dataset_mix_ds`指定的通用知识数据集, 使整体数据集长度达到`train_dataset_sample`.
101101
- `--train_dataset_mix_ds`: 默认为`ms-bench`. 用于防止知识遗忘的通用知识数据集.
102-
- `--use_loss_scale`: 默认为False. 生效时会将Agent的部分字段(Action/Action Input部分)的loss权重加强以强化CoT, 对普通SFT场景没有任何效果.
102+
- `--use_loss_scale`: 默认为`False`. 生效时会将Agent的部分字段(Action/Action Input部分)的loss权重加强以强化CoT, 对普通SFT场景没有任何效果.
103103

104104
### AdaLoRA微调参数
105105

docs/source/LLM/支持的模型和数据集.md

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -84,18 +84,18 @@
8484
|internlm-7b-chat-8k|[Shanghai_AI_Laboratory/internlm-chat-7b-8k](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm-chat-7b-8k/summary)|q_proj, k_proj, v_proj|internlm|✘|✔||
8585
|internlm-20b|[Shanghai_AI_Laboratory/internlm-20b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm-20b/summary)|q_proj, k_proj, v_proj|default-generation-bos|✘|✔||
8686
|internlm-20b-chat|[Shanghai_AI_Laboratory/internlm-chat-20b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm-chat-20b/summary)|q_proj, k_proj, v_proj|internlm|✘|✔||
87-
|internlm2-7b-base|[Shanghai_AI_Laboratory/internlm2-base-7b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-base-7b/summary)|wqkv|default-generation-bos|✔|✘||
88-
|internlm2-7b|[Shanghai_AI_Laboratory/internlm2-7b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-7b/summary)|wqkv|default-generation-bos|✔|✘||
89-
|internlm2-7b-sft-chat|[Shanghai_AI_Laboratory/internlm2-chat-7b-sft](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-chat-7b-sft/summary)|wqkv|internlm2|✔|✘||
90-
|internlm2-7b-chat|[Shanghai_AI_Laboratory/internlm2-chat-7b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-chat-7b/summary)|wqkv|internlm2|✔|✘||
91-
|internlm2-20b-base|[Shanghai_AI_Laboratory/internlm2-base-20b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-base-20b/summary)|wqkv|default-generation-bos|✔|✘||
92-
|internlm2-20b|[Shanghai_AI_Laboratory/internlm2-20b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-20b/summary)|wqkv|default-generation-bos|✔|✘||
93-
|internlm2-20b-sft-chat|[Shanghai_AI_Laboratory/internlm2-chat-20b-sft](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-chat-20b-sft/summary)|wqkv|internlm2|✔|✘||
94-
|internlm2-20b-chat|[Shanghai_AI_Laboratory/internlm2-chat-20b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-chat-20b/summary)|wqkv|internlm2|✔|✘||
95-
|internlm2-math-7b|[Shanghai_AI_Laboratory/internlm2-math-base-7b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-math-base-7b/summary)|wqkv|default-generation-bos|✔|✘||
96-
|internlm2-math-7b-chat|[Shanghai_AI_Laboratory/internlm2-math-7b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-math-7b/summary)|wqkv|internlm2|✔|✘||
97-
|internlm2-math-20b|[Shanghai_AI_Laboratory/internlm2-math-base-20b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-math-base-20b/summary)|wqkv|default-generation-bos|✔|✘||
98-
|internlm2-math-20b-chat|[Shanghai_AI_Laboratory/internlm2-math-20b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-math-20b/summary)|wqkv|internlm2|✔|✘||
87+
|internlm2-7b-base|[Shanghai_AI_Laboratory/internlm2-base-7b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-base-7b/summary)|wqkv|default-generation-bos|✔|✔||
88+
|internlm2-7b|[Shanghai_AI_Laboratory/internlm2-7b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-7b/summary)|wqkv|default-generation-bos|✔|✔||
89+
|internlm2-7b-sft-chat|[Shanghai_AI_Laboratory/internlm2-chat-7b-sft](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-chat-7b-sft/summary)|wqkv|internlm2|✔|✔||
90+
|internlm2-7b-chat|[Shanghai_AI_Laboratory/internlm2-chat-7b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-chat-7b/summary)|wqkv|internlm2|✔|✔||
91+
|internlm2-20b-base|[Shanghai_AI_Laboratory/internlm2-base-20b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-base-20b/summary)|wqkv|default-generation-bos|✔|✔||
92+
|internlm2-20b|[Shanghai_AI_Laboratory/internlm2-20b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-20b/summary)|wqkv|default-generation-bos|✔|✔||
93+
|internlm2-20b-sft-chat|[Shanghai_AI_Laboratory/internlm2-chat-20b-sft](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-chat-20b-sft/summary)|wqkv|internlm2|✔|✔||
94+
|internlm2-20b-chat|[Shanghai_AI_Laboratory/internlm2-chat-20b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-chat-20b/summary)|wqkv|internlm2|✔|✔||
95+
|internlm2-math-7b|[Shanghai_AI_Laboratory/internlm2-math-base-7b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-math-base-7b/summary)|wqkv|default-generation-bos|✔|✔||
96+
|internlm2-math-7b-chat|[Shanghai_AI_Laboratory/internlm2-math-7b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-math-7b/summary)|wqkv|internlm2|✔|✔||
97+
|internlm2-math-20b|[Shanghai_AI_Laboratory/internlm2-math-base-20b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-math-base-20b/summary)|wqkv|default-generation-bos|✔|✔||
98+
|internlm2-math-20b-chat|[Shanghai_AI_Laboratory/internlm2-math-20b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-math-20b/summary)|wqkv|internlm2|✔|✔||
9999
|internlm-xcomposer2-7b-chat|[Shanghai_AI_Laboratory/internlm-xcomposer2-7b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm-xcomposer2-7b/summary)|wqkv|internlm-xcomposer2|✔|✘||
100100
|deepseek-7b|[deepseek-ai/deepseek-llm-7b-base](https://modelscope.cn/models/deepseek-ai/deepseek-llm-7b-base/summary)|q_proj, k_proj, v_proj|default-generation-bos|✔|✔||
101101
|deepseek-7b-chat|[deepseek-ai/deepseek-llm-7b-chat](https://modelscope.cn/models/deepseek-ai/deepseek-llm-7b-chat/summary)|q_proj, k_proj, v_proj|deepseek|✔|✔||

docs/source/LLM/自我认知微调最佳实践.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -223,7 +223,7 @@ I am Xiao Huang, an artificial intelligence assistant created by ModelScope.
223223
CUDA_VISIBLE_DEVICES=0 swift infer --ckpt_dir 'qwen1half-4b-chat/vx-xxx/checkpoint-xxx'
224224

225225
# Merge LoRA增量权重并推理
226-
swift merge-lora --ckpt_dir 'xxx/vx_xxx/checkpoint-xxx'
226+
swift merge-lora --ckpt_dir 'qwen1half-4b-chat/vx-xxx/checkpoint-xxx'
227227
CUDA_VISIBLE_DEVICES=0 swift infer --ckpt_dir 'qwen1half-4b-chat/vx-xxx/checkpoint-xxx-merged'
228228
```
229229

@@ -247,7 +247,7 @@ result = app_ui_main(app_ui_args)
247247
CUDA_VISIBLE_DEVICES=0 swift app-ui --ckpt_dir 'qwen1half-4b-chat/vx-xxx/checkpoint-xxx'
248248

249249
# Merge LoRA增量权重并使用app-ui
250-
swift merge-lora --ckpt_dir 'xxx/vx_xxx/checkpoint-xxx'
250+
swift merge-lora --ckpt_dir 'qwen1half-4b-chat/vx-xxx/checkpoint-xxx'
251251
CUDA_VISIBLE_DEVICES=0 swift app-ui --ckpt_dir 'qwen1half-4b-chat/vx-xxx/checkpoint-xxx-merged'
252252
```
253253

docs/source/index.rst

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@ Swift DOCUMENTATION
1111

1212
GetStarted/快速使用.md
1313
GetStarted/SWIFT安装.md
14+
GetStarted/界面训练推理.md
1415
GetStarted/使用tuners.md
1516
GetStarted/ResTuning.md
1617
GetStarted/在SWIFT内使用PEFT.md
@@ -21,11 +22,15 @@ Swift DOCUMENTATION
2122
:caption: LLM Training and Inference Example
2223

2324
LLM/自我认知微调最佳实践.md
25+
LLM/Agent微调最佳实践.md
2426
LLM/LLM推理文档.md
2527
LLM/LLM微调文档.md
28+
LLM/LLM人类对齐训练文档.md
29+
LLM/VLLM推理加速与部署.md
2630
LLM/支持的模型和数据集.md
2731
LLM/自定义与拓展.md
2832
LLM/命令行参数.md
33+
LLM/Benchmark.md
2934

3035
.. toctree::
3136
:maxdepth: 2

swift/llm/utils/model.py

Lines changed: 24 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1237,79 +1237,91 @@ def get_model_tokenizer_qwen1half_intx(model_dir: str,
12371237
'Shanghai_AI_Laboratory/internlm2-math-base-7b',
12381238
LoRATM.internlm2,
12391239
TemplateType.default_generation_bos,
1240-
support_flash_attn=True)
1240+
support_flash_attn=True,
1241+
support_vllm=True)
12411242
@register_model(
12421243
ModelType.internlm2_math_20b,
12431244
'Shanghai_AI_Laboratory/internlm2-math-base-20b',
12441245
LoRATM.internlm2,
12451246
TemplateType.default_generation_bos,
1246-
support_flash_attn=True)
1247+
support_flash_attn=True,
1248+
support_vllm=True)
12471249
@register_model(
12481250
ModelType.internlm2_math_7b_chat,
12491251
'Shanghai_AI_Laboratory/internlm2-math-7b',
12501252
LoRATM.internlm2,
12511253
TemplateType.internlm2,
12521254
eos_token='<|im_end|>',
1253-
support_flash_attn=True)
1255+
support_flash_attn=True,
1256+
support_vllm=True)
12541257
@register_model(
12551258
ModelType.internlm2_math_20b_chat,
12561259
'Shanghai_AI_Laboratory/internlm2-math-20b',
12571260
LoRATM.internlm2,
12581261
TemplateType.internlm2,
12591262
eos_token='<|im_end|>',
1260-
support_flash_attn=True)
1263+
support_flash_attn=True,
1264+
support_vllm=True)
12611265
@register_model(
12621266
ModelType.internlm2_7b_sft_chat,
12631267
'Shanghai_AI_Laboratory/internlm2-chat-7b-sft',
12641268
LoRATM.internlm2,
12651269
TemplateType.internlm2,
12661270
eos_token='<|im_end|>',
1267-
support_flash_attn=True)
1271+
support_flash_attn=True,
1272+
support_vllm=True)
12681273
@register_model(
12691274
ModelType.internlm2_7b_chat,
12701275
'Shanghai_AI_Laboratory/internlm2-chat-7b',
12711276
LoRATM.internlm2,
12721277
TemplateType.internlm2,
12731278
eos_token='<|im_end|>',
1274-
support_flash_attn=True)
1279+
support_flash_attn=True,
1280+
support_vllm=True)
12751281
@register_model(
12761282
ModelType.internlm2_20b_sft_chat,
12771283
'Shanghai_AI_Laboratory/internlm2-chat-20b-sft',
12781284
LoRATM.internlm2,
12791285
TemplateType.internlm2,
12801286
eos_token='<|im_end|>',
1281-
support_flash_attn=True)
1287+
support_flash_attn=True,
1288+
support_vllm=True)
12821289
@register_model(
12831290
ModelType.internlm2_20b_chat,
12841291
'Shanghai_AI_Laboratory/internlm2-chat-20b',
12851292
LoRATM.internlm2,
12861293
TemplateType.internlm2,
12871294
eos_token='<|im_end|>',
1288-
support_flash_attn=True)
1295+
support_flash_attn=True,
1296+
support_vllm=True)
12891297
@register_model(
12901298
ModelType.internlm2_7b,
12911299
'Shanghai_AI_Laboratory/internlm2-7b',
12921300
LoRATM.internlm2,
12931301
TemplateType.default_generation_bos,
1294-
support_flash_attn=True)
1302+
support_flash_attn=True,
1303+
support_vllm=True)
12951304
@register_model(
12961305
ModelType.internlm2_7b_base,
12971306
'Shanghai_AI_Laboratory/internlm2-base-7b',
12981307
LoRATM.internlm2,
12991308
TemplateType.default_generation_bos,
1300-
support_flash_attn=True)
1309+
support_flash_attn=True,
1310+
support_vllm=True)
13011311
@register_model(
13021312
ModelType.internlm2_20b,
13031313
'Shanghai_AI_Laboratory/internlm2-20b',
13041314
LoRATM.internlm2,
13051315
TemplateType.default_generation_bos,
1306-
support_flash_attn=True)
1316+
support_flash_attn=True,
1317+
support_vllm=True)
13071318
@register_model(
13081319
ModelType.internlm2_20b_base,
13091320
'Shanghai_AI_Laboratory/internlm2-base-20b',
13101321
LoRATM.internlm2,
13111322
TemplateType.default_generation_bos,
1312-
support_flash_attn=True)
1323+
support_flash_attn=True,
1324+
support_vllm=True)
13131325
def get_model_tokenizer_internlm2(model_dir: str,
13141326
torch_dtype: Dtype,
13151327
model_kwargs: Dict[str, Any],

swift/llm/utils/vllm_utils.py

Lines changed: 1 addition & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -35,9 +35,6 @@ def get_vllm_engine(model_type: str,
3535
if engine_kwargs is None:
3636
engine_kwargs = {}
3737
model_info = MODEL_MAPPING[model_type]
38-
support_vllm = model_info.get('support_vllm', False)
39-
if not support_vllm:
40-
raise ValueError(f'vllm not support `{model_type}`')
4138
model_id_or_path = model_info['model_id_or_path']
4239
ignore_file_pattern = model_info['ignore_file_pattern']
4340
model_dir = kwargs.get('model_dir', None)
@@ -84,13 +81,7 @@ def get_vllm_engine(model_type: str,
8481
pass
8582
# fix HTTPError bug (use model_dir)
8683
os.environ.pop('VLLM_USE_MODELSCOPE', None)
87-
try:
88-
llm_engine = llm_engine_cls.from_engine_args(engine_args)
89-
except ValueError:
90-
logger.warning(
91-
f'The current version of VLLM does not support {model_type}. '
92-
'Please upgrade VLLM or specify `--infer_backend pt`.')
93-
raise
84+
llm_engine = llm_engine_cls.from_engine_args(engine_args)
9485
llm_engine.engine_args = engine_args
9586
llm_engine.model_dir = model_dir
9687
llm_engine.model_type = model_type

0 commit comments

Comments
 (0)