Skip to content

Commit 1f8cd73

Browse files
committed
[bugfix] fix SglangEngine (#5828)
1 parent 647e81d commit 1f8cd73

File tree

6 files changed

+10
-7
lines changed

6 files changed

+10
-7
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -134,7 +134,7 @@ Running Environment:
134134
| vllm | >=0.5.1 | 0.10.1.1 | Inference/Deployment |
135135
| sglang | >=0.4.6 | 0.4.10.post2 | Inference/Deployment |
136136
| lmdeploy | >=0.5 | 0.9.2.post1 | Inference/Deployment |
137-
| evalscope | >=0.11 | | Evaluation |
137+
| evalscope | >=1.0 | | Evaluation |
138138
| gradio | | 5.32.1 | Web-UI/App |
139139

140140
For more optional dependencies, you can refer to [here](https://github.com/modelscope/ms-swift/blob/main/requirements/install_all.sh).

README_CN.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -130,7 +130,7 @@ pip install -e .
130130
| vllm | >=0.5.1 | 0.10.1.1 | 推理/部署 |
131131
| sglang | >=0.4.6 | 0.4.10.post2 | 推理/部署 |
132132
| lmdeploy | >=0.5 | 0.9.2.post1 | 推理/部署 |
133-
| evalscope | >=0.11 | | 评测 |
133+
| evalscope | >=1.0 | | 评测 |
134134
| gradio | | 5.32.1 | Web-UI/App |
135135

136136
更多可选依赖可以参考[这里](https://github.com/modelscope/ms-swift/blob/main/requirements/install_all.sh)

docs/source/GetStarted/SWIFT安装.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -103,7 +103,7 @@ modelscope-registry.us-west-1.cr.aliyuncs.com/modelscope-repo/modelscope:ubuntu2
103103
| vllm | >=0.5.1 | 0.10.1.1 | 推理/部署 |
104104
| sglang | >=0.4.6 | 0.4.10.post2 | 推理/部署 |
105105
| lmdeploy | >=0.5 | 0.9.2.post1 | 推理/部署 |
106-
| evalscope | >=0.11 | | 评测 |
106+
| evalscope | >=1.0 | | 评测 |
107107
| gradio | | 5.32.1 | Web-UI/App |
108108

109109
更多可选依赖可以参考[这里](https://github.com/modelscope/ms-swift/blob/main/requirements/install_all.sh)

docs/source_en/GetStarted/SWIFT-installation.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -104,7 +104,7 @@ More images can be found [here](https://modelscope.cn/docs/intro/environment-set
104104
| vllm | >=0.5.1 | 0.10.1.1 | Inference/Deployment |
105105
| sglang | >=0.4.6 | 0.4.10.post2 | Inference/Deployment |
106106
| lmdeploy | >=0.5 | 0.9.2.post1 | Inference/Deployment |
107-
| evalscope | >=0.11 | | Evaluation |
107+
| evalscope | >=1.0 | | Evaluation |
108108
| gradio | | 5.32.1 | Web-UI/App |
109109

110110

requirements/install_all.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ pip install "vllm>=0.5.1,<0.10.2" "transformers<4.57" "trl<0.21" -U
55
pip install "lmdeploy>=0.5" -U
66
pip install autoawq -U --no-deps
77
pip install auto_gptq optimum bitsandbytes "gradio<5.33" -U
8-
pip install git+https://github.com/modelscope/ms-swift.git
8+
pip install git+https://github.com/modelscope/ms-swift.git#egg=ms-swift[all]
99
pip install timm deepspeed -U
1010
pip install qwen_vl_utils qwen_omni_utils keye_vl_utils -U
1111
pip install decord librosa icecream soundfile -U

swift/llm/infer/infer_engine/sglang_engine.py

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -222,13 +222,16 @@ async def _infer_embedding_async(self, template: Template, inputs: Dict[str, Any
222222

223223
async def _infer_full_async(self, template: Template, inputs: Dict[str, Any], generation_config: Dict[str, Any],
224224
request_config: RequestConfig) -> ChatCompletionResponse:
225-
output = await self.engine.async_generate(**inputs, sampling_params=generation_config)
225+
engine_inputs = {k: v for k, v in inputs.items() if k != 'template_inputs'}
226+
output = await self.engine.async_generate(**engine_inputs, sampling_params=generation_config)
226227
output['prompt_token_ids'] = inputs['input_ids']
227228
return self._create_chat_completion_response(output, inputs, template, request_config.return_details)
228229

229230
async def _infer_stream_async(self, template: Template, inputs: Dict[str, Any], generation_config: Dict[str, Any],
230231
**kwargs) -> AsyncIterator[ChatCompletionStreamResponse]:
231-
result_generator = await self.engine.async_generate(**inputs, sampling_params=generation_config, stream=True)
232+
engine_inputs = {k: v for k, v in inputs.items() if k != 'template_inputs'}
233+
result_generator = await self.engine.async_generate(
234+
**engine_inputs, sampling_params=generation_config, stream=True)
232235
infer_streamer = InferStreamer(template)
233236
async for output in result_generator:
234237
res = self._create_chat_completion_stream_response(output, template, infer_streamer)

0 commit comments

Comments
 (0)