Skip to content

Commit 4bacd3f

Browse files
committed
[template] add_retry (#6138)
1 parent 0dcd6c1 commit 4bacd3f

File tree

5 files changed

+22
-11
lines changed

5 files changed

+22
-11
lines changed

docs/source/Instruction/命令行参数.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -810,4 +810,5 @@ qwen2_5_omni除了包含qwen2_5_vl和qwen2_audio的模型特定参数外,还
810810
- LOG_LEVEL: 日志的level,默认为'INFO',你可以设置为'WARNING', 'ERROR'等。
811811
- SWIFT_DEBUG: 在`engine.infer(...)`时,若设置为'1',PtEngine将会打印input_ids和generate_ids的内容方便进行调试与对齐。
812812
- VLLM_USE_V1: 用于切换vLLM使用V0/V1版本。
813+
- SWIFT_TIMEOUT: (ms-swift>=3.10) 若多模态数据集中存在图像URL,该参数用于控制获取图片的timeout,默认为20s。
813814
- ROOT_IMAGE_DIR: (ms-swift>=3.8) 图像(多模态)资源的根目录。通过设置该参数,可以在数据集中使用相对于 `ROOT_IMAGE_DIR` 的相对路径。默认情况下,是相对于运行目录的相对路径。

docs/source/Instruction/常见问题整理.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -759,8 +759,8 @@ RAY_memory_monitor_refresh_ms=0 CUDA_VISIBLE_DEVICES=1 nohup swift deploy --ckpt
759759
```
760760
需要客户端传参数,`request_config = RequestConfig(..., logprobs=True, top_logprobs=2)`
761761

762-
### Q12: wift3.0 部署推理,可以设置请求的超时时间么?如果图片url非法,会等在那里
763-
设置环境变量`TIMEOUT`,默认是300秒。或者`InferClient`中可以传参数。
762+
### Q12: swift3.0 部署推理,可以设置请求的超时时间么?如果图片url非法,会等在那里
763+
设置环境变量`SWIFT_TIMEOUT`。或者`InferClient`中可以传参数。
764764

765765
### Q13: swift部署的模型怎么没法流式生成啊?服务端的stream设为True了,客户端的stream也设为True了,但它就是没法流式生成
766766
客户端控制的,查看[examples/deploy/client](https://github.com/modelscope/ms-swift/tree/main/examples/deploy/client)
@@ -837,7 +837,7 @@ swift eval --model_type 'qwen2_5-1_5b-instruct' --eval_dataset no --custom_eval_
837837
这是依赖了nltk的包,然后nltk的tokenizer需要下载一个punkt_tab的zip文件,国内有些环境下载不太稳定或者直接失败。已尝试改了代码做兜底,规避这个问题;参考[issue](https://github.com/nltk/nltk/issues/3293)
838838

839839
### Q6: eval微调后的模型,总是会在固定的百分比停掉,但是vllm服务看着一直是有在正常运行的。模型越大,断开的越早。
840-
`TIMEOUT`环境变量设置为-1。
840+
`SWIFT_TIMEOUT`环境变量设置为-1。
841841

842842
### Q7: evalscope 支持多模型对比吗?
843843
详见[文档](https://evalscope.readthedocs.io/zh-cn/latest/user_guides/arena.html)

docs/source_en/Instruction/Command-line-parameters.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -834,4 +834,5 @@ The meanings of the following parameters can be found in the example code [here]
834834
- LOG_LEVEL: The log level, default is 'INFO'. You can set it to 'WARNING', 'ERROR', etc.
835835
- SWIFT_DEBUG: When set to `'1'` during `engine.infer(...)`, PtEngine will print the contents of `input_ids` and `generate_ids` to facilitate debugging and alignment.
836836
- VLLM_USE_V1: Used to switch between V0 and V1 versions of vLLM.
837+
- SWIFT_TIMEOUT: (ms-swift >= 3.10) If the multimodal dataset contains image URLs, this parameter controls the timeout for fetching images, defaulting to 20 seconds.
837838
- ROOT_IMAGE_DIR: (ms-swift>=3.8) The root directory for image (multimodal) resources. By setting this parameter, relative paths in the dataset can be interpreted relative to `ROOT_IMAGE_DIR`. By default, paths are relative to the current working directory.

docs/source_en/Instruction/Frequently-asked-questions.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -760,7 +760,7 @@ RAY_memory_monitor_refresh_ms=0 CUDA_VISIBLE_DEVICES=1 nohup swift deploy --ckpt
760760
Parameters need to be passed from the client side, `request_config = RequestConfig(..., logprobs=True, top_logprobs=2)`.
761761

762762
### Q12: Can we set request timeout time for Swift3.0 deployment inference? What happens if the image URL is invalid?
763-
You can set the `TIMEOUT` environment variable, which defaults to 300 seconds. Alternatively, you can pass parameters in `InferClient`.
763+
You can set the `SWIFT_TIMEOUT` environment variable. Alternatively, you can pass parameters in `InferClient`.
764764

765765
### Q13: Why can't I get streaming generation with Swift deployed models? I've set stream to True on both server and client side, but it's still not streaming
766766
It's controlled by the client side. Please check [examples/deploy/client](https://github.com/modelscope/ms-swift/tree/main/examples/deploy/client).
@@ -840,7 +840,7 @@ swift eval --model_type 'qwen2_5-1_5b-instruct' --eval_dataset no --custom_eval_
840840
This relies on the nltk package, which needs to download a punkt_tab zip file. Some environments in China have unstable or failed downloads. The code has been modified to handle this issue; reference [issue](https://github.com/nltk/nltk/issues/3293).
841841

842842
### Q6: The model after eval fine-tuning keeps stopping at a fixed percentage, but the vllm service seems to be running normally. The larger the model, the sooner it disconnects.
843-
Set the `TIMEOUT` environment variable to -1.
843+
Set the `SWIFT_TIMEOUT` environment variable to -1.
844844

845845
### Q7: Does evalscope support multi-model comparison?
846846
See the [documentation](https://evalscope.readthedocs.io/en/latest/user_guides/arena.html) for details.

swift/llm/template/vision_utils.py

Lines changed: 15 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,8 @@
1010
import requests
1111
import torch
1212
from PIL import Image
13+
from requests.adapters import HTTPAdapter
14+
from urllib3.util.retry import Retry
1315

1416
from swift.utils import get_env_args
1517

@@ -105,12 +107,19 @@ def load_file(path: Union[str, bytes, _T]) -> Union[BytesIO, _T]:
105107
if isinstance(path, str):
106108
path = path.strip()
107109
if path.startswith('http'):
108-
request_kwargs = {}
109-
timeout = float(os.getenv('TIMEOUT', '300'))
110-
if timeout > 0:
111-
request_kwargs['timeout'] = timeout
112-
content = requests.get(path, **request_kwargs).content
113-
res = BytesIO(content)
110+
retries = Retry(total=3, backoff_factor=1, allowed_methods=['GET'])
111+
with requests.Session() as session:
112+
session.mount('http://', HTTPAdapter(max_retries=retries))
113+
session.mount('https://', HTTPAdapter(max_retries=retries))
114+
115+
timeout = float(os.getenv('SWIFT_TIMEOUT', '20'))
116+
request_kwargs = {'timeout': timeout} if timeout > 0 else {}
117+
118+
response = session.get(path, **request_kwargs)
119+
response.raise_for_status()
120+
content = response.content
121+
res = BytesIO(content)
122+
114123
elif os.path.exists(path) or (not path.startswith('data:') and len(path) <= 200):
115124
ROOT_IMAGE_DIR = get_env_args('ROOT_IMAGE_DIR', str, None)
116125
if ROOT_IMAGE_DIR is not None and not os.path.exists(path):

0 commit comments

Comments
 (0)