Skip to content

Conversation

@yrk111222
Copy link

This PR primarily adds support for the ModelScope API. Users can now utilize models supported by ModelScope API inference by setting an API key.

How to obtain an API key

Note: Please bind your Alibaba Cloud account before using this feature.

Copilot AI review requested due to automatic review settings November 28, 2025 06:49
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 11 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

model=model,
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", 'content': 'Just respond with "Hello"!'},
Copy link

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent quote style: This line uses single quotes while the corresponding line in the else branch (line 58) uses double quotes. For consistency, both should use the same quote style.

Suggested change
{"role": "user", 'content': 'Just respond with "Hello"!'},
{"role": "user", "content": "Just respond with \"Hello\"!"},

Copilot uses AI. Check for mistakes.
)
return True, response.choices[0].message.content
# Check whether it is the ModelScope platform, as some models require the stream and enable_thinking parameters
if "modelscope" in base_url:
Copy link

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ModelScope detection logic relies on a simple substring check of "modelscope" in the base_url. This approach is fragile and could match unintended URLs (e.g., "example.com/modelscope-test"). Consider using a more robust detection mechanism, such as checking against the configured ModelScope base URL from the config, or using the LLMServiceEnum to properly identify the service type.

Copilot uses AI. Check for mistakes.
for chunk in response_stream:
if chunk.choices and chunk.choices[0].delta.content:
full_answer += chunk.choices[0].delta.content

Copy link

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing error handling for empty streaming responses. If the streaming response yields no content chunks (full_answer remains empty), this will return an empty string. Consider adding validation to ensure a meaningful response was received, similar to how the non-streaming path validates response.choices[0].message.content.

Suggested change
if not full_answer.strip():
return False, "No content received from streaming response."

Copilot uses AI. Check for mistakes.
Comment on lines 159 to 161
fake_message = SimpleNamespace(content=full_content)
fake_choice = SimpleNamespace(message=fake_message)
response = SimpleNamespace(choices=[fake_choice])
Copy link

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SimpleNamespace approach creates a fake response object that may not fully conform to the OpenAI API response structure. This could break if the validation logic or downstream code expects additional attributes or methods from the actual ChatCompletion object. Consider using the official OpenAI response classes or ensuring all necessary attributes are present on the SimpleNamespace objects.

Suggested change
fake_message = SimpleNamespace(content=full_content)
fake_choice = SimpleNamespace(message=fake_message)
response = SimpleNamespace(choices=[fake_choice])
import time
fake_message = SimpleNamespace(content=full_content)
fake_choice = SimpleNamespace(message=fake_message, finish_reason="stop", index=0)
response = SimpleNamespace(
id="chatcmpl-fake-modelscope",
object="chat.completion",
created=int(time.time()),
model=model,
choices=[fake_choice],
)

Copilot uses AI. Check for mistakes.
for chunk in response_stream:
if chunk.choices and chunk.choices[0].delta.content:
full_content += chunk.choices[0].delta.content

Copy link

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing error handling for empty streaming responses. If the streaming response yields no content chunks (full_content remains empty), the validation at lines 171-179 may fail or pass with an empty string. Consider adding explicit validation after the streaming loop to ensure meaningful content was received.

Suggested change
if not full_content:
raise ValueError("Streaming response yielded no content")

Copilot uses AI. Check for mistakes.
Comment on lines 31 to 50
extra_body = {"enable_thinking": True}
response_stream = openai.OpenAI(
base_url=base_url, api_key=api_key, timeout=60
).chat.completions.create(
model=model,
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", 'content': 'Just respond with "Hello"!'},
],
stream=True,
extra_body=extra_body,
timeout=30,
)

full_answer = ""
for chunk in response_stream:
if chunk.choices and chunk.choices[0].delta.content:
full_answer += chunk.choices[0].delta.content

return True, full_answer.strip()
Copy link

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment states "some models require the stream and enable_thinking parameters", but the code unconditionally applies these settings to all ModelScope API calls. If only certain models require these parameters, this could cause errors or unexpected behavior for models that don't support them. Consider making this behavior configurable per model or checking model compatibility before applying these parameters.

Suggested change
extra_body = {"enable_thinking": True}
response_stream = openai.OpenAI(
base_url=base_url, api_key=api_key, timeout=60
).chat.completions.create(
model=model,
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", 'content': 'Just respond with "Hello"!'},
],
stream=True,
extra_body=extra_body,
timeout=30,
)
full_answer = ""
for chunk in response_stream:
if chunk.choices and chunk.choices[0].delta.content:
full_answer += chunk.choices[0].delta.content
return True, full_answer.strip()
# List of ModelScope models that require stream and enable_thinking
models_require_thinking = [
"qwen-72b-chat", "qwen-7b-chat", "qwen-14b-chat", "qwen-1.8b-chat"
# Add other models as needed
]
if model in models_require_thinking:
extra_body = {"enable_thinking": True}
response_stream = openai.OpenAI(
base_url=base_url, api_key=api_key, timeout=60
).chat.completions.create(
model=model,
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", 'content': 'Just respond with "Hello"!'},
],
stream=True,
extra_body=extra_body,
timeout=30,
)
full_answer = ""
for chunk in response_stream:
if chunk.choices and chunk.choices[0].delta.content:
full_answer += chunk.choices[0].delta.content
return True, full_answer.strip()
else:
response = openai.OpenAI(
base_url=base_url, api_key=api_key, timeout=60
).chat.completions.create(
model=model,
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", 'content': 'Just respond with "Hello"!'},
],
timeout=30,
)
return True, response.choices[0].message.content

Copilot uses AI. Check for mistakes.
"""
client = get_llm_client()
# Check whether it is the ModelScope platform, as some models require the stream and enable_thinking parameters
if "modelscope" in str(client.base_url):
Copy link

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ModelScope detection logic relies on a simple substring check of "modelscope" in the base_url. This approach is fragile and could match unintended URLs (e.g., "example.com/modelscope-test"). Consider using a more robust detection mechanism, such as checking against the configured ModelScope base URL from the config, or using the LLMServiceEnum to properly identify the service type.

Copilot uses AI. Check for mistakes.
Comment on lines 140 to 161
if "modelscope" in str(client.base_url):
logger.info("Detected ModelScope API, using stream mode with enable_thinking=True.")

extra_body = {"enable_thinking": True}

response_stream = client.chat.completions.create(
model=model,
messages=messages, # pyright: ignore[reportArgumentType]
temperature=temperature,
stream=True,
extra_body=extra_body,
**kwargs,
)

response = client.chat.completions.create(
model=model,
messages=messages, # pyright: ignore[reportArgumentType]
temperature=temperature,
**kwargs,
)
full_content = ""
for chunk in response_stream:
if chunk.choices and chunk.choices[0].delta.content:
full_content += chunk.choices[0].delta.content

fake_message = SimpleNamespace(content=full_content)
fake_choice = SimpleNamespace(message=fake_message)
response = SimpleNamespace(choices=[fake_choice])
Copy link

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is significant code duplication between this ModelScope-specific logic and the similar implementation in check_llm.py (lines 30-50). Consider extracting this into a shared helper function to improve maintainability and ensure consistent behavior across both locations.

Copilot uses AI. Check for mistakes.
ValueError: If response is invalid (empty choices or content)
"""
client = get_llm_client()
# Check whether it is the ModelScope platform, as some models require the stream and enable_thinking parameters
Copy link

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrect indentation: This comment has an extra leading space. It should be aligned with the code at the same indentation level (4 spaces, not 5).

Suggested change
# Check whether it is the ModelScope platform, as some models require the stream and enable_thinking parameters
# Check whether it is the ModelScope platform, as some models require the stream and enable_thinking parameters

Copilot uses AI. Check for mistakes.
Comment on lines 139 to 143
# Check whether it is the ModelScope platform, as some models require the stream and enable_thinking parameters
if "modelscope" in str(client.base_url):
logger.info("Detected ModelScope API, using stream mode with enable_thinking=True.")

extra_body = {"enable_thinking": True}
Copy link

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment states "some models require the stream and enable_thinking parameters", but the code unconditionally applies these settings to all ModelScope API calls. If only certain models require these parameters, this could cause errors or unexpected behavior for models that don't support them. Consider making this behavior configurable per model or checking model compatibility before applying these parameters.

Copilot uses AI. Check for mistakes.
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

README.md Outdated
| -------------- | ------------------------------------------------------------------------------------------------------------------------------------------------- |
| SiliconCloud | [SiliconCloud 官网](https://cloud.siliconflow.cn/i/onCHcaDx)配置方法请参考[配置文档](./docs/llm_config.md)<br>该并发较低,建议把线程设置为5以下。 |
| DeepSeek | [DeepSeek 官网](https://platform.deepseek.com),建议使用 `deepseek-v3` 模型,<br>官方网站最近服务好像并不太稳定。 |
| ModelScope | [ModelScope 官网](https://modelscope.cn/models?filter=inference_type&page=1&tabKey=task)配置方法请参考[配置文档](https://modelscope.cn/docs/model-service/API-Inference/intro)<br>该并发较低,建议把线程设置为5以下。 |
Copy link

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Inconsistent spacing in the table. "ModelScope" has extra spaces before the pipe character compared to other rows. The cell should be formatted consistently:

Change:

| ModelScope   | [ModelScope 官网]...

To match the pattern of other rows like:

| ModelScope     | [ModelScope 官网]...

This ensures proper alignment when the markdown is rendered.

Suggested change
| ModelScope | [ModelScope 官网](https://modelscope.cn/models?filter=inference_type&page=1&tabKey=task)配置方法请参考[配置文档](https://modelscope.cn/docs/model-service/API-Inference/intro)<br>该并发较低,建议把线程设置为5以下。 |
| ModelScope | [ModelScope 官网](https://modelscope.cn/models?filter=inference_type&page=1&tabKey=task)配置方法请参考[配置文档](https://modelscope.cn/docs/model-service/API-Inference/intro)<br>该并发较低,建议把线程设置为5以下。 |

Copilot uses AI. Check for mistakes.

from app.core.utils.cache import get_llm_cache, memoize
from app.core.utils.logger import setup_logger
from types import SimpleNamespace
Copy link

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import statement is not in the standard position. According to PEP 8, standard library imports should be grouped together at the top of the file. The from types import SimpleNamespace import should be moved up to line 5-6, grouped with other standard library imports like os and threading, rather than being placed after third-party imports.

Copilot uses AI. Check for mistakes.
"api_base_cfg": cfg.modelscope_api_base,
"model_cfg": cfg.modelscope_model,
"default_base": "https://api-inference.modelscope.cn/v1",
"default_models": ["Qwen/Qwen3-8B", "Qwen/Qwen3-30B-A3B-Instruct-2507", "deepseek-ai/DeepSeek-V3.1"],
Copy link

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The model name "Qwen/Qwen3-30B-A3B-Instruct-2507" appears to be non-standard. Based on Qwen's naming conventions, this should likely be verified:

  • Standard Qwen3 models follow patterns like "Qwen/Qwen2.5-72B-Instruct" or "Qwen/Qwen3-8B"
  • The "A3B" designation is unusual and may be a typo or placeholder
  • The "-2507" suffix (likely representing July 2025) seems far in the future

Please verify this model name exists in ModelScope's model registry, or update to a valid model name like "Qwen/Qwen2.5-32B-Instruct" if this was intended.

Suggested change
"default_models": ["Qwen/Qwen3-8B", "Qwen/Qwen3-30B-A3B-Instruct-2507", "deepseek-ai/DeepSeek-V3.1"],
"default_models": ["Qwen/Qwen3-8B", "Qwen/Qwen2.5-32B-Instruct", "deepseek-ai/DeepSeek-V3.1"],

Copilot uses AI. Check for mistakes.
"api_base_cfg": cfg.modelscope_api_base,
"model_cfg": cfg.modelscope_model,
"default_base": "https://api-inference.modelscope.cn/v1",
"default_models": ["Qwen/Qwen3-8B", "Qwen/Qwen3-30B-A3B-Instruct-2507", "deepseek-ai/DeepSeek-V3.1"],
Copy link

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The model name "deepseek-ai/DeepSeek-V3.1" may be incorrect. Based on DeepSeek's release history and the documentation in this repository (which references "deepseek-ai/DeepSeek-V3"), the current latest version is V3, not V3.1.

Please verify this model exists in ModelScope's registry. If it doesn't exist, consider using "deepseek-ai/DeepSeek-V3" instead, which is the verified model name used in the documentation.

Suggested change
"default_models": ["Qwen/Qwen3-8B", "Qwen/Qwen3-30B-A3B-Instruct-2507", "deepseek-ai/DeepSeek-V3.1"],
"default_models": ["Qwen/Qwen3-8B", "Qwen/Qwen3-30B-A3B-Instruct-2507", "deepseek-ai/DeepSeek-V3"],

Copilot uses AI. Check for mistakes.
)
return True, response.choices[0].message.content
# Check whether it is the ModelScope platform, as some models require the stream and enable_thinking parameter
if "modelscope" in base_url:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不要在这个函数里面这样做特化判断, 很不优雅。而且这个只是检查的函数,检查通过就好了。

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不要在这个函数里面这样做特化判断, 很不优雅。而且这个只是检查的函数,检查通过就好了。

好的,那我改一下把这一块的修改删掉。


extra_body = {"enable_thinking": True}

response_stream = client.chat.completions.create(
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里为什么要用流式?他肯定也支持非流式的吧?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里为什么要用流式?他肯定也支持非流式的吧?

是的,我测试下来,好像只有QwQ-32B模型是强制使用流式的,然后其他的一些模型在在非流式响应时,会强制enable_thinling参数使用False。如果在config里加一个extra_body设置选项的话,我想可以放弃这个QwQ-32B模型,这样修改好像也比较简单美观?这样可以接受吗?😊

if "modelscope" in str(client.base_url):
logger.info("Detected ModelScope API, using stream mode with enable_thinking=True.")

extra_body = {"enable_thinking": True}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不建议在这个函数里面通过特化判断的方式,建议 config 搞一个extra_body 的设置选项,然后传入extra_body, 然后**kwargs就会接收到。

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不建议在这个函数里面通过特化判断的方式,建议 config 搞一个extra_body 的设置选项,然后传入extra_body, 然后**kwargs就会接收到。

hello,我改了一下extra_body设置,麻烦有时间看一下🙏,有问题随时联系我。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants