Feature/modelscope support #898

yrk111222 · 2025-11-28T06:49:55Z

This PR primarily adds support for the ModelScope API. Users can now utilize models supported by ModelScope API inference by setting an API key.

How to obtain an API key

Note: Please bind your Alibaba Cloud account before using this feature.

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copilot

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 11 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-11-28T08:17:01Z

app/core/llm/check_llm.py

+                model=model,
+                messages=[
+                    {"role": "system", "content": "You are a helpful assistant."},
+                    {"role": "user", 'content': 'Just respond with "Hello"!'},


Inconsistent quote style: This line uses single quotes while the corresponding line in the else branch (line 58) uses double quotes. For consistency, both should use the same quote style.

Suggested change

{"role": "user", 'content': 'Just respond with "Hello"!'},

{"role": "user", "content": "Just respond with \"Hello\"!"},

Copilot · 2025-11-28T08:17:01Z

app/core/llm/check_llm.py

-        )
-        return True, response.choices[0].message.content
+        # Check whether it is the ModelScope platform, as some models require the stream and enable_thinking parameters
+        if "modelscope" in base_url:


The ModelScope detection logic relies on a simple substring check of "modelscope" in the base_url. This approach is fragile and could match unintended URLs (e.g., "example.com/modelscope-test"). Consider using a more robust detection mechanism, such as checking against the configured ModelScope base URL from the config, or using the LLMServiceEnum to properly identify the service type.

Copilot · 2025-11-28T08:17:01Z

app/core/llm/check_llm.py

+            for chunk in response_stream:
+                if chunk.choices and chunk.choices[0].delta.content:
+                    full_answer += chunk.choices[0].delta.content
+


Missing error handling for empty streaming responses. If the streaming response yields no content chunks (full_answer remains empty), this will return an empty string. Consider adding validation to ensure a meaningful response was received, similar to how the non-streaming path validates response.choices[0].message.content.

Suggested change

if not full_answer.strip():

return False, "No content received from streaming response."

Copilot · 2025-11-28T08:17:02Z

app/core/llm/client.py

+        fake_message = SimpleNamespace(content=full_content)
+        fake_choice = SimpleNamespace(message=fake_message)
+        response = SimpleNamespace(choices=[fake_choice])


The SimpleNamespace approach creates a fake response object that may not fully conform to the OpenAI API response structure. This could break if the validation logic or downstream code expects additional attributes or methods from the actual ChatCompletion object. Consider using the official OpenAI response classes or ensuring all necessary attributes are present on the SimpleNamespace objects.

Suggested change

fake_message = SimpleNamespace(content=full_content)

fake_choice = SimpleNamespace(message=fake_message)

response = SimpleNamespace(choices=[fake_choice])

import time

fake_message = SimpleNamespace(content=full_content)

fake_choice = SimpleNamespace(message=fake_message, finish_reason="stop", index=0)

response = SimpleNamespace(

id="chatcmpl-fake-modelscope",

object="chat.completion",

created=int(time.time()),

model=model,

choices=[fake_choice],

)

Copilot · 2025-11-28T08:17:02Z

app/core/llm/client.py

+        for chunk in response_stream:
+            if chunk.choices and chunk.choices[0].delta.content:
+                full_content += chunk.choices[0].delta.content
+


Missing error handling for empty streaming responses. If the streaming response yields no content chunks (full_content remains empty), the validation at lines 171-179 may fail or pass with an empty string. Consider adding explicit validation after the streaming loop to ensure meaningful content was received.

Suggested change

if not full_content:

raise ValueError("Streaming response yielded no content")

Copilot · 2025-11-28T08:17:03Z

app/core/llm/check_llm.py

+            extra_body = {"enable_thinking": True}
+            response_stream = openai.OpenAI(
+                base_url=base_url, api_key=api_key, timeout=60
+            ).chat.completions.create(
+                model=model,
+                messages=[
+                    {"role": "system", "content": "You are a helpful assistant."},
+                    {"role": "user", 'content': 'Just respond with "Hello"!'},
+                ],
+                stream=True,
+                extra_body=extra_body,
+                timeout=30,
+            )
+
+            full_answer = ""
+            for chunk in response_stream:
+                if chunk.choices and chunk.choices[0].delta.content:
+                    full_answer += chunk.choices[0].delta.content
+
+            return True, full_answer.strip()


The comment states "some models require the stream and enable_thinking parameters", but the code unconditionally applies these settings to all ModelScope API calls. If only certain models require these parameters, this could cause errors or unexpected behavior for models that don't support them. Consider making this behavior configurable per model or checking model compatibility before applying these parameters.

Suggested change

extra_body = {"enable_thinking": True}

response_stream = openai.OpenAI(

base_url=base_url, api_key=api_key, timeout=60

).chat.completions.create(

model=model,

messages=[

{"role": "system", "content": "You are a helpful assistant."},

{"role": "user", 'content': 'Just respond with "Hello"!'},

],

stream=True,

extra_body=extra_body,

timeout=30,

)

full_answer = ""

for chunk in response_stream:

if chunk.choices and chunk.choices[0].delta.content:

full_answer += chunk.choices[0].delta.content

return True, full_answer.strip()

# List of ModelScope models that require stream and enable_thinking

models_require_thinking = [

"qwen-72b-chat", "qwen-7b-chat", "qwen-14b-chat", "qwen-1.8b-chat"

# Add other models as needed

]

if model in models_require_thinking:

extra_body = {"enable_thinking": True}

response_stream = openai.OpenAI(

base_url=base_url, api_key=api_key, timeout=60

).chat.completions.create(

model=model,

messages=[

{"role": "system", "content": "You are a helpful assistant."},

{"role": "user", 'content': 'Just respond with "Hello"!'},

],

stream=True,

extra_body=extra_body,

timeout=30,

)

full_answer = ""

for chunk in response_stream:

if chunk.choices and chunk.choices[0].delta.content:

full_answer += chunk.choices[0].delta.content

return True, full_answer.strip()

else:

response = openai.OpenAI(

base_url=base_url, api_key=api_key, timeout=60

).chat.completions.create(

model=model,

messages=[

{"role": "system", "content": "You are a helpful assistant."},

{"role": "user", 'content': 'Just respond with "Hello"!'},

],

timeout=30,

)

return True, response.choices[0].message.content

Copilot · 2025-11-28T08:17:03Z

app/core/llm/client.py

    """
    client = get_llm_client()
+     # Check whether it is the ModelScope platform, as some models require the stream and enable_thinking parameters
+    if "modelscope" in str(client.base_url):


The ModelScope detection logic relies on a simple substring check of "modelscope" in the base_url. This approach is fragile and could match unintended URLs (e.g., "example.com/modelscope-test"). Consider using a more robust detection mechanism, such as checking against the configured ModelScope base URL from the config, or using the LLMServiceEnum to properly identify the service type.

Copilot · 2025-11-28T08:17:03Z

app/core/llm/client.py

+    if "modelscope" in str(client.base_url):
+        logger.info("Detected ModelScope API, using stream mode with enable_thinking=True.")
+
+        extra_body = {"enable_thinking": True}
+
+        response_stream = client.chat.completions.create(
+            model=model,
+            messages=messages,  # pyright: ignore[reportArgumentType]
+            temperature=temperature,
+            stream=True,
+            extra_body=extra_body,
+            **kwargs,
+        )

-    response = client.chat.completions.create(
-        model=model,
-        messages=messages,  # pyright: ignore[reportArgumentType]
-        temperature=temperature,
-        **kwargs,
-    )
+        full_content = ""
+        for chunk in response_stream:
+            if chunk.choices and chunk.choices[0].delta.content:
+                full_content += chunk.choices[0].delta.content
+
+        fake_message = SimpleNamespace(content=full_content)
+        fake_choice = SimpleNamespace(message=fake_message)
+        response = SimpleNamespace(choices=[fake_choice])


There is significant code duplication between this ModelScope-specific logic and the similar implementation in check_llm.py (lines 30-50). Consider extracting this into a shared helper function to improve maintainability and ensure consistent behavior across both locations.

Copilot · 2025-11-28T08:17:03Z

app/core/llm/client.py

        ValueError: If response is invalid (empty choices or content)
    """
    client = get_llm_client()
+     # Check whether it is the ModelScope platform, as some models require the stream and enable_thinking parameters


Incorrect indentation: This comment has an extra leading space. It should be aligned with the code at the same indentation level (4 spaces, not 5).

Suggested change

# Check whether it is the ModelScope platform, as some models require the stream and enable_thinking parameters

# Check whether it is the ModelScope platform, as some models require the stream and enable_thinking parameters

Copilot · 2025-11-28T08:17:03Z

app/core/llm/client.py

+     # Check whether it is the ModelScope platform, as some models require the stream and enable_thinking parameters
+    if "modelscope" in str(client.base_url):
+        logger.info("Detected ModelScope API, using stream mode with enable_thinking=True.")
+
+        extra_body = {"enable_thinking": True}


The comment states "some models require the stream and enable_thinking parameters", but the code unconditionally applies these settings to all ModelScope API calls. If only certain models require these parameters, this could cause errors or unexpected behavior for models that don't support them. Consider making this behavior configurable per model or checking model compatibility before applying these parameters.

Copilot

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-11-28T08:17:34Z

README.md

 | -------------- | ------------------------------------------------------------------------------------------------------------------------------------------------- |
 | SiliconCloud   | [SiliconCloud 官网](https://cloud.siliconflow.cn/i/onCHcaDx)配置方法请参考[配置文档](./docs/llm_config.md)<br>该并发较低，建议把线程设置为5以下。 |
 | DeepSeek       | [DeepSeek 官网](https://platform.deepseek.com)，建议使用 `deepseek-v3` 模型，<br>官方网站最近服务好像并不太稳定。                                 |
+| ModelScope   | [ModelScope 官网](https://modelscope.cn/models?filter=inference_type&page=1&tabKey=task)配置方法请参考[配置文档](https://modelscope.cn/docs/model-service/API-Inference/intro)<br>该并发较低，建议把线程设置为5以下。 |


[nitpick] Inconsistent spacing in the table. "ModelScope" has extra spaces before the pipe character compared to other rows. The cell should be formatted consistently:

Change:

| ModelScope | [ModelScope 官网]...

To match the pattern of other rows like:

| ModelScope | [ModelScope 官网]...

This ensures proper alignment when the markdown is rendered.

Suggested change

| ModelScope | [ModelScope 官网](https://modelscope.cn/models?filter=inference_type&page=1&tabKey=task)配置方法请参考[配置文档](https://modelscope.cn/docs/model-service/API-Inference/intro)<br>该并发较低，建议把线程设置为5以下。 |

| ModelScope | [ModelScope 官网](https://modelscope.cn/models?filter=inference_type&page=1&tabKey=task)配置方法请参考[配置文档](https://modelscope.cn/docs/model-service/API-Inference/intro)<br>该并发较低，建议把线程设置为5以下。 |

Copilot · 2025-11-28T08:17:34Z

app/core/llm/client.py


 from app.core.utils.cache import get_llm_cache, memoize
 from app.core.utils.logger import setup_logger
+from types import SimpleNamespace


Import statement is not in the standard position. According to PEP 8, standard library imports should be grouped together at the top of the file. The from types import SimpleNamespace import should be moved up to line 5-6, grouped with other standard library imports like os and threading, rather than being placed after third-party imports.

Copilot · 2025-11-28T08:17:35Z

app/view/setting_interface.py

+                "api_base_cfg": cfg.modelscope_api_base,
+                "model_cfg": cfg.modelscope_model,
+                "default_base": "https://api-inference.modelscope.cn/v1",
+                "default_models": ["Qwen/Qwen3-8B", "Qwen/Qwen3-30B-A3B-Instruct-2507", "deepseek-ai/DeepSeek-V3.1"],


The model name "Qwen/Qwen3-30B-A3B-Instruct-2507" appears to be non-standard. Based on Qwen's naming conventions, this should likely be verified:

Standard Qwen3 models follow patterns like "Qwen/Qwen2.5-72B-Instruct" or "Qwen/Qwen3-8B"

The "A3B" designation is unusual and may be a typo or placeholder

The "-2507" suffix (likely representing July 2025) seems far in the future

Please verify this model name exists in ModelScope's model registry, or update to a valid model name like "Qwen/Qwen2.5-32B-Instruct" if this was intended.

Suggested change

"default_models": ["Qwen/Qwen3-8B", "Qwen/Qwen3-30B-A3B-Instruct-2507", "deepseek-ai/DeepSeek-V3.1"],

"default_models": ["Qwen/Qwen3-8B", "Qwen/Qwen2.5-32B-Instruct", "deepseek-ai/DeepSeek-V3.1"],

Copilot · 2025-11-28T08:17:35Z

app/view/setting_interface.py

+                "api_base_cfg": cfg.modelscope_api_base,
+                "model_cfg": cfg.modelscope_model,
+                "default_base": "https://api-inference.modelscope.cn/v1",
+                "default_models": ["Qwen/Qwen3-8B", "Qwen/Qwen3-30B-A3B-Instruct-2507", "deepseek-ai/DeepSeek-V3.1"],


The model name "deepseek-ai/DeepSeek-V3.1" may be incorrect. Based on DeepSeek's release history and the documentation in this repository (which references "deepseek-ai/DeepSeek-V3"), the current latest version is V3, not V3.1.

Please verify this model exists in ModelScope's registry. If it doesn't exist, consider using "deepseek-ai/DeepSeek-V3" instead, which is the verified model name used in the documentation.

Suggested change

"default_models": ["Qwen/Qwen3-8B", "Qwen/Qwen3-30B-A3B-Instruct-2507", "deepseek-ai/DeepSeek-V3.1"],

"default_models": ["Qwen/Qwen3-8B", "Qwen/Qwen3-30B-A3B-Instruct-2507", "deepseek-ai/DeepSeek-V3"],

WEIFENG2333 · 2025-11-29T17:22:06Z

app/core/llm/check_llm.py

-        )
-        return True, response.choices[0].message.content
+        # Check whether it is the ModelScope platform, as some models require the stream and enable_thinking parameter
+        if "modelscope" in base_url:


不要在这个函数里面这样做特化判断，很不优雅。而且这个只是检查的函数，检查通过就好了。

不要在这个函数里面这样做特化判断，很不优雅。而且这个只是检查的函数，检查通过就好了。

好的，那我改一下把这一块的修改删掉。

WEIFENG2333 · 2025-11-29T17:22:58Z

app/core/llm/client.py

+
+        extra_body = {"enable_thinking": True}
+
+        response_stream = client.chat.completions.create(


这里为什么要用流式？他肯定也支持非流式的吧？

这里为什么要用流式？他肯定也支持非流式的吧？

是的，我测试下来，好像只有QwQ-32B模型是强制使用流式的，然后其他的一些模型在在非流式响应时，会强制enable_thinling参数使用False。如果在config里加一个extra_body设置选项的话，我想可以放弃这个QwQ-32B模型，这样修改好像也比较简单美观？这样可以接受吗？😊

WEIFENG2333 · 2025-11-29T17:24:44Z

app/core/llm/client.py

+    if "modelscope" in str(client.base_url):
+        logger.info("Detected ModelScope API, using stream mode with enable_thinking=True.")
+
+        extra_body = {"enable_thinking": True}


不建议在这个函数里面通过特化判断的方式，建议 config 搞一个extra_body 的设置选项，然后传入extra_body，然后**kwargs就会接收到。

不建议在这个函数里面通过特化判断的方式，建议 config 搞一个extra_body 的设置选项，然后传入extra_body，然后**kwargs就会接收到。

hello，我改了一下extra_body设置，麻烦有时间看一下🙏，有问题随时联系我。

yrk111222 added 3 commits November 28, 2025 14:03

add support for ModelScope API

67f544c

restore default subtitle settings

964b9d4

update README.md

bd043bd

Copilot AI review requested due to automatic review settings November 28, 2025 06:49

Copilot started reviewing on behalf of yrk111222 November 28, 2025 06:50 View session

Copilot finished reviewing on behalf of yrk111222 November 28, 2025 06:50

Copilot AI reviewed Nov 28, 2025

View reviewed changes

yrk111222 requested review from Copilot November 28, 2025 08:11

Copilot started reviewing on behalf of yrk111222 November 28, 2025 08:12 View session

Copilot finished reviewing on behalf of yrk111222 November 28, 2025 08:14

Copilot finished reviewing on behalf of yrk111222 November 28, 2025 08:15

Copilot AI reviewed Nov 28, 2025

View reviewed changes

yrk111222 added 2 commits November 28, 2025 16:40

modify some details

8434abd

change import statement position

32dcb31

WEIFENG2333 reviewed Nov 29, 2025

View reviewed changes

yrk111222 added 4 commits December 19, 2025 14:50

modify extra parameter logic

7e650a7

update

68504b5

delete some unused packages

5f91ca5

restore the deleted blank lines

2c0ab23

	{"role": "user", 'content': 'Just respond with "Hello"!'},
	{"role": "user", "content": "Just respond with \"Hello\"!"},



	if not full_answer.strip():
	return False, "No content received from streaming response."



	if not full_content:
	raise ValueError("Streaming response yielded no content")

	# Check whether it is the ModelScope platform, as some models require the stream and enable_thinking parameters
	# Check whether it is the ModelScope platform, as some models require the stream and enable_thinking parameters

	\| ModelScope \| [ModelScope 官网](https://modelscope.cn/models?filter=inference_type&page=1&tabKey=task)配置方法请参考[配置文档](https://modelscope.cn/docs/model-service/API-Inference/intro)<br>该并发较低，建议把线程设置为5以下。 \|
	\| ModelScope \| [ModelScope 官网](https://modelscope.cn/models?filter=inference_type&page=1&tabKey=task)配置方法请参考[配置文档](https://modelscope.cn/docs/model-service/API-Inference/intro)<br>该并发较低，建议把线程设置为5以下。 \|

	"default_models": ["Qwen/Qwen3-8B", "Qwen/Qwen3-30B-A3B-Instruct-2507", "deepseek-ai/DeepSeek-V3.1"],
	"default_models": ["Qwen/Qwen3-8B", "Qwen/Qwen2.5-32B-Instruct", "deepseek-ai/DeepSeek-V3.1"],


		extra_body = {"enable_thinking": True}

		response_stream = client.chat.completions.create(

Feature/modelscope support #898

Are you sure you want to change the base?

Feature/modelscope support #898

Uh oh!

Conversation

yrk111222 commented Nov 28, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

WEIFENG2333 Nov 29, 2025

Choose a reason for hiding this comment

Uh oh!

yrk111222 Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

WEIFENG2333 Nov 29, 2025

Choose a reason for hiding this comment

Uh oh!

yrk111222 Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

WEIFENG2333 Nov 29, 2025

Choose a reason for hiding this comment

Uh oh!

yrk111222 Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees