Skip to content

Conversation

@Dt8333
Copy link
Member

@Dt8333 Dt8333 commented Nov 30, 2025

Fix #3770

根据模型能力判断是否需要标注图片,而不是全部标注。

Modifications / 改动点

packages\astrbot\process_llm_request.py

  • 复制ProcessLLMRequest中的_select_provider()获取LLMProvider
    (或许提供一个Global的_select_provider会更好)

  • 添加判断:Provider能力不支持Image时,进行_ensure_img_caption()

  • This is NOT a breaking change. / 这不是一个破坏性变更。

Screenshots or Test Results / 运行截图或测试结果


Checklist / 检查清单

  • 😊 如果 PR 中有新加入的功能,已经通过 Issue / 邮件等方式和作者讨论过。/ If there are new features added in the PR, I have discussed it with the authors through issues/emails, etc.
  • 👀 我的更改经过了良好的测试,并已在上方提供了“验证步骤”和“运行截图”。/ My changes have been well-tested, and "Verification Steps" and "Screenshots" have been provided above.
  • 🤓 我确保没有引入新依赖库,或者引入了新依赖库的同时将其添加到了 requirements.txtpyproject.toml 文件相应位置。/ I have ensured that no new dependencies are introduced, OR if new dependencies are introduced, they have been added to the appropriate locations in requirements.txt and pyproject.toml.
  • 😮 我的更改没有引入恶意代码。/ My changes do not introduce malicious code.

Summary by Sourcery

在 LLM 请求处理流程中调整图像描述行为,使其依赖于所选提供商的能力。

Bug Fixes(错误修复):

  • 当所选提供商已支持图像输入模态时,避免不必要的图像描述生成。

Enhancements(增强改进):

  • 新增一个辅助工具,用于从当前事件中解析出正在使用的 LLM 提供商,以便基于其能力进行决策。
Original summary in English

Summary by Sourcery

Adjust image captioning behavior in LLM request processing to depend on the selected provider’s capabilities.

Bug Fixes:

  • Avoid unnecessary image caption generation when the selected provider already supports image input modalities.

Enhancements:

  • Introduce a helper to resolve the active LLM provider from the current event for capability-based decisions.

@auto-assign auto-assign bot requested review from Fridemn and Raven95676 November 30, 2025 02:53
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - 我已经审查了你的更改,这里是一些反馈:

  • 新的 _select_provider 辅助函数和描述中提到的现有逻辑有重复;建议将这部分“选择 provider 的逻辑”集中到一个共享的辅助函数中,以避免不同调用点之间出现逻辑偏差。
  • image caption 条件中,self._select_provider(event) 可能返回 None(例如设置了无效的 selected_provider 时),这在访问 .provider_config 时会导致 AttributeError;请添加防御性检查或在此处提前返回,以安全地处理这种情况。
  • 你目前是在 if 条件内部调用 _select_provider(event);建议先解析一次 provider,将其存储在一个局部变量中并重复使用,以避免重复查找并让逻辑更加清晰。
给 AI 代理的提示
Please address the comments from this code review:

## Overall Comments
- The new `_select_provider` helper duplicates existing logic mentioned in the description; consider centralizing this provider-selection logic in a shared helper to avoid divergence between call sites.
- In the `image caption` condition, `self._select_provider(event)` can return `None` (e.g., when an invalid `selected_provider` is set), which would cause an `AttributeError` when accessing `.provider_config`; add a defensive check or early return to handle this case safely.
- You currently call `_select_provider(event)` inside the `if` condition; consider resolving the provider once, storing it in a local variable, and reusing it to avoid repeated lookup and make the logic clearer.

## Individual Comments

### Comment 1
<location> `packages/astrbot/process_llm_request.py:118-126` </location>
<code_context>
             f"Cannot get image caption because provider `{provider_id}` is not exist.",
         )

+    def _select_provider(self, event: AstrMessageEvent):
+        """选择使用的 LLM 提供商"""
+        sel_provider = event.get_extra("selected_provider")
+        _ctx = self.ctx
+        if sel_provider and isinstance(sel_provider, str):
+            provider = _ctx.get_provider_by_id(sel_provider)
+            if not provider:
+                logger.error(f"未找到指定的提供商: {sel_provider}。")
+            return provider
+
+        return _ctx.get_using_provider(umo=event.unified_msg_origin)
</code_context>

<issue_to_address>
**issue (bug_risk):** 当选定的 provider ID 无效时,`_select_provider` 返回 `None` 而不是回退到默认 provider,这可能会导致后续错误。

在 `sel_provider` 分支中,如果 `get_provider_by_id` 返回 `None`,你会记录一条错误日志但仍然返回 `None`,这会在 `process_llm_request` 中访问 `.provider_config` 时引发异常。相应地,你可以在选定 provider 无效时回退到 `_ctx.get_using_provider(umo=event.unified_msg_origin)`,或者在这里抛出一个明确的错误并在调用方进行处理。这样可以避免当前这种“静默失败”的模式以及意料之外的 `None` 传递到下游。
</issue_to_address>

### Comment 2
<location> `packages/astrbot/process_llm_request.py:130` </location>
<code_context>
    async def process_llm_request(self, event: AstrMessageEvent, req: ProviderRequest):
        """在请求 LLM 前注入人格信息、Identifier、时间、回复内容等 System Prompt"""
        cfg: dict = self.ctx.get_config(umo=event.unified_msg_origin)[
            "provider_settings"
        ]

        # prompt prefix
        if prefix := cfg.get("prompt_prefix"):
            # 支持 {{prompt}} 作为用户输入的占位符
            if "{{prompt}}" in prefix:
                req.prompt = prefix.replace("{{prompt}}", req.prompt)
            else:
                req.prompt = prefix + req.prompt

        # user identifier
        if cfg.get("identifier"):
            user_id = event.message_obj.sender.user_id
            user_nickname = event.message_obj.sender.nickname
            req.prompt = (
                f"\n[User ID: {user_id}, Nickname: {user_nickname}]\n{req.prompt}"
            )

        # group name identifier
        if cfg.get("group_name_display") and event.message_obj.group_id:
            group_name = event.message_obj.group.group_name
            if group_name:
                req.system_prompt += f"\nGroup name: {group_name}\n"

        # time info
        if cfg.get("datetime_system_prompt"):
            current_time = None
            if self.timezone:
                # 启用时区
                try:
                    now = datetime.datetime.now(zoneinfo.ZoneInfo(self.timezone))
                    current_time = now.strftime("%Y-%m-%d %H:%M (%Z)")
                except Exception as e:
                    logger.error(f"时区设置错误: {e}, 使用本地时区")
            if not current_time:
                current_time = (
                    datetime.datetime.now().astimezone().strftime("%Y-%m-%d %H:%M (%Z)")
                )
            req.system_prompt += f"\nCurrent datetime: {current_time}\n"

        img_cap_prov_id: str = cfg.get("default_image_caption_provider_id") or ""
        if req.conversation:
            # inject persona for this request
            await self._ensure_persona(req, cfg, event.unified_msg_origin)

            # image caption
            if (
                img_cap_prov_id
                and req.image_urls
                and "image"
                not in self._select_provider(event).provider_config.get(
                    "modalities", ["image"]
                )
            ):
                await self._ensure_img_caption(req, cfg, img_cap_prov_id)

        # quote message processing
        # 解析引用内容
        quote = None
        for comp in event.message_obj.message:
            if isinstance(comp, Reply):
                quote = comp
                break
        if quote:
            sender_info = ""
            if quote.sender_nickname:
                sender_info = f"(Sent by {quote.sender_nickname})"
            message_str = quote.message_str or "[Empty Text]"
            req.system_prompt += (
                f"\nUser is quoting a message{sender_info}.\n"
                f"Here are the information of the quoted message: Text Content: {message_str}.\n"
            )
            image_seg = None
            if quote.chain:
                for comp in quote.chain:
                    if isinstance(comp, Image):
                        image_seg = comp
                        break
            if image_seg:
                try:
                    prov = None
                    if img_cap_prov_id:
                        prov = self.ctx.get_provider_by_id(img_cap_prov_id)
                    if prov is None:
                        prov = self.ctx.get_using_provider(event.unified_msg_origin)
                    if prov and isinstance(prov, Provider):
                        llm_resp = await prov.text_chat(
                            prompt="Please describe the image content.",
                            image_urls=[await image_seg.convert_to_file_path()],
                        )
                        if llm_resp.completion_text:
                            req.system_prompt += (
                                f"Image Caption: {llm_resp.completion_text}\n"
                            )
                    else:
                        logger.warning("No provider found for image captioning.")
                except BaseException as e:
                    logger.error(f"处理引用图片失败: {e}")

</code_context>

<issue_to_address>
**issue (code-quality):** 我们发现了以下问题:

- 建议使用命名表达式来简化赋值与条件判断 [×2][`use-named-expression`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/use-named-expression/)- 建议使用内置函数 `next` 来替代 for 循环([`use-next`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/use-next/)-`ProcessLLMRequest.process_llm_request` 中检测到较低的代码质量——12%([`low-code-quality`](https://docs.sourcery.ai/Reference/Default-Rules/comments/low-code-quality/))

<br/><details><summary>Explanation</summary>


该函数的质量评分低于 25% 的质量阈值。
这个分数是由方法长度、认知复杂度以及工作记忆占用等因素综合计算得出。

如何改进?

你可以考虑对这个函数进行重构,使其更短、更易读。

- 通过把部分功能提取到独立函数中来缩短函数长度。这是你能做的最重要的事情——理想情况下,一个函数应该少于 10 行。
- 通过引入“守卫式条件(guard clauses)”等方式提前返回,减少嵌套层级。
- 确保变量作用域尽量小,让使用相关概念的代码尽量聚合在一起,而不是分散在函数的不同位置。</details>
</issue_to_address>

Sourcery 对开源项目免费使用——如果你觉得我们的评审对你有帮助,欢迎分享给更多人 ✨
帮助我变得更有用!请在每条评论上点 👍 或 👎,我会根据你的反馈持续改进评审质量。
Original comment in English

Hey there - I've reviewed your changes - here's some feedback:

  • The new _select_provider helper duplicates existing logic mentioned in the description; consider centralizing this provider-selection logic in a shared helper to avoid divergence between call sites.
  • In the image caption condition, self._select_provider(event) can return None (e.g., when an invalid selected_provider is set), which would cause an AttributeError when accessing .provider_config; add a defensive check or early return to handle this case safely.
  • You currently call _select_provider(event) inside the if condition; consider resolving the provider once, storing it in a local variable, and reusing it to avoid repeated lookup and make the logic clearer.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The new `_select_provider` helper duplicates existing logic mentioned in the description; consider centralizing this provider-selection logic in a shared helper to avoid divergence between call sites.
- In the `image caption` condition, `self._select_provider(event)` can return `None` (e.g., when an invalid `selected_provider` is set), which would cause an `AttributeError` when accessing `.provider_config`; add a defensive check or early return to handle this case safely.
- You currently call `_select_provider(event)` inside the `if` condition; consider resolving the provider once, storing it in a local variable, and reusing it to avoid repeated lookup and make the logic clearer.

## Individual Comments

### Comment 1
<location> `packages/astrbot/process_llm_request.py:118-126` </location>
<code_context>
             f"Cannot get image caption because provider `{provider_id}` is not exist.",
         )

+    def _select_provider(self, event: AstrMessageEvent):
+        """选择使用的 LLM 提供商"""
+        sel_provider = event.get_extra("selected_provider")
+        _ctx = self.ctx
+        if sel_provider and isinstance(sel_provider, str):
+            provider = _ctx.get_provider_by_id(sel_provider)
+            if not provider:
+                logger.error(f"未找到指定的提供商: {sel_provider}。")
+            return provider
+
+        return _ctx.get_using_provider(umo=event.unified_msg_origin)
</code_context>

<issue_to_address>
**issue (bug_risk):** When a selected provider ID is invalid, `_select_provider` returns `None` instead of falling back to the default provider, which can cause downstream errors.

In the `sel_provider` branch, if `get_provider_by_id` returns `None` you log an error but still return `None`, which will cause an exception in `process_llm_request` when `.provider_config` is accessed. Instead, either fall back to `_ctx.get_using_provider(umo=event.unified_msg_origin)` when the selected provider is invalid, or raise an explicit error here and handle it at the call site. This avoids the current silent failure mode and unexpected `None` downstream.
</issue_to_address>

### Comment 2
<location> `packages/astrbot/process_llm_request.py:130` </location>
<code_context>
    async def process_llm_request(self, event: AstrMessageEvent, req: ProviderRequest):
        """在请求 LLM 前注入人格信息、Identifier、时间、回复内容等 System Prompt"""
        cfg: dict = self.ctx.get_config(umo=event.unified_msg_origin)[
            "provider_settings"
        ]

        # prompt prefix
        if prefix := cfg.get("prompt_prefix"):
            # 支持 {{prompt}} 作为用户输入的占位符
            if "{{prompt}}" in prefix:
                req.prompt = prefix.replace("{{prompt}}", req.prompt)
            else:
                req.prompt = prefix + req.prompt

        # user identifier
        if cfg.get("identifier"):
            user_id = event.message_obj.sender.user_id
            user_nickname = event.message_obj.sender.nickname
            req.prompt = (
                f"\n[User ID: {user_id}, Nickname: {user_nickname}]\n{req.prompt}"
            )

        # group name identifier
        if cfg.get("group_name_display") and event.message_obj.group_id:
            group_name = event.message_obj.group.group_name
            if group_name:
                req.system_prompt += f"\nGroup name: {group_name}\n"

        # time info
        if cfg.get("datetime_system_prompt"):
            current_time = None
            if self.timezone:
                # 启用时区
                try:
                    now = datetime.datetime.now(zoneinfo.ZoneInfo(self.timezone))
                    current_time = now.strftime("%Y-%m-%d %H:%M (%Z)")
                except Exception as e:
                    logger.error(f"时区设置错误: {e}, 使用本地时区")
            if not current_time:
                current_time = (
                    datetime.datetime.now().astimezone().strftime("%Y-%m-%d %H:%M (%Z)")
                )
            req.system_prompt += f"\nCurrent datetime: {current_time}\n"

        img_cap_prov_id: str = cfg.get("default_image_caption_provider_id") or ""
        if req.conversation:
            # inject persona for this request
            await self._ensure_persona(req, cfg, event.unified_msg_origin)

            # image caption
            if (
                img_cap_prov_id
                and req.image_urls
                and "image"
                not in self._select_provider(event).provider_config.get(
                    "modalities", ["image"]
                )
            ):
                await self._ensure_img_caption(req, cfg, img_cap_prov_id)

        # quote message processing
        # 解析引用内容
        quote = None
        for comp in event.message_obj.message:
            if isinstance(comp, Reply):
                quote = comp
                break
        if quote:
            sender_info = ""
            if quote.sender_nickname:
                sender_info = f"(Sent by {quote.sender_nickname})"
            message_str = quote.message_str or "[Empty Text]"
            req.system_prompt += (
                f"\nUser is quoting a message{sender_info}.\n"
                f"Here are the information of the quoted message: Text Content: {message_str}.\n"
            )
            image_seg = None
            if quote.chain:
                for comp in quote.chain:
                    if isinstance(comp, Image):
                        image_seg = comp
                        break
            if image_seg:
                try:
                    prov = None
                    if img_cap_prov_id:
                        prov = self.ctx.get_provider_by_id(img_cap_prov_id)
                    if prov is None:
                        prov = self.ctx.get_using_provider(event.unified_msg_origin)
                    if prov and isinstance(prov, Provider):
                        llm_resp = await prov.text_chat(
                            prompt="Please describe the image content.",
                            image_urls=[await image_seg.convert_to_file_path()],
                        )
                        if llm_resp.completion_text:
                            req.system_prompt += (
                                f"Image Caption: {llm_resp.completion_text}\n"
                            )
                    else:
                        logger.warning("No provider found for image captioning.")
                except BaseException as e:
                    logger.error(f"处理引用图片失败: {e}")

</code_context>

<issue_to_address>
**issue (code-quality):** We've found these issues:

- Use named expression to simplify assignment and conditional [×2] ([`use-named-expression`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/use-named-expression/))
- Use the built-in function `next` instead of a for-loop ([`use-next`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/use-next/))
- Low code quality found in ProcessLLMRequest.process\_llm\_request - 12% ([`low-code-quality`](https://docs.sourcery.ai/Reference/Default-Rules/comments/low-code-quality/))

<br/><details><summary>Explanation</summary>


The quality score for this function is below the quality threshold of 25%.
This score is a combination of the method length, cognitive complexity and working memory.

How can you solve this?

It might be worth refactoring this function to make it shorter and more readable.

- Reduce the function length by extracting pieces of functionality out into
  their own functions. This is the most important thing you can do - ideally a
  function should be less than 10 lines.
- Reduce nesting, perhaps by introducing guard clauses to return early.
- Ensure that variables are tightly scoped, so that code using related concepts
  sits together within the function rather than being scattered.</details>
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +118 to +126
def _select_provider(self, event: AstrMessageEvent):
"""选择使用的 LLM 提供商"""
sel_provider = event.get_extra("selected_provider")
_ctx = self.ctx
if sel_provider and isinstance(sel_provider, str):
provider = _ctx.get_provider_by_id(sel_provider)
if not provider:
logger.error(f"未找到指定的提供商: {sel_provider}。")
return provider
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): 当选定的 provider ID 无效时,_select_provider 返回 None 而不是回退到默认 provider,这可能会导致下游错误。

sel_provider 分支中,如果 get_provider_by_id 返回 None,你会记录一条错误日志但仍然返回 None,这会在 process_llm_request 中访问 .provider_config 时引发异常。相应地,你可以在选定 provider 无效时回退到 _ctx.get_using_provider(umo=event.unified_msg_origin),或者在这里抛出一个明确的错误并在调用方进行处理。这样可以避免当前这种“静默失败”的模式以及意料之外的 None 传递到下游。

Original comment in English

issue (bug_risk): When a selected provider ID is invalid, _select_provider returns None instead of falling back to the default provider, which can cause downstream errors.

In the sel_provider branch, if get_provider_by_id returns None you log an error but still return None, which will cause an exception in process_llm_request when .provider_config is accessed. Instead, either fall back to _ctx.get_using_provider(umo=event.unified_msg_origin) when the selected provider is invalid, or raise an explicit error here and handle it at the call site. This avoids the current silent failure mode and unexpected None downstream.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

思考。
这个复制自astrbot\core\pipeline\process_stage\method\agent_sub_stages\internal.py
要改建议两个一起改(


return _ctx.get_using_provider(umo=event.unified_msg_origin)

async def process_llm_request(self, event: AstrMessageEvent, req: ProviderRequest):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (code-quality): 我们发现了以下问题:

  • 建议使用命名表达式来简化赋值与条件判断 [×2](use-named-expression
  • 建议使用内置函数 next 来替代 for 循环(use-next
  • ProcessLLMRequest.process_llm_request 中检测到较低的代码质量——12%(low-code-quality


Explanation

该函数的质量评分低于 25% 的质量阈值。
这个分数是由方法长度、认知复杂度以及工作记忆占用等因素综合计算得出。

如何改进?

你可以考虑对这个函数进行重构,使其更短、更易读。

  • 通过把部分功能提取到独立函数中来缩短函数长度。这是你能做的最重要的事情——理想情况下,一个函数应该少于 10 行。
  • 通过引入“守卫式条件(guard clauses)”等方式提前返回,减少嵌套层级。
  • 确保变量作用域尽量小,让使用相关概念的代码尽量聚合在一起,而不是分散在函数的不同位置。
Original comment in English

issue (code-quality): We've found these issues:

  • Use named expression to simplify assignment and conditional [×2] (use-named-expression)
  • Use the built-in function next instead of a for-loop (use-next)
  • Low code quality found in ProcessLLMRequest.process_llm_request - 12% (low-code-quality)


Explanation

The quality score for this function is below the quality threshold of 25%.
This score is a combination of the method length, cognitive complexity and working memory.

How can you solve this?

It might be worth refactoring this function to make it shorter and more readable.

  • Reduce the function length by extracting pieces of functionality out into
    their own functions. This is the most important thing you can do - ideally a
    function should be less than 10 lines.
  • Reduce nesting, perhaps by introducing guard clauses to return early.
  • Ensure that variables are tightly scoped, so that code using related concepts
    sits together within the function rather than being scattered.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]群聊上下文感知的图片理解

2 participants