-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
feat(imageCaption): 根据模型能力判断是否需要标注图片,而不是全部标注。 #3849
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there - 我已经审查了你的更改,这里是一些反馈:
- 新的
_select_provider辅助函数和描述中提到的现有逻辑有重复;建议将这部分“选择 provider 的逻辑”集中到一个共享的辅助函数中,以避免不同调用点之间出现逻辑偏差。 - 在
image caption条件中,self._select_provider(event)可能返回None(例如设置了无效的selected_provider时),这在访问.provider_config时会导致AttributeError;请添加防御性检查或在此处提前返回,以安全地处理这种情况。 - 你目前是在
if条件内部调用_select_provider(event);建议先解析一次 provider,将其存储在一个局部变量中并重复使用,以避免重复查找并让逻辑更加清晰。
给 AI 代理的提示
Please address the comments from this code review:
## Overall Comments
- The new `_select_provider` helper duplicates existing logic mentioned in the description; consider centralizing this provider-selection logic in a shared helper to avoid divergence between call sites.
- In the `image caption` condition, `self._select_provider(event)` can return `None` (e.g., when an invalid `selected_provider` is set), which would cause an `AttributeError` when accessing `.provider_config`; add a defensive check or early return to handle this case safely.
- You currently call `_select_provider(event)` inside the `if` condition; consider resolving the provider once, storing it in a local variable, and reusing it to avoid repeated lookup and make the logic clearer.
## Individual Comments
### Comment 1
<location> `packages/astrbot/process_llm_request.py:118-126` </location>
<code_context>
f"Cannot get image caption because provider `{provider_id}` is not exist.",
)
+ def _select_provider(self, event: AstrMessageEvent):
+ """选择使用的 LLM 提供商"""
+ sel_provider = event.get_extra("selected_provider")
+ _ctx = self.ctx
+ if sel_provider and isinstance(sel_provider, str):
+ provider = _ctx.get_provider_by_id(sel_provider)
+ if not provider:
+ logger.error(f"未找到指定的提供商: {sel_provider}。")
+ return provider
+
+ return _ctx.get_using_provider(umo=event.unified_msg_origin)
</code_context>
<issue_to_address>
**issue (bug_risk):** 当选定的 provider ID 无效时,`_select_provider` 返回 `None` 而不是回退到默认 provider,这可能会导致后续错误。
在 `sel_provider` 分支中,如果 `get_provider_by_id` 返回 `None`,你会记录一条错误日志但仍然返回 `None`,这会在 `process_llm_request` 中访问 `.provider_config` 时引发异常。相应地,你可以在选定 provider 无效时回退到 `_ctx.get_using_provider(umo=event.unified_msg_origin)`,或者在这里抛出一个明确的错误并在调用方进行处理。这样可以避免当前这种“静默失败”的模式以及意料之外的 `None` 传递到下游。
</issue_to_address>
### Comment 2
<location> `packages/astrbot/process_llm_request.py:130` </location>
<code_context>
async def process_llm_request(self, event: AstrMessageEvent, req: ProviderRequest):
"""在请求 LLM 前注入人格信息、Identifier、时间、回复内容等 System Prompt"""
cfg: dict = self.ctx.get_config(umo=event.unified_msg_origin)[
"provider_settings"
]
# prompt prefix
if prefix := cfg.get("prompt_prefix"):
# 支持 {{prompt}} 作为用户输入的占位符
if "{{prompt}}" in prefix:
req.prompt = prefix.replace("{{prompt}}", req.prompt)
else:
req.prompt = prefix + req.prompt
# user identifier
if cfg.get("identifier"):
user_id = event.message_obj.sender.user_id
user_nickname = event.message_obj.sender.nickname
req.prompt = (
f"\n[User ID: {user_id}, Nickname: {user_nickname}]\n{req.prompt}"
)
# group name identifier
if cfg.get("group_name_display") and event.message_obj.group_id:
group_name = event.message_obj.group.group_name
if group_name:
req.system_prompt += f"\nGroup name: {group_name}\n"
# time info
if cfg.get("datetime_system_prompt"):
current_time = None
if self.timezone:
# 启用时区
try:
now = datetime.datetime.now(zoneinfo.ZoneInfo(self.timezone))
current_time = now.strftime("%Y-%m-%d %H:%M (%Z)")
except Exception as e:
logger.error(f"时区设置错误: {e}, 使用本地时区")
if not current_time:
current_time = (
datetime.datetime.now().astimezone().strftime("%Y-%m-%d %H:%M (%Z)")
)
req.system_prompt += f"\nCurrent datetime: {current_time}\n"
img_cap_prov_id: str = cfg.get("default_image_caption_provider_id") or ""
if req.conversation:
# inject persona for this request
await self._ensure_persona(req, cfg, event.unified_msg_origin)
# image caption
if (
img_cap_prov_id
and req.image_urls
and "image"
not in self._select_provider(event).provider_config.get(
"modalities", ["image"]
)
):
await self._ensure_img_caption(req, cfg, img_cap_prov_id)
# quote message processing
# 解析引用内容
quote = None
for comp in event.message_obj.message:
if isinstance(comp, Reply):
quote = comp
break
if quote:
sender_info = ""
if quote.sender_nickname:
sender_info = f"(Sent by {quote.sender_nickname})"
message_str = quote.message_str or "[Empty Text]"
req.system_prompt += (
f"\nUser is quoting a message{sender_info}.\n"
f"Here are the information of the quoted message: Text Content: {message_str}.\n"
)
image_seg = None
if quote.chain:
for comp in quote.chain:
if isinstance(comp, Image):
image_seg = comp
break
if image_seg:
try:
prov = None
if img_cap_prov_id:
prov = self.ctx.get_provider_by_id(img_cap_prov_id)
if prov is None:
prov = self.ctx.get_using_provider(event.unified_msg_origin)
if prov and isinstance(prov, Provider):
llm_resp = await prov.text_chat(
prompt="Please describe the image content.",
image_urls=[await image_seg.convert_to_file_path()],
)
if llm_resp.completion_text:
req.system_prompt += (
f"Image Caption: {llm_resp.completion_text}\n"
)
else:
logger.warning("No provider found for image captioning.")
except BaseException as e:
logger.error(f"处理引用图片失败: {e}")
</code_context>
<issue_to_address>
**issue (code-quality):** 我们发现了以下问题:
- 建议使用命名表达式来简化赋值与条件判断 [×2]([`use-named-expression`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/use-named-expression/))
- 建议使用内置函数 `next` 来替代 for 循环([`use-next`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/use-next/))
- 在 `ProcessLLMRequest.process_llm_request` 中检测到较低的代码质量——12%([`low-code-quality`](https://docs.sourcery.ai/Reference/Default-Rules/comments/low-code-quality/))
<br/><details><summary>Explanation</summary>
该函数的质量评分低于 25% 的质量阈值。
这个分数是由方法长度、认知复杂度以及工作记忆占用等因素综合计算得出。
如何改进?
你可以考虑对这个函数进行重构,使其更短、更易读。
- 通过把部分功能提取到独立函数中来缩短函数长度。这是你能做的最重要的事情——理想情况下,一个函数应该少于 10 行。
- 通过引入“守卫式条件(guard clauses)”等方式提前返回,减少嵌套层级。
- 确保变量作用域尽量小,让使用相关概念的代码尽量聚合在一起,而不是分散在函数的不同位置。</details>
</issue_to_address>帮助我变得更有用!请在每条评论上点 👍 或 👎,我会根据你的反馈持续改进评审质量。
Original comment in English
Hey there - I've reviewed your changes - here's some feedback:
- The new
_select_providerhelper duplicates existing logic mentioned in the description; consider centralizing this provider-selection logic in a shared helper to avoid divergence between call sites. - In the
image captioncondition,self._select_provider(event)can returnNone(e.g., when an invalidselected_provideris set), which would cause anAttributeErrorwhen accessing.provider_config; add a defensive check or early return to handle this case safely. - You currently call
_select_provider(event)inside theifcondition; consider resolving the provider once, storing it in a local variable, and reusing it to avoid repeated lookup and make the logic clearer.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The new `_select_provider` helper duplicates existing logic mentioned in the description; consider centralizing this provider-selection logic in a shared helper to avoid divergence between call sites.
- In the `image caption` condition, `self._select_provider(event)` can return `None` (e.g., when an invalid `selected_provider` is set), which would cause an `AttributeError` when accessing `.provider_config`; add a defensive check or early return to handle this case safely.
- You currently call `_select_provider(event)` inside the `if` condition; consider resolving the provider once, storing it in a local variable, and reusing it to avoid repeated lookup and make the logic clearer.
## Individual Comments
### Comment 1
<location> `packages/astrbot/process_llm_request.py:118-126` </location>
<code_context>
f"Cannot get image caption because provider `{provider_id}` is not exist.",
)
+ def _select_provider(self, event: AstrMessageEvent):
+ """选择使用的 LLM 提供商"""
+ sel_provider = event.get_extra("selected_provider")
+ _ctx = self.ctx
+ if sel_provider and isinstance(sel_provider, str):
+ provider = _ctx.get_provider_by_id(sel_provider)
+ if not provider:
+ logger.error(f"未找到指定的提供商: {sel_provider}。")
+ return provider
+
+ return _ctx.get_using_provider(umo=event.unified_msg_origin)
</code_context>
<issue_to_address>
**issue (bug_risk):** When a selected provider ID is invalid, `_select_provider` returns `None` instead of falling back to the default provider, which can cause downstream errors.
In the `sel_provider` branch, if `get_provider_by_id` returns `None` you log an error but still return `None`, which will cause an exception in `process_llm_request` when `.provider_config` is accessed. Instead, either fall back to `_ctx.get_using_provider(umo=event.unified_msg_origin)` when the selected provider is invalid, or raise an explicit error here and handle it at the call site. This avoids the current silent failure mode and unexpected `None` downstream.
</issue_to_address>
### Comment 2
<location> `packages/astrbot/process_llm_request.py:130` </location>
<code_context>
async def process_llm_request(self, event: AstrMessageEvent, req: ProviderRequest):
"""在请求 LLM 前注入人格信息、Identifier、时间、回复内容等 System Prompt"""
cfg: dict = self.ctx.get_config(umo=event.unified_msg_origin)[
"provider_settings"
]
# prompt prefix
if prefix := cfg.get("prompt_prefix"):
# 支持 {{prompt}} 作为用户输入的占位符
if "{{prompt}}" in prefix:
req.prompt = prefix.replace("{{prompt}}", req.prompt)
else:
req.prompt = prefix + req.prompt
# user identifier
if cfg.get("identifier"):
user_id = event.message_obj.sender.user_id
user_nickname = event.message_obj.sender.nickname
req.prompt = (
f"\n[User ID: {user_id}, Nickname: {user_nickname}]\n{req.prompt}"
)
# group name identifier
if cfg.get("group_name_display") and event.message_obj.group_id:
group_name = event.message_obj.group.group_name
if group_name:
req.system_prompt += f"\nGroup name: {group_name}\n"
# time info
if cfg.get("datetime_system_prompt"):
current_time = None
if self.timezone:
# 启用时区
try:
now = datetime.datetime.now(zoneinfo.ZoneInfo(self.timezone))
current_time = now.strftime("%Y-%m-%d %H:%M (%Z)")
except Exception as e:
logger.error(f"时区设置错误: {e}, 使用本地时区")
if not current_time:
current_time = (
datetime.datetime.now().astimezone().strftime("%Y-%m-%d %H:%M (%Z)")
)
req.system_prompt += f"\nCurrent datetime: {current_time}\n"
img_cap_prov_id: str = cfg.get("default_image_caption_provider_id") or ""
if req.conversation:
# inject persona for this request
await self._ensure_persona(req, cfg, event.unified_msg_origin)
# image caption
if (
img_cap_prov_id
and req.image_urls
and "image"
not in self._select_provider(event).provider_config.get(
"modalities", ["image"]
)
):
await self._ensure_img_caption(req, cfg, img_cap_prov_id)
# quote message processing
# 解析引用内容
quote = None
for comp in event.message_obj.message:
if isinstance(comp, Reply):
quote = comp
break
if quote:
sender_info = ""
if quote.sender_nickname:
sender_info = f"(Sent by {quote.sender_nickname})"
message_str = quote.message_str or "[Empty Text]"
req.system_prompt += (
f"\nUser is quoting a message{sender_info}.\n"
f"Here are the information of the quoted message: Text Content: {message_str}.\n"
)
image_seg = None
if quote.chain:
for comp in quote.chain:
if isinstance(comp, Image):
image_seg = comp
break
if image_seg:
try:
prov = None
if img_cap_prov_id:
prov = self.ctx.get_provider_by_id(img_cap_prov_id)
if prov is None:
prov = self.ctx.get_using_provider(event.unified_msg_origin)
if prov and isinstance(prov, Provider):
llm_resp = await prov.text_chat(
prompt="Please describe the image content.",
image_urls=[await image_seg.convert_to_file_path()],
)
if llm_resp.completion_text:
req.system_prompt += (
f"Image Caption: {llm_resp.completion_text}\n"
)
else:
logger.warning("No provider found for image captioning.")
except BaseException as e:
logger.error(f"处理引用图片失败: {e}")
</code_context>
<issue_to_address>
**issue (code-quality):** We've found these issues:
- Use named expression to simplify assignment and conditional [×2] ([`use-named-expression`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/use-named-expression/))
- Use the built-in function `next` instead of a for-loop ([`use-next`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/use-next/))
- Low code quality found in ProcessLLMRequest.process\_llm\_request - 12% ([`low-code-quality`](https://docs.sourcery.ai/Reference/Default-Rules/comments/low-code-quality/))
<br/><details><summary>Explanation</summary>
The quality score for this function is below the quality threshold of 25%.
This score is a combination of the method length, cognitive complexity and working memory.
How can you solve this?
It might be worth refactoring this function to make it shorter and more readable.
- Reduce the function length by extracting pieces of functionality out into
their own functions. This is the most important thing you can do - ideally a
function should be less than 10 lines.
- Reduce nesting, perhaps by introducing guard clauses to return early.
- Ensure that variables are tightly scoped, so that code using related concepts
sits together within the function rather than being scattered.</details>
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| def _select_provider(self, event: AstrMessageEvent): | ||
| """选择使用的 LLM 提供商""" | ||
| sel_provider = event.get_extra("selected_provider") | ||
| _ctx = self.ctx | ||
| if sel_provider and isinstance(sel_provider, str): | ||
| provider = _ctx.get_provider_by_id(sel_provider) | ||
| if not provider: | ||
| logger.error(f"未找到指定的提供商: {sel_provider}。") | ||
| return provider |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (bug_risk): 当选定的 provider ID 无效时,_select_provider 返回 None 而不是回退到默认 provider,这可能会导致下游错误。
在 sel_provider 分支中,如果 get_provider_by_id 返回 None,你会记录一条错误日志但仍然返回 None,这会在 process_llm_request 中访问 .provider_config 时引发异常。相应地,你可以在选定 provider 无效时回退到 _ctx.get_using_provider(umo=event.unified_msg_origin),或者在这里抛出一个明确的错误并在调用方进行处理。这样可以避免当前这种“静默失败”的模式以及意料之外的 None 传递到下游。
Original comment in English
issue (bug_risk): When a selected provider ID is invalid, _select_provider returns None instead of falling back to the default provider, which can cause downstream errors.
In the sel_provider branch, if get_provider_by_id returns None you log an error but still return None, which will cause an exception in process_llm_request when .provider_config is accessed. Instead, either fall back to _ctx.get_using_provider(umo=event.unified_msg_origin) when the selected provider is invalid, or raise an explicit error here and handle it at the call site. This avoids the current silent failure mode and unexpected None downstream.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
思考。
这个复制自astrbot\core\pipeline\process_stage\method\agent_sub_stages\internal.py
要改建议两个一起改(
|
|
||
| return _ctx.get_using_provider(umo=event.unified_msg_origin) | ||
|
|
||
| async def process_llm_request(self, event: AstrMessageEvent, req: ProviderRequest): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (code-quality): 我们发现了以下问题:
- 建议使用命名表达式来简化赋值与条件判断 [×2](
use-named-expression) - 建议使用内置函数
next来替代 for 循环(use-next) - 在
ProcessLLMRequest.process_llm_request中检测到较低的代码质量——12%(low-code-quality)
Explanation
该函数的质量评分低于 25% 的质量阈值。
这个分数是由方法长度、认知复杂度以及工作记忆占用等因素综合计算得出。
如何改进?
你可以考虑对这个函数进行重构,使其更短、更易读。
- 通过把部分功能提取到独立函数中来缩短函数长度。这是你能做的最重要的事情——理想情况下,一个函数应该少于 10 行。
- 通过引入“守卫式条件(guard clauses)”等方式提前返回,减少嵌套层级。
- 确保变量作用域尽量小,让使用相关概念的代码尽量聚合在一起,而不是分散在函数的不同位置。
Original comment in English
issue (code-quality): We've found these issues:
- Use named expression to simplify assignment and conditional [×2] (
use-named-expression) - Use the built-in function
nextinstead of a for-loop (use-next) - Low code quality found in ProcessLLMRequest.process_llm_request - 12% (
low-code-quality)
Explanation
The quality score for this function is below the quality threshold of 25%.
This score is a combination of the method length, cognitive complexity and working memory.
How can you solve this?
It might be worth refactoring this function to make it shorter and more readable.
- Reduce the function length by extracting pieces of functionality out into
their own functions. This is the most important thing you can do - ideally a
function should be less than 10 lines. - Reduce nesting, perhaps by introducing guard clauses to return early.
- Ensure that variables are tightly scoped, so that code using related concepts
sits together within the function rather than being scattered.
Fix #3770
根据模型能力判断是否需要标注图片,而不是全部标注。
Modifications / 改动点
packages\astrbot\process_llm_request.py
复制ProcessLLMRequest中的_select_provider()获取LLMProvider
(或许提供一个Global的_select_provider会更好)
添加判断:Provider能力不支持Image时,进行_ensure_img_caption()
This is NOT a breaking change. / 这不是一个破坏性变更。
Screenshots or Test Results / 运行截图或测试结果
Checklist / 检查清单
requirements.txt和pyproject.toml文件相应位置。/ I have ensured that no new dependencies are introduced, OR if new dependencies are introduced, they have been added to the appropriate locations inrequirements.txtandpyproject.toml.Summary by Sourcery
在 LLM 请求处理流程中调整图像描述行为,使其依赖于所选提供商的能力。
Bug Fixes(错误修复):
Enhancements(增强改进):
Original summary in English
Summary by Sourcery
Adjust image captioning behavior in LLM request processing to depend on the selected provider’s capabilities.
Bug Fixes:
Enhancements: