Skip to content

fix: sanitize multimodal history before LLM context compression with text-only providers #8593

@langyun147

Description

@langyun147

What happened / 发生了什么

When context_limit_reached_strategy is set to llm_compress, context compression can fail if older conversation history contains image content blocks such as:

{"type": "image_url", "image_url": {"url": "data:image/jpeg;base64,..."}}

This is especially visible when the compression provider is a text-only / non-multimodal model such as DeepSeek (modalities: text, tool_use). The normal chat path may work with a multimodal provider, but compression fails because the summarization request sends the old multimodal history directly to the compression provider.

Related existing issues about non-multimodal models receiving image_url: #8489, #8497, #7192. This issue is specifically about the LLM context compression path.

Reproduce / 如何复现?

  1. Configure a conversation with context_limit_reached_strategy = "llm_compress".
  2. Configure llm_compress_provider_id to a non-multimodal DeepSeek model, for example deepseek-v4-flash or deepseek-v4-pro, whose modalities do not include image.
  3. Have existing conversation history that contains an image block stored as image_url with base64 data (data:image/...;base64,...).
  4. Let the context exceed the compression threshold so LLMSummaryCompressor runs.
  5. The compression request is sent to DeepSeek with the image block still present, and the API rejects the payload.

Expected behavior / 期望行为

Before calling the compression provider, AstrBot should sanitize the history according to the compression provider's supported modalities. For text-only providers, image/audio content should be replaced with text placeholders such as [Image] / [Audio], or otherwise dropped safely, so summary compression can still proceed.

Actual behavior / 实际行为

LLMSummaryCompressor builds:

llm_payload = messages_to_summarize + [instruction_message]
response = await self.provider.text_chat(contexts=llm_payload)

without applying sanitize_contexts_by_modalities(...) for the compression provider. As a result, text-only providers can receive OpenAI-style multimodal image_url content and return an error similar to:

400 invalid_request_error: unknown variant `image_url`, expected `text`

Possible cause / 可能原因

In the normal agent request path, contexts can be sanitized based on provider modalities:

  • astrbot/core/provider/modalities.py::sanitize_contexts_by_modalities
  • astrbot/core/agent/runners/tool_loop_agent_runner.py::_sanitize_contexts_for_provider

However, astrbot/core/agent/context/compressor.py::LLMSummaryCompressor.__call__ directly calls self.provider.text_chat(contexts=llm_payload) and does not run the same modality sanitization for the compression provider.

A possible fix is to apply sanitize_contexts_by_modalities(llm_payload, self.provider.provider_config.get("modalities")) before text_chat(...), so compression providers that only support text never receive image_url / audio_url blocks.

AstrBot version, deployment method, provider used, and messaging platform used / AstrBot 版本、部署方式、提供商、消息平台

Observed from current local source behavior. Provider configuration example:

  • Main/chat provider: can be multimodal or text-only
  • Compression provider: DeepSeek (deepseek-v4-flash / deepseek-v4-pro), modalities = ["text", "tool_use"]
  • Context strategy: llm_compress
  • History contains base64 image content blocks

OS

Windows

Logs / 报错日志

Typical provider error:

Error code: 400 - {'error': {'message': 'Failed to deserialize the JSON body into the target type: messages[n]: unknown variant `image_url`, expected `text`', 'type': 'invalid_request_error', 'code': 'invalid_request_error'}}

Are you willing to submit a PR? / 你愿意提交 PR 吗?

  • Yes!

Code of Conduct

  • I have read and agree to abide by the project's Code of Conduct.

Metadata

Metadata

Assignees

No one assigned

    Labels

    area:providerThe bug / feature is about AI Provider, Models, LLM Agent, LLM Agent Runner.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions