Skip to content

fix(core): filter unsupported file content types in memory per model capability#37043

Open
kuun993 wants to merge 2 commits into
langgenius:mainfrom
kuun993:feat-file-context
Open

fix(core): filter unsupported file content types in memory per model capability#37043
kuun993 wants to merge 2 commits into
langgenius:mainfrom
kuun993:feat-file-context

Conversation

@kuun993
Copy link
Copy Markdown

@kuun993 kuun993 commented Jun 4, 2026

Fixes #37042

Summary

When a workflow LLM node switches to a model that doesn't support certain multimodal content types (e.g., switching from GPT-4o to GPT-4 for vision, or to a model without document support), TokenBufferMemory still includes all file content (ImagePromptMessageContent, DocumentPromptMessageContent, etc.) in history prompt messages. This passes invalid multimodal content to the LLM invocation.

This PR adds content-type-level filtering in _build_prompt_message_with_files: after converting files to prompt message contents, each non-text content type is checked against the model schema, and unsupported types are filtered out.

Changes

  • api/core/memory/token_buffer_memory.py:
    • In _build_prompt_message_with_files, filter prompt_message_contents by model capability via model_schema.supports_prompt_content_type(), keeping only TEXT and content types the current model supports.
    • Import PromptMessageContentType for content type checking.

Test plan

  1. Create a workflow with an LLM node using a vision model (e.g., GPT-4o), upload an image in a previous conversation turn — image context should still appear in history prompts.
  2. Switch the LLM node to a non-vision model (e.g., GPT-4), repeat — image content should be filtered out, only text remains.
  3. Verify no regression for Agent Chat / Completion / Chat app modes.
  4. Run the added unit test test_unsupported_content_type_filtered_out.

kuun993 added 2 commits June 2, 2026 13:57
TokenBufferMemory.get_history_prompt_messages() unconditionally queried
MessageFile records and built ImagePromptMessageContent for every history
message, even when the current model does not support vision. This caused
unnecessary DB queries and sent invalid multimodal content to non-vision
LLM nodes.

Now checks model schema for VISION feature before including file context.
When vision is not supported, file loading is skipped entirely and plain
text messages are used instead.
…capability

TokenBufferMemory._build_prompt_message_with_files() unconditionally
included all file content (images, documents, etc.) in history prompt
messages, even when the current model does not support those content
types (e.g. non-vision models receiving image content).

Now checks each content type against model schema capabilities and
filters out unsupported types, keeping only text content for models
that cannot handle multimodal inputs.
@dosubot dosubot Bot added the size:S This PR changes 10-29 lines, ignoring generated files. label Jun 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:S This PR changes 10-29 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

TokenBufferMemory includes unsupported multimodal content after switching models

1 participant