fix(core): filter unsupported file content types in memory per model capability#37043
Open
kuun993 wants to merge 2 commits into
Open
fix(core): filter unsupported file content types in memory per model capability#37043kuun993 wants to merge 2 commits into
kuun993 wants to merge 2 commits into
Conversation
TokenBufferMemory.get_history_prompt_messages() unconditionally queried MessageFile records and built ImagePromptMessageContent for every history message, even when the current model does not support vision. This caused unnecessary DB queries and sent invalid multimodal content to non-vision LLM nodes. Now checks model schema for VISION feature before including file context. When vision is not supported, file loading is skipped entirely and plain text messages are used instead.
…capability TokenBufferMemory._build_prompt_message_with_files() unconditionally included all file content (images, documents, etc.) in history prompt messages, even when the current model does not support those content types (e.g. non-vision models receiving image content). Now checks each content type against model schema capabilities and filters out unsupported types, keeping only text content for models that cannot handle multimodal inputs.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #37042
Summary
When a workflow LLM node switches to a model that doesn't support certain multimodal content types (e.g., switching from GPT-4o to GPT-4 for vision, or to a model without document support),
TokenBufferMemorystill includes all file content (ImagePromptMessageContent,DocumentPromptMessageContent, etc.) in history prompt messages. This passes invalid multimodal content to the LLM invocation.This PR adds content-type-level filtering in
_build_prompt_message_with_files: after converting files to prompt message contents, each non-text content type is checked against the model schema, and unsupported types are filtered out.Changes
api/core/memory/token_buffer_memory.py:_build_prompt_message_with_files, filterprompt_message_contentsby model capability viamodel_schema.supports_prompt_content_type(), keeping onlyTEXTand content types the current model supports.PromptMessageContentTypefor content type checking.Test plan
test_unsupported_content_type_filtered_out.