fix(core): filter unsupported file content types in memory per model capability by kuun993 · Pull Request #37043 · langgenius/dify

kuun993 · 2026-06-04T04:07:36Z

Fixes #37042

Summary

When a workflow LLM node switches to a model that doesn't support certain multimodal content types (e.g., switching from GPT-4o to GPT-4 for vision, or to a model without document support), TokenBufferMemory still includes all file content (ImagePromptMessageContent, DocumentPromptMessageContent, etc.) in history prompt messages. This passes invalid multimodal content to the LLM invocation.

This PR adds content-type-level filtering in _build_prompt_message_with_files: after converting files to prompt message contents, each non-text content type is checked against the model schema, and unsupported types are filtered out.

Changes

api/core/memory/token_buffer_memory.py:
- In _build_prompt_message_with_files, filter prompt_message_contents by model capability via model_schema.supports_prompt_content_type(), keeping only TEXT and content types the current model supports.
- Import PromptMessageContentType for content type checking.

Test plan

Create a workflow with an LLM node using a vision model (e.g., GPT-4o), upload an image in a previous conversation turn — image context should still appear in history prompts.
Switch the LLM node to a non-vision model (e.g., GPT-4), repeat — image content should be filtered out, only text remains.
Verify no regression for Agent Chat / Completion / Chat app modes.
Run the added unit test test_unsupported_content_type_filtered_out.

TokenBufferMemory.get_history_prompt_messages() unconditionally queried MessageFile records and built ImagePromptMessageContent for every history message, even when the current model does not support vision. This caused unnecessary DB queries and sent invalid multimodal content to non-vision LLM nodes. Now checks model schema for VISION feature before including file context. When vision is not supported, file loading is skipped entirely and plain text messages are used instead.

…capability TokenBufferMemory._build_prompt_message_with_files() unconditionally included all file content (images, documents, etc.) in history prompt messages, even when the current model does not support those content types (e.g. non-vision models receiving image content). Now checks each content type against model schema capabilities and filters out unsupported types, keeping only text content for models that cannot handle multimodal inputs.

kuun993 added 2 commits June 2, 2026 13:57

kuun993 requested review from QuantumGhost and laipz8200 as code owners June 4, 2026 04:07

dosubot Bot added the size:S This PR changes 10-29 lines, ignoring generated files. label Jun 4, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(core): filter unsupported file content types in memory per model capability#37043

fix(core): filter unsupported file content types in memory per model capability#37043
kuun993 wants to merge 2 commits into
langgenius:mainfrom
kuun993:feat-file-context

kuun993 commented Jun 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

kuun993 commented Jun 4, 2026

Fixes #37042

Summary

Changes

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant