-
Notifications
You must be signed in to change notification settings - Fork 2.4k
feat: enhance MCP tool response handling to support image responses #5185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @shivangag,
I noticed this PR doesn't have a linked GitHub issue. As per the contributing guidelines, all PRs need to be linked to an approved issue. Could you create an issue describing this enhancement and link it to this PR? This helps us keep things traceable and ensures features are properly discussed before implementation.
I'll mark this PR as "In progress" for now until the dev team has a chance to review the issue.
|
Thank you @daniel-lxs for the thorough review and excellent suggestions! I've implemented all the recommended enhancements to make MCP image handling more robust and secure. Security & Performance Controls
Robustness Improvements
Code Quality Enhancements
The implementation now safely handles potentially malicious or buggy MCP servers while maintaining full backward compatibility with existing functionality. I believe this addresses all the security, performance, and reliability concerns raised in the review. Also created GitHub issue #5233 to properly document this feature enhancement. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @shivangag, Thank you for taking a look at my previous suggestions. I took another look and left some new suggestions from my review. Let me know if you have any questions!
|
@daniel-lxs please review! |
|
This PR seems to have conflicts, but I'm not sure how many |
Maybe we get merge resolver on this? |
**Changes:** - Updated `processToolContent` to return both text and images from tool results. - Modified `executeToolAndProcessResult` to handle and pass images to the response. - Adjusted `combineCommandSequences` to preserve images from MCP server responses. - Updated UI components to display images alongside text responses. **Testing:** - Added tests to verify correct handling of tool results with text and images. - Ensured that image-only responses are processed correctly. **Files Modified:** - `src/core/prompts/responses.ts` - `src/core/tools/useMcpToolTool.ts` - `src/shared/combineCommandSequences.ts` - `webview-ui/src/components/chat/McpExecution.tsx` - `webview-ui/src/components/chat/ChatRow.tsx` - Test files for MCP tool functionality.
**Changes:** - Improved validation for base64 image data in `processToolContent` to handle invalid and non-string data gracefully. - Added error handling to log warnings for corrupted images without interrupting processing. - Updated tests to verify correct behavior when encountering invalid base64 data and non-string inputs. **Files Modified:** - `src/core/tools/useMcpToolTool.ts` - `src/core/tools/__tests__/useMcpToolTool.spec.ts` - `webview-ui/src/components/common/Thumbnails.tsx`
**Changes:** - Introduced `mcpMaxImagesPerResponse` and `mcpMaxImageSizeMB` settings to control the maximum number of images and their size in MCP tool responses. - Updated `processToolContent` to enforce these limits, logging warnings when they are exceeded. - Enhanced UI components to allow users to configure these settings. - Added tests to verify correct behavior under various image limits and sizes. **Files Modified:** - `packages/types/src/global-settings.ts` - `src/core/tools/useMcpToolTool.ts` - `src/core/webview/ClineProvider.ts` - `src/core/webview/webviewMessageHandler.ts` - `src/shared/ExtensionMessage.ts` - `src/shared/WebviewMessage.ts` - `webview-ui/src/components/mcp/McpView.tsx` - `webview-ui/src/context/ExtensionStateContext.tsx` - `webview-ui/src/context/__tests__/ExtensionStateContext.spec.tsx` - Test files for MCP tool functionality.
**Changes:** - Introduced a constant `SUPPORTED_IMAGE_TYPES` to define valid image MIME types, improving code readability and maintainability. - Updated `processToolContent` to utilize the new constant for image type validation. **Files Modified:** - `src/core/tools/useMcpToolTool.ts`
…ypes **Changes:** - Added tests to ensure that the MCP tool correctly ignores images with unsupported MIME types and malformed content. - Implemented console warnings for unsupported MIME types and missing properties in image content. **Files Modified:** - `src/core/tools/__tests__/useMcpToolTool.spec.ts`
- Add input validation for max images per response (1-100 range) - Add input validation for max image size in MB (1-100 range) - Display error messages for invalid inputs - Internationalize all image settings labels and descriptions - Update translations across all supported locales
- Add early size check to prevent memory spikes from oversized images - Implement bounds checking for MCP settings (images per response, max size) - Add image count tooltips and improved accessibility in UI - Update translations for better user experience across EN/ES/FR - Refactor base64 validation with reusable regex constant
- Extract image validation into separate validateAndProcessImage function - Replace sequential forEach with parallel Promise.all processing - Pre-filter and limit images before validation for better efficiency - Improve performance for MCP tools returning multiple images 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
- Fix image limiting warning logic by checking original array length before slicing - Add missing getState mock in test to support MCP settings access - Update test assertion to match actual function signature with images parameter 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
Add missing thumbnail and image count translations for 15 languages: - thumbnails.failedToLoad and thumbnails.altText in common.json - execution.imageCountTooltip in mcp.json This fixes the check-translations CI failure by providing proper localized strings for the new image handling features. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
01bb82d to
e1e8e5f
Compare
@daniel-lxs I have resolved the conflicts after rebasing from main. |
|
We are currently trying to get base64-string images out of the webview. They are a big contributor to the grey screen issue we have seen in some situations. You can see an example of this here: #8225. I would think that this PR will also benefit from a change like that. My recommendation is that we wait for PR #8225 to be merged, and then you can probably recycle some of the functionality to apply it here as well. |
Related GitHub Issue
Fixes: #5233
Description
This PR enhances MCP (Model Context Protocol) tool response handling to support images alongside text content. The implementation allows MCP servers to return images in their tool responses, which are then properly displayed in the Roo Code UI.
Key implementation details:
processToolContent()to extract and return both text and images from MCP tool resultsexecuteToolAndProcessResult()to handle image data and pass it through the response pipelinecombineCommandSequences()to preserve images from MCP server responses during command sequence processingMcpExecution.tsxandChatRow.tsx) to render images alongside text responsesDesign choices:
Test Procedure
Unit Tests Added:
useMcpToolTool.spec.tsto verify:combineCommandSequences.spec.tsto ensure images are preserved during command sequence processingManual Testing:
Reviewers can test by:
npm test -- useMcpToolTool.spec.tsPre-Submission Checklist
Screenshots / Videos
Before: MCP tool responses only displayed text content, images were not supported.

After: MCP tool responses now display both text and images seamlessly in the chat interface.


codicon-file-mediaindicates if images are present in mcp response and the number of images.Images are presented using the
Thumbnailcomponent insidemcp-server-responseblock.Documentation Updates
Additional Notes
Improvements
Old chat loaded from memory show tool call argument inside the response.

Important
Enhances MCP tool response handling to support image responses, updating processing logic, UI components, and adding new settings for image handling.
processToolContent()inuseMcpToolTool.tsnow extracts and returns both text and images from MCP tool results.executeToolAndProcessResult()inuseMcpToolTool.tshandles image data and passes it through the response pipeline.combineCommandSequences()incombineCommandSequences.tspreserves images from MCP server responses.McpExecution.tsxandChatRow.tsxupdated to render images alongside text responses.mcpMaxImagesPerResponseandmcpMaxImageSizeMBtoglobal-settings.tsfor image handling configuration.useMcpToolTool.spec.tsfor image extraction and handling.combineCommandSequences.spec.tsto test image preservation in command sequences.webviewMessageHandler.tsto handle new image settings.Thumbnails.tsxfor UI display.This description was created by
for a416090. You can customize this summary. It will automatically update as commits are pushed.