Skip to content

Conversation

@shivangag
Copy link

@shivangag shivangag commented Jun 27, 2025

Related GitHub Issue

Fixes: #5233

Description

This PR enhances MCP (Model Context Protocol) tool response handling to support images alongside text content. The implementation allows MCP servers to return images in their tool responses, which are then properly displayed in the Roo Code UI.

Key implementation details:

Design choices:

  • Images are handled as part of the tool result structure, maintaining backward compatibility with text-only responses
  • The implementation supports both text-only, image-only, and combined text+image responses from MCP tools

Test Procedure

Unit Tests Added:

  • Added comprehensive tests in useMcpToolTool.spec.ts to verify:
    • Correct extraction of both text and images from tool results
    • Proper handling of image-only responses
    • Backward compatibility with text-only responses
  • Updated tests in combineCommandSequences.spec.ts to ensure images are preserved during command sequence processing

Manual Testing:

  1. Connect to an MCP server that supports image responses (e.g., browser MCP server with screenshot capability)
  2. Execute a tool that returns images (e.g., take a screenshot)
  3. Verify that both text and images are displayed correctly in the chat interface
  4. Test with various MCP tools to ensure backward compatibility with text-only responses

Reviewers can test by:

  1. Running the test suite: npm test -- useMcpToolTool.spec.ts
  2. Setting up an MCP server with image capabilities and testing the integration manually

Pre-Submission Checklist

  • Issue Linked: This PR is linked to an approved GitHub Issue (see "Related GitHub Issue" above).
  • Scope: My changes are focused on the linked issue (one major feature/fix per PR).
  • Self-Review: I have performed a thorough self-review of my code.
  • Testing: New and/or updated tests have been added to cover my changes (if applicable).
  • Documentation Impact: I have considered if my changes require documentation updates (see "Documentation Updates" section below).
  • Contribution Guidelines: I have read and agree to the Contributor Guidelines.

Screenshots / Videos

Before: MCP tool responses only displayed text content, images were not supported.
Screenshot 2025-06-27 at 2 32 26 PM

After: MCP tool responses now display both text and images seamlessly in the chat interface.
codicon-file-media indicates if images are present in mcp response and the number of images.
Screenshot 2025-06-27 at 2 18 30 PM
Images are presented using the Thumbnail component inside mcp-server-response block.
Screenshot 2025-06-27 at 2 19 29 PM

Documentation Updates

  • No documentation updates are required.
  • Yes, documentation updates are required. The MCP integration documentation should be updated to mention image support capabilities for tool responses.

Additional Notes

  • This enhancement maintains full backward compatibility with existing MCP servers that only return text
  • The implementation follows the existing patterns for handling MCP tool responses
  • Images are processed and displayed using the same infrastructure as other image content in Roo Code

Improvements

Old chat loaded from memory show tool call argument inside the response.
Screenshot 2025-06-27 at 2 19 44 PM


Important

Enhances MCP tool response handling to support image responses, updating processing logic, UI components, and adding new settings for image handling.

  • Behavior:
    • processToolContent() in useMcpToolTool.ts now extracts and returns both text and images from MCP tool results.
    • executeToolAndProcessResult() in useMcpToolTool.ts handles image data and passes it through the response pipeline.
    • combineCommandSequences() in combineCommandSequences.ts preserves images from MCP server responses.
    • UI components McpExecution.tsx and ChatRow.tsx updated to render images alongside text responses.
  • Settings:
    • Adds mcpMaxImagesPerResponse and mcpMaxImageSizeMB to global-settings.ts for image handling configuration.
  • Tests:
    • Adds tests in useMcpToolTool.spec.ts for image extraction and handling.
    • Updates combineCommandSequences.spec.ts to test image preservation in command sequences.
  • Misc:
    • Updates webviewMessageHandler.ts to handle new image settings.
    • Adds image handling logic to Thumbnails.tsx for UI display.

This description was created by Ellipsis for a416090. You can customize this summary. It will automatically update as commits are pushed.

@shivangag shivangag requested review from cte, jr and mrubens as code owners June 27, 2025 09:10
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. documentation Improvements or additions to documentation enhancement New feature or request labels Jun 27, 2025
@hannesrudolph hannesrudolph added the Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. label Jun 27, 2025
@daniel-lxs daniel-lxs moved this from Triage to PR [Needs Prelim Review] in Roo Code Roadmap Jun 27, 2025
@hannesrudolph hannesrudolph added PR - Needs Preliminary Review and removed Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. labels Jun 27, 2025
Copy link
Member

@daniel-lxs daniel-lxs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @shivangag,

I noticed this PR doesn't have a linked GitHub issue. As per the contributing guidelines, all PRs need to be linked to an approved issue. Could you create an issue describing this enhancement and link it to this PR? This helps us keep things traceable and ensures features are properly discussed before implementation.

I'll mark this PR as "In progress" for now until the dev team has a chance to review the issue.

@shivangag
Copy link
Author

shivangag commented Jun 29, 2025

Thank you @daniel-lxs for the thorough review and excellent suggestions! I've implemented all the recommended enhancements to make MCP image handling more robust and secure.

Security & Performance Controls

  • Image size limits: Added configurable mcpMaxImageSizeMB setting with 10MB default
  • Count limits: Added configurable mcpMaxImagesPerResponse setting with 20 image default
  • User controls: Both settings are configurable through the MCP settings panel

Robustness Improvements

  • Error handling: Wrapped image processing in try-catch blocks to prevent crashes from corrupted data
  • Validation: Added base64 regex validation and type checking for data integrity
  • Graceful degradation: Invalid images are skipped with detailed warning logs instead of failing

Code Quality Enhancements

  • Maintainability: Extracted SUPPORTED_IMAGE_TYPES as a module-level constant
  • Test coverage: Added comprehensive test suite covering all suggested edge cases including corrupted data, large images, unsupported MIME types, and malformed content

The implementation now safely handles potentially malicious or buggy MCP servers while maintaining full backward compatibility with existing functionality. I believe this addresses all the security, performance, and reliability concerns raised in the review.

Also created GitHub issue #5233 to properly document this feature enhancement.

@shivangag shivangag requested a review from daniel-lxs July 4, 2025 23:21
@shivangag shivangag marked this pull request as ready for review July 7, 2025 11:37
@dosubot dosubot bot added size:XXL This PR changes 1000+ lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Jul 7, 2025
@daniel-lxs daniel-lxs moved this from PR [Draft / In Progress] to PR [Needs Prelim Review] in Roo Code Roadmap Jul 7, 2025
Copy link
Member

@daniel-lxs daniel-lxs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @shivangag, Thank you for taking a look at my previous suggestions. I took another look and left some new suggestions from my review. Let me know if you have any questions!

@daniel-lxs daniel-lxs moved this from PR [Needs Prelim Review] to PR [Changes Requested] in Roo Code Roadmap Jul 8, 2025
@shivangag
Copy link
Author

@daniel-lxs please review!

@shivangag shivangag requested a review from daniel-lxs September 9, 2025 10:42
@daniel-lxs daniel-lxs moved this from PR [Changes Requested] to PR [Needs Prelim Review] in Roo Code Roadmap Sep 10, 2025
@daniel-lxs
Copy link
Member

This PR seems to have conflicts, but I'm not sure how many

@daniel-lxs daniel-lxs moved this from PR [Needs Prelim Review] to PR [Changes Requested] in Roo Code Roadmap Sep 11, 2025
@hannesrudolph
Copy link
Collaborator

This PR seems to have conflicts, but I'm not sure how many

Maybe we get merge resolver on this?

shivangag and others added 10 commits September 12, 2025 15:11
**Changes:**
- Updated `processToolContent` to return both text and images from tool results.
- Modified `executeToolAndProcessResult` to handle and pass images to the response.
- Adjusted `combineCommandSequences` to preserve images from MCP server responses.
- Updated UI components to display images alongside text responses.

**Testing:**
- Added tests to verify correct handling of tool results with text and images.
- Ensured that image-only responses are processed correctly.

**Files Modified:**
- `src/core/prompts/responses.ts`
- `src/core/tools/useMcpToolTool.ts`
- `src/shared/combineCommandSequences.ts`
- `webview-ui/src/components/chat/McpExecution.tsx`
- `webview-ui/src/components/chat/ChatRow.tsx`
- Test files for MCP tool functionality.
**Changes:**
- Improved validation for base64 image data in `processToolContent` to handle invalid and non-string data gracefully.
- Added error handling to log warnings for corrupted images without interrupting processing.
- Updated tests to verify correct behavior when encountering invalid base64 data and non-string inputs.

**Files Modified:**
- `src/core/tools/useMcpToolTool.ts`
- `src/core/tools/__tests__/useMcpToolTool.spec.ts`
- `webview-ui/src/components/common/Thumbnails.tsx`
**Changes:**
- Introduced `mcpMaxImagesPerResponse` and `mcpMaxImageSizeMB` settings to control the maximum number of images and their size in MCP tool responses.
- Updated `processToolContent` to enforce these limits, logging warnings when they are exceeded.
- Enhanced UI components to allow users to configure these settings.
- Added tests to verify correct behavior under various image limits and sizes.

**Files Modified:**
- `packages/types/src/global-settings.ts`
- `src/core/tools/useMcpToolTool.ts`
- `src/core/webview/ClineProvider.ts`
- `src/core/webview/webviewMessageHandler.ts`
- `src/shared/ExtensionMessage.ts`
- `src/shared/WebviewMessage.ts`
- `webview-ui/src/components/mcp/McpView.tsx`
- `webview-ui/src/context/ExtensionStateContext.tsx`
- `webview-ui/src/context/__tests__/ExtensionStateContext.spec.tsx`
- Test files for MCP tool functionality.
**Changes:**
- Introduced a constant `SUPPORTED_IMAGE_TYPES` to define valid image MIME types, improving code readability and maintainability.
- Updated `processToolContent` to utilize the new constant for image type validation.

**Files Modified:**
- `src/core/tools/useMcpToolTool.ts`
…ypes

**Changes:**
- Added tests to ensure that the MCP tool correctly ignores images with unsupported MIME types and malformed content.
- Implemented console warnings for unsupported MIME types and missing properties in image content.

**Files Modified:**
- `src/core/tools/__tests__/useMcpToolTool.spec.ts`
- Add input validation for max images per response (1-100 range)
- Add input validation for max image size in MB (1-100 range)
- Display error messages for invalid inputs
- Internationalize all image settings labels and descriptions
- Update translations across all supported locales
- Add early size check to prevent memory spikes from oversized images
- Implement bounds checking for MCP settings (images per response, max size)
- Add image count tooltips and improved accessibility in UI
- Update translations for better user experience across EN/ES/FR
- Refactor base64 validation with reusable regex constant
- Extract image validation into separate validateAndProcessImage function
- Replace sequential forEach with parallel Promise.all processing
- Pre-filter and limit images before validation for better efficiency
- Improve performance for MCP tools returning multiple images

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
- Fix image limiting warning logic by checking original array length before slicing
- Add missing getState mock in test to support MCP settings access
- Update test assertion to match actual function signature with images parameter

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
Add missing thumbnail and image count translations for 15 languages:
- thumbnails.failedToLoad and thumbnails.altText in common.json
- execution.imageCountTooltip in mcp.json

This fixes the check-translations CI failure by providing proper
localized strings for the new image handling features.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
@shivangag shivangag force-pushed the feat/support-mcp-image branch from 01bb82d to e1e8e5f Compare September 12, 2025 09:49
@shivangag
Copy link
Author

shivangag commented Sep 12, 2025

This PR seems to have conflicts, but I'm not sure how many

@daniel-lxs I have resolved the conflicts after rebasing from main.

@hannesrudolph hannesrudolph moved this from PR [Changes Requested] to PR [Needs Prelim Review] in Roo Code Roadmap Sep 22, 2025
@daniel-lxs
Copy link
Member

We are currently trying to get base64-string images out of the webview. They are a big contributor to the grey screen issue we have seen in some situations. You can see an example of this here: #8225.

I would think that this PR will also benefit from a change like that. My recommendation is that we wait for PR #8225 to be merged, and then you can probably recycle some of the functionality to apply it here as well.

@daniel-lxs daniel-lxs moved this from PR [Needs Prelim Review] to PR [Changes Requested] in Roo Code Roadmap Sep 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation enhancement New feature or request PR - Changes Requested size:XXL This PR changes 1000+ lines, ignoring generated files.

Projects

Status: PR [Changes Requested]

Development

Successfully merging this pull request may close these issues.

Feature: Enhance MCP Image Handling with Image Support, Robustness, and Security Controls

4 participants