-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Description
What specific problem does this solve?
Currently, Roo Code does not support video uploads, which limits users' ability to provide rich context for UI-related issues. Developers working on graphical user interfaces (GUIs) often encounter problems that are difficult to describe with text alone.
Who is affected: All Roo Code users, especially those developing applications with a GUI.
When this happens: When a developer needs to troubleshoot a complex UI bug or explain a visual interaction.
Current behavior: Users can only upload images.
Expected behavior: Users should be able to upload video files (e.g., screen recordings) for analysis.
Impact: This limitation causes frustration and wastes significant time, as developers struggle to articulate visual problems in writing.
π οΈ Contributing & Technical Analysis
β
I'm interested in implementing this feature
β
I understand this needs approval before implementation begins
π Comprehensive Technical Analysis
Implementation Target
The goal is to enable video uploads for multimodal analysis in Roo Code, specifically for models that support it (e.g., Gemini 2.5 Pro). This will allow users to provide video context for UI troubleshooting and other development tasks.
Affected Components
- Primary File:
webview-ui/src/components/chat/ChatView.tsx
Current Implementation Analysis
The codebase has foundational support for video handling. Recent commits have prepared the backend (gemini-format.ts) and UI components (ChatTextArea.tsx, MediaThumbnails.tsx) to process and display video files. The main blocker is the acceptedFileTypes logic in ChatView.tsx, which currently only allows image formats.
Proposed Implementation
Step 1: Update acceptedFileTypes
- File:
webview-ui/src/components/chat/ChatView.tsx - Changes: Modify the
acceptedFileTypesuseMemohook to include video MIME types (e.g.,mp4,mov) when a video-capable model is selected. - Rationale: This is the primary change required to enable the feature.
Testing Requirements
- Unit Tests: No new unit tests are required as the change is a configuration update.
- Integration Tests:
- Verify that video files can be attached when a supported model is selected.
- Confirm that video files are rejected for unsupported models.
- Manual Tests:
- Drag-and-drop and paste a video file to ensure it's accepted.
- Send a prompt with a video to confirm the API call is successful and the response is relevant.
Performance Impact
- Expected performance change: Neutral for the UI, but processing large videos into base64 may be resource-intensive.
- Benchmarking needed: No.
Security Considerations
- None.
Migration Strategy
- Not applicable.
Rollback Plan
- Revert the change in
ChatView.tsx.
Dependencies and Breaking Changes
- None.
Implementation Complexity
- Estimated effort: Small
- Risk level: Low
Acceptance Criteria
Given a user has selected a Gemini 2.5 Pro model,
When they drag and drop or paste a supported video file (e.g., MP4, MOV) into the chat text area,
Then the video file should be accepted and a video thumbnail should appear in the attachments area.
Given a user has attached a video file,
When they send a prompt asking for analysis of the video,
Then the message should be sent to the Gemini API with the video data correctly formatted.
Given the Gemini API processes the video successfully,
When the user receives a response,
Then the response should contain an accurate analysis of the video's content.
Given a user has selected a model that does not support video,
When they attempt to attach a video file,
Then the file should be rejected, and no thumbnail should appear.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status