Add screenshot as context option for SpeakMCP input

## Feature Request: Screenshot as Context Option

### Description
Add screenshot functionality as a context option for SpeakMCP input. This should include:

1. **Input UI Enhancement**: Add a checkbox in the input UI to enable screenshot capture
2. **Agent Settings**: Add settings for agents to configure screenshot behavior
3. **Multimodal Support**: Research and implement proper data transmission for multimodal models

### Technical Requirements

#### UI Components
- [ ] Add checkbox in input UI for screenshot option
- [ ] Integrate with system screenshot capture
- [ ] Provide visual feedback when screenshot is captured

#### Agent Settings
- [ ] Add screenshot configuration options in agent settings
- [ ] Allow agents to enable/disable screenshot context
- [ ] Configure screenshot quality/format preferences

#### Multimodal Model Integration
- [ ] Research standards for multimodal models over OpenAI base URL
- [ ] Implement proper image encoding/formatting
- [ ] Ensure compatibility with various multimodal models

### Research Questions

1. **Standard Formats**: What is the standard format for sending image data to multimodal models over OpenAI-compatible APIs?
2. **Encoding Methods**: Should we use base64 encoding or direct binary transmission?
3. **Size Limits**: What are the typical size limits for image data in API requests?
4. **Model Compatibility**: How do different multimodal models (GPT-4V, Claude, Llama) handle image input?

### Implementation Considerations

- **Performance**: Optimize screenshot capture and transmission
- **Privacy**: Ensure user consent and data security
- **Compatibility**: Support across different platforms and models
- **User Experience**: Make the feature intuitive and seamless

### Acceptance Criteria

- [ ] Users can capture screenshots via checkbox in input UI
- [ ] Agents can be configured to use screenshot context
- [ ] Screenshot data is properly formatted for multimodal models
- [ ] Feature works with major multimodal model providers
- [ ] Performance impact is minimal

### Priority
Medium - This feature would significantly enhance the multimodal capabilities of SpeakMCP and improve user experience for visual context.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add screenshot as context option for SpeakMCP input #217

Feature Request: Screenshot as Context Option

Description

Technical Requirements

UI Components

Agent Settings

Multimodal Model Integration

Research Questions

Implementation Considerations

Acceptance Criteria

Priority

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Add screenshot as context option for SpeakMCP input #217

Description

Feature Request: Screenshot as Context Option

Description

Technical Requirements

UI Components

Agent Settings

Multimodal Model Integration

Research Questions

Implementation Considerations

Acceptance Criteria

Priority

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions