-
Notifications
You must be signed in to change notification settings - Fork 35
os level screenshots #76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Mesa DescriptionThis PR introduces the capability to take OS-level screenshots, capturing the entire desktop rather than just the browser content. This is useful for debugging and support, allowing users to capture system dialogs and other applications running within the environment. Changes
Testing
Description generated by Mesa. Update settings |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Performed full review of a730519...b51a533
Analysis
-
The implementation hardcodes display
:1, which may cause issues in environments with different display configurations or multiple displays. -
The use of external dependencies (ffmpeg, xdpyinfo) creates potential points of failure if these tools are unavailable or have version incompatibilities on the target system.
-
While PNG format provides lossless compression, it may result in large response payloads for high-resolution screenshots, potentially causing network or performance issues for clients.
-
The current implementation may not handle concurrent screenshot requests efficiently, as resource-intensive ffmpeg operations could impact overall system performance under load.
-
The design might lack proper authentication or authorization controls specific to screenshot functionality, which could expose sensitive visual information.
Tip
⚡ Quick Actions
This review was generated by Mesa.
Actions:
Slash Commands:
/review- Request a full code review/review latest- Review only changes since the last review/describe- Generate PR description. This will update the PR body or issue comment depending on your configuration/help- Get help with Mesa commands and configuration options
4 files reviewed | 0 comments | Review on Mesa | Edit Reviewer Settings
hiroTamada
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
What is the use case for click mouse + type endpoints? Is it just nice to have for abstracting out os level interaction?
|
@hiroTamada some people require more high-fidelity computer actions vs. what playwright does (synthetic DOM events) |
Note
Adds
/computer/screenshot(PNG via ffmpeg, optional region) and/computer/type, tightens mouse APIs with screen-bounds checks and STZ guards, makes resolution queries error-aware, updates OpenAPI/client, and adds e2e screenshot tests.POST /computer/screenshot: capture PNG (ffmpeg x11grab), optionalregioncrop; streams image.POST /computer/type: type arbitrary text with optional per-keystrokedelay.MoveMouse/ClickMouse: add display bounds validation (viagetCurrentResolution), STZ disable/enable guards, improved logging; return 400 on OOB coords.getCurrentResolutionnow returns(w,h,rate,error);PatchDisplayhandles errors.ScreenshotRequest/ScreenshotRegion,TypeTextRequest.isPNG).Written by Cursor Bugbot for commit 4094aca. This will update automatically on new commits. Configure here.