-
Notifications
You must be signed in to change notification settings - Fork 6
Open
Description
Description
SwiftAutoGUI provides powerful image recognition features that can locate images on the screen. This functionality is not currently exposed through the MCP interface but would enable visual-based automation workflows.
Implementation Details
Add new tools to enable image recognition:
findImage- Locate a single image on screenfindImageCenter- Find the center coordinates of an imagefindAllImages- Find all occurrences of an image
SwiftAutoGUI Methods to Use:
locateOnScreen(imageName: String, confidence?: Double, region?: CGRect)- Find first matchlocateCenterOnScreen(imageName: String, confidence?: Double, region?: CGRect)- Find center of first matchlocateAllOnScreen(imageName: String, confidence?: Double, region?: CGRect)- Find all matches
Parameters
Common Parameters
imagePath: Path to the reference image fileconfidence: Optional confidence threshold (0.0-1.0, default: 0.9)region: Optional search region (x, y, width, height)
Return Values
findImage
Returns the bounding box of the found image:
{
"found": true,
"x": 100,
"y": 200,
"width": 50,
"height": 30
}findImageCenter
Returns the center coordinates:
{
"found": true,
"x": 125,
"y": 215
}findAllImages
Returns an array of all matches:
{
"matches": [
{"x": 100, "y": 200, "width": 50, "height": 30},
{"x": 300, "y": 400, "width": 50, "height": 30}
]
}Use Cases
- GUI test automation
- Finding and clicking buttons based on appearance
- Waiting for visual elements to appear
- Verifying UI states
- Automating applications without accessible APIs
Example Usage
{
"tool": "findImage",
"arguments": {
"imagePath": "/path/to/button.png",
"confidence": 0.95,
"region": {
"x": 0,
"y": 0,
"width": 1920,
"height": 1080
}
}
}Priority
Medium - While powerful, image recognition is more advanced than basic automation features and may have performance implications.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels