Skip to content

Add Windows Support #23

@NakaokaRei

Description

@NakaokaRei

Overview

Add support for Windows platforms to SwiftAutoGUI, enabling automation capabilities on the world's most widely used desktop operating system.

Motivation

Windows has the largest desktop market share, and supporting it would make SwiftAutoGUI accessible to a vast number of developers and users. This would position SwiftAutoGUI as a comprehensive cross-platform automation solution.

Technical Requirements

1. Platform APIs

  • Replace CoreGraphics with Windows APIs
  • Utilize Windows API functions:
    • Mouse control (SetCursorPos, mouse_event/SendInput)
    • Keyboard control (keybd_event/SendInput)
    • Screen capture (BitBlt, Desktop Duplication API)

2. Dependencies

  • Swift on Windows compatibility
  • Windows SDK integration
  • OpenCV for Windows (already cross-platform)
  • Potential use of Windows.h and User32.dll

3. Implementation Tasks

Mouse Control

  • Implement mouse movement using SetCursorPos
  • Implement click events using SendInput
  • Implement drag operations
  • Implement scroll events with mouse_event

Keyboard Control

  • Map Swift Key enum to Windows Virtual-Key codes
  • Implement key press/release using SendInput
  • Support for keyboard shortcuts
  • Handle different keyboard layouts and locales

Screen Capture

  • Implement screenshot using BitBlt or Desktop Duplication API
  • Support for capturing specific windows
  • Handle multiple monitors and DPI scaling
  • Pixel color detection using GetPixel

Image Recognition

  • Ensure OpenCV integration works on Windows
  • Handle Windows coordinate systems
  • Test with high-DPI displays

4. Build System

  • Update Package.swift for Windows targets
  • Handle Windows-specific linking requirements
  • Integration with Visual Studio build tools
  • CI/CD pipeline for Windows testing

5. Testing

  • Port existing tests to Windows
  • Add Windows-specific test cases
  • Test on Windows 10 and Windows 11
  • Verify behavior with UAC and different privilege levels

Example Implementation Structure

#if os(Windows)
import WinSDK

extension SwiftAutoGUI {
    public static func moveMouse(dx: Int32, dy: Int32) {
        var point = POINT()
        GetCursorPos(&point)
        SetCursorPos(point.x + dx, point.y + dy)
    }
    
    public static func leftClick() {
        var input = INPUT()
        input.type = INPUT_MOUSE
        input.mi.dwFlags = MOUSEEVENTF_LEFTDOWN
        SendInput(1, &input, MemoryLayout<INPUT>.size)
        
        input.mi.dwFlags = MOUSEEVENTF_LEFTUP
        SendInput(1, &input, MemoryLayout<INPUT>.size)
    }
}
#endif

Challenges

  1. Swift on Windows: Still evolving ecosystem with limited libraries
  2. API Complexity: Windows API can be more complex than macOS equivalents
  3. Security: Windows Defender and antivirus software may flag automation
  4. DPI Scaling: Handling different DPI settings across monitors
  5. UAC: User Account Control may interfere with automation

Success Criteria

  • All core SwiftAutoGUI functions work on Windows 10/11
  • Tests pass on both Windows 10 and Windows 11
  • Proper handling of high-DPI displays
  • Documentation includes Windows-specific setup instructions
  • No false positives from Windows Defender

Implementation Priority

  1. Basic mouse movement and clicks
  2. Keyboard input
  3. Screenshot functionality
  4. Image recognition features
  5. Advanced features (multi-monitor support, etc.)

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requesthelp wantedExtra attention is needed

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions