Skip to content

feat(ios): add iOS automation support via screen mirroring #987

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 23 commits into
base: main
Choose a base branch
from

Conversation

lhuanyu
Copy link

@lhuanyu lhuanyu commented Aug 4, 2025

Add iOS Automation Support via Screen Mirroring

This PR introduces comprehensive iOS automation capabilities to Midscene through screen mirroring and PyAutoGUI integration.

Demo

ios-2025-08-04_23-03-44-4xv2jvtw.html.zip

🚀 Key Features

  • iOS Device Control: Automate iOS devices through macOS screen mirroring (iPhone Mirroring)
  • Coordinate Mapping: Automatic transformation from iOS coordinates to macOS screen coordinates
  • PyAutoGUI Backend: Reliable macOS system control through Python server integration
  • YAML Script Support: Write iOS automation scripts using familiar YAML syntax
  • AI Integration: Use natural language to interact with iOS interfaces

📦 What's Added

  • New @midscene/ios package with complete iOS automation SDK
  • iOSDevice and iOSAgent classes for device control and AI interactions
  • PyAutoGUI server (auto_server.py) for system-level operations
  • AppleScript utility for automatic mirror window detection
  • Comprehensive documentation and examples
  • Unit tests and proper TypeScript support

🛠 Usage Example

ios:
  serverUrl: "http://localhost:1412"
  mirrorConfig:
    mirrorX: 692
    mirrorY: 161
    mirrorWidth: 344
    mirrorHeight: 764

tasks:
  - name: Search music
    flow:
      - aiAction: "Open music app"
      - aiInput: "Coldplay"
        locate: "Search box"
      - aiKeyboardPress: "Enter"

This implementation enables seamless iOS automation while maintaining Midscene's core AI-driven approach and developer-friendly experience.

@Copilot Copilot AI review requested due to automatic review settings August 4, 2025 09:57
Copy link

netlify bot commented Aug 4, 2025

Deploy Preview for midscene ready!

Name Link
🔨 Latest commit 3451621
🔍 Latest deploy log https://app.netlify.com/projects/midscene/deploys/689b5ee1597f410008153a64
😎 Deploy Preview https://deploy-preview-987--midscene.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces comprehensive iOS automation support to Midscene.js through screen mirroring and PyAutoGUI integration. The implementation enables users to automate iOS devices on macOS through screen mirroring with automatic coordinate transformation.

  • Adds new @midscene/ios package with complete iOS automation SDK
  • Implements PyAutoGUI server for macOS system control and iOS device interaction
  • Extends YAML script support to include iOS configuration alongside existing web and Android options

Reviewed Changes

Copilot reviewed 37 out of 40 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
packages/web-integration/tests/unit-test/yaml/utils.test.ts Adds comprehensive tests for iOS configuration parsing
packages/web-integration/src/yaml/utils.ts Updates YAML parser to support iOS configuration alongside web/Android
packages/web-integration/src/common/tasks.ts Extends Android page detection to include iOS for unified mobile handling
packages/ios/src/page/index.ts Core iOS device implementation with coordinate mapping and PyAutoGUI integration
packages/ios/src/agent/index.ts iOS agent wrapper providing high-level automation interface
packages/ios/idb/auto_server.py Python server implementing PyAutoGUI automation with iOS-specific optimizations
packages/cli/src/create-yaml-player.ts Extends CLI to support iOS device creation and configuration
packages/core/src/yaml.ts Adds iOS environment types to core YAML schema
Files not reviewed (1)
  • pnpm-lock.yaml: Language not supported
Comments suppressed due to low confidence (1)

packages/web-integration/src/common/tasks.ts:75

  • The function name isAndroidPage is misleading since it now also returns true for iOS pages. Consider renaming to isMobilePage or isAndroidOrIOSPage to better reflect its current behavior.
  return page.pageType === 'android' || page.pageType === 'ios';

@CLAassistant
Copy link

CLAassistant commented Aug 4, 2025

CLA assistant check
All committers have signed the CLA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants