feat(ios): implement proper device pixel ratio detection (#1257)

quanru · claude · web-flow · commit 42b628067ded · 2025-09-28T15:28:36.000+08:00
* feat(ios): implement proper device pixel ratio detection Replace hardcoded scale=1 with real device pixel ratio from WebDriverAgent API. - Add getScreenScale() method to IOSWebDriverClient using /wda/screen endpoint - Add getDevicePixelRatio() method to IOSDevice for consistent DPR detection - Update getScreenSize() to use real scale instead of hardcoded value - Improve fallback values from scale 1 to scale 2 (more typical for iOS) - Add iOS-specific terms to dictionary for spell check This aligns iOS implementation with Android's approach of getting real DPR from system APIs, ensuring accurate coordinate calculations for different device densities. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * refactor(ios): streamline device pixel ratio retrieval and error handling * perf(ios): cache device pixel ratio to avoid repeated API calls Optimize device pixel ratio handling by: - Add devicePixelRatioInitialized flag to cache the value - Implement initializeDevicePixelRatio() method for one-time initialization - Call initialization in getScreenSize() to ensure it's cached - Update description to use the correct scale value This prevents repeated calls to WebDriverAgent /wda/screen endpoint and improves performance, following the same pattern as Android implementation. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * perf(android): cache device pixel ratio to avoid repeated ADB calls Optimize Android device pixel ratio handling by: - Add devicePixelRatioInitialized flag to cache the DPR value - Implement initializeDevicePixelRatio() method for one-time initialization - Call initialization in size() method to ensure it's cached - Remove repeated getDisplayDensity() calls from size() method This prevents expensive ADB shell commands (dumpsys display) from being executed on every size() call, significantly improving performance. The getDisplayDensity() method involves complex regex parsing and multiple shell commands, making caching essential for good performance. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * fix(ios,android): throw error when device pixel ratio detection fails Remove safe default values and throw errors when DPR detection fails to: - Make device configuration issues visible immediately - Prevent silent failures that could lead to coordinate calculation errors - Ensure proper device setup before automation begins Changes: - iOS: Remove fallback to scale=2, throw error if WebDriverAgent API fails - Android: Remove try-catch wrapper in initializeDevicePixelRatio() - Both platforms now fail fast when DPR cannot be determined This ensures coordinate calculations are always based on accurate device information rather than potentially incorrect default values. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * refactor(android): remove unnecessary coordinate conversion and image resizing Remove redundant operations that were converting between physical and logical pixels: - Remove reverseAdjustCoordinates() method and its usage in size() - Return physical pixel dimensions directly from size() method - Remove resizeAndConvertImgBuffer() from screenshotBase64() - Remove unnecessary width/height parameters from screenshot method - Remove unused import resizeAndConvertImgBuffer The Android screenshot is already in physical pixels, and coordinate conversion is handled by adjustCoordinates() when needed for touch operations. This simplifies the code and avoids double conversions that could introduce errors. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * fix(android): restore logical pixel coordinate system for accurate touch Fix coordinate system to ensure accurate touch operations: - Convert physical pixels to logical pixels in size() method - Return logical dimensions that users expect (width/dpr, height/dpr) - adjustCoordinates() converts logical coordinates to physical pixels for touch - This maintains the expected coordinate system where user coordinates match screen dimensions Without this fix, touch coordinates were being double-scaled since size() returned physical pixels but adjustCoordinates() was still scaling them up. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * fix(playground): update NavActions props to disable tooltip and model name display * test(ios): update unit tests for new device pixel ratio implementation Update iOS device tests to match the new DPR detection implementation: - Add getScreenScale mock method returning scale=2 - Update expected DPR from 1 to 2 in all size() assertions - Add getScreenScale mock to additional test setups - Update test comments to reflect new DPR source Tests now correctly validate that device pixel ratio is obtained from WebDriverAgent's /wda/screen endpoint rather than being hardcoded. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * fix(ios): simplify device pixel ratio initialization and error handling * fix(ios): replace error throwing with assertions for element location and device state --------- Co-authored-by: Claude <noreply@anthropic.com>
diff --git a/apps/playground/src/App.tsx b/apps/playground/src/App.tsx
@@ -127,7 +127,10 @@ export default function App() {
                 <div className="playground-panel-header">
                   <div className="header-row">
                     <Logo />
-                    <NavActions showEnvConfig={false} />
+                    <NavActions
+                      showTooltipWhenEmpty={false}
+                      showModelName={false}
+                    />
                   </div>
                 </div>
 
diff --git a/packages/android/src/device.ts b/packages/android/src/device.ts
@@ -32,7 +32,6 @@ import type { ElementInfo } from '@midscene/shared/extractor';
 import {
   createImgBase64ByFormat,
   isValidPNGImageBuffer,
-  resizeAndConvertImgBuffer,
 } from '@midscene/shared/img';
 import { getDebug } from '@midscene/shared/logger';
 import { uuid } from '@midscene/shared/utils';
@@ -67,6 +66,7 @@ export class AndroidDevice implements AbstractInterface {
   private deviceId: string;
   private yadbPushed = false;
   private devicePixelRatio = 1;
+  private devicePixelRatioInitialized = false;
   private adb: ADB | null = null;
   private connectingAdb: Promise<ADB> | null = null;
   private destroyed = false;
@@ -583,6 +583,20 @@ ${Object.keys(size)
     throw new Error(`Failed to get screen size, output: ${stdout}`);
   }
 
+  private async initializeDevicePixelRatio(): Promise<void> {
+    if (this.devicePixelRatioInitialized) {
+      return;
+    }
+
+    // Get device display density using custom method
+    const densityNum = await this.getDisplayDensity();
+    // Standard density is 160, calculate the ratio
+    this.devicePixelRatio = Number(densityNum) / 160;
+    debugDevice(`Initialized device pixel ratio: ${this.devicePixelRatio}`);
+
+    this.devicePixelRatioInitialized = true;
+  }
+
   async getDisplayDensity(): Promise<number> {
     const adb = await this.getAdb();
 
@@ -679,6 +693,9 @@ ${Object.keys(size)
   }
 
   async size(): Promise<Size> {
+    // Ensure device pixel ratio is initialized first
+    await this.initializeDevicePixelRatio();
+
     // Use custom getScreenSize method instead of adb.getScreenSize()
     const screenSize = await this.getScreenSize();
     // screenSize is a string like "width x height"
@@ -696,16 +713,12 @@ ${Object.keys(size)
     const width = Number.parseInt(match[isLandscape ? 2 : 1], 10);
     const height = Number.parseInt(match[isLandscape ? 1 : 2], 10);
 
-    // Get device display density using custom method
-    const densityNum = await this.getDisplayDensity();
-    // Standard density is 160, calculate the ratio
-    this.devicePixelRatio = Number(densityNum) / 160;
+    // Use cached device pixel ratio instead of calling getDisplayDensity() every time
 
-    // calculate logical pixel size using reverseAdjustCoordinates function
-    const { x: logicalWidth, y: logicalHeight } = this.reverseAdjustCoordinates(
-      width,
-      height,
-    );
+    // Convert physical pixels to logical pixels for consistent coordinate system
+    // adjustCoordinates() will convert back to physical pixels when needed for touch operations
+    const logicalWidth = Math.round(width / this.devicePixelRatio);
+    const logicalHeight = Math.round(height / this.devicePixelRatio);
 
     return {
       width: logicalWidth,
@@ -722,17 +735,6 @@ ${Object.keys(size)
     };
   }
 
-  private reverseAdjustCoordinates(
-    x: number,
-    y: number,
-  ): { x: number; y: number } {
-    const ratio = this.devicePixelRatio;
-    return {
-      x: Math.round(x / ratio),
-      y: Math.round(y / ratio),
-    };
-  }
-
   /**
    * Calculate the end point for scroll operations based on start point, scroll delta, and screen boundaries.
    * This method ensures that scroll operations stay within screen bounds and maintain a minimum scroll distance
@@ -800,7 +802,6 @@ ${Object.keys(size)
 
   async screenshotBase64(): Promise<string> {
     debugDevice('screenshotBase64 begin');
-    const { width, height } = await this.size();
     const adb = await this.getAdb();
     let screenshotBuffer;
     const androidScreenshotPath = `/data/local/tmp/midscene_screenshot_${uuid()}.png`;
@@ -865,20 +866,11 @@ ${Object.keys(size)
       }
     }
 
-    debugDevice('Resizing screenshot image');
-    const { buffer, format } = await resizeAndConvertImgBuffer(
-      // both "adb.takeScreenshot" and "shell screencap" result are png format
+    debugDevice('Converting to base64');
+    const result = createImgBase64ByFormat(
       'png',
-      screenshotBuffer,
-      {
-        width,
-        height,
-      },
+      screenshotBuffer.toString('base64'),
     );
-    debugDevice('Image resize completed');
-
-    debugDevice('Converting to base64');
-    const result = createImgBase64ByFormat(format, buffer.toString('base64'));
     debugDevice('screenshotBase64 end');
     return result;
   }
diff --git a/packages/ios/src/device.ts b/packages/ios/src/device.ts
@@ -42,6 +42,7 @@ export type IOSDeviceOpt = {
 export class IOSDevice implements AbstractInterface {
   private deviceId: string;
   private devicePixelRatio = 1;
+  private devicePixelRatioInitialized = false;
   private destroyed = false;
   private description: string | undefined;
   private customActions?: DeviceAction<any>[];
@@ -186,9 +187,7 @@ export class IOSDevice implements AbstractInterface {
         }),
         call: async (param) => {
           const element = param.locate;
-          if (!element) {
-            throw new Error('IOSLongPress requires an element to be located');
-          }
+          assert(element, 'IOSLongPress requires an element to be located');
           const [x, y] = element.center;
           await this.longPress(x, y, param?.duration);
         },
@@ -227,11 +226,10 @@ export class IOSDevice implements AbstractInterface {
   }
 
   public async connect(): Promise<void> {
-    if (this.destroyed) {
-      throw new Error(
-        `IOSDevice ${this.deviceId} has been destroyed and cannot execute commands`,
-      );
-    }
+    assert(
+      !this.destroyed,
+      `IOSDevice ${this.deviceId} has been destroyed and cannot execute commands`,
+    );
 
     debugDevice(`Connecting to iOS device: ${this.deviceId}`);
 
@@ -261,7 +259,7 @@ Model: ${deviceInfo.model}`
           : ''
       }
 Type: WebDriverAgent
-ScreenSize: ${size.width}x${size.height} (DPR: ${this.devicePixelRatio})
+ScreenSize: ${size.width}x${size.height} (DPR: ${size.scale})
 `;
       debugDevice('iOS device connected successfully', this.description);
     } catch (e) {
@@ -307,41 +305,48 @@ ScreenSize: ${size.width}x${size.height} (DPR: ${this.devicePixelRatio})
     };
   }
 
+  private async initializeDevicePixelRatio(): Promise<void> {
+    if (this.devicePixelRatioInitialized) {
+      return;
+    }
+
+    // Get real device pixel ratio from WebDriverAgent /wda/screen endpoint
+    const apiScale = await this.wdaBackend.getScreenScale();
+
+    assert(
+      apiScale && apiScale > 0,
+      'Failed to get device pixel ratio from WebDriverAgent API',
+    );
+
+    debugDevice(`Got screen scale from WebDriverAgent API: ${apiScale}`);
+    this.devicePixelRatio = apiScale;
+    this.devicePixelRatioInitialized = true;
+  }
+
   async getScreenSize(): Promise<{
     width: number;
     height: number;
     scale: number;
   }> {
-    try {
-      const windowSize = await this.wdaBackend.getWindowSize();
-      // WDA returns logical points, for our coordinate system we use scale = 1
-      // This means we work directly with the logical coordinates that WDA provides
-      const scale = 1; // Use 1 to work with WDA's logical coordinate system directly
-
-      return {
-        width: windowSize.width,
-        height: windowSize.height,
-        scale,
-      };
-    } catch (e) {
-      debugDevice(`Failed to get screen size: ${e}`);
-      // Fallback to default iPhone size with scale 1
-      return {
-        width: 402,
-        height: 874,
-        scale: 1,
-      };
-    }
+    // Ensure device pixel ratio is initialized
+    await this.initializeDevicePixelRatio();
+
+    const windowSize = await this.wdaBackend.getWindowSize();
+
+    return {
+      width: windowSize.width,
+      height: windowSize.height,
+      scale: this.devicePixelRatio,
+    };
   }
 
   async size(): Promise<Size> {
     const screenSize = await this.getScreenSize();
-    this.devicePixelRatio = screenSize.scale;
 
     return {
       width: screenSize.width,
       height: screenSize.height,
-      dpr: this.devicePixelRatio,
+      dpr: screenSize.scale,
     };
   }
 
diff --git a/packages/ios/src/ios-webdriver-client.ts b/packages/ios/src/ios-webdriver-client.ts
@@ -316,6 +316,20 @@ export class IOSWebDriverClient extends WebDriverClient {
     debugIOS(`Double tapped at coordinates (${x}, ${y})`);
   }
 
+  async getScreenScale(): Promise<number | null> {
+    // Use the WDA-specific screen endpoint which we confirmed works
+    const screenResponse = await this.makeRequest('GET', '/wda/screen');
+    if (screenResponse?.value?.scale) {
+      debugIOS(
+        `Got screen scale from WDA screen endpoint: ${screenResponse.value.scale}`,
+      );
+      return screenResponse.value.scale;
+    }
+
+    debugIOS('No screen scale found in WDA screen response');
+    return null;
+  }
+
   async createSession(capabilities?: any): Promise<any> {
     // iOS-specific default capabilities
     const defaultCapabilities = {
diff --git a/packages/ios/tests/unit-test/device.test.ts b/packages/ios/tests/unit-test/device.test.ts
@@ -47,6 +47,9 @@ describe('IOSDevice', () => {
       model: 'iPhone 15',
     });
 
+    // Add getScreenScale mock for new DPR detection
+    mockWdaClient.getScreenScale = vi.fn().mockResolvedValue(2);
+
     MockedWdaClient.mockImplementation(() => mockWdaClient);
 
     // Setup mock WDA manager
@@ -183,7 +186,7 @@ describe('IOSDevice', () => {
       expect(size).toEqual({
         width: 375,
         height: 812,
-        dpr: 1,
+        dpr: 2,
       });
       expect(mockWdaClient.getWindowSize).toHaveBeenCalled();
     });
@@ -295,7 +298,7 @@ describe('IOSDevice', () => {
       expect(size).toEqual({
         width: 375,
         height: 812,
-        dpr: 1,
+        dpr: 2,
       });
     });
 
@@ -445,6 +448,7 @@ describe('IOSDevice', () => {
         createSession: vi.fn().mockResolvedValue({ sessionId: 'test-session' }),
         typeText: vi.fn().mockResolvedValue(undefined),
         getWindowSize: vi.fn().mockResolvedValue({ width: 375, height: 812 }),
+        getScreenScale: vi.fn().mockResolvedValue(2),
         swipe: vi.fn().mockResolvedValue(undefined),
         sessionInfo: { sessionId: 'test-session' }, // Ensure session info is available
       };
@@ -470,7 +474,7 @@ describe('IOSDevice', () => {
 
     it('should calculate DPR correctly', async () => {
       const size = await device.size();
-      expect(size.dpr).toBe(1); // Default DPR for mocked device
+      expect(size.dpr).toBe(2); // DPR from mocked getScreenScale
     });
 
     it('should handle different screen sizes', async () => {
@@ -479,7 +483,7 @@ describe('IOSDevice', () => {
         .mockResolvedValue({ width: 1920, height: 1080 });
 
       const size = await device.size();
-      expect(size.width).toBe(1920);
+      expect(size.width).toBe(1920); // iOS returns logical pixels directly from WDA
       expect(size.height).toBe(1080);
     });
 
diff --git a/scripts/dictionary.txt b/scripts/dictionary.txt
@@ -125,3 +125,8 @@ watchpack
 webm
 webp
 midscene
+mobilesafari
+dragfromtoforduration
+XCUI
+udid
+UDID