Skip to content

Comments

feat: many StartDeviceKit() optimizations#162

Merged
gmegidish merged 4 commits intomainfrom
feat-optimizing-start-devicekit
Feb 1, 2026
Merged

feat: many StartDeviceKit() optimizations#162
gmegidish merged 4 commits intomainfrom
feat-optimizing-start-devicekit

Conversation

@gmegidish
Copy link
Member

@gmegidish gmegidish commented Jan 31, 2026

Summary by CodeRabbit

  • New Features

    • Customizable attribute filtering when querying device UI/source.
  • Bug Fixes

    • More reliable screen-capture startup with smarter connection and port-forward reuse.
    • Improved detection of app foreground state during device initialization.
    • Better error handling and cleanup when initialization or connection steps fail.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Jan 31, 2026

Walkthrough

Adds DeviceKit port-forwarder reuse and runtime detection into iOS screen-capture and DeviceKit startup flows (including foreground polling and forwarder cleanup), and adds a WDA source API to request only specified attributes with timing telemetry.

Changes

Cohort / File(s) Summary
DeviceKit startup & port forwarding
devices/ios.go
Introduces ensureDeviceKitPortForwarders to lazily create/manage DeviceKit HTTP and AVC forwarders; adds isDeviceKitRunning heuristic and waitForAppInForeground; updates StartScreenCapture and StartDeviceKit to reuse forwarders, replace fixed sleeps with foreground polling, adjust progress messages, and ensure forwarder cleanup on failures.
WDA source fetching
devices/wda/source.go
Adds GetSourceRawWithAttributes(attributes []string) to request WDA source filtered by attributes (via excluded_attributes), updates GetSourceRaw to record timing, and adds timing/attribute logging; GetSourceElements retained but references the new method path.
Imports / logging tweaks
devices/wda/source.go (imports)
Adds strings import and verbose timing logs related to source requests.

Possibly related PRs

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately captures the main change: optimization of StartDeviceKit() function with multiple improvements including lazy port forwarder initialization, running state checks, and foreground app waits.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat-optimizing-start-devicekit

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
devices/ios.go (1)

1480-1502: ⚠️ Potential issue | 🔴 Critical

Critical: waitForAppInForeground is called before WDA is started.

The code flow has an ordering issue:

  1. Line 1472: LaunchApp(devicekitMainAppBundleId) — launches the DeviceKit app
  2. Line 1482: waitForAppInForeground(...) — polls d.wdaClient.GetActiveAppInfo() which requires WDA to be running
  3. Line 1491: StartAgent(...) — starts WDA if not running

If WDA is not already running, waitForAppInForeground will continuously fail on every poll and eventually timeout after 5 seconds, since GetActiveAppInfo() requires WDA to respond.

Either:

  1. Move StartAgent() before waitForAppInForeground()
  2. Or use a different mechanism to detect foreground state that doesn't rely on WDA
🐛 Proposed fix: reorder to start WDA first
 	// Launch the main DeviceKit app
 	utils.Verbose("Launching DeviceKit app: %s", devicekitMainAppBundleId)
 	startTime := time.Now()
 	err = d.LaunchApp(devicekitMainAppBundleId)
 	if err != nil {
 		// clean up port forwarders on failure
 		_ = d.portForwarderDeviceKit.Stop()
 		_ = d.portForwarderAvc.Stop()
 		return nil, fmt.Errorf("failed to launch DeviceKit app: %w", err)
 	}

+	// Start WebDriverAgent first so we can poll for foreground state
+	err = d.StartAgent(StartAgentConfig{
+		OnProgress: func(message string) {
+			utils.Verbose(message)
+		},
+	})
+	if err != nil {
+		// clean up port forwarders on failure
+		_ = d.portForwarderDeviceKit.Stop()
+		_ = d.portForwarderAvc.Stop()
+		return nil, fmt.Errorf("failed to start agent: %w", err)
+	}
+
 	// wait for the app to be in foreground
 	utils.Verbose("Waiting for DeviceKit app to be in foreground...")
 	err = d.waitForAppInForeground(devicekitMainAppBundleId, deviceKitAppLaunchTimeout)
 	if err != nil {
 		// clean up port forwarders on failure
 		_ = d.portForwarderDeviceKit.Stop()
 		_ = d.portForwarderAvc.Stop()
 		return nil, fmt.Errorf("failed to wait for DeviceKit app: %w", err)
 	}

-	// Start WebDriverAgent to be able to tap on the screen
-	err = d.StartAgent(StartAgentConfig{
-		OnProgress: func(message string) {
-			utils.Verbose(message)
-		},
-	})
-
-	if err != nil {
-		// clean up port forwarders on failure
-		_ = d.portForwarderDeviceKit.Stop()
-		_ = d.portForwarderAvc.Stop()
-		return nil, fmt.Errorf("failed to start agent: %w", err)
-	}
-
 	// find and tap the "Start Broadcast" button
🧹 Nitpick comments (1)
devices/ios.go (1)

1269-1335: Consider holding the lock through the Forward() call to prevent race conditions.

There's a potential race condition in the forwarder creation pattern:

d.mu.Lock()
d.portForwarderDeviceKit = ios.NewPortForwarder(d.ID())
d.mu.Unlock()

err = d.portForwarderDeviceKit.Forward(httpPort, deviceKitHTTPPort)  // Outside lock

Between assigning the forwarder and calling Forward(), another goroutine could check d.portForwarderDeviceKit.IsRunning() and get false, potentially leading to duplicate forwarder creation attempts.

Consider either:

  1. Holding the lock through the Forward() call (if it's non-blocking/fast)
  2. Using a separate "in-progress" flag

This may be acceptable in practice if concurrent calls are rare, but worth noting for robustness.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
devices/ios.go (1)

1481-1503: ⚠️ Potential issue | 🟠 Major

Start WDA before foreground polling to avoid guaranteed timeouts.

waitForAppInForeground relies on WDA; when WDA isn’t running yet, this loop will always time out and StartDeviceKit fails for a fresh session. Start the agent first (or use a non‑WDA signal).

🔧 Suggested reorder
-    // wait for the app to be in foreground
-    utils.Verbose("Waiting for DeviceKit app to be in foreground...")
-    err = d.waitForAppInForeground(devicekitMainAppBundleId, deviceKitAppLaunchTimeout)
-    if err != nil {
-        // clean up port forwarders on failure
-        _ = d.portForwarderDeviceKit.Stop()
-        _ = d.portForwarderAvc.Stop()
-        return nil, fmt.Errorf("failed to wait for DeviceKit app: %w", err)
-    }
-
-    // Start WebDriverAgent to be able to tap on the screen
-    err = d.StartAgent(StartAgentConfig{
-        OnProgress: func(message string) {
-            utils.Verbose(message)
-        },
-    })
-
-    if err != nil {
-        // clean up port forwarders on failure
-        _ = d.portForwarderDeviceKit.Stop()
-        _ = d.portForwarderAvc.Stop()
-        return nil, fmt.Errorf("failed to start agent: %w", err)
-    }
+    // Start WebDriverAgent to enable foreground checks and UI interactions
+    err = d.StartAgent(StartAgentConfig{
+        OnProgress: func(message string) {
+            utils.Verbose(message)
+        },
+    })
+    if err != nil {
+        // clean up port forwarders on failure
+        _ = d.portForwarderDeviceKit.Stop()
+        _ = d.portForwarderAvc.Stop()
+        return nil, fmt.Errorf("failed to start agent: %w", err)
+    }
+
+    // wait for the app to be in foreground
+    utils.Verbose("Waiting for DeviceKit app to be in foreground...")
+    err = d.waitForAppInForeground(devicekitMainAppBundleId, deviceKitAppLaunchTimeout)
+    if err != nil {
+        // clean up port forwarders on failure
+        _ = d.portForwarderDeviceKit.Stop()
+        _ = d.portForwarderAvc.Stop()
+        return nil, fmt.Errorf("failed to wait for DeviceKit app: %w", err)
+    }
🧹 Nitpick comments (1)
devices/ios.go (1)

1338-1395: Consider probing the HTTP health endpoint on port 12004 instead of reading from the H.264 stream.

The current implementation tests DeviceKit readiness by reading a byte from the H.264 stream (port 12005), which can return false negatives if DeviceKit is running but idle and not actively streaming. The HTTP server on port 12004 (already forwarded for tap/dumpUI commands) would be a more reliable readiness probe. A simple HTTP GET to a health endpoint—or just testing successful TCP connection to the HTTP port—would confirm DeviceKit is running without depending on active stream data. Verify with the devicekit-ios documentation whether the HTTP server exposes a stable health/status endpoint suitable for readiness checks.

@gmegidish gmegidish merged commit 6f6decc into main Feb 1, 2026
14 of 15 checks passed
@gmegidish gmegidish deleted the feat-optimizing-start-devicekit branch February 1, 2026 20:36
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@devices/wda/source.go`:
- Around line 115-161: GetSourceRawWithAttributes currently ignores invalid
attribute names; change it to validate each entry in the input attributes slice
against the allAttributes list and return an error listing any unknown
attributes instead of silently continuing. Replace the nested loop membership
test by building a set/map from allAttributes (e.g., allowedAttrs
map[string]struct{}) and checking requested attributes against that map, then
build excludedAttrs from allowedAttrs minus requested ones; keep existing
behavior for building excludedStr, calling getEndpointWithTimeout, and logging.
🧹 Nitpick comments (2)
devices/wda/source.go (2)

166-167: Remove commented-out alternative call or gate it with a flag.

Commented-out code tends to rot and adds noise. If this is meant as a toggle, consider a config/flag; otherwise, remove it.

🧹 Cleanup suggestion
-	// only fetch the attributes we actually use
-	// value, err := c.GetSourceRawWithAttributes([]string{"type", "name", "label", "value", "visible", "x", "y", "width", "height"})

96-113: Omit the empty excluded_attributes parameter for consistency.

The current code sends excluded_attributes= with an empty value, but WDA treats this parameter as optional. The GetSourceRawWithAttributes() method in this same file demonstrates the cleaner pattern by omitting the parameter entirely when no attributes need exclusion. Align GetSourceRaw() with this pattern.

Suggested change
-	endpoint := "source?format=json&excluded_attributes="
+	endpoint := "source?format=json"

Comment on lines +115 to 161
// GetSourceRawWithAttributes gets the raw page source with only the specified attributes included
func (c *WdaClient) GetSourceRawWithAttributes(attributes []string) (interface{}, error) {
startTime := time.Now()

// all possible attributes that can be excluded
allAttributes := []string{
"type", "value", "name", "label", "enabled", "visible", "accessible", "focused",
"x", "y", "width", "height", "index", "hittable", "bundleId", "processId",
"placeholderValue", "nativeFrame", "traits", "minValue", "maxValue", "customActions",
}

// build excluded list by removing requested attributes from all attributes
excludedAttrs := []string{}
for _, attr := range allAttributes {
include := false
for _, requestedAttr := range attributes {
if attr == requestedAttr {
include = true
break
}
}
if !include {
excludedAttrs = append(excludedAttrs, attr)
}
}

excludedStr := ""
if len(excludedAttrs) > 0 {
excludedStr = fmt.Sprintf("&excluded_attributes=%s", strings.Join(excludedAttrs, ","))
}

endpoint := fmt.Sprintf("source?format=json%s", excludedStr)

result, err := c.getEndpointWithTimeout(endpoint, 60*time.Second)
if err != nil {
return nil, fmt.Errorf("failed to get source: %w", err)
}

value, ok := result["value"]
if !ok {
return nil, fmt.Errorf("no 'value' field found in WDA response")
}

elapsed := time.Since(startTime)
utils.Verbose("GetSourceRawWithAttributes took %.2f seconds (attributes: %v)", elapsed.Seconds(), attributes)

return value, nil
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

🧩 Analysis chain

🌐 Web query:

WDA WebDriverAgent /source endpoint excluded_attributes specification

💡 Result:

WebDriverAgent itself exposes the standard WebDriver endpoint:

  • GET /session/:sessionId/source → returns the XML page source (there is no formal, W3C-defined excluded_attributes query parameter).

In the Appium + WDA ecosystem, “exclude attributes from source” is specified via the Appium XCUITest driver (which then instructs WDA accordingly), using camelCase:

1) mobile: source (recommended; per-call control)

Execute Script:

  • script: mobile: source
  • args:
    • format: xml (default), description, or json
    • excludedAttributes: comma-separated string of attribute names to omit (only applies to xml) [1]

Example args:

{"format":"xml","excludedAttributes":"visible,accessible"}

[1]

2) Persistent setting: pageSourceExcludedAttributes

Set the XCUITest driver setting:

  • pageSourceExcludedAttributes: comma-separated string (defaults to empty). Example: "visible,accessible" [2]

Notes:

  • Appium specifically calls out excluding expensive attributes like visible / accessible as a way to speed up page source retrieval [3].
  • If you are using Appium’s JSON source mode, the doc lists excludable attributes (for that mode) as: frame, enabled, visible, accessible, focused [2].

Sources: [1] [2] [3]


🌐 Web query:

WDA source attributes list accessible visible x y width height

💡 Result:

In WebDriverAgent (WDA) page source as exposed by Appium’s XCUITest driver, the supported element attributes are: name, label, type, visible, focused, accessible, enabled, selected, index, rect, value (and hittable, but it’s not included in the XML source for performance reasons). [1]

For position/size, WDA exposes them via rect, which contains x, y, width, height (derived from the element’s frame). [1]

So if you’re looking for: accessible, visible, x, y, width, height → use accessible, visible, and rect.{x,y,width,height}. [1]

Sources: [1]


🏁 Script executed:

# Search the codebase for WDA source-related code
find . -type f -name "*.go" | xargs grep -l "source" | head -20

Repository: mobile-next/mobilecli

Length of output: 249


🏁 Script executed:

# Look for any documentation or comments about WDA attributes
rg -i "wda.*attribute|excluded.*attribute" --type go

Repository: mobile-next/mobilecli

Length of output: 464


🏁 Script executed:

# Check the specific file and surrounding context
fd "source.go" --type f

Repository: mobile-next/mobilecli

Length of output: 86


🏁 Script executed:

# Read the source.go file to understand the current implementation
cat -n ./devices/wda/source.go

Repository: mobile-next/mobilecli

Length of output: 6890


🏁 Script executed:

# Check if there are tests for this method
fd -e "test.go" -e "_test.go" | xargs grep -l "GetSourceRawWithAttributes" 2>/dev/null

Repository: mobile-next/mobilecli

Length of output: 49


🏁 Script executed:

# Look for any other source-related methods to understand the pattern
rg -A 20 "func.*Source" devices/wda/source.go --type go

Repository: mobile-next/mobilecli

Length of output: 3309


🏁 Script executed:

# Check how GetSourceRawWithAttributes is being called
rg "GetSourceRawWithAttributes" --type go

Repository: mobile-next/mobilecli

Length of output: 584


🏁 Script executed:

# Check if there are any other references to excluded_attributes in the codebase
rg "excluded_attributes|excludedAttributes" --type go

Repository: mobile-next/mobilecli

Length of output: 252


🏁 Script executed:

# Look for any WDA client implementation details
rg "getEndpointWithTimeout" devices/wda/ -A 5 --type go | head -50

Repository: mobile-next/mobilecli

Length of output: 1780


🏁 Script executed:

# Check the types file to understand the API better
find . -name "*.go" -path "*/types/*" | head -5

Repository: mobile-next/mobilecli

Length of output: 82


🏁 Script executed:

# Search for any WDA configuration or documentation
fd -e "README" -e "*.md" | xargs grep -l -i "wda\|webdriver" 2>/dev/null

Repository: mobile-next/mobilecli

Length of output: 49


🏁 Script executed:

# Check the requests.go file to understand how the endpoint is called
cat -n ./devices/wda/requests.go | head -100

Repository: mobile-next/mobilecli

Length of output: 3394


🏁 Script executed:

# Search for any usage patterns or tests that might show actual WDA behavior
rg "excluded_attributes" --type go -B 3 -A 3

Repository: mobile-next/mobilecli

Length of output: 871


🏁 Script executed:

# Check if there's any configuration or constants related to WDA attributes
rg "allAttributes\|attributes.*=" --type go

Repository: mobile-next/mobilecli

Length of output: 47


🏁 Script executed:

# Search for any comments or documentation about the attribute names
rg -i "attribute|excluded" devices/wda/ --type go

Repository: mobile-next/mobilecli

Length of output: 1483


🏁 Script executed:

# Check if there are integration tests or examples showing how GetSourceRawWithAttributes is used
find . -name "*.go" -type f | xargs grep -B 5 -A 5 "GetSourceRawWithAttributes\|excludedAttributes" 2>/dev/null | head -60

Repository: mobile-next/mobilecli

Length of output: 1898


🏁 Script executed:

# Look at the commented line in GetSourceElements to see the intended usage
sed -n '164,170p' ./devices/wda/source.go

Repository: mobile-next/mobilecli

Length of output: 384


Add validation for requested attribute names against the supported WDA attributes list.

The method silently accepts and silently ignores unknown or typo'd attribute names, potentially excluding required fields and returning partial/unexpected data. Implement validation to return an error for any attribute not in the allAttributes list. Additionally, refactor the nested loop to use sets for better performance.

🧩 Suggested refactor
-	// build excluded list by removing requested attributes from all attributes
-	excludedAttrs := []string{}
-	for _, attr := range allAttributes {
-		include := false
-		for _, requestedAttr := range attributes {
-			if attr == requestedAttr {
-				include = true
-				break
-			}
-		}
-		if !include {
-			excludedAttrs = append(excludedAttrs, attr)
-		}
-	}
+	// build excluded list by removing requested attributes from all attributes
+	allAttrSet := map[string]struct{}{}
+	for _, attr := range allAttributes {
+		allAttrSet[attr] = struct{}{}
+	}
+
+	requested := map[string]struct{}{}
+	for _, attr := range attributes {
+		if _, ok := allAttrSet[attr]; !ok {
+			return nil, fmt.Errorf("unknown WDA source attribute: %s", attr)
+		}
+		requested[attr] = struct{}{}
+	}
+
+	excludedAttrs := make([]string, 0, len(allAttributes))
+	for _, attr := range allAttributes {
+		if _, ok := requested[attr]; !ok {
+			excludedAttrs = append(excludedAttrs, attr)
+		}
+	}
🤖 Prompt for AI Agents
In `@devices/wda/source.go` around lines 115 - 161, GetSourceRawWithAttributes
currently ignores invalid attribute names; change it to validate each entry in
the input attributes slice against the allAttributes list and return an error
listing any unknown attributes instead of silently continuing. Replace the
nested loop membership test by building a set/map from allAttributes (e.g.,
allowedAttrs map[string]struct{}) and checking requested attributes against that
map, then build excludedAttrs from allowedAttrs minus requested ones; keep
existing behavior for building excludedStr, calling getEndpointWithTimeout, and
logging.

@coderabbitai coderabbitai bot mentioned this pull request Feb 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant