Skip to content

Sandbox Agent

Shane Neuville edited this page Dec 8, 2025 · 4 revisions

The Sandbox Agent is a specialized AI assistant for working with the .NET MAUI Sandbox app to test, validate, and experiment with MAUI features through automated deployment and testing.

How to Use This Agent


What It Does

The Sandbox Agent:

  • ✅ Sets up test scenarios in the Sandbox app
  • ✅ Deploys to iOS/Android simulators and emulators
  • ✅ Runs automated Appium tests to reproduce issues
  • ✅ Validates PR fixes work correctly
  • ✅ Reproduces reported issues
  • Iteratively fixes issues - once reproduction is automated, keeps working until fix is validated
  • ✅ Converts Sandbox scenarios to UI tests when ready
  • ✅ Captures device logs and screenshots

When to Use

Use the Sandbox Agent when:

  • You want to manually verify a PR fix works on device/simulator
  • You need to reproduce an issue hands-on
  • You want to experiment with a MAUI feature
  • You need functional validation beyond code review
  • You want to iteratively fix an issue - reproduce → fix → test → repeat until solved
  • You're ready to convert a working Sandbox scenario into proper UI tests

Note: The Sandbox Agent focuses on functional testing, not code review. For reviewing code quality, use the PR Reviewer instead.

Example Prompts

test PR #12345 on Android
test this PR on iOS
validate PR #12345 on both Android and iOS
reproduce issue #12345 in Sandbox
try to reproduce issue #12345 on Android
test this PR on iPhone 15
test PR #12345 on iOS 18.5
verify that PR #12345 actually fixes issue #12000
set up a test in Sandbox for CollectionView with 1000 items
create a Sandbox test that demonstrates Grid layout with SafeArea
reproduce issue #12345 - the bug happens when you tap the button twice quickly
test PR #12345 and verify that:
1. Button click works
2. Label updates correctly
3. No crashes occur
reproduce issue #12345 with Appium automation, then work on fixing it until the test passes
The fix works! Now move this Sandbox scenario to proper UI tests.

What to Expect

When you invoke the Sandbox Agent, it will:

  1. Understand the issue - Reviews PR/issue details
  2. Create test scenario - Modifies Sandbox MainPage.xaml[.cs] with reproduction code
  3. Set up Appium test - Creates automated test script
  4. Build and deploy - Uses BuildAndRunSandbox.ps1 to deploy to device
  5. Run validation - Executes Appium test and captures results
  6. Provide report - Summarizes findings with logs and screenshots

The test output includes:

  • Test Summary: What was tested and results
  • Validation Results: Pass/fail with details
  • Device Logs: Relevant log excerpts
  • Screenshots: Visual confirmation (optional)
  • Verdict: Clear assessment of whether fix works

Test Workflow

The Sandbox Agent follows this workflow:

1. Modify Sandbox app → 2. Create Appium test → 3. Deploy to device → 4. Run test → 5. Report results

Files Modified

  • src/Controls/samples/Controls.Sample.Sandbox/MainPage.xaml - UI for test scenario
  • src/Controls/samples/Controls.Sample.Sandbox/MainPage.xaml.cs - Test logic
  • CustomAgentLogsTmp/Sandbox/RunWithAppiumTest.cs - Appium test script (auto-generated)

Logs Captured

  • CustomAgentLogsTmp/Sandbox/android-device.log or ios-device.log - Device logs
  • CustomAgentLogsTmp/Sandbox/build-run-output.log - Build and deployment logs
  • CustomAgentLogsTmp/Sandbox/appium.log - Appium test execution logs
  • CustomAgentLogsTmp/Sandbox/*.png - Screenshots (if test captures them)

Tips for Best Results

Link the PR or Issue

Instead of:

test the CollectionView fix

Try:

test PR #12345 which fixes CollectionView crash on item removal

Specify Platform When Relevant

test PR #12345 on Android

More efficient than testing all platforms if it's platform-specific.


Provide Reproduction Context

reproduce issue #12345 - the bug happens when you tap the button twice quickly

Helps the agent create the right test scenario.


Request Specific Validation

test PR #12345 and verify that:
1. Button click works
2. Label updates correctly
3. No crashes occur

Gives clear success criteria.


Common Use Cases

Iterative Issue Fixing (Recommended Workflow)

This is the most powerful way to use the Sandbox Agent - set up automated reproduction, then work with Copilot to fix the issue iteratively.

Step 1: Set up automated reproduction

reproduce issue #12345 in Sandbox with Appium automation

What happens:

  • Agent creates MainPage with reproduction scenario
  • Agent creates Appium test that demonstrates the bug
  • Agent runs test and confirms: "Bug reproduced - test fails as expected"

Step 2: Iteratively fix the issue

Now work on fixing issue #12345. Keep testing after each change until the Appium test passes.

What happens:

  • Copilot analyzes the bug and proposes a fix
  • Modifies MAUI framework code (not Sandbox)
  • Reruns BuildAndRunSandbox.ps1 to test
  • If test still fails → analyzes why → tries different approach
  • Repeats until Appium test passes

Step 3: Convert to UI tests

The fix works! Now move this Sandbox scenario to proper UI tests.

What happens:

  • Agent creates TestCases.HostApp/Issues/Issue12345.xaml[.cs]
  • Agent creates TestCases.Shared.Tests/Tests/Issues/Issue12345.cs
  • Copies working test logic from Sandbox
  • Cleans up Sandbox

Why this workflow works:

  • Fast feedback loop - Appium test validates each attempt
  • Objective validation - Not guessing if fix works, test proves it
  • Incremental progress - Each iteration gets closer to solution
  • Smooth transition - Working test becomes regression test

Example conversation:

You: reproduce issue #12345 - CollectionView crashes when removing last item

Agent: [Sets up reproduction, runs test]
"Bug reproduced successfully. Appium test demonstrates crash on item removal."

You: Now fix this issue. Keep testing until it works.

Agent: [Iteration 1]
"Added null check in CollectionViewHandler. Testing..."
"Test still fails - crash occurs before null check. Analyzing..."

Agent: [Iteration 2]
"Modified item removal sequence to update adapter first. Testing..."
"Test passes! No crash observed. Fix validated."

You: Great! Now move this to UI tests.

Agent: [Creates Issue12345.xaml and Issue12345.cs in proper locations]
"UI test created. Sandbox cleaned up. Ready for PR."

Validate a Bug Fix

verify PR #12345 fixes the crash reported in issue #12000

Expected flow:

  1. Agent creates reproduction scenario
  2. Tests WITH the fix
  3. Optionally tests WITHOUT the fix to confirm bug exists
  4. Reports whether fix resolves the issue

Quick Manual Testing

deploy PR #12345 to Android so I can test it manually

Expected flow:

  1. Agent sets up basic test scenario
  2. Deploys to device
  3. Leaves app running for manual exploration
  4. Provides instructions for manual validation

Pre-Submit Validation

test my changes on iOS before I submit a PR

Expected flow:

  1. Tests current branch changes
  2. Validates functionality
  3. Confirms no obvious issues
  4. Gives green light or flags concerns

Reproduce Community-Reported Bugs

reproduce issue #12345 to verify it still happens on main branch

Expected flow:

  1. Reads issue description
  2. Creates reproduction scenario
  3. Tests on main branch
  4. Confirms whether bug is reproducible

Platform Selection

The agent automatically selects platforms based on:

  1. PR title tags - [Android], [iOS], etc.
  2. Modified file paths - Platform-specific code paths
  3. Issue description - Mentioned platforms
  4. Code changes - Cross-platform vs. platform-specific

Default: Tests on Android only (faster) unless PR affects iOS-specific code or cross-platform controls.

You can override by specifying:

test PR #12345 on iOS

Understanding Test Results

✅ Success

✅ FIX VALIDATED - Test scenario completes successfully, expected behavior observed

Meaning: PR fix works as expected, no issues found.


⚠️ Partial Success

⚠️ PARTIAL - Fix appears to work but noticed a minor animation glitch

Meaning: Fix mostly works but there are concerns worth noting.


❌ Issues Found

❌ ISSUES FOUND - App crashes when tapping button after navigation

Meaning: Test revealed problems with the fix.


🚫 Cannot Test

🚫 CANNOT TEST - Build failed due to missing dependency

Meaning: Unable to complete testing due to technical issues.


Troubleshooting

Build Failures

If the agent reports build failures:

the build failed - can you check what went wrong?

The agent will analyze build logs and suggest fixes.


Test Can't Find Elements

If Appium can't locate UI elements:

the test can't find "TestButton" - can you check the AutomationIds?

Agent will verify and fix AutomationId mismatches.


App Crashes

If the app crashes during testing:

the app crashed - what does the log say?

Agent will analyze crash logs and identify the root cause.


Manual Validation

After automated testing, you can manually validate by:

  1. Simulator stays running - The app remains deployed
  2. Navigate to Sandbox - Find the app on simulator
  3. Test manually - Interact with the test scenario
  4. Review logs - Check CustomAgentLogsTmp/Sandbox/ for captured logs

The Sandbox Agent leaves the environment ready for hands-on exploration.


Advanced Usage

Test Multiple Scenarios

test PR #12345 with these scenarios:
1. Tap button once
2. Tap button rapidly 10 times
3. Navigate away and back, then tap

Creates comprehensive test coverage.


Capture Specific Metrics

test PR #12345 and measure the Grid layout dimensions

Uses Appium to capture element properties.


Compare Branches

test this PR on Android, then test main branch to compare behavior

Shows before/after comparison.


Best Practices

  • Test platform-specific changes on that platform - Don't test iOS changes on Android
  • Start with one platform - Test Android first (faster), then iOS if needed
  • Read issue reproduction steps - Use them as your test scenario when available
  • Validate incrementally - Test small changes frequently rather than large batches
  • Keep Sandbox simple - Focus on the specific bug or feature, don't create complex scenarios

Cleanup

The Sandbox Agent leaves your repository in a ready state:

  • Sandbox app contains your test scenario
  • Logs are captured in CustomAgentLogsTmp/Sandbox/
  • Device remains booted with app deployed

When to clean up:

git checkout -- src/Controls/samples/Controls.Sample.Sandbox/
rm -rf CustomAgentLogsTmp/Sandbox/

Only clean up when you're done with this test cycle and want to start fresh.

Clone this wiki locally