Skip to content

Commit f5e7a76

Browse files
committed
Add documentation for browser automation feature and related assets
1 parent 3d9507c commit f5e7a76

File tree

9 files changed

+167
-0
lines changed

9 files changed

+167
-0
lines changed

docs/features/browser-use.md

Lines changed: 166 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,166 @@
1+
# Browser Use
2+
3+
Roo Code provides sophisticated browser automation capabilities that let you interact with websites directly from VS Code. This feature enables testing web applications, automating browser tasks, and capturing screenshots without leaving your development environment.
4+
5+
<video width="100%" controls>
6+
<source src="/img/browser-use/Roo-Code-Browser-Use.mp4" type="video/mp4"></source>
7+
Your browser does not support the video tag.
8+
</video>
9+
10+
## How Browser Use Works
11+
12+
By default, Roo Code uses a built-in browser that:
13+
- Launches automatically when you ask Roo to visit a website
14+
- Captures screenshots of web pages
15+
- Allows Roo to interact with web elements
16+
- Runs invisibly in the background
17+
18+
All of this happens directly within VS Code, with no setup required.
19+
20+
## Using Browser Use
21+
22+
A typical browser interaction follows this pattern:
23+
24+
1. Ask Roo to visit a website
25+
2. Roo launches the browser and shows you a screenshot
26+
3. Request additional actions (clicking, typing, scrolling)
27+
4. Roo closes the browser when finished
28+
29+
For example:
30+
31+
```
32+
Open the browser and view our site.
33+
```
34+
35+
```
36+
Can you check if my website at https://roocode.com is displaying correctly?
37+
```
38+
39+
```
40+
Browse http://localhost:3000, scroll down to the bottom of the page and check if the footer information is displaying correctly.
41+
```
42+
43+
<img src="/img/browser-use/browser-use-1.png" alt="Browser use example" width="300" />
44+
45+
## How Browser Actions Work
46+
47+
The browser_action tool controls a browser instance that returns screenshots and console logs after each action, allowing you to see the results of interactions.
48+
49+
Key characteristics:
50+
- Each browser session must start with `launch` and end with `close`
51+
- Only one browser action can be used per message
52+
- While the browser is active, no other tools can be used
53+
- You must wait for the response (screenshot and logs) before performing the next action
54+
55+
### Available Browser Actions
56+
57+
| Action | Description | When to Use |
58+
|--------|-------------|------------|
59+
| `launch` | Opens a browser at a URL | Starting a new browser session |
60+
| `click` | Clicks at specific coordinates | Interacting with buttons, links, etc. |
61+
| `type` | Types text into active element | Filling forms, search boxes |
62+
| `scroll_down` | Scrolls down by one page | Viewing content below the fold |
63+
| `scroll_up` | Scrolls up by one page | Returning to previous content |
64+
| `close` | Closes the browser | Ending a browser session |
65+
66+
## Browser Use Configuration/Settings
67+
68+
:::info Default Browser Settings
69+
- **Enable browser tool**: Enabled
70+
- **Viewport size**: Small Desktop (900x600)
71+
- **Screenshot quality**: 75%
72+
- **Use remote browser connection**: Disabled
73+
:::
74+
75+
### Accessing Settings
76+
77+
To change Browser / Computer Use settings in Roo:
78+
79+
1. Open Settings by clicking the gear icon <Codicon name="gear" /> → Browser / Computer Use
80+
81+
<img src="/img/browser-use/browser-use.png" alt="Browser settings menu" width="600" />
82+
83+
### Enable/Disable Browser Use
84+
85+
**Purpose**: Master toggle that enables Roo to interact with websites using a Puppeteer-controlled browser.
86+
87+
To change this setting:
88+
1. Check or uncheck the "Enable browser tool" checkbox within your Browser / Computer Use settings
89+
90+
<img src="/img/browser-use/browser-use-2.png" alt="Enable browser tool setting" width="300" />
91+
92+
### Viewport Size
93+
94+
**Purpose**: Determines the resolution of the browser session Roo Code uses.
95+
96+
**Tradeoff**: Higher values provide a larger viewport but increase token usage.
97+
98+
To change this setting:
99+
1. Click the dropdown menu under "Viewport size" within your Browser / Computer Use settings
100+
2. Select one of the available options:
101+
- Large Desktop (1280x800)
102+
- Small Desktop (900x600) - Default
103+
- Tablet (768x1024)
104+
- Mobile (360x640)
105+
2. The selected your desired resolution.
106+
107+
<img src="/img/browser-use/browser-use-3.png" alt="Viewport size setting" width="600" />
108+
109+
### Screenshot Quality
110+
111+
**Purpose**: Controls the WebP compression quality of browser screenshots.
112+
113+
**Tradeoff**: Higher values provide clearer screenshots but increase token usage.
114+
115+
To change this setting:
116+
1. Adjust the slider under "Screenshot quality" within your Browser / Computer Use settings
117+
2. Set a value between 1-100% (default is 75%)
118+
3. Higher values provide clearer screenshots but increase token usage:
119+
- 40-50%: Good for basic text-based websites
120+
- 60-70%: Balanced for most general browsing
121+
- 80%+: Use when fine visual details are critical
122+
123+
<img src="/img/browser-use/browser-use-4.png" alt="Screenshot quality setting" width="600" />
124+
125+
### Remote Browser Connection
126+
127+
**Purpose**: Connect Roo to an existing Chrome browser instead of using the built-in browser.
128+
129+
**Benefits**:
130+
- Works in containerized environments and remote development workflows
131+
- Maintains authenticated sessions between browser uses
132+
- Eliminates repetitive login steps
133+
- Allows use of custom browser profiles with specific extensions
134+
135+
**Requirements**: Chrome must be running with remote debugging enabled.
136+
137+
To enable this feature:
138+
1. Check the "Use remote browser connection" box in Browser / Computer Use settings
139+
2. Click "Test Connection" to verify
140+
141+
<img src="/img/browser-use/browser-use-5.png" alt="Remote browser connection setting" width="600" />
142+
143+
#### Common Use Cases
144+
145+
- **DevContainers**: Connect from containerized VS Code to host Chrome browser
146+
- **Remote Development**: Use local Chrome with remote VS Code server
147+
- **Custom Chrome Profiles**: Use profiles with specific extensions and settings
148+
149+
#### Connecting to a Visible Chrome Window
150+
151+
Connect to a visible Chrome window to observe Roo's interactions in real-time:
152+
153+
**macOS**
154+
```bash
155+
/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --remote-debugging-port=9222 --user-data-dir=/tmp/chrome-debug --no-first-run
156+
```
157+
158+
**Windows**
159+
```bash
160+
"C:\Program Files\Google\Chrome\Application\chrome.exe" --remote-debugging-port=9222 --user-data-dir=C:\chrome-debug --no-first-run
161+
```
162+
163+
**Linux**
164+
```bash
165+
google-chrome --remote-debugging-port=9222 --user-data-dir=/tmp/chrome-debug --no-first-run
166+
```

sidebars.ts

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@ const sidebars: SidebarsConfig = {
2424
items: [
2525
'features/api-configuration-profiles',
2626
'features/auto-approving-actions',
27+
'features/browser-use',
2728
'features/checkpoints',
2829
'features/code-actions',
2930
'features/custom-instructions',
32.6 MB
Binary file not shown.
10.9 KB
Loading
9.44 KB
Loading
45.8 KB
Loading
31.1 KB
Loading
70.4 KB
Loading
139 KB
Loading

0 commit comments

Comments
 (0)