Skip to content

Commit ff062c6

Browse files
committed
Allow selection of multiple browser viewport sizes and adjusting screenshot quality
1 parent 99ff804 commit ff062c6

File tree

15 files changed

+166
-70
lines changed

15 files changed

+166
-70
lines changed

.changeset/five-gorillas-exist.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
---
2+
"roo-cline": patch
3+
---
4+
5+
Allow selection of multiple browser viewport sizes and adjusting screenshot quality

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ A fork of Cline, an autonomous coding agent, with some additional experimental f
77
- Drag and drop images into chats
88
- "Enhance prompt" button (OpenRouter models only for now)
99
- Sound effects for feedback
10-
- Option to use a larger 1280x800 browser
10+
- Option to use browsers of different sizes and adjust screenshot quality
1111
- Quick prompt copying from history
1212
- OpenRouter compression support
1313
- Includes current time in the system prompt

jest.config.js

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,9 @@ module.exports = {
2929
transformIgnorePatterns: [
3030
'node_modules/(?!(@modelcontextprotocol|delay|p-wait-for|globby|serialize-error|strip-ansi|default-shell|os-name)/)'
3131
],
32+
modulePathIgnorePatterns: [
33+
'.vscode-test'
34+
],
3235
setupFiles: [],
3336
globals: {
3437
'ts-jest': {

src/core/Cline.ts

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -786,8 +786,8 @@ export class Cline {
786786
throw new Error("MCP hub not available")
787787
}
788788

789-
const { browserLargeViewport, preferredLanguage } = await this.providerRef.deref()?.getState() ?? {}
790-
const systemPrompt = await SYSTEM_PROMPT(cwd, this.api.getModel().info.supportsComputerUse ?? false, mcpHub, this.diffStrategy, browserLargeViewport) + await addCustomInstructions(this.customInstructions ?? '', cwd, preferredLanguage)
789+
const { browserViewportSize, preferredLanguage } = await this.providerRef.deref()?.getState() ?? {}
790+
const systemPrompt = await SYSTEM_PROMPT(cwd, this.api.getModel().info.supportsComputerUse ?? false, mcpHub, this.diffStrategy, browserViewportSize) + await addCustomInstructions(this.customInstructions ?? '', cwd, preferredLanguage)
791791

792792
// If the previous API request's total token usage is close to the context window, truncate the conversation history to free up space for the new request
793793
if (previousApiReqIndex >= 0) {

src/core/prompts/system.ts

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ export const SYSTEM_PROMPT = async (
1111
supportsComputerUse: boolean,
1212
mcpHub: McpHub,
1313
diffStrategy?: DiffStrategy,
14-
browserLargeViewport?: boolean
14+
browserViewportSize?: string
1515
) => `You are Cline, a highly skilled software engineer with extensive knowledge in many programming languages, frameworks, design patterns, and best practices.
1616
1717
====
@@ -114,7 +114,7 @@ Usage:
114114
Description: Request to interact with a Puppeteer-controlled browser. Every action, except \`close\`, will be responded to with a screenshot of the browser's current state, along with any new console logs. You may only perform one browser action per message, and wait for the user's response including a screenshot and logs to determine the next action.
115115
- The sequence of actions **must always start with** launching the browser at a URL, and **must always end with** closing the browser. If you need to visit a new URL that is not possible to navigate to from the current webpage, you must first close the browser, then launch again at the new URL.
116116
- While the browser is active, only the \`browser_action\` tool can be used. No other tools should be called during this time. You may proceed to use other tools only after closing the browser. For example if you run into an error and need to fix a file, you must close the browser, then use other tools to make the necessary changes, then re-launch the browser to verify the result.
117-
- The browser window has a resolution of **${browserLargeViewport ? "1280x800" : "900x600"}** pixels. When performing any click actions, ensure the coordinates are within this resolution range.
117+
- The browser window has a resolution of **${browserViewportSize || "900x600"}** pixels. When performing any click actions, ensure the coordinates are within this resolution range.
118118
- Before clicking on any elements such as icons, links, or buttons, you must consult the provided screenshot of the page to determine the coordinates of the element. The click should be targeted at the **center of the element**, not on its edges.
119119
Parameters:
120120
- action: (required) The action to perform. The available actions are:
@@ -132,7 +132,7 @@ Parameters:
132132
- Example: \`<action>close</action>\`
133133
- url: (optional) Use this for providing the URL for the \`launch\` action.
134134
* Example: <url>https://example.com</url>
135-
- coordinate: (optional) The X and Y coordinates for the \`click\` action. Coordinates should be within the **${browserLargeViewport ? "1280x800" : "900x600"}** resolution.
135+
- coordinate: (optional) The X and Y coordinates for the \`click\` action. Coordinates should be within the **${browserViewportSize || "900x600"}** resolution.
136136
* Example: <coordinate>450,300</coordinate>
137137
- text: (optional) Use this for providing the text for the \`type\` action.
138138
* Example: <text>Hello, world!</text>

src/core/webview/ClineProvider.ts

Lines changed: 19 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -71,7 +71,8 @@ type GlobalStateKey =
7171
| "soundVolume"
7272
| "diffEnabled"
7373
| "alwaysAllowMcp"
74-
| "browserLargeViewport"
74+
| "browserViewportSize"
75+
| "screenshotQuality"
7576
| "fuzzyMatchThreshold"
7677
| "preferredLanguage" // Language setting for Cline's communication
7778
| "writeDelayMs"
@@ -624,9 +625,9 @@ export class ClineProvider implements vscode.WebviewViewProvider {
624625
await this.updateGlobalState("diffEnabled", diffEnabled)
625626
await this.postStateToWebview()
626627
break
627-
case "browserLargeViewport":
628-
const browserLargeViewport = message.bool ?? false
629-
await this.updateGlobalState("browserLargeViewport", browserLargeViewport)
628+
case "browserViewportSize":
629+
const browserViewportSize = message.text ?? "900x600"
630+
await this.updateGlobalState("browserViewportSize", browserViewportSize)
630631
await this.postStateToWebview()
631632
break
632633
case "fuzzyMatchThreshold":
@@ -641,6 +642,10 @@ export class ClineProvider implements vscode.WebviewViewProvider {
641642
await this.updateGlobalState("writeDelayMs", message.value)
642643
await this.postStateToWebview()
643644
break
645+
case "screenshotQuality":
646+
await this.updateGlobalState("screenshotQuality", message.value)
647+
await this.postStateToWebview()
648+
break
644649
case "enhancePrompt":
645650
if (message.text) {
646651
try {
@@ -1015,7 +1020,8 @@ export class ClineProvider implements vscode.WebviewViewProvider {
10151020
diffEnabled,
10161021
taskHistory,
10171022
soundVolume,
1018-
browserLargeViewport,
1023+
browserViewportSize,
1024+
screenshotQuality,
10191025
preferredLanguage,
10201026
writeDelayMs,
10211027
} = await this.getState()
@@ -1043,7 +1049,8 @@ export class ClineProvider implements vscode.WebviewViewProvider {
10431049
shouldShowAnnouncement: lastShownAnnouncementId !== this.latestAnnouncementId,
10441050
allowedCommands,
10451051
soundVolume: soundVolume ?? 0.5,
1046-
browserLargeViewport: browserLargeViewport ?? false,
1052+
browserViewportSize: browserViewportSize ?? "900x600",
1053+
screenshotQuality: screenshotQuality ?? 75,
10471054
preferredLanguage: preferredLanguage ?? 'English',
10481055
writeDelayMs: writeDelayMs ?? 1000,
10491056
}
@@ -1140,10 +1147,11 @@ export class ClineProvider implements vscode.WebviewViewProvider {
11401147
soundEnabled,
11411148
diffEnabled,
11421149
soundVolume,
1143-
browserLargeViewport,
1150+
browserViewportSize,
11441151
fuzzyMatchThreshold,
11451152
preferredLanguage,
11461153
writeDelayMs,
1154+
screenshotQuality,
11471155
] = await Promise.all([
11481156
this.getGlobalState("apiProvider") as Promise<ApiProvider | undefined>,
11491157
this.getGlobalState("apiModelId") as Promise<string | undefined>,
@@ -1183,10 +1191,11 @@ export class ClineProvider implements vscode.WebviewViewProvider {
11831191
this.getGlobalState("soundEnabled") as Promise<boolean | undefined>,
11841192
this.getGlobalState("diffEnabled") as Promise<boolean | undefined>,
11851193
this.getGlobalState("soundVolume") as Promise<number | undefined>,
1186-
this.getGlobalState("browserLargeViewport") as Promise<boolean | undefined>,
1194+
this.getGlobalState("browserViewportSize") as Promise<string | undefined>,
11871195
this.getGlobalState("fuzzyMatchThreshold") as Promise<number | undefined>,
11881196
this.getGlobalState("preferredLanguage") as Promise<string | undefined>,
11891197
this.getGlobalState("writeDelayMs") as Promise<number | undefined>,
1198+
this.getGlobalState("screenshotQuality") as Promise<number | undefined>,
11901199
])
11911200

11921201
let apiProvider: ApiProvider
@@ -1244,7 +1253,8 @@ export class ClineProvider implements vscode.WebviewViewProvider {
12441253
soundEnabled: soundEnabled ?? false,
12451254
diffEnabled: diffEnabled ?? true,
12461255
soundVolume,
1247-
browserLargeViewport: browserLargeViewport ?? false,
1256+
browserViewportSize: browserViewportSize ?? "900x600",
1257+
screenshotQuality: screenshotQuality ?? 75,
12481258
fuzzyMatchThreshold: fuzzyMatchThreshold ?? 1.0,
12491259
writeDelayMs: writeDelayMs ?? 1000,
12501260
preferredLanguage: preferredLanguage ?? (() => {

src/core/webview/__tests__/ClineProvider.test.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -253,7 +253,7 @@ describe('ClineProvider', () => {
253253
soundEnabled: false,
254254
diffEnabled: false,
255255
writeDelayMs: 1000,
256-
browserLargeViewport: false,
256+
browserViewportSize: "900x600",
257257
fuzzyMatchThreshold: 1.0,
258258
}
259259

src/services/browser/BrowserSession.ts

Lines changed: 12 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -58,9 +58,11 @@ export class BrowserSession {
5858
"--user-agent=Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/128.0.0.0 Safari/537.36",
5959
],
6060
executablePath: stats.executablePath,
61-
defaultViewport: await this.context.globalState.get("browserLargeViewport")
62-
? { width: 1280, height: 800 }
63-
: { width: 900, height: 600 },
61+
defaultViewport: (() => {
62+
const size = (this.context.globalState.get("browserViewportSize") as string | undefined) || "900x600"
63+
const [width, height] = size.split("x").map(Number)
64+
return { width, height }
65+
})(),
6466
// headless: false,
6567
})
6668
// (latest version of puppeteer does not add headless to user agent)
@@ -134,7 +136,7 @@ export class BrowserSession {
134136
let screenshotBase64 = await this.page.screenshot({
135137
...options,
136138
type: "webp",
137-
quality: 100, // Set maximum quality to prevent compression artifacts
139+
quality: (await this.context.globalState.get("screenshotQuality") as number | undefined) ?? 75,
138140
})
139141
let screenshot = `data:image/webp;base64,${screenshotBase64}`
140142

@@ -245,27 +247,29 @@ export class BrowserSession {
245247
}
246248

247249
async scrollDown(): Promise<BrowserActionResult> {
248-
const isLargeViewport = await this.context.globalState.get("browserLargeViewport")
250+
const size = (await this.context.globalState.get("browserViewportSize") as string | undefined) || "900x600"
251+
const height = parseInt(size.split("x")[1])
249252
return this.doAction(async (page) => {
250253
await page.evaluate((scrollHeight) => {
251254
window.scrollBy({
252255
top: scrollHeight,
253256
behavior: "auto",
254257
})
255-
}, isLargeViewport ? 800 : 600)
258+
}, height)
256259
await delay(300)
257260
})
258261
}
259262

260263
async scrollUp(): Promise<BrowserActionResult> {
261-
const isLargeViewport = await this.context.globalState.get("browserLargeViewport")
264+
const size = (await this.context.globalState.get("browserViewportSize") as string | undefined) || "900x600"
265+
const height = parseInt(size.split("x")[1])
262266
return this.doAction(async (page) => {
263267
await page.evaluate((scrollHeight) => {
264268
window.scrollBy({
265269
top: -scrollHeight,
266270
behavior: "auto",
267271
})
268-
}, isLargeViewport ? 800 : 600)
272+
}, height)
269273
await delay(300)
270274
})
271275
}

src/shared/ExtensionMessage.ts

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -56,7 +56,8 @@ export interface ExtensionState {
5656
soundEnabled?: boolean
5757
soundVolume?: number
5858
diffEnabled?: boolean
59-
browserLargeViewport?: boolean
59+
browserViewportSize?: string
60+
screenshotQuality?: number
6061
fuzzyMatchThreshold?: number
6162
preferredLanguage: string
6263
writeDelayMs: number

src/shared/WebviewMessage.ts

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,8 @@ export interface WebviewMessage {
3535
| "soundEnabled"
3636
| "soundVolume"
3737
| "diffEnabled"
38-
| "browserLargeViewport"
38+
| "browserViewportSize"
39+
| "screenshotQuality"
3940
| "openMcpSettings"
4041
| "restartMcpServer"
4142
| "toggleToolAlwaysAllow"

0 commit comments

Comments
 (0)