Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
701 changes: 649 additions & 52 deletions src/core/prompts/__tests__/__snapshots__/system.test.ts.snap

Large diffs are not rendered by default.

53 changes: 52 additions & 1 deletion src/core/prompts/__tests__/system.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -184,7 +184,7 @@ describe("SYSTEM_PROMPT", () => {
expect(prompt).toMatchSnapshot()
})

it("should include browser actions when supportsComputerUse is true", async () => {
it("should include browser actions when supportsComputerUse is true and browserToolEnabled is true", async () => {
const prompt = await SYSTEM_PROMPT(
mockContext,
"/test/path",
Expand All @@ -200,8 +200,59 @@ describe("SYSTEM_PROMPT", () => {
undefined, // diffEnabled
experiments,
true, // enableMcpServerCreation
undefined, // rooIgnoreInstructions
true, // browserToolEnabled
)

expect(prompt).toContain("browser_action")
expect(prompt).toMatchSnapshot()
})

it("should not include browser actions when supportsComputerUse is true but browserToolEnabled is false", async () => {
const prompt = await SYSTEM_PROMPT(
mockContext,
"/test/path",
true, // supportsComputerUse
undefined, // mcpHub
undefined, // diffStrategy
"1280x800", // browserViewportSize
defaultModeSlug, // mode
undefined, // customModePrompts
undefined, // customModes,
undefined, // globalCustomInstructions
undefined, // preferredLanguage
undefined, // diffEnabled
experiments,
true, // enableMcpServerCreation
undefined, // rooIgnoreInstructions
false, // browserToolEnabled
)

expect(prompt).not.toContain("browser_action")
expect(prompt).toMatchSnapshot()
})

it("should not include browser actions when supportsComputerUse is false but browserToolEnabled is true", async () => {
const prompt = await SYSTEM_PROMPT(
mockContext,
"/test/path",
false, // supportsComputerUse
undefined, // mcpHub
undefined, // diffStrategy
"1280x800", // browserViewportSize
defaultModeSlug, // mode
undefined, // customModePrompts
undefined, // customModes,
undefined, // globalCustomInstructions
undefined, // preferredLanguage
undefined, // diffEnabled
experiments,
true, // enableMcpServerCreation
undefined, // rooIgnoreInstructions
true, // browserToolEnabled
)

expect(prompt).not.toContain("browser_action")
expect(prompt).toMatchSnapshot()
})

Expand Down
5 changes: 3 additions & 2 deletions src/core/prompts/sections/capabilities.ts
Original file line number Diff line number Diff line change
Expand Up @@ -6,20 +6,21 @@ export function getCapabilitiesSection(
supportsComputerUse: boolean,
mcpHub?: McpHub,
diffStrategy?: DiffStrategy,
browserToolEnabled?: boolean,
): string {
return `====

CAPABILITIES

- You have access to tools that let you execute CLI commands on the user's computer, list files, view source code definitions, regex search${
supportsComputerUse ? ", use the browser" : ""
supportsComputerUse && browserToolEnabled === true ? ", use the browser" : ""
}, read and write files, and ask follow-up questions. These tools help you effectively accomplish a wide range of tasks, such as writing code, making edits or improvements to existing files, understanding the current state of a project, performing system operations, and much more.
- When the user initially gives you a task, a recursive list of all filepaths in the current working directory ('${cwd}') will be included in environment_details. This provides an overview of the project's file structure, offering key insights into the project from directory/file names (how developers conceptualize and organize their code) and file extensions (the language used). This can also guide decision-making on which files to explore further. If you need to further explore directories such as outside the current working directory, you can use the list_files tool. If you pass 'true' for the recursive parameter, it will list files recursively. Otherwise, it will list files at the top level, which is better suited for generic directories where you don't necessarily need the nested structure, like the Desktop.
- You can use search_files to perform regex searches across files in a specified directory, outputting context-rich results that include surrounding lines. This is particularly useful for understanding code patterns, finding specific implementations, or identifying areas that need refactoring.
- You can use the list_code_definition_names tool to get an overview of source code definitions for all files at the top level of a specified directory. This can be particularly useful when you need to understand the broader context and relationships between certain parts of the code. You may need to call this tool multiple times to understand various parts of the codebase related to the task.
- For example, when asked to make edits or improvements you might analyze the file structure in the initial environment_details to get an overview of the project, then use list_code_definition_names to get further insight using source code definitions for files located in relevant directories, then read_file to examine the contents of relevant files, analyze the code and suggest improvements or make necessary edits, then use ${diffStrategy ? "the apply_diff or write_to_file" : "the write_to_file"} tool to apply the changes. If you refactored code that could affect other parts of the codebase, you could use search_files to ensure you update other files as needed.
- You can use the execute_command tool to run commands on the user's computer whenever you feel it can help accomplish the user's task. When you need to execute a CLI command, you must provide a clear explanation of what the command does. Prefer to execute complex CLI commands over creating executable scripts, since they are more flexible and easier to run. Interactive and long-running commands are allowed, since the commands are run in the user's VSCode terminal. The user may keep commands running in the background and you will be kept updated on their status along the way. Each command you execute is run in a new terminal instance.${
supportsComputerUse
supportsComputerUse && browserToolEnabled === true
? "\n- You can use the browser_action tool to interact with websites (including html files and locally running development servers) through a Puppeteer-controlled browser when you feel it is necessary in accomplishing the user's task. This tool is particularly useful for web development tasks as it allows you to launch a browser, navigate to pages, interact with elements through clicks and keyboard input, and capture the results through screenshots and console logs. This tool may be useful at key stages of web development tasks-such as after implementing new features, making substantial changes, when troubleshooting issues, or to verify the result of your work. You can analyze the provided screenshots to ensure correct rendering or identify errors, and review console logs for runtime issues.\n - For example, if asked to add a component to a react website, you might create the necessary files, use execute_command to run the site locally, then use browser_action to launch the browser, navigate to the local server, and verify the component renders & functions correctly before closing the browser."
: ""
}${
Expand Down
5 changes: 3 additions & 2 deletions src/core/prompts/sections/rules.ts
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,7 @@ export function getRulesSection(
supportsComputerUse: boolean,
diffStrategy?: DiffStrategy,
experiments?: Record<string, boolean> | undefined,
browserToolEnabled?: boolean,
): string {
return `====

Expand All @@ -80,7 +81,7 @@ ${getEditingInstructions(diffStrategy, experiments)}
- When executing commands, if you don't see the expected output, assume the terminal executed the command successfully and proceed with the task. The user's terminal may be unable to stream the output back properly. If you absolutely need to see the actual terminal output, use the ask_followup_question tool to request the user to copy and paste it back to you.
- The user may provide a file's contents directly in their message, in which case you shouldn't use the read_file tool to get the file contents again since you already have it.
- Your goal is to try to accomplish the user's task, NOT engage in a back and forth conversation.${
supportsComputerUse
supportsComputerUse && browserToolEnabled === true
? '\n- The user may ask generic non-development tasks, such as "what\'s the latest news" or "look up the weather in San Diego", in which case you might use the browser_action tool to complete the task if it makes sense to do so, rather than trying to create a website or using curl to answer the question. However, if an available MCP server tool or resource can be used instead, you should prefer to use it over browser_action.'
: ""
}
Expand All @@ -91,7 +92,7 @@ ${getEditingInstructions(diffStrategy, experiments)}
- Before executing commands, check the "Actively Running Terminals" section in environment_details. If present, consider how these active processes might impact your task. For example, if a local development server is already running, you wouldn't need to start it again. If no active terminals are listed, proceed with command execution as normal.
- MCP operations should be used one at a time, similar to other tool usage. Wait for confirmation of success before proceeding with additional operations.
- It is critical you wait for the user's response after each tool use, in order to confirm the success of the tool use. For example, if asked to make a todo app, you would create a file, wait for the user's response it was created successfully, then create another file if needed, wait for the user's response it was created successfully, etc.${
supportsComputerUse
supportsComputerUse && browserToolEnabled === true
? " Then if you want to test your work, you might use browser_action to launch the site, wait for the user's response confirming the site was launched along with a screenshot, then perhaps e.g., click a button to test functionality if needed, wait for the user's response confirming the button was clicked along with a screenshot of the new state, before finally closing the browser."
: ""
}`
Expand Down
8 changes: 6 additions & 2 deletions src/core/prompts/system.ts
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@ async function generatePrompt(
experiments?: Record<string, boolean>,
enableMcpServerCreation?: boolean,
rooIgnoreInstructions?: string,
browserToolEnabled?: boolean,
): Promise<string> {
if (!context) {
throw new Error("Extension context is required for generating system prompt")
Expand Down Expand Up @@ -74,17 +75,18 @@ ${getToolDescriptionsForMode(
mcpHub,
customModeConfigs,
experiments,
browserToolEnabled,
)}

${getToolUseGuidelinesSection()}

${mcpServersSection}

${getCapabilitiesSection(cwd, supportsComputerUse, mcpHub, effectiveDiffStrategy)}
${getCapabilitiesSection(cwd, supportsComputerUse, mcpHub, effectiveDiffStrategy, browserToolEnabled)}

${modesSection}

${getRulesSection(cwd, supportsComputerUse, effectiveDiffStrategy, experiments)}
${getRulesSection(cwd, supportsComputerUse, effectiveDiffStrategy, experiments, browserToolEnabled)}

${getSystemInfoSection(cwd, mode, customModeConfigs)}

Expand All @@ -111,6 +113,7 @@ export const SYSTEM_PROMPT = async (
experiments?: Record<string, boolean>,
enableMcpServerCreation?: boolean,
rooIgnoreInstructions?: string,
browserToolEnabled?: boolean,
): Promise<string> => {
if (!context) {
throw new Error("Extension context is required for generating system prompt")
Expand Down Expand Up @@ -161,5 +164,6 @@ ${await addCustomInstructions(promptComponent?.customInstructions || currentMode
experiments,
enableMcpServerCreation,
rooIgnoreInstructions,
browserToolEnabled,
)
}
3 changes: 2 additions & 1 deletion src/core/prompts/tools/browser-action.ts
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
import { ToolArgs } from "./types"

export function getBrowserActionDescription(args: ToolArgs): string | undefined {
if (!args.supportsComputerUse) {
// Only include browser actions if both supportsComputerUse is true and browserToolEnabled is true
if (!args.supportsComputerUse || args.browserToolEnabled !== true) {
return undefined
}
return `## browser_action
Expand Down
12 changes: 11 additions & 1 deletion src/core/prompts/tools/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,7 @@ export function getToolDescriptionsForMode(
mcpHub?: McpHub,
customModes?: ModeConfig[],
experiments?: Record<string, boolean>,
browserToolEnabled?: boolean,
): string {
const config = getModeConfig(mode, customModes)
const args: ToolArgs = {
Expand All @@ -57,6 +58,7 @@ export function getToolDescriptionsForMode(
diffStrategy,
browserViewportSize,
mcpHub,
browserToolEnabled,
}

const tools = new Set<string>()
Expand All @@ -67,8 +69,16 @@ export function getToolDescriptionsForMode(
const toolGroup = TOOL_GROUPS[groupName]
if (toolGroup) {
toolGroup.tools.forEach((tool) => {
// Check if the tool is allowed for the mode
if (isToolAllowedForMode(tool as ToolName, mode, customModes ?? [], experiments ?? {})) {
tools.add(tool)
// Special case for browser_action: only add if both supportsComputerUse AND browserToolEnabled are true
if (tool === "browser_action") {
if (args.supportsComputerUse && args.browserToolEnabled === true) {
tools.add(tool)
}
} else {
tools.add(tool)
}
}
})
}
Expand Down
1 change: 1 addition & 0 deletions src/core/prompts/tools/types.ts
Original file line number Diff line number Diff line change
Expand Up @@ -8,4 +8,5 @@ export type ToolArgs = {
browserViewportSize?: string
mcpHub?: McpHub
toolOptions?: any
browserToolEnabled?: boolean
}
2 changes: 2 additions & 0 deletions src/core/webview/ClineProvider.ts
Original file line number Diff line number Diff line change
Expand Up @@ -1820,6 +1820,7 @@ export class ClineProvider implements vscode.WebviewViewProvider {
fuzzyMatchThreshold,
experiments,
enableMcpServerCreation,
browserToolEnabled,
} = await this.getState()

// Create diffStrategy based on current model and settings
Expand Down Expand Up @@ -1851,6 +1852,7 @@ export class ClineProvider implements vscode.WebviewViewProvider {
experiments,
enableMcpServerCreation,
rooIgnoreInstructions,
browserToolEnabled,
)
return systemPrompt
}
Expand Down
4 changes: 4 additions & 0 deletions src/core/webview/__tests__/ClineProvider.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -1176,6 +1176,7 @@ describe("ClineProvider", () => {
diffEnabled: true,
fuzzyMatchThreshold: 0.8,
experiments: experimentDefault,
browserToolEnabled: true,
} as any)

// Mock SYSTEM_PROMPT to verify diffStrategy and diffEnabled are passed
Expand Down Expand Up @@ -1206,6 +1207,7 @@ describe("ClineProvider", () => {
experimentDefault,
true,
undefined, // rooIgnoreInstructions
true, // browserToolEnabled
)

// Run the test again to verify it's consistent
Expand All @@ -1230,6 +1232,7 @@ describe("ClineProvider", () => {
fuzzyMatchThreshold: 0.8,
experiments: experimentDefault,
enableMcpServerCreation: true,
browserToolEnabled: true,
} as any)

// Mock SYSTEM_PROMPT to verify diffEnabled is passed as false
Expand Down Expand Up @@ -1260,6 +1263,7 @@ describe("ClineProvider", () => {
experimentDefault,
true,
undefined, // rooIgnoreInstructions
true, // browserToolEnabled
)
})

Expand Down