Skip to content

Commit 67d238a

Browse files
feat: add experimental flag to disable command execution in attempt_c… (RooCodeInc#4352)
* feat: add experimental flag to disable command execution in attempt_completion tool * fix: remove deprecation phase comments from attemptCompletionTool * feat: add translations for disable completion command experiment * fix: revert unintended package.json change * Rename attempt-completion.experiment.test.ts to attempt-completion.test.ts * fix: address PR feedback - restore autoCondenseContext in tests and remove type assertion --------- Co-authored-by: Daniel <[email protected]>
1 parent 540d4fb commit 67d238a

29 files changed

+571
-14
lines changed

packages/types/src/experiment.ts

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ import type { Keys, Equals, AssertEqual } from "./type-fu.js"
66
* ExperimentId
77
*/
88

9-
export const experimentIds = ["powerSteering", "concurrentFileReads"] as const
9+
export const experimentIds = ["powerSteering", "concurrentFileReads", "disableCompletionCommand"] as const
1010

1111
export const experimentIdsSchema = z.enum(experimentIds)
1212

@@ -19,6 +19,7 @@ export type ExperimentId = z.infer<typeof experimentIdsSchema>
1919
export const experimentsSchema = z.object({
2020
powerSteering: z.boolean(),
2121
concurrentFileReads: z.boolean(),
22+
disableCompletionCommand: z.boolean(),
2223
})
2324

2425
export type Experiments = z.infer<typeof experimentsSchema>

src/core/prompts/sections/objective.ts

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
1+
import { EXPERIMENT_IDS, experiments } from "../../../shared/experiments"
12
import { CodeIndexManager } from "../../../services/code-index/manager"
23

3-
export function getObjectiveSection(codeIndexManager?: CodeIndexManager): string {
4+
export function getObjectiveSection(codeIndexManager?: CodeIndexManager, experimentsConfig?: Record<string, boolean>): string {
45
const isCodebaseSearchAvailable = codeIndexManager &&
56
codeIndexManager.isFeatureEnabled &&
67
codeIndexManager.isFeatureConfigured &&
@@ -9,6 +10,13 @@ export function getObjectiveSection(codeIndexManager?: CodeIndexManager): string
910
const codebaseSearchInstruction = isCodebaseSearchAvailable
1011
? "First, if the task involves understanding existing code or functionality, you MUST use the `codebase_search` tool to search for relevant code based on the task's intent BEFORE using any other search or file exploration tools. Then, "
1112
: "First, "
13+
14+
// Check if command execution is disabled via experiment
15+
const isCommandDisabled = experimentsConfig && experimentsConfig[EXPERIMENT_IDS.DISABLE_COMPLETION_COMMAND]
16+
17+
const commandInstruction = !isCommandDisabled
18+
? " You may also provide a CLI command to showcase the result of your task; this can be particularly useful for web development tasks, where you can run e.g. \`open index.html\` to show the website you've built."
19+
: ""
1220

1321
return `====
1422
@@ -19,6 +27,6 @@ You accomplish a given task iteratively, breaking it down into clear steps and w
1927
1. Analyze the user's task and set clear, achievable goals to accomplish it. Prioritize these goals in a logical order.
2028
2. Work through these goals sequentially, utilizing available tools one at a time as necessary. Each goal should correspond to a distinct step in your problem-solving process. You will be informed on the work completed and what's remaining as you go.
2129
3. Remember, you have extensive capabilities with access to a wide range of tools that can be used in powerful and clever ways as necessary to accomplish each goal. Before calling a tool, do some analysis within <thinking></thinking> tags. ${codebaseSearchInstruction}analyze the file structure provided in environment_details to gain context and insights for proceeding effectively. Next, think about which of the provided tools is the most relevant tool to accomplish the user's task. Go through each of the required parameters of the relevant tool and determine if the user has directly provided or given enough information to infer a value. When deciding if the parameter can be inferred, carefully consider all the context to see if it supports a specific value. If all of the required parameters are present or can be reasonably inferred, close the thinking tag and proceed with the tool use. BUT, if one of the values for a required parameter is missing, DO NOT invoke the tool (not even with fillers for the missing params) and instead, ask the user to provide the missing parameters using the ask_followup_question tool. DO NOT ask for more information on optional parameters if it is not provided.
22-
4. Once you've completed the user's task, you must use the attempt_completion tool to present the result of the task to the user. You may also provide a CLI command to showcase the result of your task; this can be particularly useful for web development tasks, where you can run e.g. \`open index.html\` to show the website you've built.
30+
4. Once you've completed the user's task, you must use the attempt_completion tool to present the result of the task to the user.${commandInstruction}
2331
5. The user may provide feedback, which you can use to make improvements and try again. But DO NOT continue in pointless back and forth conversations, i.e. don't end your responses with questions or offers for further assistance.`
2432
}

src/core/prompts/system.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -97,7 +97,7 @@ ${getRulesSection(cwd, supportsComputerUse, effectiveDiffStrategy, codeIndexMana
9797
9898
${getSystemInfoSection(cwd)}
9999
100-
${getObjectiveSection(codeIndexManager)}
100+
${getObjectiveSection(codeIndexManager, experiments)}
101101
102102
${await addCustomInstructions(baseInstructions, globalCustomInstructions || "", cwd, mode, { language: language ?? formatLanguage(vscode.env.language), rooIgnoreInstructions })}`
103103

Lines changed: 129 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,129 @@
1+
import { getAttemptCompletionDescription } from "../attempt-completion"
2+
import { EXPERIMENT_IDS } from "../../../../shared/experiments"
3+
4+
describe("getAttemptCompletionDescription - DISABLE_COMPLETION_COMMAND experiment", () => {
5+
describe("when experiment is disabled (default)", () => {
6+
it("should include command parameter in the description", () => {
7+
const args = {
8+
cwd: "/test/path",
9+
supportsComputerUse: false,
10+
experiments: {
11+
[EXPERIMENT_IDS.DISABLE_COMPLETION_COMMAND]: false,
12+
},
13+
}
14+
15+
const description = getAttemptCompletionDescription(args)
16+
17+
// Check that command parameter is included
18+
expect(description).toContain("- command: (optional)")
19+
expect(description).toContain("A CLI command to execute to show a live demo")
20+
expect(description).toContain("<command>Command to demonstrate result (optional)</command>")
21+
expect(description).toContain("<command>open index.html</command>")
22+
})
23+
24+
it("should include command parameter when experiments is undefined", () => {
25+
const args = {
26+
cwd: "/test/path",
27+
supportsComputerUse: false,
28+
}
29+
30+
const description = getAttemptCompletionDescription(args)
31+
32+
// Check that command parameter is included
33+
expect(description).toContain("- command: (optional)")
34+
expect(description).toContain("A CLI command to execute to show a live demo")
35+
expect(description).toContain("<command>Command to demonstrate result (optional)</command>")
36+
expect(description).toContain("<command>open index.html</command>")
37+
})
38+
39+
it("should include command parameter when no args provided", () => {
40+
const description = getAttemptCompletionDescription()
41+
42+
// Check that command parameter is included
43+
expect(description).toContain("- command: (optional)")
44+
expect(description).toContain("A CLI command to execute to show a live demo")
45+
expect(description).toContain("<command>Command to demonstrate result (optional)</command>")
46+
expect(description).toContain("<command>open index.html</command>")
47+
})
48+
})
49+
50+
describe("when experiment is enabled", () => {
51+
it("should NOT include command parameter in the description", () => {
52+
const args = {
53+
cwd: "/test/path",
54+
supportsComputerUse: false,
55+
experiments: {
56+
[EXPERIMENT_IDS.DISABLE_COMPLETION_COMMAND]: true,
57+
},
58+
}
59+
60+
const description = getAttemptCompletionDescription(args)
61+
62+
// Check that command parameter is NOT included
63+
expect(description).not.toContain("- command: (optional)")
64+
expect(description).not.toContain("A CLI command to execute to show a live demo")
65+
expect(description).not.toContain("<command>Command to demonstrate result (optional)</command>")
66+
expect(description).not.toContain("<command>open index.html</command>")
67+
68+
// But should still have the basic structure
69+
expect(description).toContain("## attempt_completion")
70+
expect(description).toContain("- result: (required)")
71+
expect(description).toContain("<attempt_completion>")
72+
expect(description).toContain("</attempt_completion>")
73+
})
74+
75+
it("should show example without command", () => {
76+
const args = {
77+
cwd: "/test/path",
78+
supportsComputerUse: false,
79+
experiments: {
80+
[EXPERIMENT_IDS.DISABLE_COMPLETION_COMMAND]: true,
81+
},
82+
}
83+
84+
const description = getAttemptCompletionDescription(args)
85+
86+
// Check example format
87+
expect(description).toContain("Example: Requesting to attempt completion with a result")
88+
expect(description).toContain("I've updated the CSS")
89+
expect(description).not.toContain("Example: Requesting to attempt completion with a result and command")
90+
})
91+
})
92+
93+
describe("description content", () => {
94+
it("should maintain core functionality description regardless of experiment", () => {
95+
const argsWithExperimentDisabled = {
96+
cwd: "/test/path",
97+
supportsComputerUse: false,
98+
experiments: {
99+
[EXPERIMENT_IDS.DISABLE_COMPLETION_COMMAND]: false,
100+
},
101+
}
102+
103+
const argsWithExperimentEnabled = {
104+
cwd: "/test/path",
105+
supportsComputerUse: false,
106+
experiments: {
107+
[EXPERIMENT_IDS.DISABLE_COMPLETION_COMMAND]: true,
108+
},
109+
}
110+
111+
const descriptionDisabled = getAttemptCompletionDescription(argsWithExperimentDisabled)
112+
const descriptionEnabled = getAttemptCompletionDescription(argsWithExperimentEnabled)
113+
114+
// Both should contain core functionality
115+
const coreText = "After each tool use, the user will respond with the result of that tool use"
116+
expect(descriptionDisabled).toContain(coreText)
117+
expect(descriptionEnabled).toContain(coreText)
118+
119+
// Both should contain the important note
120+
const importantNote = "IMPORTANT NOTE: This tool CANNOT be used until you've confirmed"
121+
expect(descriptionDisabled).toContain(importantNote)
122+
expect(descriptionEnabled).toContain(importantNote)
123+
124+
// Both should contain result parameter
125+
expect(descriptionDisabled).toContain("- result: (required)")
126+
expect(descriptionEnabled).toContain("- result: (required)")
127+
})
128+
})
129+
})
Lines changed: 31 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,23 +1,46 @@
1-
export function getAttemptCompletionDescription(): string {
2-
return `## attempt_completion
3-
Description: After each tool use, the user will respond with the result of that tool use, i.e. if it succeeded or failed, along with any reasons for failure. Once you've received the results of tool uses and can confirm that the task is complete, use this tool to present the result of your work to the user. Optionally you may provide a CLI command to showcase the result of your work. The user may respond with feedback if they are not satisfied with the result, which you can use to make improvements and try again.
1+
import { EXPERIMENT_IDS, experiments } from "../../../shared/experiments"
2+
import { ToolArgs } from "./types"
3+
4+
export function getAttemptCompletionDescription(args?: ToolArgs): string {
5+
// Check if command execution is disabled via experiment
6+
const isCommandDisabled = args?.experiments && experiments.isEnabled(
7+
args.experiments,
8+
EXPERIMENT_IDS.DISABLE_COMPLETION_COMMAND
9+
)
10+
11+
const baseDescription = `## attempt_completion
12+
Description: After each tool use, the user will respond with the result of that tool use, i.e. if it succeeded or failed, along with any reasons for failure. Once you've received the results of tool uses and can confirm that the task is complete, use this tool to present the result of your work to the user.${!isCommandDisabled ? ' Optionally you may provide a CLI command to showcase the result of your work.' : ''} The user may respond with feedback if they are not satisfied with the result, which you can use to make improvements and try again.
413
IMPORTANT NOTE: This tool CANNOT be used until you've confirmed from the user that any previous tool uses were successful. Failure to do so will result in code corruption and system failure. Before using this tool, you must ask yourself in <thinking></thinking> tags if you've confirmed from the user that any previous tool uses were successful. If not, then DO NOT use this tool.
514
Parameters:
6-
- result: (required) The result of the task. Formulate this result in a way that is final and does not require further input from the user. Don't end your result with questions or offers for further assistance.
7-
- command: (optional) A CLI command to execute to show a live demo of the result to the user. For example, use \`open index.html\` to display a created html website, or \`open localhost:3000\` to display a locally running development server. But DO NOT use commands like \`echo\` or \`cat\` that merely print text. This command should be valid for the current operating system. Ensure the command is properly formatted and does not contain any harmful instructions.
15+
- result: (required) The result of the task. Formulate this result in a way that is final and does not require further input from the user. Don't end your result with questions or offers for further assistance.`
16+
17+
const commandParameter = !isCommandDisabled ? `
18+
- command: (optional) A CLI command to execute to show a live demo of the result to the user. For example, use \`open index.html\` to display a created html website, or \`open localhost:3000\` to display a locally running development server. But DO NOT use commands like \`echo\` or \`cat\` that merely print text. This command should be valid for the current operating system. Ensure the command is properly formatted and does not contain any harmful instructions.` : ''
19+
20+
const usage = `
821
Usage:
922
<attempt_completion>
1023
<result>
1124
Your final result description here
12-
</result>
13-
<command>Command to demonstrate result (optional)</command>
14-
</attempt_completion>
25+
</result>${!isCommandDisabled ? '\n<command>Command to demonstrate result (optional)</command>' : ''}
26+
</attempt_completion>`
27+
28+
const example = !isCommandDisabled ? `
1529
1630
Example: Requesting to attempt completion with a result and command
1731
<attempt_completion>
1832
<result>
1933
I've updated the CSS
2034
</result>
2135
<command>open index.html</command>
36+
</attempt_completion>` : `
37+
38+
Example: Requesting to attempt completion with a result
39+
<attempt_completion>
40+
<result>
41+
I've updated the CSS
42+
</result>
2243
</attempt_completion>`
44+
45+
return baseDescription + commandParameter + usage + example
2346
}

src/core/prompts/tools/index.ts

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@ const toolDescriptionMap: Record<string, (args: ToolArgs) => string | undefined>
3535
list_code_definition_names: (args) => getListCodeDefinitionNamesDescription(args),
3636
browser_action: (args) => getBrowserActionDescription(args),
3737
ask_followup_question: () => getAskFollowupQuestionDescription(),
38-
attempt_completion: () => getAttemptCompletionDescription(),
38+
attempt_completion: (args) => getAttemptCompletionDescription(args),
3939
use_mcp_tool: (args) => getUseMcpToolDescription(args),
4040
access_mcp_resource: (args) => getAccessMcpResourceDescription(args),
4141
codebase_search: () => getCodebaseSearchDescription(),
@@ -69,6 +69,7 @@ export function getToolDescriptionsForMode(
6969
mcpHub,
7070
partialReadsEnabled,
7171
settings,
72+
experiments,
7273
}
7374

7475
const tools = new Set<string>()

src/core/prompts/tools/types.ts

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,4 +10,5 @@ export type ToolArgs = {
1010
toolOptions?: any
1111
partialReadsEnabled?: boolean
1212
settings?: Record<string, any>
13+
experiments?: Record<string, boolean>
1314
}

0 commit comments

Comments
 (0)