Skip to content

Commit 4befe5b

Browse files
Enforce codebase_search as primary tool for code understanding tasks (#4340)
* feat(prompts): enforce codebase_search as primary code understanding tool - Add conditional codebase_search enforcement in tool use guidelines - Modify objective section to prioritize codebase_search when available - Update rules section with critical codebase_search-first rule - Pass CodeIndexManager to prompt sections for availability checks - Ensure graceful degradation when codebase_search is unavailable * chore(docs): remove codebase search enforcement documentation * fix: update snapshot and reorder capabilities section - Update system.test.ts snapshot to reflect architect mode without codebase_search enforcement - Reorder capabilities section to place search_files description after codebase_search - Ensures logical flow: codebase_search (semantic) → search_files (regex) → other tools * refactor: improve tool-use-guidelines numbering logic - Replace subsequentNumbers object with array-based approach - Use automatic incrementing with itemNumber++ for sequential numbering - Build guidelines as an array and join at the end - Fix potential numbering issues with conditional logic - Update tests and snapshots to match new format As suggested by daniel-lxs in PR #4340
1 parent ae118c6 commit 4befe5b

File tree

8 files changed

+231
-31
lines changed

8 files changed

+231
-31
lines changed

src/core/prompts/__tests__/__snapshots__/system.test.ts.snap

Lines changed: 13 additions & 13 deletions
Large diffs are not rendered by default.
Lines changed: 77 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,77 @@
1+
import { getObjectiveSection } from "../objective"
2+
import { CodeIndexManager } from "../../../../services/code-index/manager"
3+
4+
describe("getObjectiveSection", () => {
5+
// Mock CodeIndexManager with codebase search available
6+
const mockCodeIndexManagerEnabled = {
7+
isFeatureEnabled: true,
8+
isFeatureConfigured: true,
9+
isInitialized: true,
10+
} as CodeIndexManager
11+
12+
// Mock CodeIndexManager with codebase search unavailable
13+
const mockCodeIndexManagerDisabled = {
14+
isFeatureEnabled: false,
15+
isFeatureConfigured: false,
16+
isInitialized: false,
17+
} as CodeIndexManager
18+
19+
describe("when codebase_search is available", () => {
20+
it("should include codebase_search first enforcement in thinking process", () => {
21+
const objective = getObjectiveSection(mockCodeIndexManagerEnabled)
22+
23+
// Check that the objective includes the codebase_search enforcement
24+
expect(objective).toContain("if the task involves understanding existing code or functionality, you MUST use the `codebase_search` tool")
25+
expect(objective).toContain("BEFORE using any other search or file exploration tools")
26+
})
27+
})
28+
29+
describe("when codebase_search is not available", () => {
30+
it("should not include codebase_search enforcement", () => {
31+
const objective = getObjectiveSection(mockCodeIndexManagerDisabled)
32+
33+
// Check that the objective does not include the codebase_search enforcement
34+
expect(objective).not.toContain("you MUST use the `codebase_search` tool")
35+
expect(objective).not.toContain("BEFORE using any other search or file exploration tools")
36+
})
37+
})
38+
39+
it("should maintain proper structure regardless of codebase_search availability", () => {
40+
const objectiveEnabled = getObjectiveSection(mockCodeIndexManagerEnabled)
41+
const objectiveDisabled = getObjectiveSection(mockCodeIndexManagerDisabled)
42+
43+
// Check that all numbered items are present in both cases
44+
for (const objective of [objectiveEnabled, objectiveDisabled]) {
45+
expect(objective).toContain("1. Analyze the user's task")
46+
expect(objective).toContain("2. Work through these goals sequentially")
47+
expect(objective).toContain("3. Remember, you have extensive capabilities")
48+
expect(objective).toContain("4. Once you've completed the user's task")
49+
expect(objective).toContain("5. The user may provide feedback")
50+
}
51+
})
52+
53+
it("should include thinking tags guidance regardless of codebase_search availability", () => {
54+
const objectiveEnabled = getObjectiveSection(mockCodeIndexManagerEnabled)
55+
const objectiveDisabled = getObjectiveSection(mockCodeIndexManagerDisabled)
56+
57+
// Check that thinking tags guidance is included in both cases
58+
for (const objective of [objectiveEnabled, objectiveDisabled]) {
59+
expect(objective).toContain("<thinking></thinking> tags")
60+
expect(objective).toContain("analyze the file structure provided in environment_details")
61+
expect(objective).toContain("think about which of the provided tools is the most relevant")
62+
}
63+
})
64+
65+
it("should include parameter inference guidance regardless of codebase_search availability", () => {
66+
const objectiveEnabled = getObjectiveSection(mockCodeIndexManagerEnabled)
67+
const objectiveDisabled = getObjectiveSection(mockCodeIndexManagerDisabled)
68+
69+
// Check parameter inference guidance in both cases
70+
for (const objective of [objectiveEnabled, objectiveDisabled]) {
71+
expect(objective).toContain("Go through each of the required parameters")
72+
expect(objective).toContain("determine if the user has directly provided or given enough information to infer a value")
73+
expect(objective).toContain("DO NOT invoke the tool (not even with fillers for the missing params)")
74+
expect(objective).toContain("ask_followup_question tool")
75+
}
76+
})
77+
})
Lines changed: 78 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,78 @@
1+
import { getToolUseGuidelinesSection } from "../tool-use-guidelines"
2+
import { CodeIndexManager } from "../../../../services/code-index/manager"
3+
4+
describe("getToolUseGuidelinesSection", () => {
5+
// Mock CodeIndexManager with codebase search available
6+
const mockCodeIndexManagerEnabled = {
7+
isFeatureEnabled: true,
8+
isFeatureConfigured: true,
9+
isInitialized: true,
10+
} as CodeIndexManager
11+
12+
// Mock CodeIndexManager with codebase search unavailable
13+
const mockCodeIndexManagerDisabled = {
14+
isFeatureEnabled: false,
15+
isFeatureConfigured: false,
16+
isInitialized: false,
17+
} as CodeIndexManager
18+
19+
describe("when codebase_search is available", () => {
20+
it("should include codebase_search first enforcement", () => {
21+
const guidelines = getToolUseGuidelinesSection(mockCodeIndexManagerEnabled)
22+
23+
// Check that the guidelines include the codebase_search enforcement
24+
expect(guidelines).toContain("IMPORTANT: When starting a new task or when you need to understand existing code/functionality, you MUST use the `codebase_search` tool FIRST")
25+
expect(guidelines).toContain("before any other search tools")
26+
expect(guidelines).toContain("semantic search tool helps you find relevant code based on meaning rather than just keywords")
27+
})
28+
29+
it("should maintain proper numbering with codebase_search", () => {
30+
const guidelines = getToolUseGuidelinesSection(mockCodeIndexManagerEnabled)
31+
32+
// Check that all numbered items are present
33+
expect(guidelines).toContain("1. In <thinking> tags")
34+
expect(guidelines).toContain("2. **IMPORTANT:")
35+
expect(guidelines).toContain("3. Choose the most appropriate tool")
36+
expect(guidelines).toContain("4. If multiple actions are needed")
37+
expect(guidelines).toContain("5. Formulate your tool use")
38+
expect(guidelines).toContain("6. After each tool use")
39+
expect(guidelines).toContain("7. ALWAYS wait for user confirmation")
40+
})
41+
})
42+
43+
describe("when codebase_search is not available", () => {
44+
it("should not include codebase_search enforcement", () => {
45+
const guidelines = getToolUseGuidelinesSection(mockCodeIndexManagerDisabled)
46+
47+
// Check that the guidelines do not include the codebase_search enforcement
48+
expect(guidelines).not.toContain("IMPORTANT: When starting a new task or when you need to understand existing code/functionality, you MUST use the `codebase_search` tool FIRST")
49+
expect(guidelines).not.toContain("semantic search tool helps you find relevant code based on meaning")
50+
})
51+
52+
it("should maintain proper numbering without codebase_search", () => {
53+
const guidelines = getToolUseGuidelinesSection(mockCodeIndexManagerDisabled)
54+
55+
// Check that all numbered items are present with correct numbering
56+
expect(guidelines).toContain("1. In <thinking> tags")
57+
expect(guidelines).toContain("2. Choose the most appropriate tool")
58+
expect(guidelines).toContain("3. If multiple actions are needed")
59+
expect(guidelines).toContain("4. Formulate your tool use")
60+
expect(guidelines).toContain("5. After each tool use")
61+
expect(guidelines).toContain("6. ALWAYS wait for user confirmation")
62+
})
63+
})
64+
65+
it("should include iterative process guidelines regardless of codebase_search availability", () => {
66+
const guidelinesEnabled = getToolUseGuidelinesSection(mockCodeIndexManagerEnabled)
67+
const guidelinesDisabled = getToolUseGuidelinesSection(mockCodeIndexManagerDisabled)
68+
69+
// Check that the iterative process section is included in both cases
70+
for (const guidelines of [guidelinesEnabled, guidelinesDisabled]) {
71+
expect(guidelines).toContain("It is crucial to proceed step-by-step")
72+
expect(guidelines).toContain("1. Confirm the success of each step before proceeding")
73+
expect(guidelines).toContain("2. Address any issues or errors that arise immediately")
74+
expect(guidelines).toContain("3. Adapt your approach based on new information")
75+
expect(guidelines).toContain("4. Ensure that each action builds correctly")
76+
}
77+
})
78+
})

src/core/prompts/sections/capabilities.ts

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -16,8 +16,7 @@ CAPABILITIES
1616
- You have access to tools that let you execute CLI commands on the user's computer, list files, view source code definitions, regex search${
1717
supportsComputerUse ? ", use the browser" : ""
1818
}, read and write files, and ask follow-up questions. These tools help you effectively accomplish a wide range of tasks, such as writing code, making edits or improvements to existing files, understanding the current state of a project, performing system operations, and much more.
19-
- When the user initially gives you a task, a recursive list of all filepaths in the current workspace directory ('${cwd}') will be included in environment_details. This provides an overview of the project's file structure, offering key insights into the project from directory/file names (how developers conceptualize and organize their code) and file extensions (the language used). This can also guide decision-making on which files to explore further. If you need to further explore directories such as outside the current workspace directory, you can use the list_files tool. If you pass 'true' for the recursive parameter, it will list files recursively. Otherwise, it will list files at the top level, which is better suited for generic directories where you don't necessarily need the nested structure, like the Desktop.
20-
- You can use search_files to perform regex searches across files in a specified directory, outputting context-rich results that include surrounding lines. This is particularly useful for understanding code patterns, finding specific implementations, or identifying areas that need refactoring.${
19+
- When the user initially gives you a task, a recursive list of all filepaths in the current workspace directory ('${cwd}') will be included in environment_details. This provides an overview of the project's file structure, offering key insights into the project from directory/file names (how developers conceptualize and organize their code) and file extensions (the language used). This can also guide decision-making on which files to explore further. If you need to further explore directories such as outside the current workspace directory, you can use the list_files tool. If you pass 'true' for the recursive parameter, it will list files recursively. Otherwise, it will list files at the top level, which is better suited for generic directories where you don't necessarily need the nested structure, like the Desktop.${
2120
codeIndexManager &&
2221
codeIndexManager.isFeatureEnabled &&
2322
codeIndexManager.isFeatureConfigured &&
@@ -26,6 +25,7 @@ CAPABILITIES
2625
- You can use the \`codebase_search\` tool to perform semantic searches across your entire codebase. This tool is powerful for finding functionally relevant code, even if you don't know the exact keywords or file names. It's particularly useful for understanding how features are implemented across multiple files, discovering usages of a particular API, or finding code examples related to a concept. This capability relies on a pre-built index of your code.`
2726
: ""
2827
}
28+
- You can use search_files to perform regex searches across files in a specified directory, outputting context-rich results that include surrounding lines. This is particularly useful for understanding code patterns, finding specific implementations, or identifying areas that need refactoring.
2929
- You can use the list_code_definition_names tool to get an overview of source code definitions for all files at the top level of a specified directory. This can be particularly useful when you need to understand the broader context and relationships between certain parts of the code. You may need to call this tool multiple times to understand various parts of the codebase related to the task.
3030
- For example, when asked to make edits or improvements you might analyze the file structure in the initial environment_details to get an overview of the project, then use list_code_definition_names to get further insight using source code definitions for files located in relevant directories, then read_file to examine the contents of relevant files, analyze the code and suggest improvements or make necessary edits, then use ${diffStrategy ? "the apply_diff or write_to_file" : "the write_to_file"} tool to apply the changes. If you refactored code that could affect other parts of the codebase, you could use search_files to ensure you update other files as needed.
3131
- You can use the execute_command tool to run commands on the user's computer whenever you feel it can help accomplish the user's task. When you need to execute a CLI command, you must provide a clear explanation of what the command does. Prefer to execute complex CLI commands over creating executable scripts, since they are more flexible and easier to run. Interactive and long-running commands are allowed, since the commands are run in the user's VSCode terminal. The user may keep commands running in the background and you will be kept updated on their status along the way. Each command you execute is run in a new terminal instance.${
Lines changed: 13 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,15 @@
1-
export function getObjectiveSection(): string {
1+
import { CodeIndexManager } from "../../../services/code-index/manager"
2+
3+
export function getObjectiveSection(codeIndexManager?: CodeIndexManager): string {
4+
const isCodebaseSearchAvailable = codeIndexManager &&
5+
codeIndexManager.isFeatureEnabled &&
6+
codeIndexManager.isFeatureConfigured &&
7+
codeIndexManager.isInitialized
8+
9+
const codebaseSearchInstruction = isCodebaseSearchAvailable
10+
? "First, if the task involves understanding existing code or functionality, you MUST use the `codebase_search` tool to search for relevant code based on the task's intent BEFORE using any other search or file exploration tools. Then, "
11+
: "First, "
12+
213
return `====
314
415
OBJECTIVE
@@ -7,7 +18,7 @@ You accomplish a given task iteratively, breaking it down into clear steps and w
718
819
1. Analyze the user's task and set clear, achievable goals to accomplish it. Prioritize these goals in a logical order.
920
2. Work through these goals sequentially, utilizing available tools one at a time as necessary. Each goal should correspond to a distinct step in your problem-solving process. You will be informed on the work completed and what's remaining as you go.
10-
3. Remember, you have extensive capabilities with access to a wide range of tools that can be used in powerful and clever ways as necessary to accomplish each goal. Before calling a tool, do some analysis within <thinking></thinking> tags. First, analyze the file structure provided in environment_details to gain context and insights for proceeding effectively. Then, think about which of the provided tools is the most relevant tool to accomplish the user's task. Next, go through each of the required parameters of the relevant tool and determine if the user has directly provided or given enough information to infer a value. When deciding if the parameter can be inferred, carefully consider all the context to see if it supports a specific value. If all of the required parameters are present or can be reasonably inferred, close the thinking tag and proceed with the tool use. BUT, if one of the values for a required parameter is missing, DO NOT invoke the tool (not even with fillers for the missing params) and instead, ask the user to provide the missing parameters using the ask_followup_question tool. DO NOT ask for more information on optional parameters if it is not provided.
21+
3. Remember, you have extensive capabilities with access to a wide range of tools that can be used in powerful and clever ways as necessary to accomplish each goal. Before calling a tool, do some analysis within <thinking></thinking> tags. ${codebaseSearchInstruction}analyze the file structure provided in environment_details to gain context and insights for proceeding effectively. Next, think about which of the provided tools is the most relevant tool to accomplish the user's task. Go through each of the required parameters of the relevant tool and determine if the user has directly provided or given enough information to infer a value. When deciding if the parameter can be inferred, carefully consider all the context to see if it supports a specific value. If all of the required parameters are present or can be reasonably inferred, close the thinking tag and proceed with the tool use. BUT, if one of the values for a required parameter is missing, DO NOT invoke the tool (not even with fillers for the missing params) and instead, ask the user to provide the missing parameters using the ask_followup_question tool. DO NOT ask for more information on optional parameters if it is not provided.
1122
4. Once you've completed the user's task, you must use the attempt_completion tool to present the result of the task to the user. You may also provide a CLI command to showcase the result of your task; this can be particularly useful for web development tasks, where you can run e.g. \`open index.html\` to show the website you've built.
1223
5. The user may provide feedback, which you can use to make improvements and try again. But DO NOT continue in pointless back and forth conversations, i.e. don't end your responses with questions or offers for further assistance.`
1324
}

0 commit comments

Comments
 (0)