Skip to content

Commit 338adb7

Browse files
authored
APPENG-3801-B - Agent performance fixes - all agent stages (#134)
* bugfix in the testing env * update tool descriptions for clarity * refactor tool names to be class constants instead of disparate strings * add initial unit tests * rename tool names to be more consistent and distinct * update unit tests with tool names and tool constants * cleanup startup guide notebook * rework intel source score section * update agent execution stage prompts and make tool descriptions dynamic * add tests for dynamic tool descriptions * revamp the tool description list, as well as the checklist prompt for better clarity and quality * revamp checklist prompt implementation, as well as add in dynamic tool descriptions to both checklist and agent prompts * update tests for tool descriptions * add more detailed agent examples with more useful MRKL-formatted steps * update for summary prompt * update justification prompt with more logic and explanations on how to pick the class * update CVSS prompts and cleanup examples and guidance * bugfix on intel source * bug patch for vdb generation adding in constants during the vdb generation check * bugfix by Tamar * update register_function() and transitive_search() descriptions * bugfix in the testing env * update tool descriptions for clarity * refactor tool names to be class constants instead of disparate strings * add initial unit tests * rename tool names to be more consistent and distinct * update unit tests with tool names and tool constants * cleanup startup guide notebook * rework intel source score section * update agent execution stage prompts and make tool descriptions dynamic * add tests for dynamic tool descriptions * revamp the tool description list, as well as the checklist prompt for better clarity and quality * revamp checklist prompt implementation, as well as add in dynamic tool descriptions to both checklist and agent prompts * update tests for tool descriptions * add more detailed agent examples with more useful MRKL-formatted steps * update for summary prompt * update justification prompt with more logic and explanations on how to pick the class * update CVSS prompts and cleanup examples and guidance * bugfix on intel source * bug patch for vdb generation adding in constants during the vdb generation check * bugfix by Tamar * update register_function() and transitive_search() descriptions * add function locator descriptions * add names to configs * add local output for local testing * bugfix * Update tool_names.py
1 parent 1ede849 commit 338adb7

25 files changed

+1435
-4855
lines changed

kustomize/base/exploit-iq-config.yml

Lines changed: 19 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -54,45 +54,45 @@ functions:
5454
cve_checklist:
5555
_type: cve_checklist
5656
llm_name: checklist_llm
57-
Transitive code search tool:
57+
Call Chain Analyzer:
5858
_type: transitive_code_search
5959
enable_transitive_search: true
60-
Calling Function Name Extractor:
60+
Function Caller Finder:
6161
_type: calling_function_name_extractor
6262
enable_functions_usage_search: true
63-
Package and Function Locator:
63+
Function Locator:
6464
_type: package_and_function_locator
65-
Container Image Code QA System:
65+
Code Semantic Search:
6666
_type: local_vdb_retriever
6767
embedder_name: nim_embedder
6868
llm_name: code_vdb_retriever_llm
6969
vdb_type: code
7070
return_source_documents: false
71-
Container Image Developer Guide QA System:
71+
Docs Semantic Search:
7272
_type: local_vdb_retriever
7373
embedder_name: nim_embedder
7474
llm_name: doc_vdb_retriever_llm
7575
vdb_type: doc
7676
return_source_documents: false
77-
Lexical Search Container Image Code QA System:
77+
Code Keyword Search:
7878
_type: lexical_code_search
7979
top_k: 5
80-
Internet Search:
80+
CVE Web Search:
8181
_type: serp_wrapper
8282
max_retries: 5
83-
Container Image Analysis Data:
83+
Container Analysis Data:
8484
_type: container_image_analysis_data
8585
cve_agent_executor:
8686
_type: cve_agent_executor
8787
llm_name: cve_agent_executor_llm
8888
tool_names:
89-
- Container Image Code QA System
90-
- Container Image Developer Guide QA System
91-
- Lexical Search Container Image Code QA System # Uncomment to enable lexical search
92-
- Internet Search
93-
- Transitive code search tool
94-
- Calling Function Name Extractor
95-
- Package and Function Locator
89+
- Code Semantic Search
90+
- Docs Semantic Search
91+
- Code Keyword Search
92+
- CVE Web Search
93+
- Call Chain Analyzer
94+
- Function Caller Finder
95+
- Function Locator
9696
max_concurrency: null
9797
max_iterations: 10
9898
prompt_examples: false
@@ -106,10 +106,10 @@ functions:
106106
skip: false
107107
llm_name: generate_cvss_llm
108108
tool_names:
109-
- Container Image Code QA System
110-
- Container Image Developer Guide QA System
111-
- Lexical Search Container Image Code QA System # Uncomment to enable lexical search
112-
- Container Image Analysis Data
109+
- Code Semantic Search
110+
- Docs Semantic Search
111+
- Code Keyword Search
112+
- Container Analysis Data
113113
max_concurrency: null
114114
max_iterations: 10
115115
prompt_examples: true

kustomize/config-http-openai-local.yml

Lines changed: 38 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -55,45 +55,45 @@ functions:
5555
cve_checklist:
5656
_type: cve_checklist
5757
llm_name: checklist_llm
58-
Transitive code search tool:
58+
Call Chain Analyzer:
5959
_type: transitive_code_search
6060
enable_transitive_search: true
61-
Calling Function Name Extractor:
61+
Function Caller Finder:
6262
_type: calling_function_name_extractor
6363
enable_functions_usage_search: true
64-
Package and Function Locator:
64+
Function Locator:
6565
_type: package_and_function_locator
66-
Container Image Code QA System:
66+
Code Semantic Search:
6767
_type: local_vdb_retriever
6868
embedder_name: nim_embedder
6969
llm_name: code_vdb_retriever_llm
7070
vdb_type: code
7171
return_source_documents: false
72-
Container Image Developer Guide QA System:
72+
Docs Semantic Search:
7373
_type: local_vdb_retriever
7474
embedder_name: nim_embedder
7575
llm_name: doc_vdb_retriever_llm
7676
vdb_type: doc
7777
return_source_documents: false
78-
Lexical Search Container Image Code QA System:
78+
Code Keyword Search:
7979
_type: lexical_code_search
8080
top_k: 5
81-
Internet Search:
81+
CVE Web Search:
8282
_type: serp_wrapper
8383
max_retries: 5
84-
Container Image Analysis Data:
84+
Container Analysis Data:
8585
_type: container_image_analysis_data
8686
cve_agent_executor:
8787
_type: cve_agent_executor
8888
llm_name: cve_agent_executor_llm
8989
tool_names:
90-
- Container Image Code QA System
91-
- Container Image Developer Guide QA System
92-
- Lexical Search Container Image Code QA System # Uncomment to enable lexical search
93-
- Internet Search
94-
- Transitive code search tool
95-
- Calling Function Name Extractor
96-
- Package and Function Locator
90+
- Code Semantic Search
91+
- Docs Semantic Search
92+
- Code Keyword Search
93+
- CVE Web Search
94+
- Call Chain Analyzer
95+
- Function Caller Finder
96+
- Function Locator
9797
max_concurrency: null
9898
max_iterations: 10
9999
prompt_examples: false
@@ -107,10 +107,10 @@ functions:
107107
skip: false
108108
llm_name: generate_cvss_llm
109109
tool_names:
110-
- Container Image Code QA System
111-
- Container Image Developer Guide QA System
112-
- Lexical Search Container Image Code QA System # Uncomment to enable lexical search
113-
- Container Image Analysis Data
110+
- Code Semantic Search
111+
- Docs Semantic Search
112+
- Code Keyword Search
113+
- Container Analysis Data
114114
max_concurrency: null
115115
max_iterations: 10
116116
prompt_examples: true
@@ -124,10 +124,20 @@ functions:
124124
cve_justify:
125125
_type: cve_justify
126126
llm_name: justify_llm
127+
# cve_file_output:
128+
# _type: cve_file_output
129+
# file_path: .tmp/output.json
130+
# markdown_dir: .tmp/vulnerability_markdown_reports
131+
# overwrite: true
127132
cve_http_output:
128133
_type: cve_http_output
129134
url: http://localhost:8080
130135
endpoint: /reports
136+
cve_file_output:
137+
_type: cve_file_output
138+
file_path: .tmp/output.json
139+
markdown_dir: .tmp/vulnerability_markdown_reports
140+
overwrite: true
131141
cve_calculate_intel_score:
132142
_type: cve_calculate_intel_score
133143
llm_name: intel_source_score_llm
@@ -138,55 +148,55 @@ functions:
138148
llms:
139149
checklist_llm:
140150
_type: openai
141-
api_key: "EMPTY"
151+
api_key: ${OPENAI_API_KEY:-EMPTY}
142152
base_url: ${NVIDIA_API_BASE:-https://integrate.api.nvidia.com/v1}
143153
model_name: ${CHECKLIST_MODEL_NAME:-meta/llama-3.1-70b-instruct}
144154
temperature: 0.0
145155
max_tokens: 2000
146156
top_p: 0.01
147157
code_vdb_retriever_llm:
148158
_type: openai
149-
api_key: "EMPTY"
159+
api_key: ${OPENAI_API_KEY:-EMPTY}
150160
base_url: ${NVIDIA_API_BASE:-https://integrate.api.nvidia.com/v1}
151161
model_name: ${CODE_VDB_RETRIEVER_MODEL_NAME:-meta/llama-3.1-70b-instruct}
152162
temperature: 0.0
153163
max_tokens: 2000
154164
top_p: 0.01
155165
doc_vdb_retriever_llm:
156166
_type: openai
157-
api_key: "EMPTY"
167+
api_key: ${OPENAI_API_KEY:-EMPTY}
158168
base_url: ${NVIDIA_API_BASE:-https://integrate.api.nvidia.com/v1}
159169
model_name: ${DOC_VDB_RETRIEVER_MODEL_NAME:-meta/llama-3.1-70b-instruct}
160170
temperature: 0.0
161171
max_tokens: 2000
162172
top_p: 0.01
163173
cve_agent_executor_llm:
164174
_type: openai
165-
api_key: "EMPTY"
175+
api_key: ${OPENAI_API_KEY:-EMPTY}
166176
base_url: ${NVIDIA_API_BASE:-https://integrate.api.nvidia.com/v1}
167177
model_name: ${CVE_AGENT_EXECUTOR_MODEL_NAME:-meta/llama-3.1-70b-instruct}
168178
temperature: 0.0
169179
max_tokens: 2000
170180
top_p: 0.01
171181
generate_cvss_llm:
172182
_type: openai
173-
api_key: "EMPTY"
183+
api_key: ${OPENAI_API_KEY:-EMPTY}
174184
base_url: ${NVIDIA_API_BASE:-https://integrate.api.nvidia.com/v1}
175185
model_name: ${GENERATE_CVSS_MODEL_NAME:-meta/llama-3.1-70b-instruct}
176186
temperature: 0.0
177187
max_tokens: 1024
178188
top_p: 0.01
179189
summarize_llm:
180190
_type: openai
181-
api_key: "EMPTY"
191+
api_key: ${OPENAI_API_KEY:-EMPTY}
182192
base_url: ${NVIDIA_API_BASE:-https://integrate.api.nvidia.com/v1}
183193
model_name: ${SUMMARIZE_MODEL_NAME:-meta/llama-3.1-70b-instruct}
184194
temperature: 0.0
185195
max_tokens: 1024
186196
top_p: 0.01
187197
justify_llm:
188198
_type: openai
189-
api_key: "EMPTY"
199+
api_key: ${OPENAI_API_KEY:-EMPTY}
190200
base_url: ${NVIDIA_API_BASE:-https://integrate.api.nvidia.com/v1}
191201
model_name: ${JUSTIFY_MODEL_NAME:-meta/llama-3.1-70b-instruct}
192202
temperature: 0.0
@@ -195,7 +205,7 @@ llms:
195205

196206
intel_source_score_llm:
197207
_type: openai
198-
api_key: "EMPTY"
208+
api_key: ${OPENAI_API_KEY:-EMPTY}
199209
base_url: ${NVIDIA_API_BASE:-https://integrate.api.nvidia.com/v1}
200210
model_name: ${JUSTIFY_MODEL_NAME:-meta/llama-3.1-70b-instruct}
201211
temperature: 0.0
@@ -222,7 +232,7 @@ workflow:
222232
cve_generate_cvss_name: cve_generate_cvss
223233
cve_summarize_name: cve_summarize
224234
cve_justify_name: cve_justify
225-
cve_output_config_name: cve_http_output
235+
cve_output_config_name: cve_file_output
226236

227237
eval:
228238
general:

0 commit comments

Comments
 (0)