Skip to content

Commit 813824f

Browse files
committed
Improve web search to process queries individually
Updated the research engine to perform web searches for each sub-query separately, preventing result truncation and improving coverage. The README was updated to document this change and provide guidance on search query processing.
1 parent d097415 commit 813824f

File tree

2 files changed

+36
-11
lines changed

2 files changed

+36
-11
lines changed

optillm/plugins/deep_research/README.md

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ deep_research/
3232
The core implementation of the TTD-DR algorithm with the following key methods:
3333

3434
- **`decompose_query()`** - Implements query planning phase
35-
- **`perform_web_search()`** - Orchestrates web search using Chrome automation
35+
- **`perform_web_search()`** - Orchestrates web search using individual queries to avoid truncation
3636
- **`extract_and_fetch_urls()`** - Extracts sources and fetches content
3737
- **`synthesize_with_memory()`** - Processes unbounded context with citations
3838
- **`evaluate_completeness()`** - Assesses research gaps
@@ -226,6 +226,11 @@ Potential improvements aligned with research directions:
226226
- URL parsing depends on search result format
227227
- Plugin includes fallback parsing methods
228228

229+
5. **Search Query Processing**
230+
- Plugin uses individual searches for each sub-query to prevent truncation
231+
- If search results seem incomplete, check that decomposed queries are reasonable
232+
- Each sub-query is processed separately to ensure complete coverage
233+
229234
### Debug Mode
230235

231236
Enable debug output by checking the console logs during research execution. The plugin provides detailed logging of each research phase.

optillm/plugins/deep_research/research_engine.py

Lines changed: 30 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -98,17 +98,37 @@ def perform_web_search(self, queries: List[str]) -> str:
9898
"""
9999
Perform web search for multiple queries using the web_search plugin
100100
"""
101-
combined_query = "Search for the following topics:\n" + "\n".join([f"- {q}" for q in queries])
101+
all_results = []
102102

103-
try:
104-
enhanced_query, _ = web_search_run("", combined_query, None, None, {
105-
"num_results": self.max_sources,
106-
"delay_seconds": 3, # Increased delay to avoid rate limiting
107-
"headless": False # Allow CAPTCHA solving if needed
108-
})
109-
return enhanced_query
110-
except Exception as e:
111-
return f"Web search failed: {str(e)}"
103+
# Perform individual searches for each query to avoid truncation issues
104+
for i, query in enumerate(queries):
105+
try:
106+
# Format as a clean search query
107+
search_query = f"search for {query.strip()}"
108+
109+
# Perform search with reduced results per query to stay within limits
110+
results_per_query = max(1, self.max_sources // len(queries))
111+
112+
enhanced_query, _ = web_search_run("", search_query, None, None, {
113+
"num_results": results_per_query,
114+
"delay_seconds": 2 if i == 0 else 1, # Shorter delay for subsequent queries
115+
"headless": False # Allow CAPTCHA solving if needed
116+
})
117+
118+
if enhanced_query and "Web Search Results" in enhanced_query:
119+
all_results.append(enhanced_query)
120+
121+
except Exception as e:
122+
# Continue with other queries if one fails
123+
all_results.append(f"Search failed for query '{query}': {str(e)}")
124+
continue
125+
126+
if not all_results:
127+
return "Web search failed: No results obtained from any query"
128+
129+
# Combine all search results
130+
combined_results = "\n\n".join(all_results)
131+
return combined_results
112132

113133
def extract_and_fetch_urls(self, search_results: str) -> Tuple[str, List[Dict]]:
114134
"""

0 commit comments

Comments
 (0)