Skip to content

Commit 427b8fe

Browse files
authored
Merge pull request #261 from codelion/fix-deepresearch-plugin
Refactor TTD-DR plugin for improved citation handling
2 parents 82b6c24 + 96e77f4 commit 427b8fe

File tree

2 files changed

+211
-236
lines changed

2 files changed

+211
-236
lines changed

optillm/plugins/deep_research/README.md

Lines changed: 42 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -6,13 +6,16 @@ The Deep Research plugin implements the **Test-Time Diffusion Deep Researcher (T
66

77
## Algorithm Overview
88

9-
The TTD-DR algorithm treats research as a **diffusion process** with iterative refinement through denoising and retrieval. Unlike traditional search approaches that return raw results, TTD-DR performs:
9+
The TTD-DR algorithm treats research as a **diffusion process** with iterative refinement through denoising and retrieval. Unlike traditional search approaches that return raw results, this implementation performs:
1010

11-
1. **Query Decomposition** - Breaks complex queries into focused sub-questions
12-
2. **Iterative Search** - Performs multiple rounds of web search based on identified gaps
13-
3. **Content Synthesis** - Uses advanced memory processing for unbounded context
14-
4. **Completeness Evaluation** - Automatically assesses research quality and identifies missing aspects
15-
5. **Report Generation** - Produces structured, academic-quality reports with proper citations
11+
1. **Preliminary Draft Generation** - Creates an initial "updatable skeleton" from LLM internal knowledge
12+
2. **Initial Query Decomposition** - Breaks complex queries into focused sub-questions
13+
3. **Gap Analysis** - Identifies areas in the draft needing external research
14+
4. **Iterative Denoising** - Performs multiple rounds of gap-targeted search and draft refinement
15+
5. **Quality-Guided Termination** - Automatically assesses draft quality to determine when research is complete
16+
6. **Report Finalization** - Produces structured, academic-quality reports with proper citations
17+
18+
**Note:** This is a simplified implementation of the TTD-DR paper. Some advanced features like component-wise self-evolutionary optimization and memory-based synthesis are not yet implemented.
1619

1720
## Architecture
1821

@@ -34,10 +37,12 @@ The core implementation of the TTD-DR algorithm with the following key methods:
3437
- **`decompose_query()`** - Implements query planning phase
3538
- **`perform_web_search()`** - Orchestrates web search using individual queries to avoid truncation
3639
- **`extract_and_fetch_urls()`** - Extracts sources and fetches content
37-
- **`synthesize_with_memory()`** - Processes unbounded context with citations
38-
- **`evaluate_completeness()`** - Assesses research gaps
39-
- **`generate_structured_report()`** - Creates academic-quality reports
40-
- **`research()`** - Main research loop implementing TTD-DR
40+
- **`analyze_draft_gaps()`** - Analyzes current draft to identify gaps and areas needing research
41+
- **`perform_gap_targeted_search()`** - Performs targeted searches based on identified gaps
42+
- **`denoise_draft_with_retrieval()`** - Core denoising step integrating retrieved information with current draft
43+
- **`evaluate_draft_quality()`** - Evaluates quality improvement of current draft vs previous iteration
44+
- **`finalize_research_report()`** - Applies final polishing to the research report
45+
- **`research()`** - Main research loop implementing TTD-DR diffusion process
4146

4247
#### 2. Plugin Interface (`deep_research_plugin.py`)
4348

@@ -53,16 +58,20 @@ def run(system_prompt: str, initial_query: str, client, model: str, request_conf
5358

5459
```mermaid
5560
graph TD
56-
A[Initial Query] --> B[Query Decomposition]
57-
B --> C[Web Search]
58-
C --> D[Content Extraction]
59-
D --> E[Memory Synthesis]
60-
E --> F[Completeness Evaluation]
61-
F --> G{Complete?}
62-
G -->|No| H[Generate Focused Queries]
63-
H --> C
64-
G -->|Yes| I[Generate Structured Report]
65-
I --> J[Final Report with Citations]
61+
A[Initial Query] --> B[Generate Preliminary Draft]
62+
B --> C[Initial Query Decomposition]
63+
C --> D[Initial Web Search]
64+
D --> E[Register Initial Sources]
65+
E --> F[Analyze Draft Gaps]
66+
F --> G[Gap-Targeted Search]
67+
G --> H[Content Extraction]
68+
H --> I[Denoise Draft with Retrieved Info]
69+
I --> J[Evaluate Draft Quality]
70+
J --> K{Quality Threshold Met?}
71+
K -->|No| F
72+
K -->|Yes| L[Finalize Research Report]
73+
L --> M[Add References & Metadata]
74+
M --> N[Final Report with Citations]
6675
```
6776

6877
### Citation System
@@ -105,7 +114,6 @@ The Deep Research plugin requires these OptiLLM plugins:
105114

106115
- **`web_search`** - Chrome-based Google search automation
107116
- **`readurls`** - Content extraction from URLs
108-
- **`memory`** - Unbounded context processing and synthesis
109117

110118
## Usage Examples
111119

@@ -205,22 +213,25 @@ The implementation follows the TTD-DR paper's quality criteria:
205213
| Feature | Simple Search | Deep Research (TTD-DR) |
206214
|---------|---------------|------------------------|
207215
| Query Processing | Single query | Multi-query decomposition |
208-
| Iteration | Single pass | Multiple refinement cycles |
209-
| Content Synthesis | Raw results | Comprehensive analysis |
210-
| Gap Detection | None | Automatic completeness evaluation |
216+
| Iteration | Single pass | Multiple denoising cycles |
217+
| Draft Evolution | None | Preliminary draft with iterative refinement |
218+
| Gap Detection | None | Automatic draft gap analysis |
219+
| Search Strategy | Broad search | Gap-targeted focused search |
211220
| Citations | Manual | Automatic with tracking |
212221
| Report Format | Unstructured | Academic report structure |
213-
| Context Handling | Limited | Unbounded via memory plugin |
222+
| Quality Evaluation | None | Quality-guided termination |
214223

215224
## Future Enhancements
216225

217-
Potential improvements aligned with research directions:
226+
Potential improvements aligned with the TTD-DR paper and research directions:
218227

219-
1. **Parallel Processing** - Concurrent search execution
220-
2. **Domain Specialization** - Field-specific research strategies
221-
3. **Multimedia Integration** - Image and video content analysis
222-
4. **Real-time Updates** - Live research monitoring and updates
223-
5. **Collaborative Research** - Multi-agent research coordination
228+
1. **Component-wise Self-Evolutionary Optimization** - Implement fitness-based evolution of search, synthesis, and integration components as described in the paper
229+
2. **Memory-based Synthesis** - Integrate memory plugin for unbounded context processing
230+
3. **Parallel Processing** - Concurrent search execution
231+
4. **Domain Specialization** - Field-specific research strategies
232+
5. **Multimedia Integration** - Image and video content analysis
233+
6. **Real-time Updates** - Live research monitoring and updates
234+
7. **Collaborative Research** - Multi-agent research coordination
224235

225236
## Troubleshooting
226237

0 commit comments

Comments
 (0)