Skip to content

Commit 96e77f4

Browse files
committed
Refactor TTD-DR plugin for improved citation handling
Updated the README to clarify the TTD-DR algorithm steps, highlight implemented and future features, and update the architecture diagram. In research_engine.py, removed the unused memory-based synthesis, added citation usage validation, improved citation requirements in denoising and finalization steps, and ensured only used citations are included in the final references. Enhanced metadata and documentation for future self-evolutionary optimization.
1 parent 82b6c24 commit 96e77f4

File tree

2 files changed

+211
-236
lines changed

2 files changed

+211
-236
lines changed

optillm/plugins/deep_research/README.md

Lines changed: 42 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -6,13 +6,16 @@ The Deep Research plugin implements the **Test-Time Diffusion Deep Researcher (T
66

77
## Algorithm Overview
88

9-
The TTD-DR algorithm treats research as a **diffusion process** with iterative refinement through denoising and retrieval. Unlike traditional search approaches that return raw results, TTD-DR performs:
9+
The TTD-DR algorithm treats research as a **diffusion process** with iterative refinement through denoising and retrieval. Unlike traditional search approaches that return raw results, this implementation performs:
1010

11-
1. **Query Decomposition** - Breaks complex queries into focused sub-questions
12-
2. **Iterative Search** - Performs multiple rounds of web search based on identified gaps
13-
3. **Content Synthesis** - Uses advanced memory processing for unbounded context
14-
4. **Completeness Evaluation** - Automatically assesses research quality and identifies missing aspects
15-
5. **Report Generation** - Produces structured, academic-quality reports with proper citations
11+
1. **Preliminary Draft Generation** - Creates an initial "updatable skeleton" from LLM internal knowledge
12+
2. **Initial Query Decomposition** - Breaks complex queries into focused sub-questions
13+
3. **Gap Analysis** - Identifies areas in the draft needing external research
14+
4. **Iterative Denoising** - Performs multiple rounds of gap-targeted search and draft refinement
15+
5. **Quality-Guided Termination** - Automatically assesses draft quality to determine when research is complete
16+
6. **Report Finalization** - Produces structured, academic-quality reports with proper citations
17+
18+
**Note:** This is a simplified implementation of the TTD-DR paper. Some advanced features like component-wise self-evolutionary optimization and memory-based synthesis are not yet implemented.
1619

1720
## Architecture
1821

@@ -34,10 +37,12 @@ The core implementation of the TTD-DR algorithm with the following key methods:
3437
- **`decompose_query()`** - Implements query planning phase
3538
- **`perform_web_search()`** - Orchestrates web search using individual queries to avoid truncation
3639
- **`extract_and_fetch_urls()`** - Extracts sources and fetches content
37-
- **`synthesize_with_memory()`** - Processes unbounded context with citations
38-
- **`evaluate_completeness()`** - Assesses research gaps
39-
- **`generate_structured_report()`** - Creates academic-quality reports
40-
- **`research()`** - Main research loop implementing TTD-DR
40+
- **`analyze_draft_gaps()`** - Analyzes current draft to identify gaps and areas needing research
41+
- **`perform_gap_targeted_search()`** - Performs targeted searches based on identified gaps
42+
- **`denoise_draft_with_retrieval()`** - Core denoising step integrating retrieved information with current draft
43+
- **`evaluate_draft_quality()`** - Evaluates quality improvement of current draft vs previous iteration
44+
- **`finalize_research_report()`** - Applies final polishing to the research report
45+
- **`research()`** - Main research loop implementing TTD-DR diffusion process
4146

4247
#### 2. Plugin Interface (`deep_research_plugin.py`)
4348

@@ -53,16 +58,20 @@ def run(system_prompt: str, initial_query: str, client, model: str, request_conf
5358

5459
```mermaid
5560
graph TD
56-
A[Initial Query] --> B[Query Decomposition]
57-
B --> C[Web Search]
58-
C --> D[Content Extraction]
59-
D --> E[Memory Synthesis]
60-
E --> F[Completeness Evaluation]
61-
F --> G{Complete?}
62-
G -->|No| H[Generate Focused Queries]
63-
H --> C
64-
G -->|Yes| I[Generate Structured Report]
65-
I --> J[Final Report with Citations]
61+
A[Initial Query] --> B[Generate Preliminary Draft]
62+
B --> C[Initial Query Decomposition]
63+
C --> D[Initial Web Search]
64+
D --> E[Register Initial Sources]
65+
E --> F[Analyze Draft Gaps]
66+
F --> G[Gap-Targeted Search]
67+
G --> H[Content Extraction]
68+
H --> I[Denoise Draft with Retrieved Info]
69+
I --> J[Evaluate Draft Quality]
70+
J --> K{Quality Threshold Met?}
71+
K -->|No| F
72+
K -->|Yes| L[Finalize Research Report]
73+
L --> M[Add References & Metadata]
74+
M --> N[Final Report with Citations]
6675
```
6776

6877
### Citation System
@@ -105,7 +114,6 @@ The Deep Research plugin requires these OptiLLM plugins:
105114

106115
- **`web_search`** - Chrome-based Google search automation
107116
- **`readurls`** - Content extraction from URLs
108-
- **`memory`** - Unbounded context processing and synthesis
109117

110118
## Usage Examples
111119

@@ -205,22 +213,25 @@ The implementation follows the TTD-DR paper's quality criteria:
205213
| Feature | Simple Search | Deep Research (TTD-DR) |
206214
|---------|---------------|------------------------|
207215
| Query Processing | Single query | Multi-query decomposition |
208-
| Iteration | Single pass | Multiple refinement cycles |
209-
| Content Synthesis | Raw results | Comprehensive analysis |
210-
| Gap Detection | None | Automatic completeness evaluation |
216+
| Iteration | Single pass | Multiple denoising cycles |
217+
| Draft Evolution | None | Preliminary draft with iterative refinement |
218+
| Gap Detection | None | Automatic draft gap analysis |
219+
| Search Strategy | Broad search | Gap-targeted focused search |
211220
| Citations | Manual | Automatic with tracking |
212221
| Report Format | Unstructured | Academic report structure |
213-
| Context Handling | Limited | Unbounded via memory plugin |
222+
| Quality Evaluation | None | Quality-guided termination |
214223

215224
## Future Enhancements
216225

217-
Potential improvements aligned with research directions:
226+
Potential improvements aligned with the TTD-DR paper and research directions:
218227

219-
1. **Parallel Processing** - Concurrent search execution
220-
2. **Domain Specialization** - Field-specific research strategies
221-
3. **Multimedia Integration** - Image and video content analysis
222-
4. **Real-time Updates** - Live research monitoring and updates
223-
5. **Collaborative Research** - Multi-agent research coordination
228+
1. **Component-wise Self-Evolutionary Optimization** - Implement fitness-based evolution of search, synthesis, and integration components as described in the paper
229+
2. **Memory-based Synthesis** - Integrate memory plugin for unbounded context processing
230+
3. **Parallel Processing** - Concurrent search execution
231+
4. **Domain Specialization** - Field-specific research strategies
232+
5. **Multimedia Integration** - Image and video content analysis
233+
6. **Real-time Updates** - Live research monitoring and updates
234+
7. **Collaborative Research** - Multi-agent research coordination
224235

225236
## Troubleshooting
226237

0 commit comments

Comments
 (0)