Skip to content

Commit 19eb71a

Browse files
strickvlclaude
andauthored
Refactor ResearchState into smaller artifacts (#229)
* Complete refactoring of ResearchState to artifact-based architecture This commit implements the design document for splitting the monolithic ResearchState into separate, named artifacts with custom visualizations. Major changes include: ## New Artifact Classes - **QueryContext**: Immutable context containing the research query and sub-questions - **SearchData**: All search results and cost tracking information - **SynthesisData**: Synthesized information from searches (including enhanced versions) - **AnalysisData**: Cross-viewpoint analysis and reflection metadata - **FinalReport**: The generated HTML report with metadata ## Custom Materializers Each artifact now has its own materializer with beautiful HTML visualizations: - Interactive charts using Chart.js for search costs - Collapsible sections for better organization - Consistent styling across all artifact views - Metadata tables with key statistics ## Pipeline Updates - All steps refactored to use the new artifact-based approach - Proper dependencies established between parallel steps - Fixed merge step to run after parallel sub-question processing - Updated metadata logging and tagging throughout ## Bug Fixes - Fixed log_metadata calls to include infer_artifact=True parameter - Fixed template variable names in final report generation - Corrected enhanced_info merging logic to preserve original synthesis data - Added proper step dependencies in parallel pipeline ## Test Updates - Updated all tests to use the new artifact-based interface - Tests now create individual artifacts instead of ResearchState - Maintained test coverage for all functionality This refactoring improves modularity, enables better artifact visualization in the ZenML dashboard, and makes the pipeline more maintainable and extensible. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]> * Delete legacy code --------- Co-authored-by: Claude <[email protected]>
1 parent 1a80539 commit 19eb71a

27 files changed

+3557
-3094
lines changed

deep_research/materializers/__init__.py

Lines changed: 11 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2,20 +2,25 @@
22
Materializers package for the ZenML Deep Research project.
33
44
This package contains custom ZenML materializers that handle serialization and
5-
deserialization of complex data types used in the research pipeline, particularly
6-
the ResearchState object that tracks the state of the research process.
5+
deserialization of complex data types used in the research pipeline.
76
"""
87

8+
from .analysis_data_materializer import AnalysisDataMaterializer
99
from .approval_decision_materializer import ApprovalDecisionMaterializer
10+
from .final_report_materializer import FinalReportMaterializer
1011
from .prompt_materializer import PromptMaterializer
11-
from .pydantic_materializer import ResearchStateMaterializer
12-
from .reflection_output_materializer import ReflectionOutputMaterializer
12+
from .query_context_materializer import QueryContextMaterializer
13+
from .search_data_materializer import SearchDataMaterializer
14+
from .synthesis_data_materializer import SynthesisDataMaterializer
1315
from .tracing_metadata_materializer import TracingMetadataMaterializer
1416

1517
__all__ = [
1618
"ApprovalDecisionMaterializer",
1719
"PromptMaterializer",
18-
"ReflectionOutputMaterializer",
19-
"ResearchStateMaterializer",
2020
"TracingMetadataMaterializer",
21+
"QueryContextMaterializer",
22+
"SearchDataMaterializer",
23+
"SynthesisDataMaterializer",
24+
"AnalysisDataMaterializer",
25+
"FinalReportMaterializer",
2126
]

0 commit comments

Comments
 (0)