Skip to content

Conversation

zhang2014
Copy link
Member

@zhang2014 zhang2014 commented Jun 28, 2025

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

This PR introduces a major architectural refactoring of Databend's physical plan system, migrating from a monolithic enum-based design to a modern trait-based architecture. This change significantly improves
code maintainability, performance, and extensibility while laying the foundation for future query engine enhancements.

Key Changes

Core Architecture Migration

  • Before: Single large PhysicalPlan enum with 50+ variants
  • After: Trait-based system with IPhysicalPlan trait and individual plan types
  • Benefit: Better modularity, easier to extend, and improved compile-time optimization

Trait System Design

  #[typetag::serde]
  pub trait IPhysicalPlan: DynClone + Debug + Send + Sync + 'static {
      fn derive(&self, children: Vec<PhysicalPlan>) -> PhysicalPlan;
      fn build_pipeline2(&self, builder: &mut PipelineBuilder) -> Result<()>;
      fn formatter(&self) -> Result<Box<dyn PhysicalFormat + '_>>;
      // ... other methods
  }

Benefits

  • Easier to add new physical plan types
  • Trait-based design allows for better composition
  • Testability: Each plan can be unit tested independently
  • Code Quality: Better separation of concerns and reduced duplication
  • Consistency: Unified interfaces and formatting across all plans

Tests

  • Unit Test
  • Logic Test
  • Benchmark Test
  • No Test - Explain why

Type of change

  • Bug Fix (non-breaking change which fixes an issue)
  • New Feature (non-breaking change which adds functionality)
  • Breaking Change (fix or feature that could cause existing functionality not to work as expected)
  • Documentation Update
  • Refactoring
  • Performance Improvement
  • Other (please describe):

This change is Reviewable

@github-actions github-actions bot added the pr-refactor this PR changes the code base without new features or bugfix label Jun 28, 2025
zhang2014 added 15 commits July 15, 2025 13:56
…refactor/physical_plan_trait

# Conflicts:
#	src/query/service/src/interpreters/interpreter_copy_into_table.rs
#	src/query/service/src/pipelines/builders/builder_copy_into_location.rs
#	src/query/service/src/pipelines/builders/builder_join.rs
#	src/query/service/src/pipelines/processors/transforms/transform_recursive_cte_source.rs
#	src/query/sql/src/executor/format.rs
#	src/query/sql/src/executor/physical_plans/physical_union_all.rs
…refactor/physical_plan_trait

# Conflicts:
#	src/query/service/src/interpreters/interpreter_table_analyze.rs
#	src/query/service/src/physical_plans/physical_limit.rs
#	src/query/service/src/physical_plans/physical_recluster.rs
#	src/query/service/src/physical_plans/physical_table_scan.rs
#	src/query/service/src/pipelines/builders/builder_aggregate.rs
#	src/query/service/src/pipelines/builders/builder_commit.rs
#	src/query/service/src/pipelines/builders/builder_compact.rs
#	src/query/service/src/pipelines/builders/builder_hilbert_partition.rs
#	src/query/service/src/pipelines/builders/builder_row_fetch.rs
#	src/query/service/src/pipelines/builders/builder_sort.rs
#	src/query/service/src/pipelines/builders/builder_union_all.rs
#	src/query/service/src/pipelines/builders/builder_window.rs
#	src/query/service/src/pipelines/builders/mod.rs
#	src/query/service/src/pipelines/pipeline_builder.rs
#	src/query/service/src/pipelines/processors/transforms/transform_recursive_cte_source.rs
#	src/query/service/src/schedulers/fragments/fragmenter.rs
#	src/query/sql/src/executor/format.rs
#	src/query/sql/src/executor/physical_plan.rs
#	src/query/sql/src/executor/physical_plan_visitor.rs
#	src/query/sql/src/executor/physical_plans/mod.rs
#	src/query/sql/src/executor/physical_plans/physical_aggregate_partial.rs
#	src/query/sql/src/executor/physical_plans/physical_sort.rs
#	src/query/sql/src/executor/physical_plans/physical_union_all.rs
#	src/query/sql/src/executor/physical_plans/physical_window_partition.rs
#	src/query/sql/src/planner/optimizer/optimizers/hyper_dp/dynamic_sample/dynamic_sample.rs
@zhang2014 zhang2014 marked this pull request as ready for review August 15, 2025 18:55
@zhang2014 zhang2014 added the ci-cloud Build docker image for cloud test label Aug 16, 2025
@zhang2014 zhang2014 requested a review from BohuTANG August 16, 2025 01:17
Copy link
Contributor

Docker Image for PR

  • tag: pr-18268-3488bd0-1755309078

note: this image tag is only available for internal use.

@zhang2014 zhang2014 changed the title refactor(query): use trait to refactor physical plan refactor(query): migrate physical plan from enum to trait-based architecture Aug 16, 2025
@zhang2014 zhang2014 added the ci-benchmark Benchmark: run all test label Aug 16, 2025
Copy link
Contributor

Docker Image for PR

  • tag: pr-18268-7d0a6b9-1755313503

note: this image tag is only available for internal use.

@BohuTANG BohuTANG added ci-cloud Build docker image for cloud test and removed ci-cloud Build docker image for cloud test labels Aug 16, 2025
Copy link
Contributor

Docker Image for PR

  • tag: pr-18268-0cf5486-1755337260

note: this image tag is only available for internal use.

@zhang2014 zhang2014 force-pushed the refactor/physical_plan_trait branch from a2e3e2d to eea6de3 Compare August 17, 2025 03:13
@zhang2014 zhang2014 merged commit c0f12f9 into databendlabs:main Aug 17, 2025
87 of 88 checks passed
BohuTANG pushed a commit that referenced this pull request Aug 23, 2025
…E processing

Adds #[recursive::recursive] annotations to key CTE binding functions to prevent
stack overflow when processing complex queries with many UNION operations.

This follows the pattern established in PR #18268 and addresses the
segmentation fault issue introduced in PR #18577.

Fixes:
- bind_query() - Main query binding entry point
- m_cte_to_temp_table() - CTE temporary table creation
- compute_cte_ref_count() - CTE reference counting
- bind_cte_definition() - CTE definition binding
- TableNameReplacer visitor methods - AST traversal for name replacement

The recursive library automatically grows stack size when needed,
preventing stack overflow on deeply nested or complex query structures.
BohuTANG added a commit that referenced this pull request Aug 23, 2025
…nnotations (#18588)

fix(query): add recursive annotations to prevent stack overflow in CTE processing

Adds #[recursive::recursive] annotations to key CTE binding functions to prevent
stack overflow when processing complex queries with many UNION operations.

This follows the pattern established in PR #18268 and addresses the
segmentation fault issue introduced in PR #18577.

Fixes:
- bind_query() - Main query binding entry point
- m_cte_to_temp_table() - CTE temporary table creation
- compute_cte_ref_count() - CTE reference counting
- bind_cte_definition() - CTE definition binding
- TableNameReplacer visitor methods - AST traversal for name replacement

The recursive library automatically grows stack size when needed,
preventing stack overflow on deeply nested or complex query structures.

Co-authored-by: Ubuntu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci-benchmark Benchmark: run all test ci-cloud Build docker image for cloud test pr-refactor this PR changes the code base without new features or bugfix
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants