Skip to content

Conversation

@dantengsky
Copy link
Member

@dantengsky dantengsky commented Jan 24, 2026

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

perf only

Tests

  • Unit Test
  • Logic Test
  • Benchmark Test
  • No Test - Explain why

Type of change

  • Bug Fix (non-breaking change which fixes an issue)
  • New Feature (non-breaking change which adds functionality)
  • Breaking Change (fix or feature that could cause existing functionality not to work as expected)
  • Documentation Update
  • Refactoring
  • Performance Improvement
  • Other (please describe):

This change is Reviewable

Add heuristic-based block-level shuffle for better load balancing when
tables have few segments relative to cluster size.

Changes:
- Add BlockMod shuffle kind for block-level distribution
- Add auto_block_shuffle_threshold setting (default=5, 0 to disable)
- When segment_count < nodes * threshold, use block-level shuffle
- Each executor filters blocks by block_idx % num_executors == executor_idx
- Add info logging for shuffle strategy selection
- Preserve partition kind during reshuffle to prevent data duplication
Move block_slot computation from executor-side (prune_segments_with_pipeline)
to coordinator-side (redistribute_source_fragment). This ensures all executors
use the same cluster view that was determined when the plan was created,
preventing data duplication or loss if cluster membership changes.

Changes:
- Add block_slot field to DataSourcePlan
- Compute block_slot in redistribute_source_fragment for BlockMod shuffle
- Pass block_slot through plan instead of computing at execution time
…castWarehouse

Block filtering is now controlled by plan.block_slot, not by partition kind.
After reshuffle, all executors just process partitions sequentially.

Also revert incorrect change in memory_table.rs (should use BroadcastCluster).
@github-actions github-actions bot added the pr-feature this PR introduces a new feature to the codebase label Jan 24, 2026
@dantengsky dantengsky added the ci-cloud Build docker image for cloud test label Jan 24, 2026
@github-actions
Copy link
Contributor

Docker Image for PR

  • tag: pr-19325-d8ca593-1769243264

note: this image tag is only available for internal use.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-cloud Build docker image for cloud test pr-feature this PR introduces a new feature to the codebase

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant