Skip to content

backport: size-aware CRDB bulk export partitioning#3092

Draft
ostafen wants to merge 2 commits intomainfrom
backport/partitioned-export-partitions
Draft

backport: size-aware CRDB bulk export partitioning#3092
ostafen wants to merge 2 commits intomainfrom
backport/partitioned-export-partitions

Conversation

@ostafen
Copy link
Copy Markdown
Contributor

@ostafen ostafen commented May 6, 2026

Balance partitions by total range bytes from SHOW RANGES ... WITH DETAILS instead of by range count, so workers process roughly equal amounts of data when range sizes are uneven. Uses a simple greedy: walk ranges in order, accumulating size, and split at the next boundary once a target of totalSize/K is reached. Skewed inputs may yield fewer than K partitions, but each one stays close to its fair share.

Falls back to range-count balancing (logged) when CRDB reports zero range_size for every range, e.g. on tables too fresh to have been sized.

Description

Testing

References

Balance partitions by total range bytes from SHOW RANGES ... WITH DETAILS
instead of by range count, so workers process roughly equal amounts of
data when range sizes are uneven. Uses a simple greedy: walk ranges in
order, accumulating size, and split at the next boundary once a target
of totalSize/K is reached. Skewed inputs may yield fewer than K
partitions, but each one stays close to its fair share.

Falls back to range-count balancing (logged) when CRDB reports zero
range_size for every range, e.g. on tables too fresh to have been sized.
@github-actions github-actions Bot added area/datastore Affects the storage system area/tooling Affects the dev or user toolchain (e.g. tests, ci, build tools) labels May 6, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented May 6, 2026

Codecov Report

❌ Patch coverage is 92.72727% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 75.41%. Comparing base (b371613) to head (7248b55).
⚠️ Report is 19 commits behind head on main.

Files with missing lines Patch % Lines
internal/datastore/crdb/partitioner.go 92.73% 2 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3092      +/-   ##
==========================================
- Coverage   75.66%   75.41%   -0.24%     
==========================================
  Files         486      502      +16     
  Lines       59433    61312    +1879     
==========================================
+ Hits        44964    46235    +1271     
- Misses      11196    11728     +532     
- Partials     3273     3349      +76     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/datastore Affects the storage system area/tooling Affects the dev or user toolchain (e.g. tests, ci, build tools)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant