fix(optimizer): Fix bug in partitioning utils#27179
Open
feilong-liu wants to merge 1 commit intoprestodb:masterfrom
Open
fix(optimizer): Fix bug in partitioning utils#27179feilong-liu wants to merge 1 commit intoprestodb:masterfrom
feilong-liu wants to merge 1 commit intoprestodb:masterfrom
Conversation
Contributor
Reviewer's GuideAdjusts PartitioningUtils.isPartitionedOn semantics for empty partitioning arguments to distinguish truly single-node/coordinator-only distributions from multi-node distributions, and adds unit tests to cover the new behavior and existing partitioning scenarios. Class diagram for updated partitioning types and isPartitionedOnclassDiagram
class PartitioningUtils {
+boolean isPartitionedOn(Partitioning partitioning, Collection columns, Set knownConstants)
}
class Partitioning {
+List arguments
+PartitioningHandle handle
+List getArguments()
+PartitioningHandle getHandle()
}
class PartitioningHandle {
+boolean isSingleNode()
+boolean isCoordinatorOnly()
}
class VariableReferenceExpression {
}
class RowExpression {
}
class Collection_VariableReferenceExpression {
}
class Set_VariableReferenceExpression {
}
PartitioningUtils --> Partitioning : uses
PartitioningUtils --> VariableReferenceExpression : uses
PartitioningUtils --> RowExpression : uses
Partitioning --> PartitioningHandle : has
Partitioning --> RowExpression : has arguments
PartitioningUtils --> PartitioningHandle : checks distribution
PartitioningUtils --> Collection_VariableReferenceExpression : columns (Collection<VariableReferenceExpression>)
PartitioningUtils --> Set_VariableReferenceExpression : knownConstants (Set<VariableReferenceExpression>)
Flow diagram for updated isPartitionedOn partitioning logicflowchart TD
A[Start isPartitionedOn] --> B[Read partitioning.arguments]
B --> C{arguments is empty?}
C -- Yes --> D[Check partitioning.handle.isSingleNode]
D --> E[Check partitioning.handle.isCoordinatorOnly]
E --> F{isSingleNode or isCoordinatorOnly?}
F -- Yes --> G[Return true]
F -- No --> H[Return false]
C -- No --> I[Iterate each RowExpression argument]
I --> J[Ignore arguments that are constants using knownConstants]
J --> K[Compare remaining arguments to columns]
K --> L{All required columns covered by arguments?}
L -- Yes --> G
L -- No --> H
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
Contributor
There was a problem hiding this comment.
Hey - I've left some high level feedback:
- Consider adding a short comment in
isPartitionedOnabove the early return for empty arguments to document why only single-node and coordinator-only partitioning are treated as partitioned in that case, since this is subtle and changes previous behavior. - The tests in
TestPartitioningUtilsrepeatedly constructVariableReferenceExpressioninstances in the same way; you could extract a small helper method (e.g.,variable(String name)) to reduce duplication and improve readability.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Consider adding a short comment in `isPartitionedOn` above the early return for empty arguments to document why only single-node and coordinator-only partitioning are treated as partitioned in that case, since this is subtle and changes previous behavior.
- The tests in `TestPartitioningUtils` repeatedly construct `VariableReferenceExpression` instances in the same way; you could extract a small helper method (e.g., `variable(String name)`) to reduce duplication and improve readability.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
kaikalur
previously approved these changes
Feb 20, 2026
79ec4b1 to
47fcec0
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Current isPartitionedOn function will return true when partition columns is empty, it's under the assumption that empty partition columns means no partitioning hence all data in one single node. However this is not the case for arbitrary distribution, where partition columns is empty but data is not in one single node. In this PR, I explicitly check that the partition is single node when partition columns is empty.
Motivation and Context
Bug fix
Impact
Fix a correctness bug
Test Plan
Unit tests
Contributor checklist
Release Notes
Please follow release notes guidelines and fill in the release notes below.
Summary by Sourcery
Fix partitioning utility behavior for empty partitioning arguments and strengthen test coverage for partitioning semantics.
Bug Fixes:
Tests: