enhance: [2.6] type-aware bidirectional in/== or expression rewriting#48545
enhance: [2.6] type-aware bidirectional in/== or expression rewriting#48545zhengbuqian wants to merge 3 commits intomilvus-io:2.6from
Conversation
Replace the fixed threshold (150) for merging == or into in[] with type-specific thresholds based on benchmark data: - INT types: use in[] when N >= 10 (== or is faster below due to simpler execution path) - FLOAT types: use in[] when N >= 15 (float hash is more expensive) - Other types (varchar, bool): use in[] when N >= 3 The rewriting is now bidirectional: - == or → in[]: when shouldUseInExpr returns true (existing direction) - in[] → == or: when shouldUseInExpr returns false (new direction, via visitTermExpr) - not in → != and: when shouldUseInExpr returns false (new, via visitUnaryExpr) Both directions use the same shouldUseInExpr function to ensure consistency. Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: zhengbuqian The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
[ci-v2-notice] To rerun ci-v2 checks, comment with:
If you have any questions or requests, please contact @zhikunyao. |
|
[INFO] PR Label Summary by Default
[WARNING] Milestone not set
You can set milestone by commenting: Use /refresh-label to update related check and label manually |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## 2.6 #48545 +/- ##
===========================================
+ Coverage 76.99% 82.48% +5.49%
===========================================
Files 1700 555 -1145
Lines 262533 88038 -174495
===========================================
- Hits 202142 72621 -129521
+ Misses 53550 15365 -38185
+ Partials 6841 52 -6789
🚀 New features to boost your workflow:
|
The expression rewriter splits small IN lists into OR equals based on type-specific thresholds (integer >= 10, float >= 15, default >= 3). Tests that expected TermExpr with fewer values than the threshold were failing because the rewriter converted them to OR expressions. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
BOOL only has 2 unique values (true/false), which is always below the IN threshold (default >= 3), so it gets split into OR equals. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
|
[INFO] PR Label Summary by Default
[WARNING] Milestone not set
You can set milestone by commenting: Use /refresh-label to update related check and label manually |
issue: #45525
pr: #48544