Commit 8d2f08f
feat: Add automatic null-aware anti join for NOT IN subqueries
This commit implements Phase 2 of null-aware anti join support, enabling
automatic detection and configuration of null-aware semantics for SQL
NOT IN subqueries.
DataFusion now automatically provides correct SQL NOT IN semantics with
three-valued logic. When users write NOT IN subqueries, the optimizer
automatically detects them and enables null-aware execution.
- Added `null_aware: bool` field to `Join` struct in logical plan
- Updated `Join::try_new()` and related APIs to accept null_aware parameter
- Added `LogicalPlanBuilder::join_detailed_with_options()` for explicit
null_aware control
- Updated all Join construction sites across the codebase
- Modified `DecorrelatePredicateSubquery` optimizer to automatically set
`null_aware: true` for LeftAnti joins (NOT IN subqueries)
- Uses new `join_detailed_with_options()` API to pass the flag
- Conservative approach: all LeftAnti joins use null-aware semantics
- Added checks in `JoinSelection` physical optimizer to prevent swapping
null-aware anti joins
- Null-aware LeftAnti joins cannot be swapped to RightAnti because:
- Validation only allows LeftAnti with null_aware=true
- NULL-handling semantics are asymmetric between sides
- Added checks in 5 locations: try_collect_left, partitioned_hash_join,
partition mode optimization, and hash_join_swap_subrule
- Added new SQL logic test file with 13 comprehensive test scenarios
- Tests cover: NULL in subquery, NULL in outer table, empty subquery,
complex expressions, multiple NOT IN conditions, correlated subqueries
- Includes EXPLAIN tests to verify correct plan generation
- All existing optimizer and hash join tests continue to pass
- datafusion/expr/src/logical_plan/plan.rs
- datafusion/expr/src/logical_plan/builder.rs
- datafusion/expr/src/logical_plan/tree_node.rs
- datafusion/optimizer/src/decorrelate_predicate_subquery.rs
- datafusion/optimizer/src/eliminate_cross_join.rs
- datafusion/optimizer/src/eliminate_outer_join.rs
- datafusion/optimizer/src/extract_equijoin_predicate.rs
- datafusion/physical-optimizer/src/join_selection.rs
- datafusion/physical-optimizer/src/enforce_distribution.rs
- datafusion/core/src/physical_planner.rs
- datafusion/proto/src/physical_plan/mod.rs
- datafusion/sqllogictest/test_files/null_aware_anti_join.slt (new)
Before (Phase 1 - manual):
```rust
HashJoinExec::try_new(..., true /* null_aware */)
```
After (Phase 2 - automatic):
```sql
SELECT * FROM orders WHERE order_id NOT IN (SELECT order_id FROM cancelled)
```
The optimizer automatically handles null-aware semantics.
- SQL logic tests: All passed
- Optimizer tests: 568 passed
- Hash join tests: 610 passed
- Physical optimizer tests: 16 passed
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>1 parent 8dd9456 commit 8d2f08f
File tree
12 files changed
+350
-6
lines changed- datafusion
- core/src
- expr/src/logical_plan
- optimizer/src
- physical-optimizer/src
- proto/src/physical_plan
- sqllogictest/test_files
12 files changed
+350
-6
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1091 | 1091 | | |
1092 | 1092 | | |
1093 | 1093 | | |
| 1094 | + | |
1094 | 1095 | | |
1095 | 1096 | | |
1096 | 1097 | | |
| |||
1497 | 1498 | | |
1498 | 1499 | | |
1499 | 1500 | | |
| 1501 | + | |
1500 | 1502 | | |
1501 | 1503 | | |
1502 | 1504 | | |
| |||
1508 | 1510 | | |
1509 | 1511 | | |
1510 | 1512 | | |
| 1513 | + | |
1511 | 1514 | | |
1512 | 1515 | | |
1513 | 1516 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1011 | 1011 | | |
1012 | 1012 | | |
1013 | 1013 | | |
| 1014 | + | |
| 1015 | + | |
| 1016 | + | |
| 1017 | + | |
| 1018 | + | |
| 1019 | + | |
| 1020 | + | |
| 1021 | + | |
| 1022 | + | |
| 1023 | + | |
| 1024 | + | |
| 1025 | + | |
1014 | 1026 | | |
1015 | 1027 | | |
1016 | 1028 | | |
| |||
1128 | 1140 | | |
1129 | 1141 | | |
1130 | 1142 | | |
| 1143 | + | |
1131 | 1144 | | |
1132 | 1145 | | |
1133 | 1146 | | |
| |||
1201 | 1214 | | |
1202 | 1215 | | |
1203 | 1216 | | |
| 1217 | + | |
1204 | 1218 | | |
1205 | 1219 | | |
1206 | 1220 | | |
| |||
1217 | 1231 | | |
1218 | 1232 | | |
1219 | 1233 | | |
| 1234 | + | |
1220 | 1235 | | |
1221 | 1236 | | |
1222 | 1237 | | |
| |||
1471 | 1486 | | |
1472 | 1487 | | |
1473 | 1488 | | |
| 1489 | + | |
1474 | 1490 | | |
1475 | 1491 | | |
1476 | 1492 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
661 | 661 | | |
662 | 662 | | |
663 | 663 | | |
| 664 | + | |
664 | 665 | | |
665 | 666 | | |
666 | 667 | | |
| |||
682 | 683 | | |
683 | 684 | | |
684 | 685 | | |
| 686 | + | |
685 | 687 | | |
686 | 688 | | |
687 | 689 | | |
| |||
942 | 944 | | |
943 | 945 | | |
944 | 946 | | |
| 947 | + | |
945 | 948 | | |
946 | 949 | | |
947 | 950 | | |
| |||
3781 | 3784 | | |
3782 | 3785 | | |
3783 | 3786 | | |
| 3787 | + | |
| 3788 | + | |
| 3789 | + | |
| 3790 | + | |
| 3791 | + | |
| 3792 | + | |
3784 | 3793 | | |
3785 | 3794 | | |
3786 | 3795 | | |
| |||
3798 | 3807 | | |
3799 | 3808 | | |
3800 | 3809 | | |
| 3810 | + | |
3801 | 3811 | | |
3802 | 3812 | | |
3803 | 3813 | | |
| |||
3810 | 3820 | | |
3811 | 3821 | | |
3812 | 3822 | | |
| 3823 | + | |
3813 | 3824 | | |
3814 | 3825 | | |
3815 | 3826 | | |
| |||
3822 | 3833 | | |
3823 | 3834 | | |
3824 | 3835 | | |
| 3836 | + | |
3825 | 3837 | | |
3826 | 3838 | | |
3827 | 3839 | | |
| |||
3877 | 3889 | | |
3878 | 3890 | | |
3879 | 3891 | | |
| 3892 | + | |
3880 | 3893 | | |
3881 | 3894 | | |
3882 | 3895 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
133 | 133 | | |
134 | 134 | | |
135 | 135 | | |
| 136 | + | |
136 | 137 | | |
137 | 138 | | |
138 | 139 | | |
| |||
143 | 144 | | |
144 | 145 | | |
145 | 146 | | |
| 147 | + | |
146 | 148 | | |
147 | 149 | | |
148 | 150 | | |
| |||
564 | 566 | | |
565 | 567 | | |
566 | 568 | | |
| 569 | + | |
567 | 570 | | |
568 | 571 | | |
569 | 572 | | |
| |||
574 | 577 | | |
575 | 578 | | |
576 | 579 | | |
| 580 | + | |
577 | 581 | | |
578 | 582 | | |
579 | 583 | | |
| |||
Lines changed: 26 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
27 | 27 | | |
28 | 28 | | |
29 | 29 | | |
30 | | - | |
| 30 | + | |
31 | 31 | | |
32 | 32 | | |
33 | 33 | | |
| |||
403 | 403 | | |
404 | 404 | | |
405 | 405 | | |
| 406 | + | |
| 407 | + | |
406 | 408 | | |
407 | 409 | | |
408 | 410 | | |
| |||
415 | 417 | | |
416 | 418 | | |
417 | 419 | | |
| 420 | + | |
| 421 | + | |
| 422 | + | |
| 423 | + | |
| 424 | + | |
| 425 | + | |
418 | 426 | | |
419 | | - | |
420 | | - | |
421 | | - | |
| 427 | + | |
| 428 | + | |
| 429 | + | |
| 430 | + | |
| 431 | + | |
| 432 | + | |
| 433 | + | |
| 434 | + | |
| 435 | + | |
| 436 | + | |
| 437 | + | |
| 438 | + | |
| 439 | + | |
| 440 | + | |
| 441 | + | |
| 442 | + | |
| 443 | + | |
422 | 444 | | |
423 | 445 | | |
424 | 446 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
341 | 341 | | |
342 | 342 | | |
343 | 343 | | |
| 344 | + | |
344 | 345 | | |
345 | 346 | | |
346 | 347 | | |
| |||
363 | 364 | | |
364 | 365 | | |
365 | 366 | | |
| 367 | + | |
366 | 368 | | |
367 | 369 | | |
368 | 370 | | |
| |||
1367 | 1369 | | |
1368 | 1370 | | |
1369 | 1371 | | |
| 1372 | + | |
1370 | 1373 | | |
1371 | 1374 | | |
1372 | 1375 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
119 | 119 | | |
120 | 120 | | |
121 | 121 | | |
| 122 | + | |
122 | 123 | | |
123 | 124 | | |
124 | 125 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
76 | 76 | | |
77 | 77 | | |
78 | 78 | | |
| 79 | + | |
79 | 80 | | |
80 | 81 | | |
81 | 82 | | |
| |||
117 | 118 | | |
118 | 119 | | |
119 | 120 | | |
| 121 | + | |
120 | 122 | | |
121 | 123 | | |
122 | 124 | | |
| |||
132 | 134 | | |
133 | 135 | | |
134 | 136 | | |
| 137 | + | |
135 | 138 | | |
136 | 139 | | |
137 | 140 | | |
| |||
143 | 146 | | |
144 | 147 | | |
145 | 148 | | |
| 149 | + | |
146 | 150 | | |
147 | 151 | | |
148 | 152 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
295 | 295 | | |
296 | 296 | | |
297 | 297 | | |
| 298 | + | |
298 | 299 | | |
299 | 300 | | |
300 | 301 | | |
| |||
314 | 315 | | |
315 | 316 | | |
316 | 317 | | |
| 318 | + | |
317 | 319 | | |
318 | 320 | | |
319 | 321 | | |
| |||
618 | 620 | | |
619 | 621 | | |
620 | 622 | | |
| 623 | + | |
621 | 624 | | |
622 | 625 | | |
623 | 626 | | |
| |||
644 | 647 | | |
645 | 648 | | |
646 | 649 | | |
| 650 | + | |
647 | 651 | | |
648 | 652 | | |
649 | 653 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
184 | 184 | | |
185 | 185 | | |
186 | 186 | | |
| 187 | + | |
187 | 188 | | |
| 189 | + | |
188 | 190 | | |
189 | 191 | | |
190 | 192 | | |
| |||
198 | 200 | | |
199 | 201 | | |
200 | 202 | | |
| 203 | + | |
201 | 204 | | |
202 | 205 | | |
203 | 206 | | |
| |||
210 | 213 | | |
211 | 214 | | |
212 | 215 | | |
| 216 | + | |
213 | 217 | | |
214 | 218 | | |
215 | | - | |
| 219 | + | |
| 220 | + | |
216 | 221 | | |
217 | 222 | | |
218 | 223 | | |
| |||
232 | 237 | | |
233 | 238 | | |
234 | 239 | | |
235 | | - | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
236 | 244 | | |
237 | 245 | | |
238 | 246 | | |
| |||
245 | 253 | | |
246 | 254 | | |
247 | 255 | | |
| 256 | + | |
248 | 257 | | |
249 | 258 | | |
250 | 259 | | |
| |||
277 | 286 | | |
278 | 287 | | |
279 | 288 | | |
| 289 | + | |
280 | 290 | | |
| 291 | + | |
281 | 292 | | |
282 | 293 | | |
283 | 294 | | |
| |||
484 | 495 | | |
485 | 496 | | |
486 | 497 | | |
| 498 | + | |
487 | 499 | | |
488 | 500 | | |
489 | 501 | | |
| |||
0 commit comments