refactor: split pushdown and residual predicates in scan plan#583
refactor: split pushdown and residual predicates in scan plan#583
Conversation
separate source pushdown from post-merge residual filtering compute residuals for non-sst sources and gate early limit on residual emptiness add scan tests for full-pushdown and residual-required paths
There was a problem hiding this comment.
StartsWith range pruning is not sound for non-SST scans. next_prefix_string can return None on Unicode boundary cases (when char+1 is not a valid scalar), which degrades pruning to a lower-bound-only key range. In non-SST paths, StartsWith is treated as fully pushdown (residual = None), so false positives are never re-filtered.
| } | ||
|
|
||
| fn split_predicate_for_non_sst(predicate: &Expr, key_schema: &SchemaRef) -> NonSstPredicateSplit { | ||
| if key_schema.fields().len() != 1 { |
There was a problem hiding this comment.
Composite-key + Expr::True incorrectly forces residual filtering and disables early merge limit. split_predicate_for_non_sst returns residual: Some(predicate.clone()) for multi-column keys before handling Expr::True, which propagates to needs_post_filter = true and turns off early merge limit.
# Conflicts: # src/db/scan.rs # src/query/scan.rs
ethe
left a comment
There was a problem hiding this comment.
StartsWith can be silently dropped for non-string primary keys on non-SSTscans.
In split_predicate_for_non_sst_inner, Expr::StartsWith is treated as fullpushdown when the column matches and next_prefix_string(prefix) exists (src/db/scan.rs:1215), but this path does not verify the key type.However, key-range derivation only supports StartsWith for string keys (src/query/scan.rs:523). For non-string keys, no key range is produced, and becauseresidual is None (src/db/scan.rs:1232, consumed at src/db/scan.rs:226),the predicate is not evaluated at all. This is a correctness issue
Good catch! Though the phrasing is not completely accurate: this path wasn’t silently dropping the predicate because residual evaluation was still preserving it end-to-end. This uncovered a real inconsistency, though: non-SST |
Summary
pushdown_predicate) separate from post-merge filtering (residual_predicate)PackageStreampushdown_predicate; evaluate residual onlyTests
cargo testcargo clippy -- -D warningssrc/db/tests/core/scan.rs