You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Address planner bugs encountered while integrating rewrite rules (#3435)
This contains a number of planner fixes that were first identified when
trying to integrate the planner rewrite rules (see: #3401). They are:
1. There were some values being leaked into the `Traversal` which would
end up polluting the memo structure with additional values that weren't
in the final graph. This adds a sanity check to validate the traversal
contains precisely the information in the tree. There were a bunch of
places where things got out of line, mainly due to something being added
to the memo structure during rule execution (e.g., a new child) but that
was never actually integrated into the final expression (e.g., because
the expression was a duplicate of another in the reference). This adds a
new "garbage collection" like mechanic, where new references are tracked
during rule execution, and then after we're done, any that are still not
in the memo are pruned.
2. Another memoizer issue: The first was that expressions could be
re-used when they were in the wrong planning stage. This would be bad,
and so the memoizer now looks for only references that match the
expected stage.
3. Additionally, we could end up memoizing a reference that had an
incorrect correlation set. Say, for example, that one expression was
added that dropped a correlation (maybe due to predicate
simplification). If we tried to re-use this reference in a place where
that correlation wasn't guaranteed to be defined, which means we'd have
an illegal graph. This addresses this by only looking for pre-existing
references in the memoizer that have the right correlation set. This
also allows us to update `DecorrelateValues` so that it can pick up any
_expression_ that is not sideways correlated without having to worry
about those correlations in any quantifier it pushes down
4. The `PredicateMultiMap` map hit an issue if one tried to add the same
(semantically) equivalent predicate multiple times. This can happen if
the user had the same predicate multiple times (like `a > 4 AND a > 4`)
though it's actually harder to construct than one might think. The
rewrite rules hit it because there was a query like `SELECT * FROM
(SELECT * FROM T WHERE T.a > 4) X WHERE X.a > 4` and then select merge
would be two copies of the same predicate at the same level. The builder
for the `PredicateMultiMap` was always based on `LinkedIdenityMap`, so
it used pointer equality. This updates the built `PredicateMultiMap` to
be based on pointer equality as well instead of semantic equality
(through immutable collections)
5. The `Reference` used to have a check to stop any expression that had
a different correlation set from the reference as a whole from being in
the memo. This is wrong, and it would result in way too many members of
the memo. That check has been removed
6. Expression partitions can now match for the argmin of a tuple of
properties. This allows us to pick the best child select with
tiebreakers to favor expressions that are simpler (with tiebreakers)
7. There was a bug in the expression count property where it would
always report the number of selects on references rather than the number
of the expression requested. This has been updated to correctly return
the expression type desired
8. The `InComparisonToExplodeRule` would allow us to do things like take
an expression like `FROM T WHERE (T.a, T.b) IN ?list` and then explode
the list and turn it into something like `FROM T, ?list x WHERE T.a =
x._0 AND T.b = x._1`. That actually required that the original
comparison had child values, and it wouldn't work if `(T.a, T.b)` could
be re-expressed as some kind of simpler thing. This has been modified to
work on more generic items of type `Record`
9. The exploration rules now only match against exploratory examples
10. Select merge now checks its children to see if pulling them up
introduces duplicate quantifiers, and it uses the new `rebaseGraphs`
method to rename any that are affected if it would do so
11. Constant object values now can be ignored during simplification is
not in the evaluation context. This was mainly for tests, but it would
be fine for actual code as well
As one can see, it's a lot of things. Most of them are actually fairly
minor code changes, though some of them are a bit conceptually tricky
---------
Co-authored-by: normen662 <[email protected]>
Copy file name to clipboardExpand all lines: fdb-record-layer-core/src/main/java/com/apple/foundationdb/record/query/plan/cascades/CascadesPlanner.java
+11Lines changed: 11 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -437,6 +437,11 @@ private void planPartial(@Nonnull final Supplier<Reference> referenceSupplier,
Copy file name to clipboardExpand all lines: fdb-record-layer-core/src/main/java/com/apple/foundationdb/record/query/plan/cascades/CascadesRuleCall.java
+60-19Lines changed: 60 additions & 19 deletions
Original file line number
Diff line number
Diff line change
@@ -97,6 +97,8 @@ public class CascadesRuleCall implements ExplorationCascadesRuleCall, Implementa
Copy file name to clipboardExpand all lines: fdb-record-layer-core/src/main/java/com/apple/foundationdb/record/query/plan/cascades/PredicateMultiMap.java
0 commit comments