Skip to content

Commit c323aa8

Browse files
Address planner bugs encountered while integrating rewrite rules (#3435)
This contains a number of planner fixes that were first identified when trying to integrate the planner rewrite rules (see: #3401). They are: 1. There were some values being leaked into the `Traversal` which would end up polluting the memo structure with additional values that weren't in the final graph. This adds a sanity check to validate the traversal contains precisely the information in the tree. There were a bunch of places where things got out of line, mainly due to something being added to the memo structure during rule execution (e.g., a new child) but that was never actually integrated into the final expression (e.g., because the expression was a duplicate of another in the reference). This adds a new "garbage collection" like mechanic, where new references are tracked during rule execution, and then after we're done, any that are still not in the memo are pruned. 2. Another memoizer issue: The first was that expressions could be re-used when they were in the wrong planning stage. This would be bad, and so the memoizer now looks for only references that match the expected stage. 3. Additionally, we could end up memoizing a reference that had an incorrect correlation set. Say, for example, that one expression was added that dropped a correlation (maybe due to predicate simplification). If we tried to re-use this reference in a place where that correlation wasn't guaranteed to be defined, which means we'd have an illegal graph. This addresses this by only looking for pre-existing references in the memoizer that have the right correlation set. This also allows us to update `DecorrelateValues` so that it can pick up any _expression_ that is not sideways correlated without having to worry about those correlations in any quantifier it pushes down 4. The `PredicateMultiMap` map hit an issue if one tried to add the same (semantically) equivalent predicate multiple times. This can happen if the user had the same predicate multiple times (like `a > 4 AND a > 4`) though it's actually harder to construct than one might think. The rewrite rules hit it because there was a query like `SELECT * FROM (SELECT * FROM T WHERE T.a > 4) X WHERE X.a > 4` and then select merge would be two copies of the same predicate at the same level. The builder for the `PredicateMultiMap` was always based on `LinkedIdenityMap`, so it used pointer equality. This updates the built `PredicateMultiMap` to be based on pointer equality as well instead of semantic equality (through immutable collections) 5. The `Reference` used to have a check to stop any expression that had a different correlation set from the reference as a whole from being in the memo. This is wrong, and it would result in way too many members of the memo. That check has been removed 6. Expression partitions can now match for the argmin of a tuple of properties. This allows us to pick the best child select with tiebreakers to favor expressions that are simpler (with tiebreakers) 7. There was a bug in the expression count property where it would always report the number of selects on references rather than the number of the expression requested. This has been updated to correctly return the expression type desired 8. The `InComparisonToExplodeRule` would allow us to do things like take an expression like `FROM T WHERE (T.a, T.b) IN ?list` and then explode the list and turn it into something like `FROM T, ?list x WHERE T.a = x._0 AND T.b = x._1`. That actually required that the original comparison had child values, and it wouldn't work if `(T.a, T.b)` could be re-expressed as some kind of simpler thing. This has been modified to work on more generic items of type `Record` 9. The exploration rules now only match against exploratory examples 10. Select merge now checks its children to see if pulling them up introduces duplicate quantifiers, and it uses the new `rebaseGraphs` method to rename any that are affected if it would do so 11. Constant object values now can be ignored during simplification is not in the evaluation context. This was mainly for tests, but it would be fine for actual code as well As one can see, it's a lot of things. Most of them are actually fairly minor code changes, though some of them are a bit conceptually tricky --------- Co-authored-by: normen662 <[email protected]>
1 parent 331b53d commit c323aa8

File tree

27 files changed

+1357
-195
lines changed

27 files changed

+1357
-195
lines changed

fdb-record-layer-core/src/main/java/com/apple/foundationdb/record/EvaluationContext.java

Lines changed: 45 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,6 @@
2121
package com.apple.foundationdb.record;
2222

2323
import com.apple.foundationdb.annotation.API;
24-
import com.apple.foundationdb.record.logging.LogMessageKeys;
2524
import com.apple.foundationdb.record.query.plan.cascades.CorrelationIdentifier;
2625
import com.apple.foundationdb.record.query.plan.cascades.typing.TypeRepository;
2726

@@ -136,24 +135,63 @@ public Object getBinding(@Nonnull final Bindings.Internal type, @Nonnull final C
136135
return bindings.get(type.bindingName(alias.getId()));
137136
}
138137

138+
/**
139+
* Whether a value is bound to the provided parameter.
140+
*
141+
* @param name the name of the parameter to check
142+
* @return whether a value is bound to the given parameter
143+
* @see Bindings#containsBinding(String)
144+
*/
145+
public boolean containsBinding(@Nonnull final String name) {
146+
return bindings.containsBinding(name);
147+
}
148+
149+
/**
150+
* Whether the bindings contain a given special internal correlation.
151+
*
152+
* @param type the type of the parameter
153+
* @param alias the parameter's alias
154+
* @return whether the bindings contain that special correlation value
155+
*/
156+
public boolean containsBinding(@Nonnull final Bindings.Internal type, @Nonnull final CorrelationIdentifier alias) {
157+
return containsBinding(type.bindingName(alias.getId()));
158+
}
159+
160+
/**
161+
* Whether a value is bound to the given constant.
162+
*
163+
* @param alias the alias of the constant map
164+
* @param constantId the identity of the constant within the map
165+
* @return whether a value is bound to the given constant
166+
*/
167+
public boolean containsConstantBinding(@Nonnull final CorrelationIdentifier alias, @Nonnull final String constantId) {
168+
if (!containsBinding(Bindings.Internal.CONSTANT, alias)) {
169+
return false;
170+
}
171+
final var constantsMap = getConstantsMap(alias);
172+
return constantsMap.containsKey(constantId);
173+
}
174+
175+
139176
/**
140177
* Dereferences the constant.
141178
*
142179
* @param alias the correlation identifier
143180
* @param constantId the constant id
144181
* @return de-referenced constant
145182
*/
146-
@SuppressWarnings("unchecked")
147183
@Nullable
148184
public Object dereferenceConstant(@Nonnull final CorrelationIdentifier alias, @Nonnull final String constantId) {
149-
final var constantsMap = (Map<String, ?>)bindings.get(Bindings.Internal.CONSTANT.bindingName(alias.getId()));
150-
if (constantsMap == null) {
151-
throw new RecordCoreException("could not find constant in the evaluation context")
152-
.addLogInfo(LogMessageKeys.KEY, "'" + alias.getId() + "' - '" + constantId + "'");
153-
}
185+
final var constantsMap = getConstantsMap(alias);
154186
return constantsMap.get(constantId);
155187
}
156188

189+
@SuppressWarnings("unchecked")
190+
@Nonnull
191+
private Map<String, ?> getConstantsMap(@Nonnull final CorrelationIdentifier alias) {
192+
return (Map<String, ?>) getBinding(Bindings.Internal.CONSTANT, alias);
193+
}
194+
157195
@Nonnull
158196
public TypeRepository getTypeRepository() {
159197
return typeRepository;

fdb-record-layer-core/src/main/java/com/apple/foundationdb/record/query/plan/cascades/CascadesPlanner.java

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -437,6 +437,11 @@ private void planPartial(@Nonnull final Supplier<Reference> referenceSupplier,
437437
Debugger.withDebugger(debugger -> debugger.onEvent(nextTask.toTaskEvent(Location.BEGIN)));
438438
try {
439439
nextTask.execute();
440+
Debugger.sanityCheck(() ->
441+
// This is a fairly expensive check, but it makes sure that the traversal
442+
// is up to date with the query DAG. This is used during memoization, so
443+
// it's important that it has an accurate view of the state of the world.
444+
traversal.verifyIntegrity());
440445
} finally {
441446
Debugger.withDebugger(debugger -> debugger.onEvent(nextTask.toTaskEvent(Location.END)));
442447
}
@@ -1004,6 +1009,12 @@ public void execute() {
10041009
protected void executeRuleCall(@Nonnull CascadesRuleCall ruleCall) {
10051010
ruleCall.run();
10061011

1012+
//
1013+
// During rule construction, references can be added to the memoizer that don't end up in
1014+
// any final expression. Prune them here
1015+
//
1016+
ruleCall.pruneUnusedNewReferences();
1017+
10071018
//
10081019
// Handle produced artifacts (through yield...() calls)
10091020
//

fdb-record-layer-core/src/main/java/com/apple/foundationdb/record/query/plan/cascades/CascadesRuleCall.java

Lines changed: 60 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -97,6 +97,8 @@ public class CascadesRuleCall implements ExplorationCascadesRuleCall, Implementa
9797
@Nonnull
9898
private final LinkedIdentitySet<PartialMatch> newPartialMatches;
9999
@Nonnull
100+
private final LinkedIdentitySet<Reference> newReferences;
101+
@Nonnull
100102
private final Set<Reference> referencesWithPushedConstraints;
101103
@Nonnull
102104
private final EvaluationContext evaluationContext;
@@ -119,6 +121,7 @@ public CascadesRuleCall(@Nonnull final PlannerPhase plannerPhase,
119121
this.newExploratoryExpressions = new LinkedIdentitySet<>();
120122
this.newFinalExpressions = new LinkedIdentitySet<>();
121123
this.newPartialMatches = new LinkedIdentitySet<>();
124+
this.newReferences = new LinkedIdentitySet<>();
122125
this.referencesWithPushedConstraints = Sets.newLinkedHashSet();
123126
this.evaluationContext = evaluationContext;
124127
}
@@ -289,6 +292,24 @@ public EvaluationContext getEvaluationContext() {
289292
return evaluationContext;
290293
}
291294

295+
@Nonnull
296+
private Reference addNewReference(@Nonnull final Reference newRef) {
297+
for (RelationalExpression expression : newRef.getAllMemberExpressions()) {
298+
Debugger.withDebugger(debugger -> debugger.onEvent(InsertIntoMemoEvent.newExp(expression)));
299+
traversal.addExpression(newRef, expression);
300+
}
301+
newReferences.add(newRef);
302+
return newRef;
303+
}
304+
305+
public void pruneUnusedNewReferences() {
306+
// Not all of the references created during the course of a rule call end up in a yielded expression.
307+
// At the end of rule call execution, remove any of the newly created references that aren't referenced
308+
// by another reference in the graph
309+
traversal.pruneUnreferencedRefs(newReferences);
310+
newReferences.clear();
311+
}
312+
292313
@Nonnull
293314
@Override
294315
public Reference memoizeExpressions(@Nonnull final Collection<? extends RelationalExpression> exploratoryExpressions,
@@ -335,7 +356,9 @@ public Reference memoizeExploratoryExpressions(@Nonnull final Collection<? exten
335356
try {
336357
Preconditions.checkArgument(expressions.stream().noneMatch(expression -> expression instanceof RecordQueryPlan));
337358

338-
// Pick a candidate expression from the expressions collection. This will be used to do the
359+
final Set<CorrelationIdentifier> requiredCorrelations = correlatedTo(expressions);
360+
361+
// Pick a candidate expression from the expressions collection. This will be used to do the topological check
339362
final RelationalExpression expression = Iterables.getFirst(expressions, null);
340363
Verify.verify(expression != null, "should not get null from first element of non-empty expressions collection");
341364

@@ -350,6 +373,7 @@ public Reference memoizeExploratoryExpressions(@Nonnull final Collection<? exten
350373
final var expressionToReferenceMap = new LinkedIdentityMap<RelationalExpression, Reference>();
351374
referencePathsList.stream()
352375
.flatMap(Collection::stream)
376+
.filter(referencePath -> isEligibleForReuse(requiredCorrelations, referencePath))
353377
.forEach(referencePath -> {
354378
final var referencingExpression = referencePath.getExpression();
355379
if (expressionToReferenceMap.containsKey(referencingExpression)) {
@@ -365,6 +389,7 @@ public Reference memoizeExploratoryExpressions(@Nonnull final Collection<? exten
365389
final List<Set<RelationalExpression>> referencingExpressions = referencePathsList.stream()
366390
.map(referencePaths ->
367391
referencePaths.stream()
392+
.filter(referencePath -> isEligibleForReuse(requiredCorrelations, referencePath))
368393
.map(Traversal.ReferencePath::getExpression)
369394
.collect(LinkedIdentitySet.toLinkedIdentitySet()))
370395
.collect(ImmutableList.toImmutableList());
@@ -394,27 +419,35 @@ public Reference memoizeExploratoryExpressions(@Nonnull final Collection<? exten
394419
}
395420

396421
// If we didn't find one, create a new reference and add it to the memo
397-
final var newRef = Reference.ofExploratoryExpressions(plannerPhase.getTargetPlannerStage(), expressions);
398-
expressions.forEach(expr -> {
399-
Debugger.withDebugger(debugger -> debugger.onEvent(InsertIntoMemoEvent.newExp(expr)));
400-
traversal.addExpression(newRef, expr);
401-
});
402-
return newRef;
422+
return addNewReference(Reference.ofExploratoryExpressions(plannerPhase.getTargetPlannerStage(), expressions));
403423
} finally {
404424
Debugger.withDebugger(debugger -> debugger.onEvent(InsertIntoMemoEvent.end()));
405425
}
406426
}
407427

428+
// Only look for references that are: (1) in the right planner stage and (2) have appropriate correlations.
429+
// It's important that there aren't any extra correlations as if we choose to re-use a reference that
430+
// contains a correlation that the original set of expressions don't have, then we might try to use it
431+
// in a place that we are not allowed to
432+
private boolean isEligibleForReuse(@Nonnull Set<CorrelationIdentifier> requiredCorrelations,
433+
@Nonnull Traversal.ReferencePath path) {
434+
return path.getReference().getPlannerStage() == plannerPhase.getTargetPlannerStage() && path.getReference().getCorrelatedTo().equals(requiredCorrelations);
435+
}
436+
408437
@Nonnull
409438
private Reference memoizeLeafExpressions(@Nonnull final Collection<? extends RelationalExpression> expressions) {
410439
Debugger.withDebugger(debugger -> debugger.onEvent(InsertIntoMemoEvent.begin()));
411440
try {
412441
Preconditions.checkArgument(expressions.stream()
413442
.allMatch(expression -> !(expression instanceof RecordQueryPlan) && expression.getQuantifiers().isEmpty()));
414443

444+
final Set<CorrelationIdentifier> requiredCorrelations = correlatedTo(expressions);
415445
final var leafRefs = traversal.getLeafReferences();
416446

417447
for (final var leafRef : leafRefs) {
448+
if (leafRef.getPlannerStage() != plannerPhase.getTargetPlannerStage() || !requiredCorrelations.equals(leafRef.getCorrelatedTo())) {
449+
continue;
450+
}
418451
if (leafRef.containsAllInMemo(expressions, AliasMap.emptyMap(), false)) {
419452
for (RelationalExpression expression : expressions) {
420453
Debugger.withDebugger(debugger -> debugger.onEvent(InsertIntoMemoEvent.reusedExp(expression)));
@@ -423,17 +456,30 @@ private Reference memoizeLeafExpressions(@Nonnull final Collection<? extends Rel
423456
}
424457
}
425458

426-
final Reference newRef = Reference.ofExploratoryExpressions(plannerPhase.getTargetPlannerStage(), expressions);
427-
for (RelationalExpression expression : expressions) {
428-
Debugger.withDebugger(debugger -> debugger.onEvent(InsertIntoMemoEvent.newExp(expression)));
429-
traversal.addExpression(newRef, expression);
430-
}
431-
return newRef;
459+
return addNewReference(Reference.ofExploratoryExpressions(plannerPhase.getTargetPlannerStage(), expressions));
432460
} finally {
433461
Debugger.withDebugger(debugger -> debugger.onEvent(InsertIntoMemoEvent.end()));
434462
}
435463
}
436464

465+
@Nonnull
466+
private Set<CorrelationIdentifier> correlatedTo(@Nonnull Collection<? extends RelationalExpression> expressions) {
467+
if (expressions.isEmpty()) {
468+
return ImmutableSet.of();
469+
} else if (expressions.size() == 1) {
470+
return Iterables.getOnlyElement(expressions).getCorrelatedTo();
471+
} else {
472+
// Note: this makes a copy, whereas an approach based on using Sets.union wouldn't need to.
473+
// However, we expect there to be a lot of duplicate values amongst the different expressions
474+
// here, and we don't want to have to re-do the de-duplication checks like we would with
475+
// an approach that uses Sets.union. So we just make a copy here
476+
return expressions.stream()
477+
.map(Correlated::getCorrelatedTo)
478+
.flatMap(Collection::stream)
479+
.collect(ImmutableSet.toImmutableSet());
480+
}
481+
}
482+
437483
@Nonnull
438484
@Override
439485
public Reference memoizeFinalExpressionsFromOther(@Nonnull final Reference reference,
@@ -493,12 +539,7 @@ private Reference memoizeExpressionsExactly(@Nonnull final Collection<? extends
493539
try {
494540
final var exploratoryExpressionSet = new LinkedIdentitySet<>(exploratoryExpressions);
495541
final var finalExpressionSet = new LinkedIdentitySet<>(finalExpressions);
496-
final var newRef = referenceCreator.apply(exploratoryExpressionSet, finalExpressionSet);
497-
for (final var expression : allExpressions) {
498-
Debugger.withDebugger(debugger -> debugger.onEvent(InsertIntoMemoEvent.newExp(expression)));
499-
traversal.addExpression(newRef, expression);
500-
}
501-
return newRef;
542+
return addNewReference(referenceCreator.apply(exploratoryExpressionSet, finalExpressionSet));
502543
} finally {
503544
Debugger.withDebugger(debugger -> allExpressions.forEach(
504545
expression -> debugger.onEvent(InsertIntoMemoEvent.end())));

fdb-record-layer-core/src/main/java/com/apple/foundationdb/record/query/plan/cascades/PredicateMap.java

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,6 @@
2121
package com.apple.foundationdb.record.query.plan.cascades;
2222

2323
import com.apple.foundationdb.record.query.plan.cascades.predicates.QueryPredicate;
24-
import com.google.common.collect.ImmutableSet;
2524
import com.google.common.collect.Iterables;
2625
import com.google.common.collect.SetMultimap;
2726

@@ -38,7 +37,7 @@ private PredicateMap(@Nonnull final SetMultimap<QueryPredicate, PredicateMapping
3837
}
3938

4039
public Optional<PredicateMapping> getMappingOptional(@Nonnull final QueryPredicate queryPredicate) {
41-
final ImmutableSet<PredicateMapping> predicateEntries = getMap().get(queryPredicate);
40+
final Set<PredicateMapping> predicateEntries = getMap().get(queryPredicate);
4241
if (predicateEntries.size() != 1) {
4342
return Optional.empty();
4443
}

fdb-record-layer-core/src/main/java/com/apple/foundationdb/record/query/plan/cascades/PredicateMultiMap.java

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,6 @@
2525
import com.apple.foundationdb.record.query.plan.cascades.values.Value;
2626
import com.apple.foundationdb.record.query.plan.cascades.values.translation.PullUp;
2727
import com.apple.foundationdb.record.query.plan.cascades.values.translation.TranslationMap;
28-
import com.google.common.collect.ImmutableSetMultimap;
2928
import com.google.common.collect.Multimaps;
3029
import com.google.common.collect.SetMultimap;
3130
import com.google.common.collect.Sets;
@@ -54,7 +53,7 @@ public class PredicateMultiMap {
5453
* Backing multimap.
5554
*/
5655
@Nonnull
57-
private final ImmutableSetMultimap<QueryPredicate, PredicateMapping> map;
56+
private final SetMultimap<QueryPredicate, PredicateMapping> map;
5857

5958
/**
6059
* Functional interface to reapply a predicate if necessary.
@@ -501,11 +500,13 @@ public PredicateMapping build() {
501500
}
502501

503502
protected PredicateMultiMap(@Nonnull final SetMultimap<QueryPredicate, PredicateMapping> map) {
504-
this.map = ImmutableSetMultimap.copyOf(map);
503+
SetMultimap<QueryPredicate, PredicateMapping> copy = Multimaps.newSetMultimap(new LinkedIdentityMap<>(), LinkedIdentitySet::new);
504+
map.entries().forEach(entry -> copy.put(entry.getKey(), entry.getValue()));
505+
this.map = Multimaps.unmodifiableSetMultimap(copy);
505506
}
506507

507508
@Nonnull
508-
protected ImmutableSetMultimap<QueryPredicate, PredicateMapping> getMap() {
509+
protected SetMultimap<QueryPredicate, PredicateMapping> getMap() {
509510
return map;
510511
}
511512

fdb-record-layer-core/src/main/java/com/apple/foundationdb/record/query/plan/cascades/Quantifier.java

Lines changed: 0 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -820,19 +820,6 @@ public Quantifier rebase(@Nonnull final AliasMap translationMap) {
820820
throw new UnsupportedOperationException("rebase not supported on quantifier");
821821
}
822822

823-
@Nonnull
824-
@SuppressWarnings("PMD.CompareObjectsWithEquals")
825-
public Quantifier translateGraph(@Nonnull final Memoizer memoizer,
826-
@Nonnull final TranslationMap translationMap,
827-
final boolean shouldSimplifyValues) {
828-
final Reference rangesOver = getRangesOver();
829-
final Reference translatedReference =
830-
getRangesOver().translateGraph(memoizer, translationMap, shouldSimplifyValues);
831-
return rangesOver == translatedReference
832-
? this
833-
: overNewReference(translatedReference);
834-
}
835-
836823
@Nonnull
837824
public static CorrelationIdentifier uniqueID() {
838825
return CorrelationIdentifier.uniqueID(Quantifier.class);

0 commit comments

Comments
 (0)