Relax restriction that was preventing us from caching the result of subqueries. #3205
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
#1470 was supposed to, among other things, add restrictions to when we generate
CachedResultsnodes to cache subquery results. However, the added check is overly broad, and as a result it became impossible to actually cache subquery results. Currently, there is not a single plan test that contains aCachedResultsnode in the output plan.The
cacheSubqueryAliasesInJoinsfunction has a comment://The left-most child of a join root is an exception that cannot be cached.No rationale is given for this. Looking at it, it seems like we used to generate
CachedResultsnodes before we finished resolving references in the query, and now we wait until after. So it's possible that this is the reason for the restriction, and the entire check is no longer necessary.Either way, the implementation is more restrictive than the comment would suggest. Due to how the algorithm tracks state via function parameters, it doesn't propagate state from a child node to it's parents/siblings, and the flag for recording that it's encountered a subqeury gets unset. As a result, no Subqueries will actually be cached.
This PR fixes the tree walk to correctly remember once it's encountered a subquery and allow subsequent subqueries to be cached. This is still more broad than the comment would suggest, since it doesn't require that this subquery appear in the left-most child of the join.
Among the changed plan tests, we see that
CachedResultsnodes are now emitted. Most of them are the children ofHashLookups, but there are some that are the children ofInnerJoinandSemiJoinnodes where one of the things being joined is a subquery.