-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Fix different first/max result values taking up query cache space #11188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…mit as well as locking mode statements to a given SQL query. Add a FinalizedSelectExecutor that executes given, finalized SQL statements.
…l happen before and after the finalization phase. Move the part that generates "pre-finalization" SQL into a dedicated method. Use a side channel in SingleSelectSqlFinalizer to access the "finalization" logic and avoid duplication.
…hrough the finalizer, parser knows about output walkers yielding finalizers
…ParserResult of the old kind (with the executor already constructed)
tests/Doctrine/Tests/ORM/Tools/Pagination/WhereInWalkerTest.php
Outdated
Show resolved
Hide resolved
|
@greg0ire, @derrabus or @beberlei, could you please tell me what you think about this? I thought I could get along without BC breaks or deprecations for the time being. But, as it turns out, when users use the Yes, it's a non-BC behaviour, but is that relevant? Is using the |
|
Discussed in internal slack – the |
Co-authored-by: Grégoire Paris <[email protected]>
|
@beberlei what if people use tree walkers (not an output walker) that make modifications to the AST depending on whether there are first/max result limitations (need not even be the concrete values for those)? Then we’d not correctly separate those results in the query cache. We’re looking at the output walker only to make that decision. |
|
I need to double check, but doesn't |
|
If I understand this correctly, the migration path is to extend the new If that is correct, isn't this log deprecation message misleading ?
as is the changelog: |
… `App\PostsQuery\QueryResult::getQueryResult()` * fix `User Deprecated: Your output walker class App\Doctrine\InterpolateParametersSQLOutputWalker should implement Doctrine\ORM\Query\OutputWalker in order to provide a Doctrine\ORM\Query\Exec\SqlFinalizer. This also means the output walker should not use the query firstResult/maxResult values, which should be read from the query by the SqlFinalizer only.`: doctrine/orm#11188 @ `App\Doctrine\InterpolateParametersSQLOutputWalker` @ be
…ecutor` class This removes methods that have been deprecated in 2.20.x via doctrine#11188 and/or got deprecation notices added in doctrine#12196.
…ecutor` class This removes methods that have been deprecated in 2.20.x via doctrine#11188 and/or got deprecation notices added in doctrine#12196.
…ecutor` class This removes methods that have been deprecated in 2.20.x via doctrine#11188 and/or got deprecation notices added in doctrine#12196.
…ecutor` class This removes methods that have been deprecated in 2.20.x via doctrine#11188 and/or got deprecation notices added in doctrine#12196.
#11112 reported that passing different values to the
Query::setFirstResult()andQuery::setMaxResults()methods leads to the queries using different keys in the query cache that speeds up DQL -> SQL transformation. This is no surprise, since these values are explicitly included in the cache key computation:orm/lib/Doctrine/ORM/Query.php
Line 819 in f58984c
The two methods in question may also be used at a lower level when using the
Paginatorcomponent, which may forward its limit/offset values to these methods in one way or another.Since in web applications the limit/offset (or a "page" numer that effectively gives the offset) is often controlled through a request parameter, this may lead to different kinds of memory or storage exhaustion problems.
Reason for having plain values in SQL
The reason why these values are part of the cache key is that both values need to be put into the resulting SQL statement literally, not through parameters. This has to do with how DBAL needs to generate platform-specific SQL, and it cannot be fixed. The corresponding DBAL discussion was at doctrine/dbal#6246.
Approach taken by this PR
Applying the limit/offset to a given SQL query is something that DBAL does on a string-processing level. Luckily, it also happens very late during SQL statement preparation. Only extra SQL expressions added for locking modes need to be added after it.
This gives us the possibility to split DQL -> SQL transformation into two stages. The first one includes building the AST, tree processing and output walking. But, it stops short of adding the limit/offset values to the query or adding locking statements.
We can then put this intermediate, "unfinalized" result into the DQL->SQL query cache, and we can take it from there also for other queries that only differ in the first/max result values. Then, we run the "finalizer" phase to append offset/limit and locking SQL to the prepared query, based on the actual query at hand.
That way, the first/max values no longer need to be part of the cache key.
Concepts and code paths added
The new
\Doctrine\ORM\Query\OutputWalkerinterface expresses how theParserwould like to interact with the (custom) output walker. ItsgetFinalizer()method returns an instance of\Doctrine\ORM\Query\Exec\SqlFinalizer. This will eventually replace thegetExecutor()method that is deprecated in this PR and that will be removed in 4.0.The
SqlFinalizermust be a serializable class, since it will be kept in the query cache as part of theParserResult. Based on the currentQuery, theSqlFinalizerthen ultimately creates theAbstractSqlExecutor. At this time, it has the opportunity to add the SQL limit/offset and locking parts.The
Querypassed to the finalizer is the one that caused the cache lookup, not necessarily the same as the one that was used to compile the DQL. However, since the DQL is part of the cache key computation, both queries must be based on the same DQL, hints etc.The
Queryhas to decide whether to put offset/limit into the cache key computation or not. It does that by inspecting the output walker being hinted to, with the slight optimization of knowing the default value. This leaks details of theParserimplementation, but could be removed once we makeOutputWalkerthe only choice.BC/migration path
We cannot change the
SqlWalkerto return the SQL without the limit/offset parts for BC reasons.Also, our documentation is showing users how to build their custom output walkers by subclassing this class. So, we cannot make the
SqlWalkerimplementOutputWalker, since we could not tell whether the subclasses are aware of the necessary changes and have updated their SQL generation accordingly.The solution is to have a new subclass
SqlOutputWalker. By moving to and extending this new class, users indicate that they are aware of the necessary changes and have removed the limit/offset parts from their own SQL generation. Of course, users would be free to return their ownSqlFinalizerimplementations in case they need special offset/limit handling code.Slight BC break considered irrelevant
Using the
Parser::parse()method may now produce aParserResultthat no longer contains theAbstractSqlExecutor, but only theSqlFinalizer. This has been discussed internally with the conclusion that theParserand itsparse()method are not part of an official API, but "implicitly internal". They are part of the inner workings of theQueryclass, which is able to deal with this variation.