-
Notifications
You must be signed in to change notification settings - Fork 108
Description
The following query from WDbench opts is silly, but it illustrates a place where QLever could be faster. Because the OPTIONAL can only increase the size of the result, it is only necessary to create 10000000 results for the join in the main basic graph pattern. What QLever used to do was to relatively quickly run out of memory for the join, which wasn't great because only a few of these results are needed to produce a final result. With the recent changes QLever instead runs out of time, which is arguably worse. If the limit was pushed down through the sort and onto the join, QLever would perform much better on this query, and also other queries where a LIMIT can be used to limit the number of intermediate results.
Even if LIMIT isn't pushed down, the size of the intermediate result is known, so a good cost, in both time and space, to compute it can be determined. If either is significantly greater than the available resource, then the query could be aborted without actually performing the join. Care does have to be taken here, as evidenced by Virtuoso's laughable time estimates.
SELECT * WHERE {
?x1 <http://www.wikidata.org/prop/direct/P106> <http://www.wikidata.org/entity/Q188094> .
?x2 <http://www.wikidata.org/prop/direct/P27> <http://www.wikidata.org/entity/Q155> .
?x3 <http://www.wikidata.org/prop/direct/P31> <http://www.wikidata.org/entity/Q5> .
OPTIONAL {
?x3 <http://www.wikidata.org/prop/direct/P21> ?x4 .
}
}
LIMIT 10000000