Skip to content

Conversation

@gavinking
Copy link
Member

Latest rebase of #10139.


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license
and can be relicensed under the terms of the LGPL v2.1 license in the future at the maintainers' discretion.
For more information on licensing, please check here.


@hibernate-github-bot
Copy link

hibernate-github-bot bot commented Jun 13, 2025

Thanks for your pull request!

This pull request does not follow the contribution rules. Could you have a look?

❌ All commit messages should start with a JIRA issue key matching pattern HHH-\d+
    ↳ Offending commits: [607eab1]

› This message was automatically generated.

…r SQM nodes

- finally enables efficient caching of criteria query plans
- also reconsider how alias generation is done - aliases should only
  be unique to a given query, NOT globally unique, since that results
  in interpretation cache misses
- ran into and fixed several other problems along the way
- note that the previous solution based on translating to HQL was not
  working at all, partly because the translation to HQL is not very
  correct - but anyway this is more efficient, since hashCodes are
  in general more flexible from an efficiency perspective
- there is still a remaining problem where NavigablePaths elements
  are assigned globally unique aliases resulting in cache misses
…oblem

- just deprecate the method, since HQL doesn't allow anonymous CTEs
- perhaps we need to do the same for the withRescursiveXxxxx() methods
- I don't really see a better fix, since nested CTEs can be composed
  before composing them with the outer query (perhaps we could modify
  the aliases of all CTEs after the whole query is built)
Oracle disapproves of aliases of form "_1"
@gavinking gavinking force-pushed the criteria-plan-caching-third-rebase branch from 29fd7e3 to ace0d87 Compare June 13, 2025 14:38
Copy link
Member

@beikov beikov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, the approach looks interesting and I think it has potential. The uses of toHqlString() might lead to problems, because it is only used for certain predicates/expressions instead of the whole query, which can ultimately lead to wrong results. Please check the inline comments for further details.


@Override
public String generateAlias() {
return "t_" + (++aliasCounter);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mutating the statement with such counters makes substitutability checks somewhat "impossible" AFAICT. If the order of operations is different, the generated aliases might be different even though the queries really are equivalent.
At first, I also thought that this might be a potential concurrency concern, but I guess it's fine since this all happens during CriteriaQuery construction.
Another "problem" of this approach is that people might add From nodes from other CriteriaQuery objects which coincidentally have the same generated alias, but do not represent the same node. Any use of toHqlString() for equality is then at danger of producing wrong results. Note that introducing From nodes from other queries appears to be a common programmer mistake, so this is a real problem.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the generated aliases might be different even though the queries really are equivalent.

Well, yeah, sure, in future we could come up with a more efficient implementation which is less sensitive to equivalent permutations of the query syntax. But for now we're more interested in avoiding false hits on the cache than we are in worrying about false misses.

Note that introducing From nodes from other queries appears to be a common programmer mistake, so this is a real problem.

Well then if we don't allow that, we should detect it and throw a meaningful error. It's not the responsibility of this code to deal with that problem.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, yeah, sure, in future we could come up with a more efficient implementation which is less sensitive to equivalent permutations of the query syntax. But for now we're more interested in avoiding false hits on the cache than we are in worrying about false misses.

I might be missing some context as I was gone for a while and am still catching up, so please be patient with me for a bit :)
Using an equals/hashCode based approach surely is an interesting solution if we can figure out the aliasing part, but you seem to imply that there is a bug currently where one CriteriaQuery hits the query plan cache of another that is in fact not equal. I checked the issues you linked but nothing in the descriptions/comments implied that a false cache hit is their root cause. Can you please clarify?

Well then if we don't allow that, we should detect it and throw a meaningful error. It's not the responsibility of this code to deal with that problem.

I'm trying to give you context so you understand my concerns better. Obviously generateAlias() is not the place to fix this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please clarify?

Well there are two problems at present:

  1. There are multiple of node types for where toHqlString() has missing "bits". And so different queries generate the same HQL string, resulting in an incorrect cache hit when hibernate.criteria.plan_cache_enabled is turned on.
  2. On the other hand, when hibernate.criteria.plan_cache_enabled is off, the query caches criteria queries based on the identity of the SQL tree, which is quite likely to result in almost no hits at all.

So whether hibernate.criteria.plan_cache_enabled is enabled or not, we don't get nice behavior.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And of course there's also the fact that evaluating a hashCode() or equals() is by nature more efficient than rendering a string, since it doesn't evolve allocation, and can often be quickly short-circuited.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are multiple of node types for where toHqlString() has missing "bits"

We should fix these missing "bits" then, because the HQL representation is still useful to people and should be semantically correct, regardless what query cache key strategy we use.

the query caches criteria queries based on the identity of the SQL tree, which is quite likely to result in almost no hits at all.

I agree that this is debatable, but I think we discussed this in the past and decided to default plan caching to false, because the majority in the team thought criteria queries are too dynamic to benefit from caching. Though that flag being false makes the query cache key null i.e. not even do a lookup.

And of course there's also the fact that evaluating a hashCode() or equals() is by nature more efficient than rendering a string, since it doesn't evolve allocation, and can often be quickly short-circuited.

I would hope so as well, but performance folks repeatedly surprised me in the past about my misconceptions, so I try to be defensive and prefer measuring before jumping to conclusions. The choice of the current implementation was obviously also biased by me trying to reuse existing code and wanting to avoid setting generated aliases on the From nodes.

return this == parameter ? 0 : 1;
}

// this is not really a parameter, it's really a literal value
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your implementation surely is better that the existing one, but this thing actually really is a parameter, in the sense that for caching, the value doesn't matter. This is a point where using equals/hashCode for cache keys is kind of quirky. Naturally, the equality of this type probably should check for value equality, but for caching, the only thing that matters IMO is the fact that the "other" object to compare against is also a parameter.

Copy link
Member Author

@gavinking gavinking Jun 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you suggesting that a correct implementation is of equals() is:

return object instanceof ValueBindJpaCriteriaParameter

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you suggesting that a correct implementation is of equals() is:

return object instanceof ValueBindJpaCriteriaParameter

Well, no, that's no good. Lots and lots of test failures.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know what tests fail as I haven't tried this myself yet, though wanted to mention that in theory, return object instanceof ValueBindJpaCriteriaParameter should in fact be a good enough implementation for cache key purposes.

Copy link
Member Author

@gavinking gavinking Jun 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be that equals() is called on parameters in other contexts not related to caching.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, that's why I wrote this:

This is a point where using equals/hashCode for cache keys is kind of quirky.

Essentially, this boils down to the "custom equals method with context parameter" suggestion I made in a different discussion on this PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah well I tried that and it was a bloodbath.

We could probably make the change you suggest and then fix whatever other code is making use of equals(). It could use compareTo() instead, for example.

Not going to do that in this PR though, since this seems to be working well enough for now. We can open an issue.

@gavinking
Copy link
Member Author

The uses of toHqlString() might lead to problems, because it is only used for certain predicates/expressions instead of the whole query, which can ultimately lead to wrong results. Please check the inline comments for further details.

Right, so, to clarify: the two places where toHqlString() are used are marked with a TODO indicating that those implementations are temporary, and that equals() should ideally be overridden on all subtypes.

I'm definitely not going to do that in this already-huge pull request, and I'm not going to let that minor issue stop me merging it. Calling toHqlString() is imperfect, but good enough for now. We can deal with that later.

@beikov
Copy link
Member

beikov commented Jun 16, 2025

Let my try to explain my main concerns and thinking with this approach to caching CriteriaQuery plans.

  1. For a cache lookup, the SQM tree needs to be traversed at least twice. Once for determining the hash and then again for the equality check.
  2. On a hash collision, the equals check will be invoked multiple times and since that does a lot of pointer chasing, my gut tells me that this will be slow due to CPU cache misses.

On the other hand, the HQL string approach that we have right now traverses the SQM tree only once to generate the string. Hash code generation and equality checks don't do pointer chasing and benefit from vector instructions on CPUs. The only downside to this approach is IMO the additional allocation needed for the HQL string.

To me, the HQL string approach seems more favorable, but maybe I'm wrong and someone from the performance team can prove my intuition about this topic wrong. HTH

@gavinking
Copy link
Member Author

gavinking commented Jun 16, 2025

For a cache lookup, the SQM tree needs to be traversed at least twice. Once for determining the hash and then again for the equality check. On a hash collision, the equals check will be invoked multiple times and since that does a lot of pointer chasing, my gut tells me that this will be slow due to CPU cache misses.

It's extremely unlikely that the whole SQM tree will be traversed on a hash collision. The overwhelmingly common case is that the comparison terminates after a couple of nodes. And hash collisions are also quite unlikely: there are almost 2^31 ~= 2E10 integers. There are way fewer criteria queries than that.

It's true that for a cache hit the whole tree gets traversed (just once), but that's clearly a huge win compared to actually recompiling the criteria query from scratch.

@gavinking
Copy link
Member Author

gavinking commented Jun 16, 2025

To be clear: computing a hashcode is a far more efficient operation than incrementally rendering a string, which usually involves multiple allocations; hashes can be cached just as easily as strings; when cached they occupy much less memory; and when we're done with them, they don't need to be garbage collected.

@gavinking
Copy link
Member Author

Let's consider the whole process here.

Suppose I have a newly-instantiated criteria query object, and I want to find if we have a cached plan for it.

Under the current implementation:

  1. We render its equivalent HQL string, which requires walking the whole tree and incrementally allocating a string.
  2. We compute the hash of this string.
  3. We do the lookup in the hashtable.
  4. If there's a hash collision (unlikely, assuming we set a sensible load factor), we check the strings for equality, which is extremely fast, probably returning after a few characters.
  5. Otherwise, in case we have a hit, we check the whole string. This is quite fast.

Note that the cost of this is completely dominated by step 1.

Under the new implementation:

  1. We compute the hashcode, which requires walking the whole tree, but involves no allocation.
  2. We do a lookup on the hash table.
  3. If there's a hash collision (unlikely, assuming we set a sensible load factor), we check the trees for equality, which is very fast, probably returning after checking the top couple of nodes.
  4. Otherwise, in case we have a hit, we walk the whole tree.

In this implementation, steps 1 and 4 dominate, and are naively about equally expensive. But both are much less expensive than step 1 of the current implementation.

@gavinking
Copy link
Member Author

Note also, that setting a generous load factor is completely realistic here, since there's only one such cache (or at most a handful) in the whole program. We're much more interested in not having those all those HQL strings floating around occupying memory than we are in scrimping on the load factor for a singleton map.

@beikov
Copy link
Member

beikov commented Jun 16, 2025

Like I wrote already, the HQL string approach being faster is just my gut feeling, because I assume pointer chasing is more expensive than CPU vectorized computations + allocation. I might just as well be wrong, though it also felt easier to just reuse the HQL string implementation since we already had code around.

If we want to know for sure, we probably will have to test the performance of both, though I think the implementations you added are useful regardless of the actual query cache key strategy we end up using.

@gavinking
Copy link
Member Author

OK, so that’s a great approach and I 100% endorse taking on questions about performance with a whole lot of humility. But we can’t possibly do any meaningful performance comparisons while this is a pull request. Even if that’s possible in principle, we both know it’s not going to happen in practice. So here we’re just going to have to rely on some intuition. And I don’t feel like one can go very far wrong with the basic intuition that allocating heap memory is a relatively expensive operation in a language with automatic memory management. I don’t know how or why vectorization would change that. Can a garbage collector be efficiently vectorized?

@beikov
Copy link
Member

beikov commented Jun 16, 2025

And I don’t feel like one can go very far wrong with the basic intuition that allocating heap memory is a relatively expensive operation in a language with automatic memory management. I don’t know how or why vectorization would change that. Can a garbage collector be efficiently vectorized?

That's the thing, if the allocation can happen with the TLB memory, it might be super fast, but in general, "it depends" ;)
So let's create an issue for measuring the performance and deciding based on hard numbers. Would be nice if the query cache key strategy could be easily configured to allow testing that though.

@gavinking
Copy link
Member Author

That's the thing, if the allocation can happen with the TLB memory, it might be super fast

As far as I understand, there's absolutely no way that you could possibly guarantee that this will happen. So even if it does happen in some sort of benchmark where you're hitting the query interpretation cache over and over again in a loop to evaluate which implementation is faster, there's absolutely no way you say for sure that it would happen in a real running program.

@gavinking
Copy link
Member Author

Superseded by #10354 due to conflicts.

@gavinking gavinking closed this Jun 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants