Skip to content

Conversation

@ashu-mehra
Copy link
Collaborator

@ashu-mehra ashu-mehra commented Dec 9, 2025

This work aims to reduce the time taken to perform call resolution by caching the result of direct calls (static and opt-virtual) in the reloc info during compilation of a method.
Relocations for static and opt-virtual calls already have a field method_index which is used to store the "real" method to be invoked by the method handle. It is currently only used during c2 compilations.
This patch re-uses the method_index field for static and opt-virtual calls to store the target method. The runtime call (SharedRuntime::resolve_helper) used by the compiled code to perform the call site resolution can then optimize the resolution process by getting the target method from the reloc info and patches the callsite through CompiledDirectCall.
No special handling is needed for AOT code.

On a 4-cpu system there is around 3% improvement in spring-boot-getting-started. Numbers for JavacBench range between 0-3% improvement.

spring-boot-getting-started:

Run,Old CDS + AOT,New CDS + AOT
1,199,192
2,199,196
3,202,197
4,203,198
5,198,196
6,201,194
7,203,197
8,200,193
9,204,193
10,199,201
Geomean,200.79,195.68 (1.03x improvement)
Stdev,1.99,2.61

-Xlog:init shows the numbers for time spent in call resolution from the compiled code.
For spring-boot-getting-started before this patch:

[0.357s][info][init] SharedRuntime:
[0.357s][info][init]   resolve_opt_virtual_call:      8260us /  2249 events
[0.357s][info][init]   resolve_virtual_call:          6899us /  1297 events
[0.357s][info][init]   resolve_static_call:           4646us /  1723 events
[0.357s][info][init]   handle_wrong_method:            680us /   145 events
[0.357s][info][init]   ic_miss:                       2109us /   488 events
[0.357s][info][init] Total:                      22596us
[0.357s][info][init]   perf_resolve_static_cache_hit_ctr:     0
[0.357s][info][init]   perf_resolve_opt_virtual_cache_hit_ctr:     0

For spring-boot-getting-started after this patch:

[0.348s][info][init] SharedRuntime:
[0.348s][info][init]   resolve_opt_virtual_call:      2774us /  2251 events
[0.348s][info][init]   resolve_virtual_call:          5577us /  1294 events
[0.348s][info][init]   resolve_static_call:           1901us /  1728 events
[0.348s][info][init]   handle_wrong_method:            719us /   146 events
[0.348s][info][init]   ic_miss:                       2109us /   474 events
[0.348s][info][init] Total:                      13082us
[0.348s][info][init]   perf_resolve_static_cache_hit_ctr:  1704
[0.348s][info][init]   perf_resolve_opt_virtual_cache_hit_ctr:  2202

For JavacBench before this patch:

[0.406s][info][init] SharedRuntime:
[0.406s][info][init]   resolve_opt_virtual_call:      7146us /  2354 events
[0.406s][info][init]   resolve_virtual_call:          7160us /  2207 events
[0.406s][info][init]   resolve_static_call:           2992us /  1264 events
[0.406s][info][init]   handle_wrong_method:            728us /   186 events
[0.406s][info][init]   ic_miss:                       2389us /   675 events
[0.406s][info][init] Total:                      20416us
[0.406s][info][init]   perf_resolve_static_cache_hit_ctr:     0
[0.406s][info][init]   perf_resolve_opt_virtual_cache_hit_ctr:     0

For JavacBench after this patch:

[0.399s][info][init] SharedRuntime:
[0.399s][info][init]   resolve_opt_virtual_call:      2321us /  2346 events
[0.399s][info][init]   resolve_virtual_call:          7452us /  2213 events
[0.399s][info][init]   resolve_static_call:           1264us /  1258 events
[0.399s][info][init]   handle_wrong_method:            747us /   177 events
[0.399s][info][init]   ic_miss:                       2395us /   665 events
[0.399s][info][init] Total:                      14180us
[0.399s][info][init]   perf_resolve_static_cache_hit_ctr:  1212
[0.399s][info][init]   perf_resolve_opt_virtual_cache_hit_ctr:  2236

Progress

  • Change must not contain extraneous whitespace
  • Change must be properly reviewed (1 review required, with at least 1 Committer)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/leyden.git pull/106/head:pull/106
$ git checkout pull/106

Update a local copy of the PR:
$ git checkout pull/106
$ git pull https://git.openjdk.org/leyden.git pull/106/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 106

View PR using the GUI difftool:
$ git pr show -t 106

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/leyden/pull/106.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Dec 9, 2025

👋 Welcome back asmehra! A progress list of the required criteria for merging this PR into premain will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Dec 9, 2025

❗ This change is not yet ready to be integrated.
See the Progress checklist in the description for automated requirements.

@openjdk openjdk bot added the rfr Pull request is ready for review label Dec 9, 2025
@mlbridge
Copy link

mlbridge bot commented Dec 9, 2025

Webrevs

@ashu-mehra
Copy link
Collaborator Author

To make it convenient to measure perf impact the change in SharedRuntime::resolve_helper is protected by UseNewCode2 flag. If these changes make sense I will remove this flag before integrating.

@ashu-mehra
Copy link
Collaborator Author

@vnkozlov @adinn @iwanowww fyi

Copy link
Collaborator

@iwanowww iwanowww left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

PerfTickCounters* SharedRuntime::_perf_handle_wrong_method_total_time = nullptr;
PerfTickCounters* SharedRuntime::_perf_ic_miss_total_time = nullptr;

uint SharedRuntime::_perf_resolve_static_cache_hit_ctr = 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PerfCounters are usually more convenient to use than raw counters. For example, they can be sampled on-the-fly from a live process.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. I used these counters just to do a quick check how much static call resolution can be optimized this way. If we go with this approach I will try to replace them with PerfCounters or even get rid of these counters if they are not needed.

AtomicAccess::inc(addr);

if (UseNewCode2) {
bool is_mhi;
Copy link
Collaborator

@iwanowww iwanowww Dec 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe disabling inlining through MH linkers when generating archived code should simplify things. Then, there should be no attached methods for MH linkers in archived code and vise-versa.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you suggesting disable inlining through MH linkers for both aot and jit code, or only for the aot code? If we do only for the aot code, it wouldn't help unless we decided to do this optimization only for the aot code. As it stands, it benefits bot jit and aot code.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd assume it is less important for JITed code. The problem is so acute for AOTed code because it's so cheap to retrieve and install it, so we have plenty of AOT code published in a short period during application startup.

@bridgekeeper
Copy link

bridgekeeper bot commented Jan 8, 2026

@ashu-mehra This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply issue a /touch or /keepalive command to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

rfr Pull request is ready for review

Development

Successfully merging this pull request may close these issues.

2 participants