Skip to content

8365053: Refresh hotspot precompiled.hpp with headers based on current frequency #26681

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 19 commits into
base: master
Choose a base branch
from
Open
46 changes: 16 additions & 30 deletions src/hotspot/share/precompiled/precompiled.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -26,52 +26,38 @@
// --disable-precompiled-headers to configure.

// These header files are included in at least 130 C++ files, as of
// measurements made in November 2018. This list excludes files named
// *.inline.hpp, since including them decreased build performance.
// measurements made in August 2025.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there's anything particularly special about the number 130.

Another thing to consider when a header file has a high include count is
whether it's being overincluded. We've had lots of those, and some folks
occasionally try to poke at that problem. Some of the removals here look like
they might be a result of such efforts.

Still another thing to consider is the cost of inclusion. Some files may just
be a lot more expensive to process and benefit more for being precompiled.
File size can be an indicator, but there are others. Unfortunately, I don't
know of a good way to measure this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to modify the comment here, too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I'll wait just a bit so the discussion on how to approach the problem stabilizes :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This number is based on my original experimentation. If I tried to lower the bar by lowering the number, the PCH list grew too much and it made for a worse performance. And contrary, if I raised the number, fewer files where included which made the PCH quicker to process but less helpful. That number was the optimum I found. It seems from https://bugs.openjdk.org/browse/JDK-8365053 that this is still around the optimum. However, rather than just mentioning the number in the comment, the rationale could be specified, like the optimum number of includes.


#include "classfile/classLoaderData.hpp"
#include "asm/assembler.hpp"
#include "asm/macroAssembler.hpp"
#include "classfile/javaClasses.hpp"
#include "classfile/systemDictionary.hpp"
#include "classfile/vmSymbols.hpp"
#include "gc/shared/collectedHeap.hpp"
#include "gc/shared/gcCause.hpp"
#include "interpreter/interpreter.hpp"
#include "logging/log.hpp"
#include "memory/allocation.hpp"
#include "logging/logStream.hpp"
#include "memory/allStatic.hpp"
#include "memory/allocation.inline.hpp"
#include "memory/iterator.hpp"
#include "memory/memRegion.hpp"
#include "memory/resourceArea.hpp"
#include "memory/universe.hpp"
#include "nmt/memTracker.hpp"
#include "oops/instanceKlass.hpp"
#include "oops/klass.hpp"
#include "oops/method.hpp"
#include "oops/objArrayKlass.hpp"
#include "oops/objArrayOop.hpp"
#include "oops/oop.hpp"
#include "oops/oopsHierarchy.hpp"
#include "oops/oop.inline.hpp"
#include "runtime/atomic.hpp"
#include "runtime/frame.inline.hpp"
#include "runtime/globals.hpp"
#include "runtime/handles.hpp"
#include "runtime/handles.inline.hpp"
#include "runtime/interfaceSupport.inline.hpp"
#include "runtime/javaThread.hpp"
#include "runtime/mutex.hpp"
#include "runtime/orderAccess.hpp"
#include "runtime/mutexLocker.hpp"
#include "runtime/os.hpp"
#include "runtime/timer.hpp"
#include "runtime/sharedRuntime.hpp"
#include "runtime/stubRoutines.hpp"
#include "utilities/align.hpp"
#include "utilities/bitMap.hpp"
#include "utilities/copy.hpp"
#include "utilities/debug.hpp"
#include "utilities/exceptions.hpp"
#include "utilities/globalDefinitions.hpp"
#include "utilities/growableArray.hpp"
#include "utilities/macros.hpp"
#include "utilities/ostream.hpp"
#include "utilities/ticks.hpp"

#ifdef TARGET_COMPILER_visCPP

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of the reported testing was on Linux. These were included specifically because measurements
said their inclusion here was beneficial. And my recollection from previous discussions is that
Visual Studio may be the compiler where precompiled headers are most beneficial.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, Visual Studio is the place where PCH is most needed. I see Erik says he tested on Windows with no difference. While he concluded that this means no regression, I see it as a missed opportunity. Giving the Windows platform a bit of extra love can probably increase compilation speed where it is needed the most.

// For Visual Studio, including the *.inline.hpp files actually
// increased performance.
#include "memory/allocation.inline.hpp"
#include "oops/access.inline.hpp"
#include "oops/oop.inline.hpp"
#include "runtime/handles.inline.hpp"
#endif // TARGET_COMPILER_visCPP
#include "utilities/powerOfTwo.hpp"
134 changes: 134 additions & 0 deletions src/utils/PrecompiledHeaders/PrecompiledHeaders.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@

/*
* Copyright (c) 2025, Oracle and/or its affiliates. All rights reserved.
* DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
*
* This code is free software; you can redistribute it and/or modify it
* under the terms of the GNU General Public License version 2 only, as
* published by the Free Software Foundation.
*
* This code is distributed in the hope that it will be useful, but WITHOUT
* ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
* FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
* version 2 for more details (a copy is included in the LICENSE file that
* accompanied this code).
*
* You should have received a copy of the GNU General Public License version
* 2 along with this work; if not, write to the Free Software Foundation,
* Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
*
* Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA
* or visit www.oracle.com if you need additional information or have any
* questions.
*/

import java.io.IOException;
import java.io.UncheckedIOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.StandardOpenOption;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Objects;
import java.util.Set;
import java.util.function.Predicate;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import java.util.stream.Collectors;
import java.util.stream.Stream;

public final class PrecompiledHeaders {

private static final Pattern INCLUDE_PATTERN = Pattern.compile("^#\\s*include \"([^\"]+)\"$");
private static final String HOTSPOT_PATH = "src/hotspot";
private static final String PRECOMPILED_HPP = "src/hotspot/share/precompiled/precompiled.hpp";
private static final String INLINE_HPP_SUFFIX = ".inline.hpp";

private PrecompiledHeaders() {
throw new UnsupportedOperationException("Instances not allowed");
}

public static void main(String[] args) throws IOException {
if (args.length == 0 || args.length > 2) {
System.err.println("Usage: min_inclusion_count [jdk_root=.]");
System.exit(1);
}

int minInclusionCount = Integer.parseInt(args[0]);
Path jdkRoot = Path.of(args.length == 2 ? args[1] : ".").toAbsolutePath();
if (!Files.isDirectory(jdkRoot)) {
throw new IllegalArgumentException("jdk_root is not a directory: " + jdkRoot);
}
Path hotspotPath = jdkRoot.resolve(HOTSPOT_PATH);
if (!Files.isDirectory(hotspotPath)) {
throw new IllegalArgumentException("Invalid hotspot directory: " + hotspotPath);
}

// Count inclusion times for each header
Map<String, Integer> occurrences = new HashMap<>();
try (Stream<Path> paths = Files.walk(hotspotPath)) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think walking the source tree is the wrong approach to gathering the data
about include counts. I think better is to do a build and then look at the
files /hotspot/variant-server/libjvm/objs/*.d. From that one can build
a completely accurate count of the transitive inclusions.

Copy link
Contributor Author

@fandreuz fandreuz Aug 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I followed this hint, the latest version of the script checks the .d files.

Two observations:

  • The magic number is now 2460 (more includes are taken into account, which makes sense)
  • There is much less wiggle room, 2461 includes nothing, 2459 includes too much
  • Runtime does not seem to be affected negatively or positively

If anybody is interested, this file contains the inclusion count for each source file: inclusions_count.txt

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something seems wrong with these numbers.

  • There are 538 headers with include counts of 2460.
  • There are only 1121 .o.cmdline files.

So how do we get include counts that are greater (by more than a factor of 2)
than the number of .d files?

Note that precompiled.hpp has an include count of 2456. There's 1 .d file that
depends on everything - BUILD_LIBJVM.d. That probably ought to be excluded
from counting, although I guess it should just add one to every file.

Also, precompiled.hpp has an include count of 2456, so not the maximum count,
but close. Which is itself weird that it wouldn't be the most included file.
But it's hard to guess what might be going on with that until the surprising
high include count > number of .d files is understood.

That include count listing would also be more useful of sorted by count.

Copy link
Contributor Author

@fandreuz fandreuz Aug 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that precompiled.hpp has an include count of 2456. There's 1 .d file that
depends on everything - BUILD_LIBJVM.d. That probably ought to be excluded
from counting, although I guess it should just add one to every file.

Apparently BUILD_LIBJVM.d mentions all files multiple times:

% grep "utilities/waitBarrier.hpp" build/clang/hotspot/variant-server/libjvm/objs/BUILD_LIBJVM.d | wc -l
1231

It's definitely reasonable to exclude it, I did it in the latest revision of the script.

Now, the numbers seem more reasonable:

% ls build/clang/hotspot/variant-server/libjvm/objs/*.d | grep -v BUILD | wc -l
1232

% grep "code/codeCache.hpp" build/clang/hotspot/variant-server/libjvm/objs/*.d | \
            grep -v BUILD | sort | uniq | wc -l
1230

% grep "code/codeCache.hpp" inclusions_count.txt
code/codeCache.hpp=1230

Updated inclusions_count.txt

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this means that other .d files might include a header multiple times, too. Is it reasonable to count only the number of files containing a header instead of number of appearances?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this means that other .d files might include a header multiple times, too.

From the following two numbers, I deduce code/codeCache.hpp is (transitively) included in 1230 files, with each inclusion coming from a unique .d file. Am I missing something?

% grep "code/codeCache.hpp" build/clang/hotspot/variant-server/libjvm/objs/*.d | grep -v BUILD | sort | wc -l
1230

% grep "code/codeCache.hpp" build/clang/hotspot/variant-server/libjvm/objs/*.d | grep -v BUILD | sort | uniq | wc -l
1230

I haven't checked every single file, but I'd expect each include to appear at most once in each .d file.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarification, I think that it is fine, then, although a comment explaining in the code would be great.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect that the multiple includes reported in the .d files are accurate. The .d file appears
to report the number of times we #include a file, however, because if the include-guard
constructs we use, the content is pulled in only once.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The number of times a header is (directly or indirectly) included by a given
source file doesn't affect the number of times that header appears in the .d
file associated with that source file. The compiler uniquifies that set
(though links might confuse things, but that's not an issue here).

The .d files consist of a set of targets, and for each target, the set of
dependencies.

For the .d files, there's one target, .o, with its
dependencies (once each).

For BUILD_LIBJVM.d, there's a target for each .o file. And for each of those
targets, it's dependencies (once each). Essentially BUILD_LIBJVM.d includes a
concatenation of all the .d files. And since there are duplicates
among the dependencies of different targets, there are many(!) duplicates in
this file.

paths.filter(Files::isRegularFile)
.filter(path -> {
String name = path.getFileName().toString();
return name.endsWith(".cpp") || name.endsWith(".hpp");
})
.flatMap(path -> {
try {
return Files.lines(path);
} catch (IOException exception) {
throw new UncheckedIOException(exception);
}
})
.map(INCLUDE_PATTERN::matcher)
.filter(Matcher::matches)
.map(matcher -> matcher.group(1))
.forEach(include -> occurrences.compute(include,
(k, old) -> Objects.requireNonNullElse(old, 0) + 1));
}

List<String> inlineIncludes = occurrences.keySet().stream()
.filter(s -> s.endsWith(INLINE_HPP_SUFFIX))
.toList();

// Each .inline.hpp pulls in the non-inline version too, so we can increase its counter
for (String inlineInclude : inlineIncludes) {
int inlineCount = occurrences.get(inlineInclude);
String noInlineInclude = inlineInclude.replace(INLINE_HPP_SUFFIX, ".hpp");
int noInlineCount = Objects.requireNonNullElse(
occurrences.get(noInlineInclude), 0);
occurrences.put(noInlineInclude, inlineCount + noInlineCount);
}

// Keep only the headers which are included at least 'minInclusionCount' times
Set<String> headers = occurrences.entrySet().stream()
.filter(entry -> entry.getValue() > minInclusionCount)
.map(Map.Entry::getKey)
.collect(Collectors.toSet());

// If both inline and non-inline headers are to be included, prefer the inline header
for (String inlineInclude : inlineIncludes) {
if (headers.contains(inlineInclude)) {
String noInlineInclude = inlineInclude.replace(INLINE_HPP_SUFFIX, ".hpp");
headers.remove(noInlineInclude);
}
}

Path precompiledHpp = jdkRoot.resolve(PRECOMPILED_HPP);
try (Stream<String> lines = Files.lines(precompiledHpp)) {
String precompiledHppHeader = lines
.takeWhile(Predicate.not(s -> INCLUDE_PATTERN.matcher(s).matches()))
.collect(Collectors.joining(System.lineSeparator()));
Files.write(precompiledHpp, precompiledHppHeader.getBytes());
}

String headerLines = headers.stream()
.sorted()
.map(header -> String.format("#include \"%s\"", header))
.collect(Collectors.joining(System.lineSeparator()));
Files.write(precompiledHpp,
(System.lineSeparator() + headerLines + System.lineSeparator()).getBytes(),
StandardOpenOption.APPEND);
}

}
23 changes: 23 additions & 0 deletions src/utils/PrecompiledHeaders/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# Hotspot precompiled headers

This directory contains a simple tool to refresh the current set of precompiled headers
in `src/hotspot`. The headers are selected according to how frequently they are included
in Hotspot source code.

## Usage

The script requires a parameter which determines the minimum inclusion count a header
must reach in order to be precompiled. Optionally, the root path to the JDK project can
be specified as the second parameter.

```bash
$ javac src/utils/PrecompiledHeaders/PrecompiledHeaders.java
$ java -cp src/utils/PrecompiledHeaders PrecompiledHeaders min_inclusion_count [jdk_root=.]
```

The script will write to `src/hotspot/share/precompiled/precompiled.hpp` the new set of
headers selected to be precompiled.

## Related tickets
- [JDK-8213339](https://bugs.openjdk.org/browse/JDK-8213339)
- [JDK-8365053](https://bugs.openjdk.org/browse/JDK-8365053)