Skip to content

Conversation

christopherbate
Copy link
Contributor

@christopherbate christopherbate commented Oct 4, 2025

I profiled initial CMake configuration and generation (Ninja) steps in LLVM-Project with just LLVM, MLIR, and Clang enabled using the command shown below. Based on the profile, I then implemented a number of optimizations.

All the optimizations are hid behind extra flags which are set to have no change with existing logic (except for check_linker_flag optimization below and the optimization of GoogleBenchmark flags).

LLVM_ENABLE_EXPENSIVE_CMAKE_CHECKS
LLVM_ENABLE_LIT_CONVENIENCE_TARGETS
LLVM_TOOLCHAIN_CHECK_CACHE

Initial time of cmake command @ 679d2b2 on my workstation:

-- Configuring done (17.8s)
-- Generating done (6.9s)

After all below optimizations:

-- Configuring done (12.8s)
-- Generating done (4.7s)

With a "toolchain check cache" (explained below):

-- Configuring done (6.9s)
-- Generating done (4.3s)

There's definitely room for more optimizations, I think <10sec end-to-end for this command is definitely doable.

Most changes have a small impact. It's the gradual creep of inefficiencies that have added up over time to make the system less efficient than it could be.

Command tested:

cmake -G Ninja -S llvm -B ${buildDir} \
		-DLLVM_ENABLE_PROJECTS="mlir;clang" \
		-DLLVM_TARGETS_TO_BUILD="X86;NVPTX" \
		-DCMAKE_BUILD_TYPE=RelWithDebInfo \
		-DLLVM_ENABLE_ASSERTIONS=ON \
		-DLLVM_CCACHE_BUILD=ON \
		-DBUILD_SHARED_LIBS=ON \
		-DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DLLVM_USE_LINKER=lld \
		-DCMAKE_EXPORT_COMPILE_COMMANDS=ON \
		-DMLIR_ENABLE_BINDINGS_PYTHON=ON \
		--fresh

To enable new optimal optimizations optimizations, set

-DLLVM_ENABLE_EXPENSIVE_CMAKE_CHECKS=OFF
-DLLVM_ENABLE_LIT_CONVENIENCE_TARGETS=OFF
-DLLVM_TOOLCHAIN_CHECK_CACHE=$(pwd)/toolchain-check-cache.cmake

Optimizations:

Optimize check_linker_flag calls

In AddLLVM.cmake, there were a couple places where we call check_linker_flag every time llvm_add_library is called. Even in non-initial cmake configuration runs, this carries unreasonable overhead.

Change: Host (CheckLinkerFlag)in AddLLVM.cmake and optimize placement ofcheck_linker_flag` calls so that they are only made once.

Impact: - <1 sec

Make add_lit_testsuites optional

The function add_lit_testsuites is used to
recursively populate a set of convenience targets that run a filtered portion of a LIT test suite. So instead of running check-mlir you can run check-mlir-dialect. These targets are built recursively for each subdirectory (e.g. check-mlir-dialect-tensor, check-mlir-dialect-math, etc.).

This call has quite a bit of overhead, especially for the main LLVM LIT test suite.

Personally I use a combination of ninja -C build check-mlir-build-only and llvm-lit directly to run filtered portions of the MLIR LIT test suite, but I can imagine that others depend on these filtered targets.

Change: Introduce a new option LLVM_ENABLE_LIT_CONVENIENCE_TARGETS which defaults to ON. When set to OFF, the function add_lit_testsuites just becomes a no-op. It's possible that we could also just improve the performance of add_lit_testsuites directly, but I didn't pursue this.

Impact: ~1-2sec

Reduce file(GLOB) calls in LLVMProcessSources.cmake

The llvm_process_sources call is made whenver the llvm_add_library function is called. It makes several file(GLOB) calls, which can be expensive depending on the underlying filesystem/storage. The function globs for headers and TD files to add as sources to the target, but the comments suggest that this is only necessary for MSVC. In addition, it calls llvm_check_source_file_list to check that no source files in the directory are unused unless PARTIAL_SOURCES_INTENDED is set, which incurs another file(GLOB) call.

Changes: Guard the file(GLOB) calls for populating header sources behind if(MSVC). Only do the llvm_check_source_file_list check if a new option LLVM_ENABLE_EXPENSIVE_CMAKE_CHECKS is set to ON.

Impact: depends on system. On my local workstation, impact is minimal. On another remote server I use, impact is much larger.

Optimize initial symbol/flag checks made in config-ix.cmake and HandleLLVMOptions.cmake

The config-ix.cmake and HandleLLVMOptions.cmake files make a number of calls to compile C/C++ programs in order to verify the precense of certain symbols or whether certain compiler flags are supported.

These checks have the biggest impact on an initial cmake configuration time.

I propose an "opt in" approach for amortizing these checks using a special generated CMake cache file as directed by the developer.

An option LLVM_TOOLCHAIN_CHECK_CACHE is introduced. It should be set to a path like -DLLVM_TOOLCHAIN_CHECK_CACHE=$PWD/.toolchain-check-cache.cmake.

Before entering the config-ix.cmake and HandleLLVMOptions.cmake files, if the LLVM_TOOLCHAIN_CHECK_CACHE option is set and exists, include that file to pre-populate cache variables.
Otherwise, we save the current set of CMake cache variables names. After calling the config-ix|HandleLLVMOptions files, if the LLVM_TOOLCHAIN_CHECK_CACHE option is set but does not exist, check what new CMake cache variables were set by those scripts. Filter these variables by whether they are likely cache variables
supporting symbol/flag checks (e.g. CXX_SUPPORTS_.*|HAVE_.* etc) and write the file to set all these cache variables to their current values.

This allows a developer to obviate any subsequent checks, even in initial cmake configuration runs. The correctness depends on the developer knowing when it is invalid (e.g. they change toolchains or platforms) and us suddenly not changing the meaning of CXX_SUPPORTS_SOME_FLAG to correspond to a different flag.

It could be extended the cache file to store a key used to check whether to regenerate the cache, but I didn't go there.

Impact: Trivial overhead for cache generation, ~5sec reduction in initial config time.

Reduce overhead of embedded Google Benchmark configuration

Note: technically this could be lumped in with the above if we expanded scope of before/after change that the LLVM_TOOLCHAIN_CHECK_CACHE covers.

You can disable google benchmark entirely with LLVM_INCLUDE_BENCHMARK=OFF, but most CI systems don't set that.

GoogleBenchmark is embedded under the third-party/benchmark directory. Its CMake script does a compilation check for each flag that it wants to populate (even for -Wall). In comparison, LLVM's HandleLLVMOptions.cmake script takes a faster approach by skipping as many compilation checks as possible if the cache variable LLVM_COMPILER_IS_GCC_COMPATIBLE is true.

Changes: Use LLVM_COMPILER_IS_GCC_COMPATIBLE to skip as many compilation checks as possible in GoogleBenchmark.

Impact: ~1-2sec

@llvmbot llvmbot added cmake Build system in general and CMake in particular third-party:benchmark labels Oct 4, 2025
@llvmbot
Copy link
Member

llvmbot commented Oct 4, 2025

@llvm/pr-subscribers-third-party-benchmark

Author: Christopher Bate (christopherbate)

Changes

I profiled initial CMake configuration and generation (Ninja) steps in LLVM-Project with just LLVM, MLIR, and Clang enabled using the command shown below. Based on the profile, I then implemented a number of optimizations.

All the optimizations are hid behind extra flags which are set to have no change with existing logic (except for check_linker_flag optimization below):

LLVM_ENABLE_EXPENSIVE_CMAKE_CHECKS
LLVM_ENABLE_LIT_CONVENIENCE_TARGETS
BENCHMARK_ENABLE_EXPENSIVE_CMAKE_CHECKS
LLVM_TOOLCHAIN_CHECK_CACHE

Initial time of cmake command @ 679d2b2 on my workstation:

-- Configuring done (17.8s)
-- Generating done (6.9s)

After all below optimizations:

-- Configuring done (13.4s)
-- Generating done (4.7s)

With a "toolchain check cache" (explained below):

-- Configuring done (8.2s)
-- Generating done (4.7s)

There's definitely room for more optimizations, I think <10sec end-to-end for this command is definitely doable.

Most changes have a small impact. It's the gradual creep of inefficiencies that have added up over time to make the system less efficient than it could be.

Command tested:

cmake -G Ninja -S llvm -B ${buildDir} \
		-DLLVM_ENABLE_PROJECTS="mlir;clang" \
		-DLLVM_TARGETS_TO_BUILD="X86;NVPTX" \
		-DCMAKE_BUILD_TYPE=RelWithDebInfo \
		-DLLVM_ENABLE_ASSERTIONS=ON \
		-DLLVM_CCACHE_BUILD=ON \
		-DBUILD_SHARED_LIBS=ON \
		-DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DLLVM_USE_LINKER=lld \
		-DCMAKE_EXPORT_COMPILE_COMMANDS=ON \
		-DMLIR_ENABLE_BINDINGS_PYTHON=ON \
		--fresh

To enable optimizations, set

-DLLVM_ENABLE_EXPENSIVE_CMAKE_CHECKS=OFF
-DLLVM_ENABLE_LIT_CONVENIENCE_TARGETS=OFF
-DBENCHMARK_ENABLE_EXPENSIVE_CMAKE_CHECKS=OFF
-DLLVM_TOOLCHAIN_CHECK_CACHE=$(pwd)/toolchain-check-cache.cmake

Optimizations:

Optimize check_linker_flag calls

In AddLLVM.cmake, there were a couple places where we call check_linker_flag every time llvm_add_library is called. Even in non-initial cmake configuration runs, this carries unreasonable overhead.

Change: Host (CheckLinkerFlag)in AddLLVM.cmake and optimize placement ofcheck_linker_flag` calls so that they are only made once.

Impact: - <1 sec

Make add_lit_testsuites optional

The function add_lit_testsuites is used to
recursively populate a set of convenience targets that run a filtered portion of a LIT test suite. So instead of running check-mlir you can run check-mlir-dialect. These targets are built recursively for each subdirectory (e.g. check-mlir-dialect-tensor, check-mlir-dialect-math, etc.).

This call has quite a bit of overhead, especially for the main LLVM LIT test suite.

Personally I use a combination of ninja -C build check-mlir-build-only and llvm-lit directly to run filtered portions of the MLIR LIT test suite, but I can imagine that others depend on these filtered targets.

Change: Introduce a new option LLVM_ENABLE_LIT_CONVENIENCE_TARGETS which defaults to ON. When set to OFF, the function add_lit_testsuites just becomes a no-op. It's possible that we could also just improve the performance of add_lit_testsuites directly, but I didn't pursue this.

Impact: ~1-2sec

Reduce file(GLOB) calls in LLVMProcessSources.cmake

The llvm_process_sources call is made whenver the llvm_add_library function is called. It makes several file(GLOB) calls, which can be expensive depending on the underlying filesystem/storage. The function globs for headers and TD files to add as sources to the target, but the comments suggest that this is only necessary for MSVC. In addition, it calls llvm_check_source_file_list to check that no source files in the directory are unused unless PARTIAL_SOURCES_INTENDED is set, which incurs another file(GLOB) call.

Changes: Guard the file(GLOB) calls for populating header sources behind if(MSVC). Only do the llvm_check_source_file_list check if a new option LLVM_ENABLE_EXPENSIVE_CMAKE_CHECKS is set to ON.

Impact: depends on system. On my local workstation, impact is minimal. On another remote server I use, impact is much larger.

Optimize initial symbol/flag checks made in config-ix.cmake and HandleLLVMOptions.cmake

The config-ix.cmake and HandleLLVMOptions.cmake files make a number of calls to compile C/C++ programs in order to verify the precense of certain symbols or whether certain compiler flags are supported.

These checks have the biggest impact on an initial cmake configuration time.

I propose an "opt in" approach for amortizing these checks using a special generated CMake cache file as directed by the developer.

An option LLVM_TOOLCHAIN_CHECK_CACHE is introduced. It should be set to a path like -DLLVM_TOOLCHAIN_CHECK_CACHE=$PWD/.toolchain-check-cache.cmake.

Before entering the config-ix.cmake and HandleLLVMOptions.cmake files, if the LLVM_TOOLCHAIN_CHECK_CACHE option is set and exists, include that file to pre-populate cache variables.
Otherwise, we save the current set of CMake cache variables names. After calling the config-ix|HandleLLVMOptions files, if the LLVM_TOOLCHAIN_CHECK_CACHE option is set but does not exist, check what new CMake cache variables were set by those scripts. Filter these variables by whether they are likely cache variables
supporting symbol/flag checks (e.g. CXX_SUPPORTS_.*|HAVE_.* etc) and write the file to set all these cache variables to their current values.

This allows a developer to obviate any subsequent checks, even in initial cmake configuration runs. The correctness depends on the developer knowing when it is invalid (e.g. they change toolchains or platforms) and us suddenly not changing the meaning of CXX_SUPPORTS_SOME_FLAG to correspond to a different flag.

It could be extended the cache file to store a key used to check whether to regenerate the cache, but I didn't go there.

Impact: Trivial overhead for cache generation, ~5sec reduction in initial config time.

Reduce overhead of embedded Google Benchmark configuration

Note: technically this could be lumped in with the above if we expanded scope of before/after change that the LLVM_TOOLCHAIN_CHECK_CACHE covers.

GoogleBenchmark is embedded under the third-party/benchmark directory. Its CMake script does a compilation check for each flag that it wants to populate (even for -Wall). In comparison, LLVM's HandleLLVMOptions.cmake script takes a faster approach by skipping as many compilation checks as possible if the cache variable LLVM_COMPILER_IS_GCC_COMPATIBLE is true.

Changes: Use LLVM_COMPILER_IS_GCC_COMPATIBLE to skip as many compilation checks as possible in GoogleBenchmark.

Impact: ~1-2sec


Full diff: https://github.com/llvm/llvm-project/pull/161981.diff

5 Files Affected:

  • (modified) llvm/CMakeLists.txt (+21-1)
  • (modified) llvm/cmake/modules/AddLLVM.cmake (+30-21)
  • (added) llvm/cmake/modules/LLVMCacheSnapshot.cmake (+51)
  • (modified) llvm/cmake/modules/LLVMProcessSources.cmake (+16-8)
  • (modified) third-party/benchmark/CMakeLists.txt (+31-22)
diff --git a/llvm/CMakeLists.txt b/llvm/CMakeLists.txt
index c450ee5a3d72e..d52bd84814aec 100644
--- a/llvm/CMakeLists.txt
+++ b/llvm/CMakeLists.txt
@@ -867,6 +867,11 @@ option(LLVM_INSTALL_GTEST
   "Install the llvm gtest library.  This should be on if you want to do
    stand-alone builds of the other projects and run their unit tests." OFF)
 
+option(LLVM_ENABLE_EXPENSIVE_CMAKE_CHECKS
+  "Enable use of expensive CMake checks for unused source files" ON)
+option(LLVM_ENABLE_LIT_CONVENIENCE_TARGETS
+  "Enable use of convenience targets for all subdirectories of a LIT test suite" ON)
+
 option(LLVM_BUILD_BENCHMARKS "Add LLVM benchmark targets to the list of default
 targets. If OFF, benchmarks still could be built using Benchmarks target." OFF)
 option(LLVM_INCLUDE_BENCHMARKS "Generate benchmark targets. If OFF, benchmarks can't be built." ON)
@@ -1027,9 +1032,18 @@ endif()
 find_package(Python3 ${LLVM_MINIMUM_PYTHON_VERSION} REQUIRED
     COMPONENTS Interpreter)
 
+set(LLVM_TOOLCHAIN_CHECK_CACHE "" CACHE PATH
+  "Path to where a generated *.cmake cache file will be saved.")
+
+include(LLVMCacheSnapshot)
 # All options referred to from HandleLLVMOptions have to be specified
 # BEFORE this include, otherwise options will not be correctly set on
-# first cmake run
+# first cmake run.
+if(LLVM_TOOLCHAIN_CHECK_CACHE AND EXISTS "${LLVM_TOOLCHAIN_CHECK_CACHE}")
+  include("${LLVM_TOOLCHAIN_CHECK_CACHE}")
+elseif(LLVM_TOOLCHAIN_CHECK_CACHE)
+  llvm_get_list_of_existing_cache_variables(cache_before)
+endif()
 include(config-ix)
 
 # By default, we target the host, but this can be overridden at CMake
@@ -1081,6 +1095,12 @@ endif()
 
 include(HandleLLVMOptions)
 
+if(LLVM_TOOLCHAIN_CHECK_CACHE AND NOT EXISTS "${LLVM_TOOLCHAIN_CHECK_CACHE}")
+  llvm_list_of_new_cache_variables_and_values(cache_before cache_new_pairs)
+  list(JOIN cache_new_pairs "\n" cache_new_pairs_joined)
+  file(WRITE "${LLVM_TOOLCHAIN_CHECK_CACHE}" "${cache_new_pairs_joined}")
+endif()
+
 ######
 
 # Configure all of the various header file fragments LLVM uses which depend on
diff --git a/llvm/cmake/modules/AddLLVM.cmake b/llvm/cmake/modules/AddLLVM.cmake
index 80e59a4df2433..f26dd7aa549c5 100644
--- a/llvm/cmake/modules/AddLLVM.cmake
+++ b/llvm/cmake/modules/AddLLVM.cmake
@@ -3,17 +3,18 @@ include(LLVMDistributionSupport)
 include(LLVMProcessSources)
 include(LLVM-Config)
 include(DetermineGCCCompatible)
+include(CheckLinkerFlag)
 
 # get_subproject_title(titlevar)
 #   Set ${outvar} to the title of the current LLVM subproject (Clang, MLIR ...)
-# 
+#
 # The title is set in the subproject's top-level using the variable
 # LLVM_SUBPROJECT_TITLE. If it does not exist, it is assumed it is LLVM itself.
 # The title is not semantically significant, but use to create folders in
 # CMake-generated IDE projects (Visual Studio/XCode).
 function(get_subproject_title outvar)
   if (LLVM_SUBPROJECT_TITLE)
-    set(${outvar} "${LLVM_SUBPROJECT_TITLE}" PARENT_SCOPE) 
+    set(${outvar} "${LLVM_SUBPROJECT_TITLE}" PARENT_SCOPE)
   else ()
     set(${outvar} "LLVM" PARENT_SCOPE)
   endif ()
@@ -269,7 +270,6 @@ if (NOT DEFINED LLVM_LINKER_DETECTED AND NOT WIN32)
   endif()
 
   if("${CMAKE_SYSTEM_NAME}" MATCHES "Darwin")
-    include(CheckLinkerFlag)
     # Linkers that support Darwin allow a setting to internalize all symbol exports,
     # aiding in reducing binary size and often is applicable for executables.
     check_linker_flag(C "-Wl,-no_exported_symbols" LLVM_LINKER_SUPPORTS_NO_EXPORTED_SYMBOLS)
@@ -289,8 +289,23 @@ if (NOT DEFINED LLVM_LINKER_DETECTED AND NOT WIN32)
   endif()
 endif()
 
+if (NOT uppercase_CMAKE_BUILD_TYPE STREQUAL "DEBUG")
+  if(NOT LLVM_NO_DEAD_STRIP)
+    if("${CMAKE_SYSTEM_NAME}" MATCHES "SunOS" AND LLVM_LINKER_IS_SOLARISLD)
+      # Support for ld -z discard-unused=sections was only added in
+      # Solaris 11.4.  GNU ld ignores it, but warns every time.
+      check_linker_flag(CXX "-Wl,-z,discard-unused=sections" LINKER_SUPPORTS_Z_DISCARD_UNUSED)
+    endif()
+  endif()
+endif()
+
+# Check for existence of symbolic functions flag. Not supported
+# by the older BFD linker (such as on some OpenBSD archs), the
+# MinGW driver for LLD, and the Solaris native linker.
+check_linker_flag(CXX "-Wl,-Bsymbolic-functions"
+                  LLVM_LINKER_SUPPORTS_B_SYMBOLIC_FUNCTIONS)
+
 function(add_link_opts target_name)
-  include(CheckLinkerFlag)
   get_llvm_distribution(${target_name} in_distribution in_distribution_var)
   if(NOT in_distribution)
     # Don't LTO optimize targets that aren't part of any distribution.
@@ -320,9 +335,6 @@ function(add_link_opts target_name)
         set_property(TARGET ${target_name} APPEND_STRING PROPERTY
                      LINK_FLAGS " -Wl,-dead_strip")
       elseif("${CMAKE_SYSTEM_NAME}" MATCHES "SunOS" AND LLVM_LINKER_IS_SOLARISLD)
-        # Support for ld -z discard-unused=sections was only added in
-        # Solaris 11.4.  GNU ld ignores it, but warns every time.
-        check_linker_flag(CXX "-Wl,-z,discard-unused=sections" LINKER_SUPPORTS_Z_DISCARD_UNUSED)
         if (LINKER_SUPPORTS_Z_DISCARD_UNUSED)
           set_property(TARGET ${target_name} APPEND_STRING PROPERTY
                        LINK_FLAGS " -Wl,-z,discard-unused=sections")
@@ -349,12 +361,6 @@ function(add_link_opts target_name)
     set_property(TARGET ${target_name} APPEND_STRING PROPERTY
                  LINK_FLAGS " -Wl,-brtl")
   endif()
-
-  # Check for existence of symbolic functions flag. Not supported
-  # by the older BFD linker (such as on some OpenBSD archs), the
-  # MinGW driver for LLD, and the Solaris native linker.
-  check_linker_flag(CXX "-Wl,-Bsymbolic-functions"
-                    LLVM_LINKER_SUPPORTS_B_SYMBOLIC_FUNCTIONS)
 endfunction(add_link_opts)
 
 # Set each output directory according to ${CMAKE_CONFIGURATION_TYPES}.
@@ -645,11 +651,11 @@ function(llvm_add_library name)
   endif()
   set_target_properties(${name} PROPERTIES FOLDER "${subproject_title}/Libraries")
 
-  ## If were compiling with clang-cl use /Zc:dllexportInlines- to exclude inline 
+  ## If were compiling with clang-cl use /Zc:dllexportInlines- to exclude inline
   ## class members from being dllexport'ed to reduce compile time.
   ## This will also keep us below the 64k exported symbol limit
   ## https://blog.llvm.org/2018/11/30-faster-windows-builds-with-clang-cl_14.html
-  if(LLVM_BUILD_LLVM_DYLIB AND NOT LLVM_DYLIB_EXPORT_INLINES AND 
+  if(LLVM_BUILD_LLVM_DYLIB AND NOT LLVM_DYLIB_EXPORT_INLINES AND
      MSVC AND CMAKE_CXX_COMPILER_ID MATCHES Clang)
     target_compile_options(${name} PUBLIC /Zc:dllexportInlines-)
     if(TARGET ${obj_name})
@@ -1500,8 +1506,8 @@ macro(llvm_add_tool project name)
                 RUNTIME DESTINATION ${${project}_TOOLS_INSTALL_DIR}
                 COMPONENT ${name})
         if (LLVM_ENABLE_PDB)
-          install(FILES $<TARGET_PDB_FILE:${name}> 
-                DESTINATION "${${project}_TOOLS_INSTALL_DIR}" COMPONENT ${name} 
+          install(FILES $<TARGET_PDB_FILE:${name}>
+                DESTINATION "${${project}_TOOLS_INSTALL_DIR}" COMPONENT ${name}
                 OPTIONAL)
         endif()
 
@@ -1535,8 +1541,8 @@ macro(add_llvm_example name)
   if( LLVM_BUILD_EXAMPLES )
     install(TARGETS ${name} RUNTIME DESTINATION "${LLVM_EXAMPLES_INSTALL_DIR}")
     if (LLVM_ENABLE_PDB)
-      install(FILES $<TARGET_PDB_FILE:${name}> 
-              DESTINATION "${LLVM_EXAMPLES_INSTALL_DIR}" COMPONENT ${name} 
+      install(FILES $<TARGET_PDB_FILE:${name}>
+              DESTINATION "${LLVM_EXAMPLES_INSTALL_DIR}" COMPONENT ${name}
               OPTIONAL)
     endif()
   endif()
@@ -1574,8 +1580,8 @@ macro(add_llvm_utility name)
               RUNTIME DESTINATION ${LLVM_UTILS_INSTALL_DIR}
               COMPONENT ${name})
       if (LLVM_ENABLE_PDB)
-        install(FILES $<TARGET_PDB_FILE:${name}> 
-                DESTINATION "${LLVM_UTILS_INSTALL_DIR}" COMPONENT ${name} 
+        install(FILES $<TARGET_PDB_FILE:${name}>
+                DESTINATION "${LLVM_UTILS_INSTALL_DIR}" COMPONENT ${name}
                 OPTIONAL)
       endif()
 
@@ -2192,6 +2198,9 @@ function(add_lit_testsuite target comment)
 endfunction()
 
 function(add_lit_testsuites project directory)
+  if(NOT LLVM_ENABLE_LIT_CONVENIENCE_TARGETS)
+    return()
+  endif()
   if (NOT LLVM_ENABLE_IDE)
     cmake_parse_arguments(ARG
       "EXCLUDE_FROM_CHECK_ALL"
diff --git a/llvm/cmake/modules/LLVMCacheSnapshot.cmake b/llvm/cmake/modules/LLVMCacheSnapshot.cmake
new file mode 100644
index 0000000000000..89d592a0a4165
--- /dev/null
+++ b/llvm/cmake/modules/LLVMCacheSnapshot.cmake
@@ -0,0 +1,51 @@
+# Example usage
+# llvm_get_cache_vars(before)
+# include(SomeModule)
+# llvm_diff_cache_vars("${before}" new_vars new_pairs)
+
+# message(STATUS "New cache variables: ${new_vars}")
+# message(STATUS "New cache vars and values:\n${new_pairs}")
+
+# get_list_of_existing_cache_variables(existing)
+function(llvm_get_list_of_existing_cache_variables out_var)
+  get_cmake_property(_all CACHE_VARIABLES)
+  if(NOT _all)
+    set(_all "")
+  endif()
+  set(${out_var} "${_all}" PARENT_SCOPE)
+endfunction()
+
+# list_of_new_cache_variables_and_values(existing new_vars_and_values)
+# - `existing` is the name of the var returned by the first helper
+# - `new_vars_and_values` will be a list like:  NAME=VALUE (TYPE=...);NAME2=VALUE2 (TYPE=...)
+function(llvm_list_of_new_cache_variables_and_values existing_list_var out_var)
+  # Existing (pre-include) snapshot
+  set(_before "${${existing_list_var}}")
+
+  # Current (post-include) snapshot
+  get_cmake_property(_after CACHE_VARIABLES)
+
+  # Compute new names
+  set(_new "${_after}")
+  if(_before)
+    list(REMOVE_ITEM _new ${_before})
+  endif()
+
+  # Pack "NAME=VALUE (TYPE=...)" for each new cache entry
+  set(_pairs "")
+  foreach(_k IN LISTS _new)
+    if(NOT "${_k}" MATCHES "^((C|CXX)_SUPPORTS|HAVE_|GLIBCXX_USE|SUPPORTS_FVISI)")
+      continue()
+    endif()
+    # Cache VALUE: dereference is fine here because cache entries read like normal vars
+    set(_val "${${_k}}")
+    # Cache TYPE (e.g., STRING, BOOL, PATH, FILEPATH, INTERNAL, UNINITIALIZED)
+    get_property(_type CACHE "${_k}" PROPERTY TYPE)
+    if(NOT _type)
+      set(_type "UNINITIALIZED")
+    endif()
+    list(APPEND _pairs "set(${_k} \"${_val}\" CACHE ${_type} \"\")")
+  endforeach()
+
+  set(${out_var} "${_pairs}" PARENT_SCOPE)
+endfunction()
diff --git a/llvm/cmake/modules/LLVMProcessSources.cmake b/llvm/cmake/modules/LLVMProcessSources.cmake
index 0670d60bf2afd..0bcce4c6c78ad 100644
--- a/llvm/cmake/modules/LLVMProcessSources.cmake
+++ b/llvm/cmake/modules/LLVMProcessSources.cmake
@@ -56,17 +56,25 @@ endfunction(find_all_header_files)
 function(llvm_process_sources OUT_VAR)
   cmake_parse_arguments(ARG "PARTIAL_SOURCES_INTENDED" "" "ADDITIONAL_HEADERS;ADDITIONAL_HEADER_DIRS" ${ARGN})
   set(sources ${ARG_UNPARSED_ARGUMENTS})
-  llvm_check_source_file_list(${sources})
 
-  # This adds .td and .h files to the Visual Studio solution:
-  add_td_sources(sources)
-  find_all_header_files(hdrs "${ARG_ADDITIONAL_HEADER_DIRS}")
-  if (hdrs)
-    set_source_files_properties(${hdrs} PROPERTIES HEADER_FILE_ONLY ON)
+  if(LLVM_ENABLE_EXPENSIVE_CMAKE_CHECKS)
+    llvm_check_source_file_list(${sources})
+  endif()
+
+  if(ARG_ADDITIONAL_HEADERS)
+    set_source_files_properties(${ARG_ADDITIONAL_HEADERS} PROPERTIES HEADER_FILE_ONLY ON)
+    list(APPEND sources ${ARG_ADDITIONAL_HEADERS})
   endif()
-  set_source_files_properties(${ARG_ADDITIONAL_HEADERS} PROPERTIES HEADER_FILE_ONLY ON)
-  list(APPEND sources ${ARG_ADDITIONAL_HEADERS} ${hdrs})
 
+  # This adds .td and .h files to the Visual Studio solution:
+  if(MSVC OR LLVM_ENABLE_EXPENSIVE_CMAKE_CHECKS)
+    add_td_sources(sources)
+    find_all_header_files(hdrs "${ARG_ADDITIONAL_HEADER_DIRS}")
+    if (hdrs)
+      set_source_files_properties(${hdrs} PROPERTIES HEADER_FILE_ONLY ON)
+    endif()
+    list(APPEND sources ${hdrs})
+  endif()
   set( ${OUT_VAR} ${sources} PARENT_SCOPE )
 endfunction(llvm_process_sources)
 
diff --git a/third-party/benchmark/CMakeLists.txt b/third-party/benchmark/CMakeLists.txt
index d9bcc6a4939be..1e0f4d22907fc 100644
--- a/third-party/benchmark/CMakeLists.txt
+++ b/third-party/benchmark/CMakeLists.txt
@@ -147,6 +147,15 @@ endif()
 set(CMAKE_CXX_STANDARD ${BENCHMARK_CXX_STANDARD})
 set(CMAKE_CXX_STANDARD_REQUIRED YES)
 set(CMAKE_CXX_EXTENSIONS OFF)
+option(BENCHMARK_ENABLE_EXPENSIVE_CMAKE_CHECKS "Enable use of expensive CMake checks for unused source files" ON)
+
+function(handle_flag flag)
+  if(LLVM_COMPILER_IS_GCC_COMPATIBLE AND NOT BENCHMARK_ENABLE_EXPENSIVE_CMAKE_CHECKS)
+    list(APPEND CMAKE_CXX_FLAGS ${flag})
+  else()
+    add_cxx_compiler_flag(${flag})
+  endif()
+endfunction()
 
 if (MSVC)
   # Turn compiler warnings up to 11
@@ -185,49 +194,49 @@ else()
   add_definitions(-D_LARGEFILE64_SOURCE)
   add_definitions(-D_LARGEFILE_SOURCE)
   # Turn compiler warnings up to 11
-  add_cxx_compiler_flag(-Wall)
-  add_cxx_compiler_flag(-Wextra)
-  add_cxx_compiler_flag(-Wshadow)
-  add_cxx_compiler_flag(-Wfloat-equal)
-  add_cxx_compiler_flag(-Wold-style-cast)
+  handle_flag(-Wall)
+  handle_flag(-Wextra)
+  handle_flag(-Wshadow)
+  handle_flag(-Wfloat-equal)
+  handle_flag(-Wold-style-cast)
   if(BENCHMARK_ENABLE_WERROR)
-      add_cxx_compiler_flag(-Werror)
+      handle_flag(-Werror)
   endif()
   if (NOT BENCHMARK_ENABLE_TESTING)
     # Disable warning when compiling tests as gtest does not use 'override'.
-    add_cxx_compiler_flag(-Wsuggest-override)
+    handle_flag(-Wsuggest-override)
   endif()
-  add_cxx_compiler_flag(-pedantic)
-  add_cxx_compiler_flag(-pedantic-errors)
-  add_cxx_compiler_flag(-Wshorten-64-to-32)
-  add_cxx_compiler_flag(-fstrict-aliasing)
+  handle_flag(-pedantic)
+  handle_flag(-pedantic-errors)
+  handle_flag(-Wshorten-64-to-32)
+  handle_flag(-fstrict-aliasing)
   # Disable warnings regarding deprecated parts of the library while building
   # and testing those parts of the library.
-  add_cxx_compiler_flag(-Wno-deprecated-declarations)
+  handle_flag(-Wno-deprecated-declarations)
   if (CMAKE_CXX_COMPILER_ID STREQUAL "Intel" OR CMAKE_CXX_COMPILER_ID STREQUAL "IntelLLVM")
     # Intel silently ignores '-Wno-deprecated-declarations',
     # warning no. 1786 must be explicitly disabled.
     # See #631 for rationale.
-    add_cxx_compiler_flag(-wd1786)
-    add_cxx_compiler_flag(-fno-finite-math-only)
+    handle_flag(-wd1786)
+    handle_flag(-fno-finite-math-only)
   endif()
   # Disable deprecation warnings for release builds (when -Werror is enabled).
   if(BENCHMARK_ENABLE_WERROR)
-      add_cxx_compiler_flag(-Wno-deprecated)
+      handle_flag(-Wno-deprecated)
   endif()
   if (NOT BENCHMARK_ENABLE_EXCEPTIONS)
-    add_cxx_compiler_flag(-fno-exceptions)
+    handle_flag(-fno-exceptions)
   endif()
 
   if (HAVE_CXX_FLAG_FSTRICT_ALIASING)
     if (NOT CMAKE_CXX_COMPILER_ID STREQUAL "Intel" AND NOT CMAKE_CXX_COMPILER_ID STREQUAL "IntelLLVM") #ICC17u2: Many false positives for Wstrict-aliasing
-      add_cxx_compiler_flag(-Wstrict-aliasing)
+      handle_flag(-Wstrict-aliasing)
     endif()
   endif()
   # ICC17u2: overloaded virtual function "benchmark::Fixture::SetUp" is only partially overridden
   # (because of deprecated overload)
-  add_cxx_compiler_flag(-wd654)
-  add_cxx_compiler_flag(-Wthread-safety)
+  handle_flag(-wd654)
+  handle_flag(-Wthread-safety)
   if (HAVE_CXX_FLAG_WTHREAD_SAFETY)
     cxx_feature_check(THREAD_SAFETY_ATTRIBUTES "-DINCLUDE_DIRECTORIES=${PROJECT_SOURCE_DIR}/include")
   endif()
@@ -246,8 +255,8 @@ else()
 
   # Link time optimisation
   if (BENCHMARK_ENABLE_LTO)
-    add_cxx_compiler_flag(-flto)
-    add_cxx_compiler_flag(-Wno-lto-type-mismatch)
+    handle_flag(-flto)
+    handle_flag(-Wno-lto-type-mismatch)
     if ("${CMAKE_CXX_COMPILER_ID}" STREQUAL "GNU")
       find_program(GCC_AR gcc-ar)
       if (GCC_AR)
@@ -278,7 +287,7 @@ else()
     BENCHMARK_SHARED_LINKER_FLAGS_COVERAGE)
   set(CMAKE_BUILD_TYPE "${CMAKE_BUILD_TYPE}" CACHE STRING
     "Choose the type of build, options are: None Debug Release RelWithDebInfo MinSizeRel Coverage.")
-  add_cxx_compiler_flag(--coverage COVERAGE)
+  handle_flag(--coverage COVERAGE)
 endif()
 
 if (BENCHMARK_USE_LIBCXX)

@christopherbate christopherbate requested review from fmayer, aganea and Copilot and removed request for fmayer October 4, 2025 22:54
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements several CMake build system optimizations to reduce LLVM configuration and generation time from 17.8s/6.9s to 8.2s/4.7s with the toolchain check cache enabled. The optimizations include reducing expensive CMake checks, making LIT convenience targets optional, and caching toolchain verification results.

Key changes:

  • Introduces new optional flags to control expensive operations: LLVM_ENABLE_EXPENSIVE_CMAKE_CHECKS, LLVM_ENABLE_LIT_CONVENIENCE_TARGETS, BENCHMARK_ENABLE_EXPENSIVE_CMAKE_CHECKS, and LLVM_TOOLCHAIN_CHECK_CACHE
  • Optimizes linker flag checks by moving them out of per-library call paths
  • Adds toolchain check caching mechanism to avoid repeated compilation tests

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
third-party/benchmark/CMakeLists.txt Adds optimization to skip expensive compiler flag checks when using GCC-compatible compilers
llvm/cmake/modules/LLVMProcessSources.cmake Guards expensive file globbing operations behind new configuration options
llvm/cmake/modules/LLVMCacheSnapshot.cmake New module for capturing and managing CMake cache snapshots for toolchain checks
llvm/cmake/modules/AddLLVM.cmake Moves expensive linker flag checks out of per-library functions and adds option to disable LIT convenience targets
llvm/CMakeLists.txt Adds new configuration options and implements toolchain check caching logic


# This adds .td and .h files to the Visual Studio solution:
Copy link

Copilot AI Oct 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition logic has changed from always executing to conditional execution based on MSVC OR LLVM_ENABLE_EXPENSIVE_CMAKE_CHECKS. This could break builds on non-MSVC platforms where headers are needed but LLVM_ENABLE_EXPENSIVE_CMAKE_CHECKS is OFF. Consider documenting this behavioral change or adding a comment explaining when this optimization is safe.

Suggested change
# This adds .td and .h files to the Visual Studio solution:
# This adds .td and .h files to the Visual Studio solution.
# NOTE: The following conditional logic only adds these files for MSVC or when
# LLVM_ENABLE_EXPENSIVE_CMAKE_CHECKS is enabled. On non-MSVC platforms, if
# LLVM_ENABLE_EXPENSIVE_CMAKE_CHECKS is OFF, these files will NOT be added.
# This optimization is intended to reduce solution clutter for non-MSVC builds,
# but may break IDE integration or developer workflows on platforms that expect
# these files to be present. If you encounter issues with missing headers or
# .td files in your IDE or build system, consider enabling
# LLVM_ENABLE_EXPENSIVE_CMAKE_CHECKS or revisiting this logic.

Copilot uses AI. Check for mistakes.

# Pack "NAME=VALUE (TYPE=...)" for each new cache entry
set(_pairs "")
foreach(_k IN LISTS _new)
if(NOT "${_k}" MATCHES "^((C|CXX)_SUPPORTS|HAVE_|GLIBCXX_USE|SUPPORTS_FVISI)")
Copy link

Copilot AI Oct 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The regex pattern SUPPORTS_FVISI appears to be incomplete or a typo. This likely should be SUPPORTS_FVISIBILITY or similar. The incomplete pattern may not match intended cache variables.

Suggested change
if(NOT "${_k}" MATCHES "^((C|CXX)_SUPPORTS|HAVE_|GLIBCXX_USE|SUPPORTS_FVISI)")
if(NOT "${_k}" MATCHES "^((C|CXX)_SUPPORTS|HAVE_|GLIBCXX_USE|SUPPORTS_FVISIBILITY)")

Copilot uses AI. Check for mistakes.

I profiled initial CMake configuration and generation (Ninja)
steps in LLVM-Project with just LLVM, MLIR, and Clang enabled using the
command shown below. Based on the profile, I then implemented
a number of optimizations.

Initial time of `cmake` command @ 679d2b2:

-- Configuring done (17.8s)
-- Generating done (6.9s)

After all below optimizations:

-- Configuring done (12.8s)
-- Generating done (4.7s)

With a "toolchain check cache" (explained below):

-- Configuring done (6.9s)
-- Generating done (4.3s)

There's definitely room for more optimizations -- another 20% at least.

Most changes have a small impact. It's the gradual creep of inefficiencies
that have added up over time to make the system less efficient than it
could be.

Command tested:

```
cmake -G Ninja -S llvm -B ${buildDir} \
		-DLLVM_ENABLE_PROJECTS="mlir;clang" \
		-DLLVM_TARGETS_TO_BUILD="X86;NVPTX" \
		-DCMAKE_BUILD_TYPE=RelWithDebInfo \
		-DLLVM_ENABLE_ASSERTIONS=ON \
		-DLLVM_CCACHE_BUILD=ON \
		-DBUILD_SHARED_LIBS=ON \
		-DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DLLVM_USE_LINKER=lld \
		-DCMAKE_EXPORT_COMPILE_COMMANDS=ON \
		-DMLIR_ENABLE_BINDINGS_PYTHON=ON \
		-DLLVM_TOOLCHAIN_CHECK_CACHE=${PWD}/.toolchain-check-cache.cmake \
		--fresh
```

## Optimizations:

### Optimize `check_linker_flag` calls

In `AddLLVM.cmake`, there were a couple places where we call `check_linker_flag`
every time `llvm_add_library` is called. Even in non-initial cmake configuration
runs, this carries unreasonable overhead.

Change: Host (CheckLinkerFlag)` in AddLLVM.cmake and optimize placement of `check_linker_flag` calls
so that they are only made once.

Impact: - <1 sec

### Make `add_lit_testsuites` optional

The function `add_lit_testsuites` is used to
recursively populate a set of convenience targets that run a
filtered portion of a LIT test suite. So instead of running `check-mlir`
you can run `check-mlir-dialect`. These targets are built recursively
for each subdirectory (e.g. `check-mlir-dialect-tensor`, `check-mlir-dialect-math`, etc.).

This call has quite a bit of overhead, especially for the main LLVM LIT test suite.

Personally I use a combination of `ninja -C build check-mlir-build-only` and
`llvm-lit` directly to run filtered portions of the MLIR LIT test suite, but
I can imagine that others depend on these filtered targets.

Change: Introduce a new option `LLVM_ENABLE_LIT_CONVENIENCE_TARGETS`
which defaults to `ON`. When set to `OFF`, the function `add_lit_testsuites`
just becomes a no-op. It's possible that we could also just improve the performance
of `add_lit_testsuites` directly, but I didn't pursue this.

Impact: ~1-2sec

### Reduce `file(GLOB)` calls in `LLVMProcessSources.cmake`

The `llvm_process_sources` call is made whenver the `llvm_add_library`
function is called. It makes several `file(GLOB)` calls, which can
be expensive depending on the underlying filesystem/storage. The
function globs for headers and TD files to add as sources to the target,
but the comments suggest that this is only necessary for MSVC. In addition,
it calls `llvm_check_source_file_list` to check that no source files in
the directory are unused unless `PARTIAL_SOURCES_INTENDED` is set, which
incurs another `file(GLOB)` call.

Changes: Guard the `file(GLOB)` calls for populating header sources
behind `if(MSVC)`. Only do the `llvm_check_source_file_list` check
if a new option `LLVM_ENABLE_EXPENSIVE_CMAKE_CHECKS` is set to `ON`.

Impact: depends on system. On my local workstation, impact is minimal.
On another remote server I use, impact is much larger.

### Optimize initial symbol/flag checks made in `config-ix.cmake` and `HandleLLVMOptions.cmake`

The `config-ix.cmake` and `HandleLLVMOptions.cmake` files make a number of calls to
compile C/C++ programs in order to verify the precense of certain symbols or
whether certain compiler flags are supported.

These checks have the biggest impact on an initial `cmake` configuration time.

I propose an "opt in" approach for amortizing these checks using a special generated
CMake cache file as directed by the developer.

An option `LLVM_TOOLCHAIN_CHECK_CACHE` is introduced. It should be set to
a path like `-DLLVM_TOOLCHAIN_CHECK_CACHE=$PWD/.toolchain-check-cache.cmake`.

Before entering the `config-ix.cmake` and `HandleLLVMOptions.cmake` files,
if the `LLVM_TOOLCHAIN_CHECK_CACHE` option is set and exists, include
that file to pre-populate cache variables.
Otherwise, we save the current set of CMake cache variables names.
After calling the `config-ix|HandleLLVMOptions` files,
if the `LLVM_TOOLCHAIN_CHECK_CACHE` option is set but does not exist,
check what new CMake cache variables were set by those scripts. Filter these variables by
whether they are likely cache variables
supporting symbol/flag checks (e.g. `CXX_SUPPORTS_.*|HAVE_.*` etc)
and write the file to set all these cache variables to their current values.

This allows a developer to obviate any subsequent checks, even in initial `cmake`
configuration runs. The correctness depends on the developer knowing
when it is invalid (e.g. they change toolchains or platforms) and us suddenly
not changing the meaning of `CXX_SUPPORTS_SOME_FLAG` to correspond to a different flag.

It could be extended the cache file to store a key used to check whether to regenerate
the cache, but I didn't go there.

Impact: Trivial overhead for cache generation, ~5sec reduction in initial config time.

### Reduce overhead of embedded Google Benchmark configuration

Note: technically this could be lumped in with the above if we expanded scope of before/after
change that the `LLVM_TOOLCHAIN_CHECK_CACHE` covers.

GoogleBenchmark is embedded under the `third-party/benchmark` directory.
Its CMake script does a compilation check for each flag that it wants to
populate (even for `-Wall`). In comparison, LLVM's HandleLLVMOptions.cmake script takes
a faster approach by skipping as many compilation checks as possible
if the cache variable `LLVM_COMPILER_IS_GCC_COMPATIBLE` is true.

Changes: Use `LLVM_COMPILER_IS_GCC_COMPATIBLE` to skip as many compilation
checks as possible in GoogleBenchmark.

Impact: ~1-2sec
@nikic
Copy link
Contributor

nikic commented Oct 5, 2025

Please separate the individual changes here into separate PRs.

@christopherbate
Copy link
Contributor Author

Please separate the individual changes here into separate PRs.

will do

@efriedma-quic
Copy link
Collaborator

https://github.com/google/benchmark is a third-party project, and we prefer to minimize differences with upstream. Please submit changes there first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cmake Build system in general and CMake in particular third-party:benchmark
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants