Skip to content

Conversation

@efriedma-quic
Copy link
Collaborator

@efriedma-quic efriedma-quic commented Nov 14, 2025

Adds a flag COMPILER_RT_PROFILE_BAREMETAL, which disables the parts of the profile runtime which require a filesystem or malloc. This minimal library only requires string.h from the C library.

This is useful for profiling or code coverage of baremetal images, which don't have filesystem APIs, and might not have malloc configured (or have limited heap space).

Expected usage:

  • Add code to your project to call __llvm_profile_get_size_for_buffer() and __llvm_profile_write_buffer() to write the profile data to a buffer in memory, and then copy that data off the device using target-specific tools.
  • If you're using a linker script, set up your linker script to map the profiling and coverage input sections to corresponding output sections with the same name, and mark them KEEP. __llvm_covfun and __llvm_covmap are non-allocatable, __llvm_prf_names is read-only allocatable, and __llvm_prf_cnts and __llvm_prf_data are read-write allocatable.
  • The resulting data is in same format as the non-baremetal profiles.

There's some room for improvement here in the future for doing profiling and code coverage for baremetal. If we revised the profiling format, and introduced some additional host tooling, we could move some of the metadata into non-allocated sections, and construct the profraw file on the host. But this patch is sufficient for some use-cases.

@llvmbot llvmbot added compiler-rt PGO Profile Guided Optimizations labels Nov 14, 2025
@llvmbot
Copy link
Member

llvmbot commented Nov 14, 2025

@llvm/pr-subscribers-pgo

Author: Eli Friedman (efriedma-quic)

Changes

Adds a flag COMPILER_RT_PROFILE_BAREMETAL, which disables the parts of the profile runtime which require a filesystem or malloc. This minimal library only requires string.h from the C library.

This is useful for profiling or code coverage of baremetal images, which don't have filesystem APIs, and might not have malloc configured (or have limited heap space).

There's some room for improvement here in the future for doing profiling and code coverage for baremetal. If we revised the profiling format, and introduced some additional host tooling, we could move some of the metadata into non-allocated sections, and construct the profraw file on the host. But this patch is sufficient for some use-cases.


Full diff: https://github.com/llvm/llvm-project/pull/167998.diff

9 Files Affected:

  • (modified) compiler-rt/CMakeLists.txt (+2)
  • (modified) compiler-rt/lib/profile/CMakeLists.txt (+23-5)
  • (modified) compiler-rt/lib/profile/InstrProfiling.c (-2)
  • (modified) compiler-rt/lib/profile/InstrProfiling.h (+5)
  • (modified) compiler-rt/lib/profile/InstrProfilingMerge.c (+2-1)
  • (modified) compiler-rt/lib/profile/InstrProfilingMergeFile.c (-1)
  • (modified) compiler-rt/lib/profile/InstrProfilingPlatformLinux.c (+4-1)
  • (modified) compiler-rt/lib/profile/InstrProfilingPlatformOther.c (+2-1)
  • (modified) compiler-rt/lib/profile/InstrProfilingPort.h (+2-10)
diff --git a/compiler-rt/CMakeLists.txt b/compiler-rt/CMakeLists.txt
index a9e8899f8ae0c..71a11a08175c2 100644
--- a/compiler-rt/CMakeLists.txt
+++ b/compiler-rt/CMakeLists.txt
@@ -309,6 +309,8 @@ option(COMPILER_RT_USE_BUILTINS_LIBRARY
 
 option(COMPILER_RT_USE_ATOMIC_LIBRARY "Use compiler-rt atomic instead of libatomic" OFF)
 
+option(COMPILER_RT_PROFILE_BAREMETAL "Build minimal baremetal profile library" OFF)
+
 include(config-ix)
 
 #================================
diff --git a/compiler-rt/lib/profile/CMakeLists.txt b/compiler-rt/lib/profile/CMakeLists.txt
index a6402f80b890a..7795e7399f493 100644
--- a/compiler-rt/lib/profile/CMakeLists.txt
+++ b/compiler-rt/lib/profile/CMakeLists.txt
@@ -60,12 +60,9 @@ int main() {
 add_compiler_rt_component(profile)
 
 set(PROFILE_SOURCES
-  GCDAProfiling.c
   InstrProfiling.c
   InstrProfilingInternal.c
-  InstrProfilingValue.c
   InstrProfilingBuffer.c
-  InstrProfilingFile.c
   InstrProfilingMerge.c
   InstrProfilingMergeFile.c
   InstrProfilingNameVar.c
@@ -77,10 +74,25 @@ set(PROFILE_SOURCES
   InstrProfilingPlatformLinux.c
   InstrProfilingPlatformOther.c
   InstrProfilingPlatformWindows.c
-  InstrProfilingRuntime.cpp
-  InstrProfilingUtil.c
   )
 
+if (NOT COMPILER_RT_PROFILE_BAREMETAL)
+  # For baremetal, exclude the following:
+  # - Anything that contains filesystem operations (InstrProfilingFile.c,
+  #   InstrProfilingUtils.c)
+  # - Initialization, because it isn't necesary without the filesystem bits
+  #   on ELF targets (InstrProfilingRuntime.cpp).
+  # - Value profiling, because it requires malloc (InstrProfilingValue.c).
+  #   This could be optional if someone needs it.
+  # - GCDA profiling, which is unrelated (GCDAProfiling.c)
+  list(APPEND PROFILE_SOURCES GCDAProfiling.c
+                              InstrProfilingFile.c
+                              InstrProfilingRuntime.cpp
+                              InstrProfilingUtil.c
+                              InstrProfilingValue.c
+                              )
+endif()
+
 set(PROFILE_HEADERS
   InstrProfiling.h
   InstrProfilingInternal.h
@@ -135,6 +147,12 @@ if(COMPILER_RT_TARGET_HAS_UNAME)
      -DCOMPILER_RT_HAS_UNAME=1)
 endif()
 
+if(COMPILER_RT_PROFILE_BAREMETAL)
+ set(EXTRA_FLAGS
+     ${EXTRA_FLAGS}
+     -DCOMPILER_RT_PROFILE_BAREMETAL=1)
+endif()
+
 if(MSVC)
   # profile historically has only been supported with the static runtime
   # on windows
diff --git a/compiler-rt/lib/profile/InstrProfiling.c b/compiler-rt/lib/profile/InstrProfiling.c
index da04d8ebdec95..d59ec78ad3296 100644
--- a/compiler-rt/lib/profile/InstrProfiling.c
+++ b/compiler-rt/lib/profile/InstrProfiling.c
@@ -10,8 +10,6 @@
 // with freestanding compilation. See `darwin_add_builtin_libraries`.
 
 #include <limits.h>
-#include <stdio.h>
-#include <stdlib.h>
 #include <string.h>
 
 #include "InstrProfiling.h"
diff --git a/compiler-rt/lib/profile/InstrProfiling.h b/compiler-rt/lib/profile/InstrProfiling.h
index 8791d5aa5dd70..187ef55ef3784 100644
--- a/compiler-rt/lib/profile/InstrProfiling.h
+++ b/compiler-rt/lib/profile/InstrProfiling.h
@@ -10,7 +10,10 @@
 #define PROFILE_INSTRPROFILING_H_
 
 #include "InstrProfilingPort.h"
+#include <stddef.h>
+#ifndef COMPILER_RT_PROFILE_BAREMETAL
 #include <stdio.h>
+#endif
 
 // Make sure __LLVM_INSTR_PROFILE_GENERATE is always defined before
 // including instr_prof_interface.h so the interface functions are
@@ -200,7 +203,9 @@ int __llvm_profile_write_file(void);
  * copying the old profile file to new profile file and this function is usually
  * used when the proess doesn't have permission to open file.
  */
+#ifndef COMPILER_RT_PROFILE_BAREMETAL
 int __llvm_profile_set_file_object(FILE *File, int EnableMerge);
+#endif
 
 /*! \brief Register to write instrumentation data to file at exit. */
 int __llvm_profile_register_write_file_atexit(void);
diff --git a/compiler-rt/lib/profile/InstrProfilingMerge.c b/compiler-rt/lib/profile/InstrProfilingMerge.c
index 92721c4fd55ea..9b84b512ad9c7 100644
--- a/compiler-rt/lib/profile/InstrProfilingMerge.c
+++ b/compiler-rt/lib/profile/InstrProfilingMerge.c
@@ -11,7 +11,6 @@
 
 #include "InstrProfiling.h"
 #include "InstrProfilingInternal.h"
-#include "InstrProfilingUtil.h"
 
 #define INSTR_PROF_VALUE_PROF_DATA
 #include "profile/InstrProfData.inc"
@@ -131,9 +130,11 @@ COMPILER_RT_VISIBILITY
 int __llvm_profile_merge_from_buffer(const char *ProfileData,
                                      uint64_t ProfileSize) {
   if (__llvm_profile_get_version() & VARIANT_MASK_TEMPORAL_PROF) {
+#ifndef COMPILER_RT_PROFILE_BAREMETAL
     PROF_ERR("%s\n",
              "Temporal profiles do not support profile merging at runtime. "
              "Instead, merge raw profiles using the llvm-profdata tool.");
+#endif
     return 1;
   }
 
diff --git a/compiler-rt/lib/profile/InstrProfilingMergeFile.c b/compiler-rt/lib/profile/InstrProfilingMergeFile.c
index 8923ba21cc580..d8fe1f6677950 100644
--- a/compiler-rt/lib/profile/InstrProfilingMergeFile.c
+++ b/compiler-rt/lib/profile/InstrProfilingMergeFile.c
@@ -13,7 +13,6 @@
 
 #include "InstrProfiling.h"
 #include "InstrProfilingInternal.h"
-#include "InstrProfilingUtil.h"
 
 #define INSTR_PROF_VALUE_PROF_DATA
 #include "profile/InstrProfData.inc"
diff --git a/compiler-rt/lib/profile/InstrProfilingPlatformLinux.c b/compiler-rt/lib/profile/InstrProfilingPlatformLinux.c
index 558b7fc8cad62..650dbbcb0a665 100644
--- a/compiler-rt/lib/profile/InstrProfilingPlatformLinux.c
+++ b/compiler-rt/lib/profile/InstrProfilingPlatformLinux.c
@@ -8,14 +8,17 @@
 
 #if defined(__linux__) || defined(__FreeBSD__) || defined(__Fuchsia__) ||      \
     (defined(__sun__) && defined(__svr4__)) || defined(__NetBSD__) ||          \
-    defined(_AIX) || defined(__wasm__) || defined(__HAIKU__)
+    defined(_AIX) || defined(__wasm__) || defined(__HAIKU__) ||                \
+    defined(COMPILER_RT_PROFILE_BAREMETAL)
 
+#ifndef COMPILER_RT_PROFILE_BAREMETAL
 #if !defined(_AIX) && !defined(__wasm__)
 #include <elf.h>
 #include <link.h>
 #endif
 #include <stdlib.h>
 #include <string.h>
+#endif
 
 #include "InstrProfiling.h"
 #include "InstrProfilingInternal.h"
diff --git a/compiler-rt/lib/profile/InstrProfilingPlatformOther.c b/compiler-rt/lib/profile/InstrProfilingPlatformOther.c
index 19414ab78b3be..bedf934af29b6 100644
--- a/compiler-rt/lib/profile/InstrProfilingPlatformOther.c
+++ b/compiler-rt/lib/profile/InstrProfilingPlatformOther.c
@@ -9,7 +9,8 @@
 #if !defined(__APPLE__) && !defined(__linux__) && !defined(__FreeBSD__) &&     \
     !defined(__Fuchsia__) && !(defined(__sun__) && defined(__svr4__)) &&       \
     !defined(__NetBSD__) && !defined(_WIN32) && !defined(_AIX) &&              \
-    !defined(__wasm__) && !defined(__HAIKU__)
+    !defined(__wasm__) && !defined(__HAIKU__) &&                               \
+    !defined(COMPILER_RT_PROFILE_BAREMETAL)
 
 #include <stdlib.h>
 #include <stdio.h>
diff --git a/compiler-rt/lib/profile/InstrProfilingPort.h b/compiler-rt/lib/profile/InstrProfilingPort.h
index 66d697885eaee..151ed94007413 100644
--- a/compiler-rt/lib/profile/InstrProfilingPort.h
+++ b/compiler-rt/lib/profile/InstrProfilingPort.h
@@ -117,7 +117,9 @@ static inline size_t getpagesize(void) {
   return S.dwPageSize;
 }
 #else /* defined(_WIN32) */
+#ifndef COMPILER_RT_PROFILE_BAREMETAL
 #include <unistd.h>
+#endif
 #endif /* defined(_WIN32) */
 
 #define PROF_ERR(Format, ...)                                                  \
@@ -137,16 +139,6 @@ static inline size_t getpagesize(void) {
 #define O_BINARY 0
 #endif
 
-#if defined(__FreeBSD__)
-
-#include <inttypes.h>
-#include <sys/types.h>
-
-#else /* defined(__FreeBSD__) */
-
-#include <inttypes.h>
 #include <stdint.h>
 
-#endif /* defined(__FreeBSD__) && defined(__i386__) */
-
 #endif /* PROFILE_INSTRPROFILING_PORT_H_ */

@lenary
Copy link
Member

lenary commented Nov 14, 2025

It seems a little strange to me that baremetal profiling is using the "Linux" profiling platform (rather than "Other" or a new "Baremetal" platform). That could do with documenting a little bit better, I think.

I am not familiar enough with the profiling runtime to understand which bits you're disabling, but the individual changes seem well-reasoned?

@efriedma-quic
Copy link
Collaborator Author

efriedma-quic commented Nov 14, 2025

It seems a little strange to me that baremetal profiling is using the "Linux" profiling platform

"Linux" is badly named; it's really any target where the linker makes __start_ and __stop_ symbols point to the right place. Modern ELF and wasm linkers generally do this. We have a different magic incantation for Darwin linkers in InstrProfilingPlatformDarwin.c, and we have a terrible fallback which uses a runtime constructor in InstrProfilingPlatformOther.c. I can add add a comment at the top of InstrProfilingPlatformLinux.c to explain this.

Probably it also makes sense to rename the file, but I'm not sure what to name it.

@evodius96
Copy link
Contributor

There's some room for improvement here in the future for doing profiling and code coverage for baremetal. If we revised the profiling format, and introduced some additional host tooling, we could move some of the metadata into non-allocated sections, and construct the profraw file on the host. But this patch is sufficient for some use-cases.

Yes, this is ultimately what I would like to see. But this patch is a good place to begin. Thank you!

@lenary
Copy link
Member

lenary commented Nov 15, 2025

It seems a little strange to me that baremetal profiling is using the "Linux" profiling platform

"Linux" is badly named; it's really any target where the linker makes __start_ and __stop_ symbols point to the right place. Modern ELF and wasm linkers generally do this. We have a different magic incantation for Darwin linkers in InstrProfilingPlatformDarwin.c, and we have a terrible fallback which uses a runtime constructor in InstrProfilingPlatformOther.c. I can add add a comment at the top of InstrProfilingPlatformLinux.c to explain this.

Yes Please

Probably it also makes sense to rename the file, but I'm not sure what to name it.

Probably unnecessary if there's a comment, given it's shared between ELF and Wasm, which removes calling it PlatformELF

Comment on lines 12 to 14
defined(COMPILER_RT_PROFILE_BAREMETAL)

#ifndef COMPILER_RT_PROFILE_BAREMETAL
Copy link

@reactive-firewall reactive-firewall Nov 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 NIT - Confusing logic could use an explanation comment.

For the uninitiated this looks like a logical contradiction for the edge-case where:

  • given COMPILER_RT_PROFILE_BAREMETAL is defined
    • #ifndef COMPILER_RT_PROFILE_BAREMETAL should always result in a 'false' result
    • if reader misses first #endif intent becomes easy to misunderstand (e.g. cognitive complexity)
Suggested change
defined(COMPILER_RT_PROFILE_BAREMETAL)
#ifndef COMPILER_RT_PROFILE_BAREMETAL
defined(COMPILER_RT_PROFILE_BAREMETAL)
/* skip extensions if Bare-metal */
#ifndef COMPILER_RT_PROFILE_BAREMETAL

AND what about embedded systems based on musl libc (e.g. so called generic-none targets) if they don't define COMPILER_RT_PROFILE_BAREMETAL ? or do they now need to define COMPILER_RT_PROFILE_BAREMETAL (meaning many things)?

TL;DR - this also conflates linux and the standard C library (consider -D__STDC__=1 with musl libc does not need -D__linux__ (or the linux headers) to build)

Rational is that the ISO 9899 (C standards) defines stdlib.h and string.h as part of the C library, and could be more accurately can only be conditionally guarded by __STDC__ but that may lead to tighter-dependency of compiler-rt on libc when bootstrapping.

Technically though the correct guard should be down below the first #endif:

#ifdef __STDC__ /* check if K&R or ISO 9899:1990 (c89) for bare-metal */
    #include <stdlib.h> /* defined in ISO 9899:1990 "Standard headers" */

    #if __STDC_VERSION__ >= 199901L /* Check if ISO 9899:1999 (c99) or later */
        #include <string.h> /* defined in ISO 9899:1999 "Standard headers" */
    #endif
#else
    // Handle the case where the standard is not supported (optional)
#endif

Hopefully a comment could help clarify this a bit.

@petrhosek
Copy link
Member

petrhosek commented Nov 15, 2025

We've been collecting coverage from embedded projects and from our Zircon kernel (which is also a baremetal system) for some time now and in our experience, the existing runtime is unnecessarily complicated (even with changes in this PR). We ended up building our more minimal runtime for Zircon and another minimal runtime (internal) for our embedded projects. Arm also has their own minimal runtime implementation for baremetal.

I'm fine with this change but I think a better strategy would be to start off with a minimal, header-only runtime implemented in C++ for baremetal systems which could also serve as a building block for full-featured OSes, eventually replacing the existing runtime which is rapidly approaching the point of being unmaintainable due to poor abstractions and code littered with #ifdefs.

"Linux" is badly named; it's really any target where the linker makes __start_ and __stop_ symbols point to the right place. Modern ELF and wasm linkers generally do this. We have a different magic incantation for Darwin linkers in InstrProfilingPlatformDarwin.c, and we have a terrible fallback which uses a runtime constructor in InstrProfilingPlatformOther.c. I can add add a comment at the top of InstrProfilingPlatformLinux.c to explain this.

I have an older change https://reviews.llvm.org/D47212 that renamed InstrProfilingPlatformLinux.c to InstrProfilingPlatformELF.c and InstrProfilingPlatformDarwin.c to InstrProfilingPlatformMachO.c respectively. I could reupload it as a PR to GitHub if there was interest in going forward with it.

There's some room for improvement here in the future for doing profiling and code coverage for baremetal. If we revised the profiling format, and introduced some additional host tooling, we could move some of the metadata into non-allocated sections, and construct the profraw file on the host. But this patch is sufficient for some use-cases.

How is this different from -profile-correlate=binary?

@smithp35
Copy link
Collaborator

Aside from a rewrite along the lines of what Petr Hosek mentions, I think that this is a good step forward. From what I understand it would permit us to shrink our baremetal profiling implementation down to just our implementation of __llvm_profile_dump() which would guard against most, but not all profile format changes.

I agree that it would be useful to change the name of the file as I would instinctively look for InstrProfilingPlatformOther rather than InstrProfilingPlatform.

If I've understood the scope of the change. The user/toolchain would still have to arrange for a mechanism to construct a means to output the counters in a format that llvm-profdata understands. In our sample runtime we used semi-hosting to write to a file. While I don't think it would be possible to cover all possible methods of output in the library, it could be possible to write a function that covered some common cases (user may have to supply some low-level callbacks). Alternatively it would be useful to write some documentation, possibly as a comment on what a toolchain provider/user has to provide to make this work. Could be a follow on change.

@efriedma-quic
Copy link
Collaborator Author

There's some room for improvement here in the future for doing profiling and code coverage for baremetal. If we revised the profiling format, and introduced some additional host tooling, we could move some of the metadata into non-allocated sections, and construct the profraw file on the host. But this patch is sufficient for some use-cases.

How is this different from -profile-correlate=binary?

I wasn't aware of that. I'll need to experiment.

That said, even if you strip out those sections, you still need to allocate a buffer large enough to hold a copy of all the counters, and run a bunch of code to fill that buffer. Ideally you could just stop the CPU, use the debugger to copy some data into a file, and use that as input to llvm-prof.


I'm fine with this change but I think a better strategy would be to start off with a minimal, header-only runtime implemented in C++ for baremetal systems which could also serve as a building block for full-featured OSes, eventually replacing the existing runtime which is rapidly approaching the point of being unmaintainable due to poor abstractions and code littered with #ifdefs.

I can see why you'd want a callback-driven interface to write out the profile file, instead of allocating a memory buffer or directly using filesystem APIs. But I don't see how any further granularity helps, unless you just want to avoid the compiler-rt build system.


AND what about embedded systems based on musl libc (e.g. so called generic-none targets) if they don't define COMPILER_RT_PROFILE_BAREMETAL ? or do they now need to define COMPILER_RT_PROFILE_BAREMETAL (meaning many things)?

It maybe depends on what exactly you've done. As far as I can tell, we currently use mmap, so I'm not sure how a baremetal target could build, unless your hacked up musl accidentally exposes mmap. But if you have a setup that works, this patch shouldn't affect it.

It's possible that we should be checking for __ELF__ in InstrProfilingPlatformLinux.c instead of __linux__; other places check for whether the target is ELF, instead of checking specific operating systems (including the LLVM instrumentation pass, and other places in libprofile).

@efriedma-quic
Copy link
Collaborator Author

If I've understood the scope of the change. The user/toolchain would still have to arrange for a mechanism to construct a means to output the counters in a format that llvm-profdata understands.

This subset of the profile library includes __llvm_profile_write_buffer, which outputs counters in the format llvm-profdata expects. Obviously the user still needs to copy that data off the device somehow.

Adds a flag COMPILER_RT_PROFILE_BAREMETAL, which disables the parts of
the profile runtime which require a filesystem or malloc.  This minimal
library only requires string.h from the C library.

This is useful for profiling or code coverage of baremetal images, which
don't have filesystem APIs, and might not have malloc configured (or
have limited heap space).

There's some room for improvement here in the future for doing profiling
and code coverage for baremetal. If we revised the profiling format, and
introduced some additional host tooling, we could move some of the
metadata into non-allocated sections, and construct the profraw file on
the host.  But this patch is sufficient for some use-cases.
Also adjust ifdef to be a bit more clear.
@efriedma-quic
Copy link
Collaborator Author

Updated with some comments, and cleaned up one of the ifdefs a bit.

My thinking for the filenames is that we actually should split InstrProfilingPlatformLinux.c: we should have InstrProfilingPlatformELF.c for the ELF-only bits, and InstrProfilingPlatformStaticInit.c for the "generic" __start/__stop bits, and InstrProfilingPlatformWasm.c for the wasm-only bits (basically just a stub, but it doesn't fit anywhere else). Then rename InstrProfilingPlatformOther.c to InstrProfilingPlatformDynamicInit.c. But I'd prefer to leave that for a followup.

@github-actions
Copy link

🐧 Linux x64 Test Results

  • 5820 tests passed
  • 1319 tests skipped

@lenary
Copy link
Member

lenary commented Nov 18, 2025

Thanks for the additional comments.

I agree that splitting up these files better can wait for a follow-up commit.

I am happy, but don't feel familiar enough to approve this commit.

@smithp35
Copy link
Collaborator

Thanks for adding the comments. I agree with you on splitting the file, was tempted to mention that yesterday but it may have been a change too far for others to accept.

This subset of the profile library includes __llvm_profile_write_buffer, which outputs counters in the format llvm-profdata expects. Obviously the user still needs to copy that data off the device somehow.

Thanks for pointing that out, it is quite easy to miss what the intended usage is. I eWhat I was thinking for a comment for toolchain vendors for how to use the API to save some time figuring it out. May not need to be in the source code, could put it in the description.

Could it be as simple as:

extern uint64_t __llvm_profile_get_size_for_buffer(void);
extern int __llvm_profile_write_buffer(char *Buffer);

uint64_t size = __llvm_profile_get_size_for_buffer(void);
char* buffer = allocate_memory(size);
if (__llvm_profile_write_buffer(buffer) == -1) {
  // error
}
// Program specifc way to write out buffer

I'm with Lenary in that I'm supportive of the direction, but I don't have any prior commit experience. If we end up with no-one willing to approve/reject I can put my name to it though.

@efriedma-quic
Copy link
Collaborator Author

Updated description.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

compiler-rt PGO Profile Guided Optimizations

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants