Skip to content

Conversation

@ldrumm
Copy link
Contributor

@ldrumm ldrumm commented Jul 22, 2025

There's been a variation of the following in the code since 2005:

if (unoptimizable_true)
  return;
use_this_symbol_to_force_linking(); // unreachable but never removed

Way back in 00d5508 it was the win32 call GetCurrentProcess but switched to getenv("bar") fairly soon after in 63e504f. While that pulled in fewer dependencies and made the code portable, it's a bit of a weird construct. The environment variable used for the getenv call is "bar", which is particularly weird to see fly past when you run ltrace on a binary linked against LLVM.

In this patch I don't try to replace this construct wholesale - it's still required for architectural reasons I'm not able to tackle right now, but I did try and make it slightly less weird and opaque:

  • It gives the construct a name
  • The environment variable hints where this comes from and that its value is ignored

Combined, this should be a bit of improvement for the next person who wonders what LLVM is up to when they trace their process or see smatterings of getenv("bar") dotted around the source.

@github-actions
Copy link

github-actions bot commented Jul 22, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update the comment to no longer talk about getenv? That's a detail of getNonFoldableAlwaysTrue now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Fixed.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// effectively a NO-OP. // This is so that globals in the translation units
// effectively a NO-OP. This is so that globals in the translation units

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@llvmbot
Copy link
Member

llvmbot commented Jul 22, 2025

@llvm/pr-subscribers-llvm-support

Author: Luke Drummond (ldrumm)

Changes

There's been a variation of the following in the code since 2005:

if (unoptimizable_true)
  return;
use_this_symbol_to_force_linking(); // unreachable but never removed

Way back in 00d5508 it was the win32 call GetCurrentProcess but switched to getenv("bar") fairly soon after in 63e504f. While that pulled in fewer dependencies and made the code portable, it's a bit of a weird construct. The environment variable used for the getenv call is "bar", which is particularly weird to see fly past when you run ltrace on a binary linked against LLVM.

In this patch I don't try to replace this construct wholesale - it's still required for architectural reasons I'm not able to tackle right now, but I did try and make it slightly less weird and opaque:

  • It gives the construct a name
  • The environment variable hints where this comes from and that its value is ignored

Combined, this should be a bit of improvement for the next person who wonders what LLVM is up to when they trace their process or see smatterings of getenv("bar") dotted around the source.


Full diff: https://github.com/llvm/llvm-project/pull/150020.diff

8 Files Affected:

  • (modified) llvm/include/llvm/CodeGen/LinkAllAsmWriterComponents.h (+5-7)
  • (modified) llvm/include/llvm/CodeGen/LinkAllCodegenComponents.h (+5-7)
  • (modified) llvm/include/llvm/ExecutionEngine/MCJIT.h (+6-8)
  • (modified) llvm/include/llvm/LinkAllIR.h (+3-5)
  • (modified) llvm/include/llvm/LinkAllPasses.h (+7-8)
  • (added) llvm/include/llvm/Support/AlwaysTrue.h (+29)
  • (modified) llvm/tools/bugpoint/bugpoint.cpp (+2-1)
  • (modified) polly/include/polly/LinkAllPasses.h (+5-6)
diff --git a/llvm/include/llvm/CodeGen/LinkAllAsmWriterComponents.h b/llvm/include/llvm/CodeGen/LinkAllAsmWriterComponents.h
index c22f9d49f374b..c70413d72ebfd 100644
--- a/llvm/include/llvm/CodeGen/LinkAllAsmWriterComponents.h
+++ b/llvm/include/llvm/CodeGen/LinkAllAsmWriterComponents.h
@@ -15,19 +15,17 @@
 #define LLVM_CODEGEN_LINKALLASMWRITERCOMPONENTS_H
 
 #include "llvm/IR/BuiltinGCs.h"
-#include <cstdlib>
+#include "llvm/Support/AlwaysTrue.h"
 
 namespace {
   struct ForceAsmWriterLinking {
     ForceAsmWriterLinking() {
       // We must reference the plug-ins in such a way that compilers will not
       // delete it all as dead code, even with whole program optimization,
-      // yet is effectively a NO-OP. As the compiler isn't smart enough
-      // to know that getenv() never returns -1, this will do the job.
-      // This is so that globals in the translation units where these functions
-      // are defined are forced to be initialized, populating various
-      // registries.
-      if (std::getenv("bar") != (char*) -1)
+      // yet is effectively a NO-OP. This is so that globals in the translation
+      // units where these functions are defined are forced to be initialized,
+      // populating various registries.
+      if (llvm::getNonFoldableAlwaysTrue())
         return;
 
       llvm::linkOcamlGCPrinter();
diff --git a/llvm/include/llvm/CodeGen/LinkAllCodegenComponents.h b/llvm/include/llvm/CodeGen/LinkAllCodegenComponents.h
index 6f56682dce5fa..f0a01d2baf0e0 100644
--- a/llvm/include/llvm/CodeGen/LinkAllCodegenComponents.h
+++ b/llvm/include/llvm/CodeGen/LinkAllCodegenComponents.h
@@ -16,20 +16,18 @@
 
 #include "llvm/CodeGen/Passes.h"
 #include "llvm/CodeGen/SchedulerRegistry.h"
+#include "llvm/Support/AlwaysTrue.h"
 #include "llvm/Target/TargetMachine.h"
-#include <cstdlib>
 
 namespace {
   struct ForceCodegenLinking {
     ForceCodegenLinking() {
       // We must reference the passes in such a way that compilers will not
       // delete it all as dead code, even with whole program optimization,
-      // yet is effectively a NO-OP. As the compiler isn't smart enough
-      // to know that getenv() never returns -1, this will do the job.
-      // This is so that globals in the translation units where these functions
-      // are defined are forced to be initialized, populating various
-      // registries.
-      if (std::getenv("bar") != (char*) -1)
+      // yet is effectively a NO-OP. This is so that globals in the translation
+      // units where these functions are defined are forced to be initialized,
+      // populating various registries.
+      if (llvm::getNonFoldableAlwaysTrue())
         return;
 
       (void) llvm::createFastRegisterAllocator();
diff --git a/llvm/include/llvm/ExecutionEngine/MCJIT.h b/llvm/include/llvm/ExecutionEngine/MCJIT.h
index c836c06813fc6..1e035c0008fbd 100644
--- a/llvm/include/llvm/ExecutionEngine/MCJIT.h
+++ b/llvm/include/llvm/ExecutionEngine/MCJIT.h
@@ -15,8 +15,8 @@
 #define LLVM_EXECUTIONENGINE_MCJIT_H
 
 #include "llvm/ExecutionEngine/ExecutionEngine.h"
+#include "llvm/Support/AlwaysTrue.h"
 #include "llvm/Support/Compiler.h"
-#include <cstdlib>
 
 extern "C" LLVM_ABI void LLVMLinkInMCJIT();
 
@@ -24,13 +24,11 @@ namespace {
   struct ForceMCJITLinking {
     ForceMCJITLinking() {
       // We must reference MCJIT in such a way that compilers will not
-      // delete it all as dead code, even with whole program optimization,
-      // yet is effectively a NO-OP. As the compiler isn't smart enough
-      // to know that getenv() never returns -1, this will do the job.
-      // This is so that globals in the translation units where these functions
-      // are defined are forced to be initialized, populating various
-      // registries.
-      if (std::getenv("bar") != (char*) -1)
+      // delete it all as dead code, even with whole program optimization, yet
+      // is effectively a NO-OP. This is so that globals in the translation
+      // units where these functions are defined are forced to be initialized,
+      // populating various registries.
+      if (llvm::getNonFoldableAlwaysTrue())
         return;
 
       LLVMLinkInMCJIT();
diff --git a/llvm/include/llvm/LinkAllIR.h b/llvm/include/llvm/LinkAllIR.h
index ceed784d557de..894a8dd5bd926 100644
--- a/llvm/include/llvm/LinkAllIR.h
+++ b/llvm/include/llvm/LinkAllIR.h
@@ -21,6 +21,7 @@
 #include "llvm/IR/LLVMContext.h"
 #include "llvm/IR/Module.h"
 #include "llvm/IR/Verifier.h"
+#include "llvm/Support/AlwaysTrue.h"
 #include "llvm/Support/DynamicLibrary.h"
 #include "llvm/Support/MathExtras.h"
 #include "llvm/Support/Memory.h"
@@ -29,19 +30,16 @@
 #include "llvm/Support/Process.h"
 #include "llvm/Support/Program.h"
 #include "llvm/Support/Signals.h"
-#include <cstdlib>
 
 namespace {
   struct ForceVMCoreLinking {
     ForceVMCoreLinking() {
       // We must reference VMCore in such a way that compilers will not
-      // delete it all as dead code, even with whole program optimization,
-      // yet is effectively a NO-OP. As the compiler isn't smart enough
-      // to know that getenv() never returns -1, this will do the job.
+      // delete it all as dead code, even with whole program optimization.
       // This is so that globals in the translation units where these functions
       // are defined are forced to be initialized, populating various
       // registries.
-      if (std::getenv("bar") != (char*) -1)
+      if (llvm::getNonFoldableAlwaysTrue())
         return;
       llvm::LLVMContext Context;
       (void)new llvm::Module("", Context);
diff --git a/llvm/include/llvm/LinkAllPasses.h b/llvm/include/llvm/LinkAllPasses.h
index bae7f0da7022c..f82a43967e67a 100644
--- a/llvm/include/llvm/LinkAllPasses.h
+++ b/llvm/include/llvm/LinkAllPasses.h
@@ -34,6 +34,7 @@
 #include "llvm/CodeGen/Passes.h"
 #include "llvm/IR/Function.h"
 #include "llvm/IR/IRPrintingPasses.h"
+#include "llvm/Support/AlwaysTrue.h"
 #include "llvm/Support/Valgrind.h"
 #include "llvm/Transforms/IPO.h"
 #include "llvm/Transforms/IPO/AlwaysInliner.h"
@@ -54,14 +55,12 @@ class Triple;
 namespace {
 struct ForcePassLinking {
   ForcePassLinking() {
-    // We must reference the passes in such a way that compilers will not
-    // delete it all as dead code, even with whole program optimization,
-    // yet is effectively a NO-OP. As the compiler isn't smart enough
-    // to know that getenv() never returns -1, this will do the job.
-    // This is so that globals in the translation units where these functions
-    // are defined are forced to be initialized, populating various
-    // registries.
-    if (std::getenv("bar") != (char *)-1)
+    // We must reference the passes in such a way that compilers will not delete
+    // it all as dead code, even with whole program optimization, yet is
+    // effectively a NO-OP. This is so that globals in the translation units
+    // where these functions are defined are forced to be initialized,
+    // populating various registries.
+    if (llvm::getNonFoldableAlwaysTrue())
       return;
 
     (void)llvm::createAtomicExpandLegacyPass();
diff --git a/llvm/include/llvm/Support/AlwaysTrue.h b/llvm/include/llvm/Support/AlwaysTrue.h
new file mode 100644
index 0000000000000..733596af3e17e
--- /dev/null
+++ b/llvm/include/llvm/Support/AlwaysTrue.h
@@ -0,0 +1,29 @@
+//===--- AlwaysTrue.h - Helper for oqaque truthy values        --*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This file defines the getNonFoldableAlwaysTrue helper funtion
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_SUPPORT_ALWAYS_TRUE_H
+#define LLVM_SUPPORT_ALWAYS_TRUE_H
+
+#include <cstdlib>
+
+namespace llvm {
+inline bool getNonFoldableAlwaysTrue() {
+  // Some parts of the codebase require a "constant true value" used as a
+  // predicate. These cases require that even with LTO and static linking,
+  // it's not possible for the compiler to fold the value. As compilers
+  // aren't smart enough to know that getenv() never returns -1, this will do
+  // the job.
+  return std::getenv("LLVM_IGNORED_ENV_VAR") != (char *)-1;
+}
+} // end namespace llvm
+
+#endif // LLVM_SUPPORT_ALWAYS_TRUE_H
diff --git a/llvm/tools/bugpoint/bugpoint.cpp b/llvm/tools/bugpoint/bugpoint.cpp
index e49efdfe7c8e0..1b4cae45f518b 100644
--- a/llvm/tools/bugpoint/bugpoint.cpp
+++ b/llvm/tools/bugpoint/bugpoint.cpp
@@ -22,6 +22,7 @@
 #include "llvm/LinkAllIR.h"
 #include "llvm/LinkAllPasses.h"
 #include "llvm/Passes/PassPlugin.h"
+#include "llvm/Support/AlwaysTrue.h"
 #include "llvm/Support/CommandLine.h"
 #include "llvm/Support/InitLLVM.h"
 #include "llvm/Support/PluginLoader.h"
@@ -111,7 +112,7 @@ int main(int argc, char **argv) {
   initializeInstCombine(Registry);
   initializeTarget(Registry);
 
-  if (std::getenv("bar") == (char*) -1) {
+  if (!llvm::getNonFoldableAlwaysTrue())
     InitializeAllTargets();
     InitializeAllTargetMCs();
     InitializeAllAsmPrinters();
diff --git a/polly/include/polly/LinkAllPasses.h b/polly/include/polly/LinkAllPasses.h
index a2f8f33299918..d2e10d1a7acc6 100644
--- a/polly/include/polly/LinkAllPasses.h
+++ b/polly/include/polly/LinkAllPasses.h
@@ -18,7 +18,7 @@
 #include "polly/Support/DumpFunctionPass.h"
 #include "polly/Support/DumpModulePass.h"
 #include "llvm/ADT/StringRef.h"
-#include <cstdlib>
+#include "llvm/Support/AlwaysTrue.h"
 
 namespace llvm {
 class Pass;
@@ -72,11 +72,10 @@ extern char &CodePreparationID;
 namespace {
 struct PollyForcePassLinking {
   PollyForcePassLinking() {
-    // We must reference the passes in such a way that compilers will not
-    // delete it all as dead code, even with whole program optimization,
-    // yet is effectively a NO-OP. As the compiler isn't smart enough
-    // to know that getenv() never returns -1, this will do the job.
-    if (std::getenv("bar") != (char *)-1)
+    // We must reference the passes in such a way that compilers will not delete
+    // it all as dead code, even with whole program optimization, yet is
+    // effectively a NO-OP.
+    if (llvm::getNonFoldableAlwaysTrue())
       return;
 
     polly::createCodePreparationPass();

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of adding a dependency on getenv, did you consider using an empty side-effecting inline asm? THat would produce lower code size and reduce an unnecessary libc dependency.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did, but decided to keep it as-is.

  • MSVC for x86-64 doesn't support inline assembly
  • getenv is already a dependency

@lattner lattner removed their assignment Jul 22, 2025
Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// This file defines the getNonFoldableAlwaysTrue helper funtion
// This file defines the getNonFoldableAlwaysTrue helper function.

But really, this comment seems redundant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deleted

There's been a variation of the following in the code since 2005:

    if (unoptimizable_true)
      return;
    use_this_symbol_to_force_linking(); // unreachable but never removed

Way back in 00d5508 it was the win32 call `GetCurrentProcess`
but switched to `getenv("bar")` fairly soon after in 63e504f. While
that pulled in fewer dependencies and made the code portable, it's a
bit of a weird construct. The environment variable used for the `getenv`
call is "bar", which is particularly weird to see fly past when you run
`ltrace` on a binary linked against LLVM.

In this patch I don't try to replace this construct wholesale - it's
still required for architectural reasons I'm not able to tackle right
now, but I did try and make it slightly less weird and opaque:

- It gives the construct a name
- The environment variable hints where this comes from and that its
  value is ignored

Combined, this should be a bit of improvement for the next person who
wonders what LLVM is up to when they trace their process or see
smatterings of `getenv("bar")` dotted around the source.
@ldrumm ldrumm merged commit f992ae4 into llvm:main Jul 23, 2025
1 of 2 checks passed
@ldrumm ldrumm deleted the getenvbar branch July 23, 2025 11:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants