Skip to content

Conversation

@pcc
Copy link
Contributor

@pcc pcc commented Jun 5, 2025

This option complements -funique-source-file-names and allows the user
to use a different unique identifier than the source file path.

Created using spr 1.3.6-beta.1
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:codegen IR generation bugs: mangling, exceptions, etc. llvm:transforms labels Jun 5, 2025
@pcc pcc requested a review from teresajohnson June 5, 2025 05:37
@llvmbot
Copy link
Member

llvmbot commented Jun 5, 2025

@llvm/pr-subscribers-clang-codegen
@llvm/pr-subscribers-clang-driver

@llvm/pr-subscribers-llvm-transforms

Author: Peter Collingbourne (pcc)

Changes

This flag complements -funique-source-file-names and allows the user to
use a different unique identifier than the source file path.


Full diff: https://github.com/llvm/llvm-project/pull/142901.diff

10 Files Affected:

  • (modified) clang/docs/UsersManual.rst (+12-5)
  • (modified) clang/include/clang/Basic/CodeGenOptions.def (-2)
  • (modified) clang/include/clang/Basic/CodeGenOptions.h (+4)
  • (modified) clang/include/clang/Driver/Options.td (+9-7)
  • (modified) clang/lib/CodeGen/CodeGenModule.cpp (+7-2)
  • (modified) clang/lib/Driver/ToolChains/Clang.cpp (+8-2)
  • (modified) clang/test/CodeGen/unique-source-file-names.c (+3-2)
  • (modified) clang/test/Driver/unique-source-file-names.c (+9-3)
  • (modified) llvm/lib/Transforms/Utils/ModuleUtils.cpp (+6-4)
  • (modified) llvm/test/Transforms/ThinLTOBitcodeWriter/unique-source-file-names.ll (+2-1)
diff --git a/clang/docs/UsersManual.rst b/clang/docs/UsersManual.rst
index 8c72f95b94095..62844f7e6a2fa 100644
--- a/clang/docs/UsersManual.rst
+++ b/clang/docs/UsersManual.rst
@@ -2300,12 +2300,14 @@ are listed below.
 .. option:: -f[no-]unique-source-file-names
 
    When enabled, allows the compiler to assume that each object file
-   passed to the linker has been compiled using a unique source file
-   path. This is useful for reducing link times when doing ThinLTO
-   in combination with whole-program devirtualization or CFI.
+   passed to the linker has a unique identifier. The identifier for
+   an object file is either the source file path or the value of the
+   argument `-funique-source-file-identifier` if specified. This is
+   useful for reducing link times when doing ThinLTO in combination with
+   whole-program devirtualization or CFI.
 
-   The full source path passed to the compiler must be unique. This
-   means that, for example, the following is a usage error:
+   The full source path or identifier passed to the compiler must be
+   unique. This means that, for example, the following is a usage error:
 
    .. code-block:: console
 
@@ -2327,6 +2329,11 @@ are listed below.
    A misuse of this flag may result in a duplicate symbol error at
    link time.
 
+.. option:: -funique-source-file-identifier=IDENTIFIER
+
+   Used with `-funique-source-file-names` to specify a source file
+   identifier.
+
 .. option:: -fforce-emit-vtables
 
    In order to improve devirtualization, forces emitting of vtables even in
diff --git a/clang/include/clang/Basic/CodeGenOptions.def b/clang/include/clang/Basic/CodeGenOptions.def
index aad4e107cbeb3..fa9474d63ae42 100644
--- a/clang/include/clang/Basic/CodeGenOptions.def
+++ b/clang/include/clang/Basic/CodeGenOptions.def
@@ -278,8 +278,6 @@ CODEGENOPT(SanitizeCfiICallNormalizeIntegers, 1, 0) ///< Normalize integer types
                                                     ///< CFI icall function signatures
 CODEGENOPT(SanitizeCfiCanonicalJumpTables, 1, 0) ///< Make jump table symbols canonical
                                                  ///< instead of creating a local jump table.
-CODEGENOPT(UniqueSourceFileNames, 1, 0) ///< Allow the compiler to assume that TUs
-                                        ///< have unique source file names at link time
 CODEGENOPT(SanitizeKcfiArity, 1, 0) ///< Embed arity in KCFI patchable function prefix
 CODEGENOPT(SanitizeCoverageType, 2, 0) ///< Type of sanitizer coverage
                                        ///< instrumentation.
diff --git a/clang/include/clang/Basic/CodeGenOptions.h b/clang/include/clang/Basic/CodeGenOptions.h
index 278803f7bb960..f6a6a7fcfa6d7 100644
--- a/clang/include/clang/Basic/CodeGenOptions.h
+++ b/clang/include/clang/Basic/CodeGenOptions.h
@@ -338,6 +338,10 @@ class CodeGenOptions : public CodeGenOptionsBase {
   /// -fsymbol-partition (see https://lld.llvm.org/Partitions.html).
   std::string SymbolPartition;
 
+  /// If non-empty, allow the compiler to assume that the given source file
+  /// identifier is unique at link time.
+  std::string UniqueSourceFileIdentifier;
+  
   enum RemarkKind {
     RK_Missing,            // Remark argument not present on the command line.
     RK_Enabled,            // Remark enabled via '-Rgroup'.
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index 5ca31c253ed8f..f04e214066ccb 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -4204,13 +4204,15 @@ def ftrigraphs : Flag<["-"], "ftrigraphs">, Group<f_Group>,
 def fno_trigraphs : Flag<["-"], "fno-trigraphs">, Group<f_Group>,
   HelpText<"Do not process trigraph sequences">,
   Visibility<[ClangOption, CC1Option]>;
-defm unique_source_file_names: BoolOption<"f", "unique-source-file-names",
-  CodeGenOpts<"UniqueSourceFileNames">, DefaultFalse,
-  PosFlag<SetTrue, [], [CC1Option], "Allow">,
-  NegFlag<SetFalse, [], [], "Do not allow">,
-  BothFlags<[], [ClangOption], " the compiler to assume that each translation unit has a unique "
-                               "source file name at link time">>,
-  Group<f_clang_Group>;
+def funique_source_file_names: Flag<["-"], "funique-source-file-names">, Group<f_Group>,
+  HelpText<"Allow the compiler to assume that each translation unit has a unique "                       
+           "source file identifier (see funique-source-file-identifier) at link time">;
+def fno_unique_source_file_names: Flag<["-"], "fno-unique-source-file-names">;
+def unique_source_file_identifier_EQ: Joined<["-"], "funique-source-file-identifier=">, Group<f_Group>,
+  Visibility<[ClangOption, CC1Option]>,
+  HelpText<"Specify the source file identifier for -funique-source-file-names; "
+           "uses the source file path if not specified">,
+  MarshallingInfoString<CodeGenOpts<"UniqueSourceFileIdentifier">>;
 def funsigned_bitfields : Flag<["-"], "funsigned-bitfields">, Group<f_Group>;
 def funsigned_char : Flag<["-"], "funsigned-char">, Group<f_Group>;
 def fno_unsigned_char : Flag<["-"], "fno-unsigned-char">;
diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp
index 468fc6e0e5c56..4885965b35abb 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -1146,8 +1146,13 @@ void CodeGenModule::Release() {
                               1);
   }
 
-  if (CodeGenOpts.UniqueSourceFileNames) {
-    getModule().addModuleFlag(llvm::Module::Max, "Unique Source File Names", 1);
+  if (!CodeGenOpts.UniqueSourceFileIdentifier.empty()) {
+    getModule().addModuleFlag(
+        llvm::Module::Append, "Unique Source File Identifier",
+        llvm::MDTuple::get(
+            TheModule.getContext(),
+            llvm::MDString::get(TheModule.getContext(),
+                                CodeGenOpts.UniqueSourceFileIdentifier)));
   }
 
   if (LangOpts.Sanitize.has(SanitizerKind::KCFI)) {
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp
index 13842b8cc2870..504d79461d534 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -7740,8 +7740,14 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA,
   Args.addOptInFlag(CmdArgs, options::OPT_fexperimental_late_parse_attributes,
                     options::OPT_fno_experimental_late_parse_attributes);
 
-  Args.addOptInFlag(CmdArgs, options::OPT_funique_source_file_names,
-                    options::OPT_fno_unique_source_file_names);
+  if (Args.hasFlag(options::OPT_funique_source_file_names,
+                    options::OPT_fno_unique_source_file_names, false)) {
+    if (Arg *A = Args.getLastArg(options::OPT_unique_source_file_identifier_EQ))
+      A->render(Args, CmdArgs);
+    else
+      CmdArgs.push_back(Args.MakeArgString(
+          Twine("-funique-source-file-identifier=") + Input.getBaseInput()));
+  }
 
   // Setup statistics file output.
   SmallString<128> StatsFile = getStatsFileName(Args, Output, Input, D);
diff --git a/clang/test/CodeGen/unique-source-file-names.c b/clang/test/CodeGen/unique-source-file-names.c
index 1d5a4a5e8e4c5..df8e3025870ae 100644
--- a/clang/test/CodeGen/unique-source-file-names.c
+++ b/clang/test/CodeGen/unique-source-file-names.c
@@ -1,2 +1,3 @@
-// RUN: %clang_cc1 -funique-source-file-names -triple x86_64-linux-gnu -emit-llvm %s -o - | FileCheck %s
-// CHECK:  !{i32 7, !"Unique Source File Names", i32 1}
+// RUN: %clang_cc1 -funique-source-file-identifier=foo -triple x86_64-linux-gnu -emit-llvm %s -o - | FileCheck %s
+// CHECK:  !{i32 5, !"Unique Source File Identifier", ![[MD:[0-9]*]]}
+// CHECK: ![[MD]] = !{!"foo"}
diff --git a/clang/test/Driver/unique-source-file-names.c b/clang/test/Driver/unique-source-file-names.c
index 8322f0e37b0c7..0dc71345d745c 100644
--- a/clang/test/Driver/unique-source-file-names.c
+++ b/clang/test/Driver/unique-source-file-names.c
@@ -1,5 +1,11 @@
 // RUN: %clang -funique-source-file-names -### %s 2> %t
-// RUN: FileCheck < %t %s
+// RUN: FileCheck --check-prefix=SRC < %t %s
 
-// CHECK: "-cc1"
-// CHECK: "-funique-source-file-names"
+// SRC: "-cc1"
+// SRC: "-funique-source-file-identifier={{.*}}unique-source-file-names.c"
+
+// RUN: %clang -funique-source-file-names -funique-source-file-identifier=foo -### %s 2> %t
+// RUN: FileCheck --check-prefix=ID < %t %s
+
+// ID: "-cc1"
+// ID: "-funique-source-file-identifier=foo"
diff --git a/llvm/lib/Transforms/Utils/ModuleUtils.cpp b/llvm/lib/Transforms/Utils/ModuleUtils.cpp
index 10efdd61d4553..596849ecab742 100644
--- a/llvm/lib/Transforms/Utils/ModuleUtils.cpp
+++ b/llvm/lib/Transforms/Utils/ModuleUtils.cpp
@@ -18,6 +18,7 @@
 #include "llvm/IR/IRBuilder.h"
 #include "llvm/IR/MDBuilder.h"
 #include "llvm/IR/Module.h"
+#include "llvm/Support/Casting.h"
 #include "llvm/Support/MD5.h"
 #include "llvm/Support/raw_ostream.h"
 #include "llvm/Support/xxhash.h"
@@ -346,10 +347,11 @@ void llvm::filterDeadComdatFunctions(
 std::string llvm::getUniqueModuleId(Module *M) {
   MD5 Md5;
 
-  auto *UniqueSourceFileNames = mdconst::extract_or_null<ConstantInt>(
-      M->getModuleFlag("Unique Source File Names"));
-  if (UniqueSourceFileNames && UniqueSourceFileNames->getZExtValue()) {
-    Md5.update(M->getSourceFileName());
+  auto *UniqueSourceFileIdentifier = dyn_cast_or_null<MDNode>(
+      M->getModuleFlag("Unique Source File Identifier"));
+  if (UniqueSourceFileIdentifier) {
+    Md5.update(
+        cast<MDString>(UniqueSourceFileIdentifier->getOperand(0))->getString());
   } else {
     bool ExportsSymbols = false;
     for (auto &GV : M->global_values()) {
diff --git a/llvm/test/Transforms/ThinLTOBitcodeWriter/unique-source-file-names.ll b/llvm/test/Transforms/ThinLTOBitcodeWriter/unique-source-file-names.ll
index 0f3fd566f9b1c..13dcefcb70cb5 100644
--- a/llvm/test/Transforms/ThinLTOBitcodeWriter/unique-source-file-names.ll
+++ b/llvm/test/Transforms/ThinLTOBitcodeWriter/unique-source-file-names.ll
@@ -19,4 +19,5 @@ define internal void @f() {
 !0 = !{i32 0, !"typeid"}
 
 !llvm.module.flags = !{!1}
-!1 = !{i32 1, !"Unique Source File Names", i32 1}
+!1 = !{i32 5, !"Unique Source File Identifier", !2}
+!2 = !{!"unique-source-file-names.c"}

@github-actions
Copy link

github-actions bot commented Jun 5, 2025

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff HEAD~1 HEAD --extensions c,cpp,h -- clang/include/clang/Basic/CodeGenOptions.h clang/lib/CodeGen/CodeGenModule.cpp clang/lib/Driver/ToolChains/Clang.cpp clang/test/CodeGen/unique-source-file-names.c clang/test/Driver/unique-source-file-names.c llvm/lib/Transforms/Utils/ModuleUtils.cpp
View the diff from clang-format here.
diff --git a/llvm/lib/Transforms/Utils/ModuleUtils.cpp b/llvm/lib/Transforms/Utils/ModuleUtils.cpp
index 596849eca..05470b5cd 100644
--- a/llvm/lib/Transforms/Utils/ModuleUtils.cpp
+++ b/llvm/lib/Transforms/Utils/ModuleUtils.cpp
@@ -11,8 +11,8 @@
 //===----------------------------------------------------------------------===//
 
 #include "llvm/Transforms/Utils/ModuleUtils.h"
-#include "llvm/Analysis/VectorUtils.h"
 #include "llvm/ADT/SmallString.h"
+#include "llvm/Analysis/VectorUtils.h"
 #include "llvm/IR/DerivedTypes.h"
 #include "llvm/IR/Function.h"
 #include "llvm/IR/IRBuilder.h"

Created using spr 1.3.6-beta.1
@MaskRay
Copy link
Member

MaskRay commented Jun 5, 2025

Should call this "option". Within LLVM, we use flag for options without a value.

Group<f_clang_Group>;
def funique_source_file_names: Flag<["-"], "funique-source-file-names">, Group<f_Group>,
HelpText<"Allow the compiler to assume that each translation unit has a unique "
"source file identifier (see funique-source-file-identifier) at link time">;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: missing "-" in front of option name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Created using spr 1.3.6-beta.1
@pcc pcc changed the title Add -funique-source-file-identifier flag. Add -funique-source-file-identifier option. Jun 5, 2025
@pcc pcc merged commit d1b0b4b into main Jun 5, 2025
5 of 8 checks passed
@pcc pcc deleted the users/pcc/spr/add-funique-source-file-identifier-flag branch June 5, 2025 17:52
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Jun 5, 2025
This option complements -funique-source-file-names and allows the user
to use a different unique identifier than the source file path.

Reviewers: teresajohnson

Reviewed By: teresajohnson

Pull Request: llvm/llvm-project#142901
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang:codegen IR generation bugs: mangling, exceptions, etc. clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category llvm:transforms

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants