Skip to content

Conversation

@omern1
Copy link
Member

@omern1 omern1 commented Oct 29, 2025

This patch makes Clang produce the crash reproducer shell script for IR inputs as well.

This patch makes Clang produce the crash reproducer shell script for IR
inputs as well.
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' labels Oct 29, 2025
@llvmbot
Copy link
Member

llvmbot commented Oct 29, 2025

@llvm/pr-subscribers-clang-driver

@llvm/pr-subscribers-clang

Author: Nabeel Omer (omern1)

Changes

This patch makes Clang produce the crash reproducer shell script for IR inputs as well.


Full diff: https://github.com/llvm/llvm-project/pull/165572.diff

2 Files Affected:

  • (modified) clang/lib/Driver/Driver.cpp (+71-33)
  • (added) clang/test/Driver/crash-ir-repro.cpp (+15)
diff --git a/clang/lib/Driver/Driver.cpp b/clang/lib/Driver/Driver.cpp
index 40ea513e85427..7f28f7fa0c997 100644
--- a/clang/lib/Driver/Driver.cpp
+++ b/clang/lib/Driver/Driver.cpp
@@ -70,6 +70,7 @@
 #include "llvm/ADT/ArrayRef.h"
 #include "llvm/ADT/STLExtras.h"
 #include "llvm/ADT/SmallSet.h"
+#include "llvm/ADT/SmallVector.h"
 #include "llvm/ADT/StringExtras.h"
 #include "llvm/ADT/StringRef.h"
 #include "llvm/ADT/StringSet.h"
@@ -103,6 +104,7 @@
 #include <memory>
 #include <optional>
 #include <set>
+#include <string>
 #include <utility>
 #if LLVM_ON_UNIX
 #include <unistd.h> // getpid
@@ -2041,12 +2043,17 @@ void Driver::generateCompilationDiagnostics(
   InputList Inputs;
   BuildInputs(C.getDefaultToolChain(), C.getArgs(), Inputs);
 
+  ArgStringList IRInputs;
   for (InputList::iterator it = Inputs.begin(), ie = Inputs.end(); it != ie;) {
     bool IgnoreInput = false;
 
-    // Ignore input from stdin or any inputs that cannot be preprocessed.
-    // Check type first as not all linker inputs have a value.
-    if (types::getPreprocessedType(it->first) == types::TY_INVALID) {
+    // Save IR inputs separately, ignore input from stdin or any other inputs
+    // that cannot be preprocessed. Check type first as not all linker inputs
+    // have a value.
+    if (types::isLLVMIR(it->first)) {
+      IRInputs.push_back(it->second->getValue());
+      IgnoreInput = true;
+    } else if (types::getPreprocessedType(it->first) == types::TY_INVALID) {
       IgnoreInput = true;
     } else if (!strcmp(it->second->getValue(), "-")) {
       Diag(clang::diag::note_drv_command_failed_diag_msg)
@@ -2063,7 +2070,7 @@ void Driver::generateCompilationDiagnostics(
     }
   }
 
-  if (Inputs.empty()) {
+  if (Inputs.empty() && IRInputs.empty()) {
     Diag(clang::diag::note_drv_command_failed_diag_msg)
         << "Error generating preprocessed source(s) - "
            "no preprocessable inputs.";
@@ -2086,46 +2093,77 @@ void Driver::generateCompilationDiagnostics(
     return;
   }
 
-  // Construct the list of abstract actions to perform for this compilation. On
-  // Darwin OSes this uses the driver-driver and builds universal actions.
-  const ToolChain &TC = C.getDefaultToolChain();
-  if (TC.getTriple().isOSBinFormatMachO())
-    BuildUniversalActions(C, TC, Inputs);
-  else
-    BuildActions(C, C.getArgs(), Inputs, C.getActions());
+  // If we only have IR inputs there's no need for preprocessing.
+  if (!Inputs.empty()) {
+    // Construct the list of abstract actions to perform for this compilation.
+    // On
+    // Darwin OSes this uses the driver-driver and builds universal actions.
+    const ToolChain &TC = C.getDefaultToolChain();
+    if (TC.getTriple().isOSBinFormatMachO())
+      BuildUniversalActions(C, TC, Inputs);
+    else
+      BuildActions(C, C.getArgs(), Inputs, C.getActions());
 
-  BuildJobs(C);
+    BuildJobs(C);
 
-  // If there were errors building the compilation, quit now.
-  if (Trap.hasErrorOccurred()) {
-    Diag(clang::diag::note_drv_command_failed_diag_msg)
-        << "Error generating preprocessed source(s).";
-    return;
-  }
+    // If there were errors building the compilation, quit now.
+    if (Trap.hasErrorOccurred()) {
+      Diag(clang::diag::note_drv_command_failed_diag_msg)
+          << "Error generating preprocessed source(s).";
+      return;
+    }
+    // Generate preprocessed output.
+    SmallVector<std::pair<int, const Command *>, 4> FailingCommands;
+    C.ExecuteJobs(C.getJobs(), FailingCommands);
 
-  // Generate preprocessed output.
-  SmallVector<std::pair<int, const Command *>, 4> FailingCommands;
-  C.ExecuteJobs(C.getJobs(), FailingCommands);
+    // If any of the preprocessing commands failed, clean up and exit.
+    if (!FailingCommands.empty()) {
+      Diag(clang::diag::note_drv_command_failed_diag_msg)
+          << "Error generating preprocessed source(s).";
+      return;
+    }
 
-  // If any of the preprocessing commands failed, clean up and exit.
-  if (!FailingCommands.empty()) {
-    Diag(clang::diag::note_drv_command_failed_diag_msg)
-        << "Error generating preprocessed source(s).";
-    return;
+    const ArgStringList &TempFiles = C.getTempFiles();
+    if (TempFiles.empty()) {
+      Diag(clang::diag::note_drv_command_failed_diag_msg)
+          << "Error generating preprocessed source(s).";
+      return;
+    }
   }
 
-  const ArgStringList &TempFiles = C.getTempFiles();
-  if (TempFiles.empty()) {
-    Diag(clang::diag::note_drv_command_failed_diag_msg)
-        << "Error generating preprocessed source(s).";
-    return;
+  // Copying filenames due to ownership.
+  const ArgStringList &Files = C.getTempFiles();
+  SmallVector<std::string> TempFiles(Files.begin(), Files.end());
+
+  // We'd like to copy the IR input file into our own temp file
+  // because the build system might try to clean-up after itself.
+  for (auto const *Input : IRInputs) {
+    int FD;
+    llvm::SmallVector<char, 64> Path;
+
+    StringRef extension = llvm::sys::path::extension(Input);
+    if (!extension.empty())
+      extension = extension.drop_front();
+
+    std::error_code EC = llvm::sys::fs::createTemporaryFile(
+        llvm::sys::path::stem(Input), extension, FD, Path);
+
+    if (EC) {
+      Diag(clang::diag::note_drv_command_failed_diag_msg)
+          << "Error generating run script: " << "Failed copying IR input files"
+          << " " << EC.message();
+      return;
+    }
+    llvm::sys::fs::copy_file(Input, FD);
+
+    TempFiles.push_back(std::string(Path.begin(), Path.end()));
   }
 
   Diag(clang::diag::note_drv_command_failed_diag_msg) << BugReporMsg;
 
   SmallString<128> VFS;
   SmallString<128> ReproCrashFilename;
-  for (const char *TempFile : TempFiles) {
+  for (std::string &TempFile : TempFiles) {
     Diag(clang::diag::note_drv_command_failed_diag_msg) << TempFile;
     if (Report)
       Report->TemporaryFiles.push_back(TempFile);
@@ -2142,7 +2180,7 @@ void Driver::generateCompilationDiagnostics(
   }
 
   for (const char *TempFile : SavedTemps)
-    C.addTempFile(TempFile);
+    TempFiles.push_back(TempFile);
 
   // Assume associated files are based off of the first temporary file.
   CrashReportInfo CrashInfo(TempFiles[0], VFS);
diff --git a/clang/test/Driver/crash-ir-repro.cpp b/clang/test/Driver/crash-ir-repro.cpp
new file mode 100644
index 0000000000000..e0e47814a0e78
--- /dev/null
+++ b/clang/test/Driver/crash-ir-repro.cpp
@@ -0,0 +1,15 @@
+// RUN: %clang -S -emit-llvm -o %t.ll %s
+// RUN: not %clang -DCRASH %s %t.ll 2>&1 | FileCheck %s
+
+// CHECK: Preprocessed source(s) and associated run script(s) are located at:
+// CHECK-NEXT: clang: note: diagnostic msg: {{.*}}.cpp
+// CHECK-NEXT: clang: note: diagnostic msg: {{.*}}.ll
+// CHECK-NEXT: clang: note: diagnostic msg: {{.*}}.sh
+
+#ifdef CRASH
+#pragma clang __debug parser_crash
+#endif
+
+int main() {
+  return 0;
+}

@AaronBallman
Copy link
Collaborator

I think this makes sense and the changes look correct to me, but I've added a CodeGen person just to verify they think this is useful for them.

Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-LTO IR inputs are in sort of a weird place. We have handling for them, but we don't really have any rules for exactly what kind of IR inputs we're actually expecting: the format isn't really stable, and we sort of just shove them through whatever pipeline we usually create. If you do have some workflow, it would be nice to hear about.

We don't want to do this for LTO; the linker is better positioned to generate crash diagnostics in that case.

@@ -0,0 +1,15 @@
// RUN: %clang -S -emit-llvm -o %t.ll %s
// RUN: not %clang -DCRASH %s %t.ll 2>&1 | FileCheck %s
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect this regression test will break in some situations; it assumes that invoking "clang" can actually link a program, which isn't the case for less common build configs. Can we trigger this without trying to link a program?

@omern1
Copy link
Member Author

omern1 commented Nov 10, 2025

Thanks @AaronBallman.

Thanks for the information @efriedma-quic, are you happy for this to go in as it stands?

@efriedma-quic
Copy link
Collaborator

We don't want to do this for LTO; the linker is better positioned to generate crash diagnostics in that case.

I'd like some response to this, at least.

@omern1
Copy link
Member Author

omern1 commented Nov 17, 2025

We don't want to do this for LTO; the linker is better positioned to generate crash diagnostics in that case.

I'd like some response to this, at least.

Ah I misunderstood your comment.

As things stand, if a link job crashes we never reach the point where we start constructing the list of inputs:

Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see, there's separate handling for crashes during the link step. In that case, one minor comment; otherwise LGTM.

<< " " << EC.message();
return;
}
llvm::sys::fs::copy_file(Input, FD);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to check the error code of copy_file?

@github-actions
Copy link

🐧 Linux x64 Test Results

  • 111271 tests passed
  • 4419 tests skipped

@omern1 omern1 merged commit 83d27f6 into llvm:main Nov 18, 2025
10 checks passed
@omern1 omern1 deleted the omern1/ir-repro branch November 18, 2025 15:56
googlewalt added a commit to googlewalt/llvm-project that referenced this pull request Nov 19, 2025
googlewalt added a commit that referenced this pull request Nov 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang Clang issues not falling into any other category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants