Skip to content

Conversation

@mgcarrasco
Copy link
Contributor

@mgcarrasco mgcarrasco commented Sep 2, 2025

This PR introduces flexibility in configuring diagnostic error handling, previously hardcoded. The introduced changes are as follows:

  1. Configurable Exit Behavior: The behavior when encountering the first unhandled diagnostic error is now customizable with -halt-on-first-diag-error=[exit,abort,none]. The default remains as exit, which preserves existing behavior.
  2. Diagnostic Handler Toggling in llc: llc's diagnostic error handler, which prints all errors, can now be toggled (-no-diag-handler). By default, it continues to load as before, ensuring no change to current behavior.

These changes aim to add flexibility while ensuring that default settings do not alter the current behavior of diagnostic error handling.

Motivation:

Providing the option to generate a stack trace on the first diagnostic error is key for fuzzing, as stack traces can help deduplicate test cases. Capturing the stack trace right when the diagnostic error occurs allows a more precise identification.

@llvmbot
Copy link
Member

llvmbot commented Sep 4, 2025

@llvm/pr-subscribers-llvm-ir

Author: Manuel Carrasco (mgcarrasco)

Changes

This PR introduces flexibility in configuring diagnostic error handling, previously hardcoded. The introduced changes are as follows:

  1. Configurable Exit Behavior: The behavior when encountering the first unhandled diagnostic error is now customizable with -halt-on-first-diag-error=[exit,abort,none]. The default remains as exit, which preserves existing behavior.
  2. Diagnostic Handler Toggling in llc: llc's diagnostic error handler, which prints all errors, can now be toggled (-no-diag-handler). By default, it continues to load as before, ensuring no change to current behavior.

These changes aim to add flexibility while ensuring that default settings do not alter the current behavior of diagnostic error handling.

Motivation:

Providing the option to generate a stack trace on the first diagnostic error is key for fuzzing, as stack traces can help deduplicate test cases. Capturing the stack trace right when the diagnostic error occurs allows a more precise identification.


Full diff: https://github.com/llvm/llvm-project/pull/156439.diff

3 Files Affected:

  • (modified) llvm/lib/IR/LLVMContext.cpp (+41-2)
  • (added) llvm/test/tools/llc/no-diagnostic-handler.ll (+25)
  • (modified) llvm/tools/llc/llc.cpp (+9-2)
diff --git a/llvm/lib/IR/LLVMContext.cpp b/llvm/lib/IR/LLVMContext.cpp
index 57532cd491dd6..94257a101db51 100644
--- a/llvm/lib/IR/LLVMContext.cpp
+++ b/llvm/lib/IR/LLVMContext.cpp
@@ -31,6 +31,30 @@
 
 using namespace llvm;
 
+namespace opts {
+
+enum class HaltOnFirstDiagErrorAction {
+  Exit,
+  Abort,
+  None,
+};
+
+static cl::opt<HaltOnFirstDiagErrorAction> HaltOnFirstDiagErrorOpt(
+    "halt-on-first-diag-error",
+    cl::desc(
+        "Halt action to take on the first unhandled diagnostic error reported"),
+    cl::values(
+        clEnumValN(
+            HaltOnFirstDiagErrorAction::Exit, "exit",
+            "Exit with error code 1 on first diagnostic with error severity"),
+        clEnumValN(HaltOnFirstDiagErrorAction::Abort, "abort",
+                   "Abort with a stacktrace immediately on first diagnostic "
+                   "with error severity"),
+        clEnumValN(HaltOnFirstDiagErrorAction::None, "none",
+                   "Do not halt on first diagnostic with error severity")),
+    cl::init(HaltOnFirstDiagErrorAction::Exit), cl::Hidden);
+} // namespace opts
+
 static StringRef knownBundleName(unsigned BundleTagID) {
   switch (BundleTagID) {
   case LLVMContext::OB_deopt:
@@ -242,6 +266,19 @@ LLVMContext::getDiagnosticMessagePrefix(DiagnosticSeverity Severity) {
   llvm_unreachable("Unknown DiagnosticSeverity");
 }
 
+static void handleHaltOnFirstDiagError() {
+  switch (opts::HaltOnFirstDiagErrorOpt) {
+  case opts::HaltOnFirstDiagErrorAction::Exit:
+    std::exit(1);
+    break;
+  case opts::HaltOnFirstDiagErrorAction::Abort:
+    std::abort();
+    break;
+  default:
+    break;
+  }
+}
+
 void LLVMContext::diagnose(const DiagnosticInfo &DI) {
   if (auto *OptDiagBase = dyn_cast<DiagnosticInfoOptimizationBase>(&DI))
     if (LLVMRemarkStreamer *RS = getLLVMRemarkStreamer())
@@ -264,8 +301,10 @@ void LLVMContext::diagnose(const DiagnosticInfo &DI) {
   errs() << getDiagnosticMessagePrefix(DI.getSeverity()) << ": ";
   DI.print(DP);
   errs() << "\n";
-  if (DI.getSeverity() == DS_Error)
-    exit(1);
+
+  if (DI.getSeverity() == DS_Error) {
+    handleHaltOnFirstDiagError();
+  }
 }
 
 //===----------------------------------------------------------------------===//
diff --git a/llvm/test/tools/llc/no-diagnostic-handler.ll b/llvm/test/tools/llc/no-diagnostic-handler.ll
new file mode 100644
index 0000000000000..72f4b3cd2b4a2
--- /dev/null
+++ b/llvm/test/tools/llc/no-diagnostic-handler.ll
@@ -0,0 +1,25 @@
+; COM: Test that the default behavior persists (the llc-specific handler prints all errors).
+; RUN: not llc -mtriple=amdgcn -verify-machineinstrs=0 -global-isel=false < %s 2>&1 | FileCheck -check-prefix=ALL-ERRORS %s
+; COM: Do not halt on the first error when the llc-specific handler is not loaded.
+; RUN: not llc -mtriple=amdgcn -verify-machineinstrs=0 -global-isel=false -no-diag-handler -halt-on-first-diag-error=none < %s 2>&1 | FileCheck -check-prefix=ALL-ERRORS %s
+
+; COM: Now halt on the first error by disabling the llc-specific handler and test the different halt actions
+; RUN: not llc -mtriple=amdgcn -verify-machineinstrs=0 -global-isel=false -no-diag-handler -halt-on-first-diag-error=exit < %s 2>&1 | FileCheck -check-prefix=FIRST-ERROR %s
+; COM: Same error message as in -halt-on-first-diag-error=exit but with a crash.
+; RUN: not --crash llc -mtriple=amdgcn -verify-machineinstrs=0 -global-isel=false -no-diag-handler -halt-on-first-diag-error=abort < %s 2>&1 | FileCheck -check-prefix=FIRST-ERROR %s
+
+; ALL-ERRORS: error: <unknown>:0:0: in function illegal_vgpr_to_sgpr_copy_i32 void (): illegal VGPR to SGPR copy
+; FIRST-ERROR: error: <unknown>:0:0: in function illegal_vgpr_to_sgpr_copy_i32 void (): illegal VGPR to SGPR copy
+define amdgpu_kernel void @illegal_vgpr_to_sgpr_copy_i32() #0 {
+  %vgpr = call i32 asm sideeffect "; def $0", "=${v1}"()
+  call void asm sideeffect "; use $0", "${s9}"(i32 %vgpr)
+  ret void
+}
+
+; ALL-ERRORS: error: <unknown>:0:0: in function illegal_vgpr_to_sgpr_copy_v2i32 void (): illegal VGPR to SGPR copy
+; FIRST-ERROR-NOT: error: <unknown>:0:0: in function illegal_vgpr_to_sgpr_copy_v2i32 void (): illegal VGPR to SGPR copy
+define amdgpu_kernel void @illegal_vgpr_to_sgpr_copy_v2i32() #0 {
+  %vgpr = call <2 x i32> asm sideeffect "; def $0", "=${v[0:1]}"()
+  call void asm sideeffect "; use $0", "${s[10:11]}"(<2 x i32> %vgpr)
+  ret void
+}
diff --git a/llvm/tools/llc/llc.cpp b/llvm/tools/llc/llc.cpp
index b3d7185e7f144..06be15500028a 100644
--- a/llvm/tools/llc/llc.cpp
+++ b/llvm/tools/llc/llc.cpp
@@ -213,6 +213,11 @@ static cl::opt<std::string> PassPipeline(
 static cl::alias PassPipeline2("p", cl::aliasopt(PassPipeline),
                                cl::desc("Alias for -passes"));
 
+static cl::opt<bool>
+    NoDiagHandler("no-diag-handler",
+                  cl::desc("Do not load the llc-specific diagnostic handler."),
+                  cl::init(false), cl::Hidden);
+
 namespace {
 
 std::vector<std::string> &getRunPassNames() {
@@ -384,8 +389,10 @@ int main(int argc, char **argv) {
   LLVMContext Context;
   Context.setDiscardValueNames(DiscardValueNames);
 
-  // Set a diagnostic handler that doesn't exit on the first error
-  Context.setDiagnosticHandler(std::make_unique<LLCDiagnosticHandler>());
+  if (!NoDiagHandler) {
+    // Set a diagnostic handler that doesn't exit on the first error
+    Context.setDiagnosticHandler(std::make_unique<LLCDiagnosticHandler>());
+  }
 
   Expected<std::unique_ptr<ToolOutputFile>> RemarksFileOrErr =
       setupLLVMOptimizationRemarks(Context, RemarksFilename, RemarksPasses,

@jmmartinez jmmartinez requested a review from rengolin September 8, 2025 09:17
@jmmartinez jmmartinez requested a review from arsenm September 15, 2025 09:46
Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not like that today we have a difference in error behavior between llc and opt. This PR almost helps with that, but it adds yet another mechanism parallel to the existing diagnostic callback. Can we just make it not an option, and always do what llc does?

In particular, I think the behavior of exiting is always wrong. llc has the correct and more useful behavior for writing error tests.

@mgcarrasco
Copy link
Contributor Author

@arsenm Thanks for the feedback! My main goal is to optionally generate a stack trace for diagnostics errors. This is key for fuzzing in order to deduplicate test cases based on the similarity of their stack traces. This PR gives us such option.

I'm trying to understand/find concrete cases where opt may behave like llc as it was mentioned. I couldn't find cases of diagnose() or emitError() calls in IR-related files (llvm/lib/IR) that are called by opt. Would you have any example at hand?

@mgcarrasco mgcarrasco requested a review from arsenm September 29, 2025 08:28
@CatherineMoore
Copy link
Contributor

@CatherineMoore for awareness

if (!NoDiagHandler) {
// Set a diagnostic handler that doesn't exit on the first error
Context.setDiagnosticHandler(std::make_unique<LLCDiagnosticHandler>());
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tried setting a DiagHandlerCallback ?

Your current proposal adds 2 options: one for everyone linking with LLVMContext, and another llc specific.

You could just keep one that is llc-specific (like, abort-on-error or something like that).
The idea is to keep the LLCDiagnosticHandler, but call:

if(AbortOnError) {
  Context.setDiagnosticHandlerCallBack(
    [](const DiagnosticInfo*, void*) { abort(1); },
    nullptr);
}

When an error is diagnosed, it would go through LLCDiagnosticHandler::handleDiagnostics which calls DiagnosticHandler::handleDiagnostics, which calls DiagnosticHandler::DiagHandlerCallback if it is setted, which in our case will call abort(1).

That would reduce the surface of the change to achieve the behavior you want.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated to this, it's weird that LLCDiagnosticHandler::handleDiagnostics calls DiagnosticHandler::handleDiagnostics but ignores its result (if the diagnostic was handled by the callback), which is always true if the callback is set.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants