[RFC] Emit dwarf data for signature-changed or new functions #157349

yonghong-song · 2025-09-07T16:46:48Z

Add a new pass EmitChangedFuncDebugInfo which will add dwarf for
additional functions including functions with signature change
and new functions.

The previous approach in [1] tries to add debuginfo for those
optimization passes which cause signature changes. Based on
discussion in [1], it is preferred to have a specific pass to
add debuginfo and later on dwarf generation can include those
new debuginfo.

The following is an example:
Source:

  $ cat test.c
  struct t { int a; };
  char *tar(struct t *a, struct t *d);
  __attribute__((noinline)) static char * foo(struct t *a, struct t *d, int b)
  {
    return tar(a, d);
  }
  char *bar(struct t *a, struct t *d)
  {
    return foo(a, d, 1);
  }

Compiled and dump dwarf with:

  clang -O2 -c -g test.c
  llvm-dwarfdump test.o

and related dwarf output

0x0000005c:   DW_TAG_subprogram
                DW_AT_low_pc    (0x0000000000000010)
                DW_AT_high_pc   (0x0000000000000015)
                DW_AT_frame_base        (DW_OP_reg7 RSP)
                DW_AT_linkage_name      ("foo")
                DW_AT_name      ("foo")
                DW_AT_decl_file ("/home/yhs/tests/sig-change/deadarg/test.c")
                DW_AT_decl_line (3)
                DW_AT_type      (0x000000bb "char *")
                DW_AT_artificial        (true)
                DW_AT_external  (true)

0x0000006c:     DW_TAG_formal_parameter
                  DW_AT_location        (DW_OP_reg5 RDI)
                  DW_AT_decl_file       ("/home/yhs/tests/sig-change/deadarg/test.c")
                  DW_AT_decl_line       (3)
                  DW_AT_type    (0x000000c4 "t *")

0x00000075:     DW_TAG_formal_parameter
                  DW_AT_location        (DW_OP_reg4 RSI)
                  DW_AT_decl_file       ("/home/yhs/tests/sig-change/deadarg/test.c")
                  DW_AT_decl_line       (3)
                  DW_AT_type    (0x000000c4 "t *")

0x0000007e:     DW_TAG_inlined_subroutine
                  DW_AT_abstract_origin (0x0000009a "foo")
                  DW_AT_low_pc  (0x0000000000000010)
                  DW_AT_high_pc (0x0000000000000015)
                  DW_AT_call_file       ("/home/yhs/tests/sig-change/deadarg/test.c")
                  DW_AT_call_line       (0)

0x0000008a:       DW_TAG_formal_parameter
                    DW_AT_location      (DW_OP_reg5 RDI)
                    DW_AT_abstract_origin       (0x000000a2 "a")

0x00000091:       DW_TAG_formal_parameter
                    DW_AT_location      (DW_OP_reg4 RSI)
                    DW_AT_abstract_origin       (0x000000aa "d")

0x00000098:       NULL

0x00000099:     NULL

0x0000009a:   DW_TAG_subprogram
                DW_AT_name      ("foo")
                DW_AT_decl_file ("/home/yhs/tests/sig-change/deadarg/test.c")
                DW_AT_decl_line (3)
                DW_AT_prototyped        (true)
                DW_AT_type      (0x000000bb "char *")
                DW_AT_inline    (DW_INL_inlined)

0x000000a2:     DW_TAG_formal_parameter
                  DW_AT_name    ("a")
                  DW_AT_decl_file       ("/home/yhs/tests/sig-change/deadarg/test.c")
                  DW_AT_decl_line       (3)
                  DW_AT_type    (0x000000c4 "t *")

0x000000aa:     DW_TAG_formal_parameter
                  DW_AT_name    ("d")
                  DW_AT_decl_file       ("/home/yhs/tests/sig-change/deadarg/test.c")
                  DW_AT_decl_line       (3)
                  DW_AT_type    (0x000000c4 "t *")

0x000000b2:     DW_TAG_formal_parameter
                  DW_AT_name    ("b")
                  DW_AT_decl_file       ("/home/yhs/tests/sig-change/deadarg/test.c")
                  DW_AT_decl_line       (3)
                  DW_AT_type    (0x000000d8 "int")

0x000000ba:     NULL

There are some restrictions in the current implementation:

Only C language is supported
BPF target is excluded as one of main goals for this pull request
is to generate proper vmlinux BTF for arch's like x86_64/arm64 etc.
Function must not be a intrinsic, decl only, return value size more
than arch register size and func with variable arguments.
Missed flag to turn off this feature and missed some dbg info (e.g.
argument cannot be easily retrieved from dbg_value etc.).
Currently, some functions (e.g. foo.llvm.) do not change
signatures but the current implementation still marks them
as artificial. We might be able to avoid DW_TAG_inlined_subroutine
for these routines.

I have tested this patch set by building latest bpf-next linux kernel.
For no-lto case:

  65341 original number of functions
  1085  new functions with this patch

For thin-lto case:

  65595 original number of functions
  2492  new functions with this patch

[1] #127855

llvmbot · 2025-09-07T16:47:19Z

@llvm/pr-subscribers-pgo
@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-debuginfo

Author: None (yonghong-song)

Changes

Add a new pass EmitChangedFuncDebugInfo which will add dwarf for
additional functions including functions with signature change
and new functions.

The previous approach in [1] tries to add debuginfo for those
optimization passes which cause signature changes. Based on
discussion in [1], it is preferred to have a specific pass to
add debuginfo and later on dwarf generation can include those
new debuginfo.

The ultimate goal is to add new information to dwarf like below:

  DW_TAG_compile_unit
    ...
    // New functions with suffix
    DW_TAG_inlined_subroutine
      DW_AT_name      ("foo.1")
      DW_AT_type      (0x0000000000000091 "int")
      DW_AT_artificial (true)
      DW_AT_specificiation (original DW_TAG_subprogram)

      DW_TAG_formal_parameter
        DW_AT_name    ("b")
        DW_AT_type    (0x0000000000000091 "int")

      DW_TAG_formal_parameter
        DW_AT_name    ("c")
        DW_AT_type    (0x0000000000000095 "long")

    ...
    // Functions with changed signatures
    DW_TAG_inlined_subroutine
      DW_AT_name      ("bar")
      DW_AT_type      (0x0000000000000091 "int")
      DW_AT_artificial (true)
      DW_AT_specificiation (original DW_TAG_subprogram)

      DW_TAG_formal_parameter
        DW_AT_name    ("c")
        DW_AT_type    (0x0000000000000095 "unsigned int")

    ...
    // Functions not obtained function changed signatures yet
    // The DW_CC_nocall presence indicates such cases.
    DW_TAG_inlined_subroutine
      DW_AT_name      ("bar" or "bar.1")
      DW_AT_calling_convention        (DW_CC_nocall)
      DW_AT_artificial (true)
      DW_AT_specificiation (original DW_TAG_subprogram)

The parent tag of above DW_TAG_inlined_subroutine is
DW_TAG_compile_unit. This is a new feature for dwarf
so it won't cause issues with existing dwarf related tools.
Total three patterns are introduced as the above.
. New functions with suffix, e.g., 'foo.1' or 'foo.llvm.<hash>'.
. Functions with changed signature due to ArgumentPromotion
or DeadArgumentElimination.
. Functions the current implementation cannot get proper
signature. For this case, DW_CC_nocall is set to indicate
signature is lost. More details in the below.

A special CompileUnit with file name "<artificial>" is created
to hold special DISubprograms for the above three kinds of functions.
During actual dwarf generation, these special DISubprograms
will turn to above to proper DW_TAG_inlined_subroutine tags.

The below are some discussions with not handled cases and
some other alternative things:
(1) Currently, there are three not handled signature changes.
. During to ArgumentPromotion, we may have
foo(..., struct foo *p, ...) => foo(..., int p.0.val, int p.4.val, ...)
. Struct argument which expands to two actual arguments,
foo(..., struct foo v, ...) => foo(..., v.coerce0, v.coerce1, ...)
. Struct argument changed to struct pointer,
foo(..., struct foo v, ...) => foo(..., struct foo *p, ...)
I think by utilizing dbg_value/dbg_declare and instructions, we
might be able to resolve the above and get proper signature.
But any suggestions are welcome.
(2) Currently, I am using a special CompileUnit "<artificial>" to hold
newly created DISubprograms. But there is an alternative.
For example, "llvm.dbg.cu" metadata is used to hold all CompileUnits.
We could introduce "llvm.dbg.sp.extra" to hold all new
DISubprograms instead of a new CompileUnit.

I have tested this patch set by building latest bpf-next linux kernel.
For no-lto case:

  65288 original number of functions
  910   new functions with this patch (including DW_CC_nocall case)
  7     new functions without signatures (with DW_CC_nocall)

For thin-lto case:

  65541 original number of functions
  2324  new functions with this patch (including DW_CC_nocall case)
  14    new functions without signatures (with DW_CC_nocall)

The following are some examples with thinlto with generated dwarf:

  ...
  0x0001707f:   DW_TAG_inlined_subroutine
                  DW_AT_name      ("msr_build_context")
                  DW_AT_type      (0x00004163 "int")
                  DW_AT_artificial        (true)
                  DW_AT_specification     (0x0000440b "msr_build_context")

  0x0001708b:     DW_TAG_formal_parameter
                    DW_AT_name    ("msr_id")
                    DW_AT_type    (0x0000e55c "const u32 *")

  0x00017093:     NULL
  ...
  0x004225e5:   DW_TAG_inlined_subroutine
                  DW_AT_name      ("__die_body.llvm.14794269134614576759")
                  DW_AT_type      (0x00418a14 "int")
                  DW_AT_artificial        (true)
                  DW_AT_specification     (0x00422348 "__die_body")

  0x004225f1:     DW_TAG_formal_parameter
                    DW_AT_name    ("")
                    DW_AT_type    (0x004181f3 "const char *")

  0x004225f9:     DW_TAG_formal_parameter
                    DW_AT_name    ("")
                    DW_AT_type    (0x00419118 "pt_regs *")

  0x00422601:     DW_TAG_formal_parameter
                    DW_AT_name    ("")
                    DW_AT_type    (0x0041af2f "long")

  0x00422609:     NULL
  ...
  0x013f5dac:   DW_TAG_inlined_subroutine
                  DW_AT_name      ("devkmsg_emit")
                  DW_AT_calling_convention        (DW_CC_nocall)
                  DW_AT_artificial        (true)
                  DW_AT_specification     (0x013ef75b "devkmsg_emit")

[1] #127855

Patch is 25.17 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/157349.diff

14 Files Affected:

(added) llvm/include/llvm/Transforms/Utils/EmitChangedFuncDebugInfo.h (+33)
(modified) llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp (+72)
(modified) llvm/lib/CodeGen/AsmPrinter/DwarfDebug.h (+2)
(modified) llvm/lib/Passes/PassBuilder.cpp (+1)
(modified) llvm/lib/Passes/PassBuilderPipelines.cpp (+7-3)
(modified) llvm/lib/Passes/PassRegistry.def (+1)
(modified) llvm/lib/Transforms/IPO/ArgumentPromotion.cpp (+9)
(modified) llvm/lib/Transforms/Utils/CMakeLists.txt (+1)
(added) llvm/lib/Transforms/Utils/EmitChangedFuncDebugInfo.cpp (+337)
(modified) llvm/test/Other/new-pm-defaults.ll (+2)
(modified) llvm/test/Other/new-pm-thinlto-postlink-defaults.ll (+1)
(modified) llvm/test/Other/new-pm-thinlto-postlink-pgo-defaults.ll (+1)
(modified) llvm/test/Other/new-pm-thinlto-postlink-samplepgo-defaults.ll (+1)
(modified) llvm/test/Transforms/ArgumentPromotion/dbg.ll (+5-1)

diff --git a/llvm/include/llvm/Transforms/Utils/EmitChangedFuncDebugInfo.h b/llvm/include/llvm/Transforms/Utils/EmitChangedFuncDebugInfo.h
new file mode 100644
index 0000000000000..8d569cd95d7f7
--- /dev/null
+++ b/llvm/include/llvm/Transforms/Utils/EmitChangedFuncDebugInfo.h
@@ -0,0 +1,33 @@
+//===- EmitChangedFuncDebugInfo.h - Emit Additional Debug Info -*- C++ --*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+/// \file
+/// Emit debug info for changed or new funcs.
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_TRANSFORMS_UTILS_EMITCHANGEDFUNCDEBUGINFO_H
+#define LLVM_TRANSFORMS_UTILS_EMITCHANGEDFUNCDEBUGINFO_H
+
+#include "llvm/IR/PassManager.h"
+
+namespace llvm {
+
+class Module;
+
+// Pass that emits late dwarf.
+class EmitChangedFuncDebugInfoPass
+    : public PassInfoMixin<EmitChangedFuncDebugInfoPass> {
+public:
+  EmitChangedFuncDebugInfoPass() = default;
+
+  PreservedAnalyses run(Module &M, ModuleAnalysisManager &AM);
+};
+
+} // end namespace llvm
+
+#endif // LLVM_TRANSFORMS_UTILS_EMITCHANGEDFUNCDEBUGINFO_H
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
index c27f100775625..3245d486feb77 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
@@ -1266,11 +1266,83 @@ void DwarfDebug::finishSubprogramDefinitions() {
   }
 }
 
+void DwarfDebug::addChangedSubprograms() {
+  // Generate additional dwarf for functions with signature changed.
+  NamedMDNode *NMD = MMI->getModule()->getNamedMetadata("llvm.dbg.cu");
+  DICompileUnit *ExtraCU = nullptr;
+  for (MDNode *N : NMD->operands()) {
+    auto *CU = cast<DICompileUnit>(N);
+    if (CU->getFile()->getFilename() == "<artificial>") {
+      ExtraCU = CU;
+      break;
+    }
+  }
+  if (!ExtraCU)
+    return;
+
+  llvm::DebugInfoFinder DIF;
+  DIF.processModule(*MMI->getModule());
+  for (auto *ExtraSP : DIF.subprograms()) {
+    if (ExtraSP->getUnit() != ExtraCU)
+      continue;
+
+    DISubprogram *SP = cast<DISubprogram>(ExtraSP->getScope());
+    DwarfCompileUnit &Cu = getOrCreateDwarfCompileUnit(SP->getUnit());
+    DIE *ScopeDIE =
+        DIE::get(DIEValueAllocator, dwarf::DW_TAG_inlined_subroutine);
+    Cu.getUnitDie().addChild(ScopeDIE);
+
+    Cu.addString(*ScopeDIE, dwarf::DW_AT_name, ExtraSP->getName());
+
+    DITypeRefArray Args = ExtraSP->getType()->getTypeArray();
+
+    if (Args[0])
+        Cu.addType(*ScopeDIE, Args[0]);
+
+    if (ExtraSP->getType()->getCC() == llvm::dwarf::DW_CC_nocall) {
+      Cu.addUInt(*ScopeDIE, dwarf::DW_AT_calling_convention,
+                 dwarf::DW_FORM_data1, llvm::dwarf::DW_CC_nocall);
+    }
+
+    Cu.addFlag(*ScopeDIE, dwarf::DW_AT_artificial);
+
+    // dereference the DIE* for DIEEntry
+    DIE *OriginDIE = Cu.getOrCreateSubprogramDIE(SP);
+    Cu.addDIEEntry(*ScopeDIE, dwarf::DW_AT_specification, DIEEntry(*OriginDIE));
+
+    SmallVector<const DILocalVariable *> ArgVars(Args.size());
+    for (const DINode *DN : ExtraSP->getRetainedNodes()) {
+      if (const auto *DV = dyn_cast<DILocalVariable>(DN)) {
+        uint32_t Arg = DV->getArg();
+        if (Arg)
+          ArgVars[Arg - 1] = DV;
+      }
+    }
+
+    for (unsigned i = 1, N = Args.size(); i < N; ++i) {
+      const DIType *Ty = Args[i];
+      if (!Ty) {
+        assert(i == N-1 && "Unspecified parameter must be the last argument");
+        Cu.createAndAddDIE(dwarf::DW_TAG_unspecified_parameters, *ScopeDIE);
+      } else {
+        DIE &Arg =
+            Cu.createAndAddDIE(dwarf::DW_TAG_formal_parameter, *ScopeDIE);
+        const DILocalVariable *DV = ArgVars[i - 1];
+        if (DV)
+          Cu.addString(Arg, dwarf::DW_AT_name, DV->getName());
+        Cu.addType(Arg, Ty);
+      }
+    }
+  }
+}
+
 void DwarfDebug::finalizeModuleInfo() {
   const TargetLoweringObjectFile &TLOF = Asm->getObjFileLowering();
 
   finishSubprogramDefinitions();
 
+  addChangedSubprograms();
+
   finishEntityDefinitions();
 
   bool HasEmittedSplitCU = false;
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.h b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.h
index 89813dcf0fdab..417ffb19633c3 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.h
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.h
@@ -565,6 +565,8 @@ class DwarfDebug : public DebugHandlerBase {
 
   void finishSubprogramDefinitions();
 
+  void addChangedSubprograms();
+
   /// Finish off debug information after all functions have been
   /// processed.
   void finalizeModuleInfo();
diff --git a/llvm/lib/Passes/PassBuilder.cpp b/llvm/lib/Passes/PassBuilder.cpp
index 587f0ece0859b..fa937a9a317be 100644
--- a/llvm/lib/Passes/PassBuilder.cpp
+++ b/llvm/lib/Passes/PassBuilder.cpp
@@ -344,6 +344,7 @@
 #include "llvm/Transforms/Utils/CanonicalizeAliases.h"
 #include "llvm/Transforms/Utils/CanonicalizeFreezeInLoops.h"
 #include "llvm/Transforms/Utils/CountVisits.h"
+#include "llvm/Transforms/Utils/EmitChangedFuncDebugInfo.h"
 #include "llvm/Transforms/Utils/DXILUpgrade.h"
 #include "llvm/Transforms/Utils/Debugify.h"
 #include "llvm/Transforms/Utils/DeclareRuntimeLibcalls.h"
diff --git a/llvm/lib/Passes/PassBuilderPipelines.cpp b/llvm/lib/Passes/PassBuilderPipelines.cpp
index 98821bb1408a7..123041ea8cad8 100644
--- a/llvm/lib/Passes/PassBuilderPipelines.cpp
+++ b/llvm/lib/Passes/PassBuilderPipelines.cpp
@@ -133,6 +133,7 @@
 #include "llvm/Transforms/Utils/AssumeBundleBuilder.h"
 #include "llvm/Transforms/Utils/CanonicalizeAliases.h"
 #include "llvm/Transforms/Utils/CountVisits.h"
+#include "llvm/Transforms/Utils/EmitChangedFuncDebugInfo.h"
 #include "llvm/Transforms/Utils/EntryExitInstrumenter.h"
 #include "llvm/Transforms/Utils/ExtraPassManager.h"
 #include "llvm/Transforms/Utils/InjectTLIMappings.h"
@@ -1625,9 +1626,12 @@ PassBuilder::buildModuleOptimizationPipeline(OptimizationLevel Level,
   if (PTO.CallGraphProfile && !LTOPreLink)
     MPM.addPass(CGProfilePass(isLTOPostLink(LTOPhase)));
 
-  // RelLookupTableConverterPass runs later in LTO post-link pipeline.
-  if (!LTOPreLink)
+  // RelLookupTableConverterPass and EmitChangedFuncDebugInfoPass run later in
+  // LTO post-link pipeline.
+  if (!LTOPreLink) {
     MPM.addPass(RelLookupTableConverterPass());
+    MPM.addPass(EmitChangedFuncDebugInfoPass());
+  }
 
   return MPM;
 }
@@ -2355,4 +2359,4 @@ AAManager PassBuilder::buildDefaultAAPipeline() {
 bool PassBuilder::isInstrumentedPGOUse() const {
   return (PGOOpt && PGOOpt->Action == PGOOptions::IRUse) ||
          !UseCtxProfile.empty();
-}
\ No newline at end of file
+}
diff --git a/llvm/lib/Passes/PassRegistry.def b/llvm/lib/Passes/PassRegistry.def
index 299aaa801439b..78ee4ca6f96a1 100644
--- a/llvm/lib/Passes/PassRegistry.def
+++ b/llvm/lib/Passes/PassRegistry.def
@@ -73,6 +73,7 @@ MODULE_PASS("debugify", NewPMDebugifyPass())
 MODULE_PASS("declare-runtime-libcalls", DeclareRuntimeLibcallsPass())
 MODULE_PASS("dfsan", DataFlowSanitizerPass())
 MODULE_PASS("dot-callgraph", CallGraphDOTPrinterPass())
+MODULE_PASS("dwarf-emit-late", EmitChangedFuncDebugInfoPass())
 MODULE_PASS("dxil-upgrade", DXILUpgradePass())
 MODULE_PASS("elim-avail-extern", EliminateAvailableExternallyPass())
 MODULE_PASS("extract-blocks", BlockExtractorPass({}, false))
diff --git a/llvm/lib/Transforms/IPO/ArgumentPromotion.cpp b/llvm/lib/Transforms/IPO/ArgumentPromotion.cpp
index 262c902d40d2d..609e4f8e4d23a 100644
--- a/llvm/lib/Transforms/IPO/ArgumentPromotion.cpp
+++ b/llvm/lib/Transforms/IPO/ArgumentPromotion.cpp
@@ -50,6 +50,7 @@
 #include "llvm/IR/BasicBlock.h"
 #include "llvm/IR/CFG.h"
 #include "llvm/IR/Constants.h"
+#include "llvm/IR/DIBuilder.h"
 #include "llvm/IR/DataLayout.h"
 #include "llvm/IR/DerivedTypes.h"
 #include "llvm/IR/Dominators.h"
@@ -432,6 +433,14 @@ doPromotion(Function *F, FunctionAnalysisManager &FAM,
     PromoteMemToReg(Allocas, DT, &AC);
   }
 
+  // DW_CC_nocall to DISubroutineType to inform debugger that it may not be safe
+  // to call this function.
+  DISubprogram *SP = NF->getSubprogram();
+  if (SP) {
+    auto Temp = SP->getType()->cloneWithCC(llvm::dwarf::DW_CC_nocall);
+    SP->replaceType(MDNode::replaceWithPermanent(std::move(Temp)));
+  }
+
   return NF;
 }
 
diff --git a/llvm/lib/Transforms/Utils/CMakeLists.txt b/llvm/lib/Transforms/Utils/CMakeLists.txt
index e411d68570096..0b36693ce7975 100644
--- a/llvm/lib/Transforms/Utils/CMakeLists.txt
+++ b/llvm/lib/Transforms/Utils/CMakeLists.txt
@@ -22,6 +22,7 @@ add_llvm_component_library(LLVMTransformUtils
   Debugify.cpp
   DeclareRuntimeLibcalls.cpp
   DemoteRegToStack.cpp
+  EmitChangedFuncDebugInfo.cpp
   DXILUpgrade.cpp
   EntryExitInstrumenter.cpp
   EscapeEnumerator.cpp
diff --git a/llvm/lib/Transforms/Utils/EmitChangedFuncDebugInfo.cpp b/llvm/lib/Transforms/Utils/EmitChangedFuncDebugInfo.cpp
new file mode 100644
index 0000000000000..82acae3f0efeb
--- /dev/null
+++ b/llvm/lib/Transforms/Utils/EmitChangedFuncDebugInfo.cpp
@@ -0,0 +1,337 @@
+//==- EmitChangedFuncDebugInfoPass - Emit Additional Debug Info -*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This file implements emitting debug info for functions with changed
+// signatures or new functions.
+//
+//===----------------------------------------------------------------------===//
+
+#include "llvm/Transforms/Utils/EmitChangedFuncDebugInfo.h"
+#include "llvm/IR/DIBuilder.h"
+#include "llvm/IR/IRBuilder.h"
+#include "llvm/IR/IntrinsicInst.h"
+#include "llvm/IR/Module.h"
+
+using namespace llvm;
+
+static bool getArg(BasicBlock &FirstBB, unsigned Idx, DIBuilder &DIB,
+                   DIFile *NewFile, Function *F, DISubprogram *OldSP,
+                   SmallVector<Metadata *, 5> &TypeList,
+                   SmallVector<Metadata *, 5> &ArgList) {
+  for (Instruction &I : FirstBB) {
+    for (const DbgRecord &DR : I.getDbgRecordRange()) {
+      auto *DVR = dyn_cast<DbgVariableRecord>(&DR);
+      if (!DVR)
+        continue;
+      // All of DbgVariableRecord::LocationType::{Value,Assign,Declare}
+      // are covered.
+      Metadata *Loc = DVR->getRawLocation();
+      auto *ValueMDN = dyn_cast<ValueAsMetadata>(Loc);
+      if (!ValueMDN)
+        continue;
+
+      // A poison value may correspond to a unused argument.
+      if (isa<PoisonValue>(ValueMDN->getValue())) {
+        Type *Ty = ValueMDN->getType();
+        auto *Var = cast<DILocalVariable>(DVR->getRawVariable());
+        if (!Var || Var->getArg() != (Idx + 1))
+          continue;
+
+        // Check for cases like below due to ArgumentPromotion
+        //   define internal ... i32 @add42_byref(i32 %p.0.val) ... {
+        //     #dbg_value(ptr poison, !17, !DIExpression(), !18)
+        //     ...
+        //   }
+        // TODO: one pointer expands to more than one argument is not
+        // supported yet. For example,
+        //   define internal ... i32 @add42_byref(i32 %p.0.val, i32 %p.4.val)
+        //   ...
+        if (Ty->isPointerTy() && F->getArg(Idx)->getType()->isIntegerTy()) {
+          // For such cases, a new argument is created.
+          auto *IntTy = cast<IntegerType>(F->getArg(Idx)->getType());
+          unsigned IntBitWidth = IntTy->getBitWidth();
+
+          DIType *IntDIType =
+              DIB.createBasicType("int" + std::to_string(IntBitWidth),
+                                  IntBitWidth, dwarf::DW_ATE_signed);
+          Var = DIB.createParameterVariable(OldSP, F->getArg(Idx)->getName(),
+                                            Idx + 1, NewFile, OldSP->getLine(),
+                                            IntDIType);
+        }
+
+        TypeList.push_back(Var->getType());
+        ArgList.push_back(Var);
+        return true;
+      }
+
+      // Handle the following pattern:
+      //   ... @vgacon_do_font_op(..., i32 noundef, i1 noundef zeroext %ch512)
+      //   ... {
+      //     ...
+      //       #dbg_value(i32 %set, !8568, !DIExpression(), !8589)
+      //     %storedv = zext i1 %ch512 to i8
+      //       #dbg_value(i8 %storedv, !8569, !DIExpression(), !8589)
+      //     ...
+      //   }
+      if (ValueMDN->getValue() != F->getArg(Idx)) {
+        Instruction *PrevI = I.getPrevNode();
+        if (!PrevI)
+          continue;
+        if (ValueMDN->getValue() != PrevI)
+          continue;
+        auto *ZExt = dyn_cast<ZExtInst>(PrevI);
+        if (!ZExt)
+          continue;
+        if (ZExt->getOperand(0) != F->getArg(Idx))
+          continue;
+      }
+
+      auto *Var = cast<DILocalVariable>(DVR->getRawVariable());
+
+      // Even we get dbg_*(...) for arguments, we still need to ensure
+      // compatible types between IR func argument types and debugInfo argument
+      // types.
+      Type *Ty = ValueMDN->getType();
+      DIType *DITy = Var->getType();
+      while (auto *DTy = dyn_cast<DIDerivedType>(DITy)) {
+        if (DTy->getTag() == dwarf::DW_TAG_pointer_type) {
+          DITy = DTy;
+          break;
+        }
+        DITy = DTy->getBaseType();
+      }
+
+      if (Ty->isIntegerTy()) {
+        if (auto *DTy = dyn_cast<DICompositeType>(DITy)) {
+          if (!Ty->isIntegerTy(DTy->getSizeInBits())) {
+            // TODO: A struct param breaks into two actual arguments like
+            //    static int count(struct user_arg_ptr argv, int max)
+            // and the actual func signature:
+            //    i32 @count(i8 range(i8 0, 2) %argv.coerce0, ptr %argv.coerce1)
+            //    {
+            //      #dbg_value(i8 %argv.coerce0, !14759,
+            //      !DIExpression(DW_OP_LLVM_fragment, 0, 8), !14768)
+            //      #dbg_value(ptr %argv.coerce1, !14759,
+            //      !DIExpression(DW_OP_LLVM_fragment, 64, 64), !14768)
+            //      ...
+            //    }
+            return false;
+          }
+        }
+      } else if (Ty->isPointerTy()) {
+        // TODO: A struct turned into a pointer to struct.
+        //   @rhashtable_lookup_fast(ptr noundef %key,
+        //      ptr noundef readonly byval(%struct.rhashtable_params)
+        //        align 8 captures(none) %params) {
+        //      ...
+        //      %MyAlloca = alloca [160 x i8], align 32
+        //      %0 = ptrtoint ptr %MyAlloca to i64
+        //      %1 = add i64 %0, 32
+        //      %2 = inttoptr i64 %1 to ptr
+        //      ...
+        //      call void @llvm.memcpy.p0.p0.i64(ptr align 8 %2, ptr align 8
+        //      %params, i64 40, i1 false)
+        //        #dbg_value(ptr @offdevs, !15308, !DIExpression(), !15312)
+        //        #dbg_value(ptr %key, !15309, !DIExpression(), !15312)
+        //        #dbg_declare(ptr %MyAlloca, !15310,
+        //        !DIExpression(DW_OP_plus_uconst, 32), !15313)
+        //      tail call void @__rcu_read_lock() #14, !dbg !15314
+        //   }
+        if (dyn_cast<DICompositeType>(DITy))
+          return false;
+
+        auto *DTy = dyn_cast<DIDerivedType>(DITy);
+        if (!DTy)
+          continue;
+        if (DTy->getTag() != dwarf::DW_TAG_pointer_type)
+          continue;
+      }
+
+      TypeList.push_back(Var->getType());
+      if (Var->getArg() != (Idx + 1) ||
+          Var->getName() != F->getArg(Idx)->getName()) {
+        Var = DIB.createParameterVariable(OldSP, F->getArg(Idx)->getName(),
+                                          Idx + 1, OldSP->getUnit()->getFile(),
+                                          OldSP->getLine(), Var->getType());
+      }
+      ArgList.push_back(Var);
+      return true;
+    }
+  }
+
+  return false;
+}
+
+static bool getTypeArgList(DIBuilder &DIB, DIFile *NewFile, Function *F,
+                           FunctionType *FTy, DISubprogram *OldSP,
+                           SmallVector<Metadata *, 5> &TypeList,
+                           SmallVector<Metadata *, 5> &ArgList) {
+  Type *RetTy = FTy->getReturnType();
+  if (RetTy->isVoidTy()) {
+    // Void return type may be due to optimization.
+    TypeList.push_back(nullptr);
+  } else {
+    // Optimization does not change return type from one
+    // non-void type to another non-void type.
+    DITypeRefArray TyArray = OldSP->getType()->getTypeArray();
+    TypeList.push_back(TyArray[0]);
+  }
+
+  unsigned NumArgs = FTy->getNumParams();
+  BasicBlock &FirstBB = F->getEntryBlock();
+  for (unsigned i = 0; i < NumArgs; ++i) {
+    if (!getArg(FirstBB, i, DIB, NewFile, F, OldSP, TypeList, ArgList))
+      return false;
+  }
+
+  return true;
+}
+
+static void generateDebugInfo(Module &M, Function *F) {
+  // For this CU, we want generate the following three dwarf units:
+  // DW_TAG_compile_unit
+  //   ...
+  //   // New functions with suffix
+  //   DW_TAG_inlined_subroutine
+  //     DW_AT_name      ("foo.1")
+  //     DW_AT_type      (0x0000000000000091 "int")
+  //     DW_AT_artificial (true)
+  //     DW_AT_specificiation (original DW_TAG_subprogram)
+  //
+  //     DW_TAG_formal_parameter
+  //       DW_AT_name    ("b")
+  //       DW_AT_type    (0x0000000000000091 "int")
+  //
+  //     DW_TAG_formal_parameter
+  //       DW_AT_name    ("c")
+  //       DW_AT_type    (0x0000000000000095 "long")
+  //   ...
+  //   // Functions with changed signatures
+  //   DW_TAG_inlined_subroutine
+  //     DW_AT_name      ("bar")
+  //     DW_AT_type      (0x0000000000000091 "int")
+  //     DW_AT_artificial (true)
+  //     DW_AT_specificiation (original DW_TAG_subprogram)
+  //
+  //     DW_TAG_formal_parameter
+  //       DW_AT_name    ("c")
+  //       DW_AT_type    (0x0000000000000095 "unsigned int")
+  //   ...
+  //   // Functions not obtained function changed signatures yet
+  //   // The DW_CC_nocall presence indicates such cases.
+  //   DW_TAG_inlined_subroutine
+  //     DW_AT_name      ("bar" or "bar.1")
+  //     DW_AT_calling_convention        (DW_CC_nocall)
+  //     DW_AT_artificial (true)
+  //     DW_AT_specificiation (original DW_TAG_subprogram)
+  //   ...
+
+  // A new ComputeUnit is created with file name "<artificial>"
+  // to host newly-created DISubprogram's.
+  DICompileUnit *NewCU = nullptr;
+  NamedMDNode *CUs = M.getNamedMetadata("llvm.dbg.cu");
+  // Check whether the expected CU already there or not.
+  for (MDNode *Node : CUs->operands()) {
+    auto *CU = cast<DICompileUnit>(Node);
+    if (CU->getFile()->getFilename() == "<artificial>") {
+      NewCU = CU;
+      break;
+    }
+  }
+
+  DISubprogram *OldSP = F->getSubprogram();
+  DIBuilder DIB(M, /*AllowUnresolved=*/false, NewCU);
+  DIFile *NewFile;
+
+  if (NewCU) {
+    NewFile = NewCU->getFile();
+  } else {
+    DICompileUnit *OldCU = OldSP->getUnit();
+    DIFile *OldFile = OldCU->getFile();
+    NewFile = DIB.createFile("<artificial>", OldFile->getDirectory());
+    NewCU = DIB.createCompileUnit(
+        OldCU->getSourceLanguage(), NewFile, OldCU->getProducer(),
+        OldCU->isOptimized(), OldCU->getFlags(), OldCU->getRuntimeVersion());
+  }
+
+  SmallVector<Metadata *, 5> TypeList;
+  SmallVector<Metadata *, 5> ArgList;
+
+  FunctionType *FTy = F->getFunctionType();
+  bool Success = getTypeArgList(DIB, NewFile, F, FTy, OldSP, TypeList, ArgList);
+  if (!Success) {
+    TypeList.clear();
+    TypeList.push_back(nullptr);
+    ArgList.clear();
+  }
+
+  DITypeRefArray DITypeArray = DIB.getOrCreateTypeArray(TypeList);
+  auto *SubroutineType = DIB.createSubroutineType(DITypeArray);
+  DINodeArray ArgArray = DIB.getOrCreateArray(ArgList);
+
+  Function *DummyF =
+      Function::Create(FTy, GlobalValue::AvailableExternallyLinkage,
+                       F->getName() + ".newsig", &M);
+
+  DISubprogram *NewSP =
+      DIB.createFunction(OldSP,                   // Scope
+                         F->getName(),            // Name
+                         OldSP->getLinkageName(), // Linkage name
+                         NewFile,                 // File
+                         OldSP->getLine(),        // Line
+                         SubroutineType,          // DISubroutineType
+                         OldSP->getScopeLine(),   // ScopeLine
+      ...
[truncated]

github-actions · 2025-09-07T16:50:47Z

✅ With the latest revision this PR passed the C/C++ code formatter.

yonghong-song · 2025-09-07T16:51:28Z

cc @jemarch

arsenm · 2025-09-08T03:02:52Z

llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp

@@ -1266,11 +1266,83 @@ void DwarfDebug::finishSubprogramDefinitions() {
  }
 }

+void DwarfDebug::addChangedSubprograms() {
+  // Generate additional dwarf for functions with signature changed.
+  NamedMDNode *NMD = MMI->getModule()->getNamedMetadata("llvm.dbg.cu");


Module::debug_compile_units

Module::debug_compile_units

Thanks! Will do as you suggested.

In llvm pull request [1], the dwarf is changed to accommodate functions whose signatures are different from source level although they have the same name. Other non-source functions are also included in dwarf. The following is an example: The source: ==== $ cat test.c struct t { int a; }; char *tar(struct t *a, struct t *d); __attribute__((noinline)) static char * foo(struct t *a, struct t *d, int b) { return tar(a, d); } char *bar(struct t *a, struct t *d) { return foo(a, d, 1); } ==== Part of generated dwarf: ==== 0x0000005c: DW_TAG_subprogram DW_AT_low_pc (0x0000000000000010) DW_AT_high_pc (0x0000000000000015) DW_AT_frame_base (DW_OP_reg7 RSP) DW_AT_linkage_name ("foo") DW_AT_name ("foo") DW_AT_decl_file ("/home/yhs/tests/sig-change/deadarg/test.c") DW_AT_decl_line (3) DW_AT_type (0x000000bb "char *") DW_AT_artificial (true) DW_AT_external (true) 0x0000006c: DW_TAG_formal_parameter DW_AT_location (DW_OP_reg5 RDI) DW_AT_decl_file ("/home/yhs/tests/sig-change/deadarg/test.c") DW_AT_decl_line (3) DW_AT_type (0x000000c4 "t *") 0x00000075: DW_TAG_formal_parameter DW_AT_location (DW_OP_reg4 RSI) DW_AT_decl_file ("/home/yhs/tests/sig-change/deadarg/test.c") DW_AT_decl_line (3) DW_AT_type (0x000000c4 "t *") 0x0000007e: DW_TAG_inlined_subroutine DW_AT_abstract_origin (0x0000009a "foo") DW_AT_low_pc (0x0000000000000010) DW_AT_high_pc (0x0000000000000015) DW_AT_call_file ("/home/yhs/tests/sig-change/deadarg/test.c") DW_AT_call_line (0) 0x0000008a: DW_TAG_formal_parameter DW_AT_location (DW_OP_reg5 RDI) DW_AT_abstract_origin (0x000000a2 "a") 0x00000091: DW_TAG_formal_parameter DW_AT_location (DW_OP_reg4 RSI) DW_AT_abstract_origin (0x000000aa "d") 0x00000098: NULL 0x00000099: NULL 0x0000009a: DW_TAG_subprogram DW_AT_name ("foo") DW_AT_decl_file ("/home/yhs/tests/sig-change/deadarg/test.c") DW_AT_decl_line (3) DW_AT_prototyped (true) DW_AT_type (0x000000bb "char *") DW_AT_inline (DW_INL_inlined) 0x000000a2: DW_TAG_formal_parameter DW_AT_name ("a") DW_AT_decl_file ("/home/yhs/tests/sig-change/deadarg/test.c") DW_AT_decl_line (3) DW_AT_type (0x000000c4 "t *") 0x000000aa: DW_TAG_formal_parameter DW_AT_name ("d") DW_AT_decl_file ("/home/yhs/tests/sig-change/deadarg/test.c") DW_AT_decl_line (3) DW_AT_type (0x000000c4 "t *") 0x000000b2: DW_TAG_formal_parameter DW_AT_name ("b") DW_AT_decl_file ("/home/yhs/tests/sig-change/deadarg/test.c") DW_AT_decl_line (3) DW_AT_type (0x000000d8 "int") 0x000000ba: NULL ==== In the above, there are two subprograms with the same name 'foo'. Currently btf encoder will consider both functions as ELF functions. Since two subprograms have different signature, the funciton will be ignored. But actually, one of function 'foo' is marked as DW_INL_inlined which means we should not treat it as an elf funciton. The patch fixed this issue by filtering subprograms if the corresponding function__inlined() is true. This will fix the issue for [1]. But it should work fine without [1] too. [1] llvm/llvm-project#157349 Signed-off-by: Yonghong Song <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/dwarves/[email protected]/ Signed-off-by: Alan Maguire <[email protected]>

ArgumentPromotion pass may change function signatures. If this happens and debuginfo is enabled, let us add DW_CC_nocall to debuginfo so it is clear that the function signature has changed. DeadArgumentElimination ([1]) has similar implementation. Also fix an ArgumentPromotion test due to adding DW_CC_nocall to debuginfo. [1] llvm@340b0ca

During development of emitting dwarf data for signature-changed or new functions, I found two test failures llvm/test/Transforms/SampleProfile/ctxsplit.ll llvm/test/Transforms/SampleProfile/flattened.ll due to incorrect DISubroutineType(s). This patch fixed the issue with proper types.

Add a new pass EmitChangedFuncDebugInfo which will add dwarf for additional functions including functions with signature change and new functions. The previous approach in [1] tries to add debuginfo for those optimization passes which cause signature changes. Based on discussion in [1], it is preferred to have a specific pass to add debuginfo and later on dwarf generation can include those new debuginfo. The following is an example: Source: $ cat test.c struct t { int a; }; char *tar(struct t *a, struct t *d); __attribute__((noinline)) static char * foo(struct t *a, struct t *d, int b) { return tar(a, d); } char *bar(struct t *a, struct t *d) { return foo(a, d, 1); } Compiled and dump dwarf with: clang -O2 -c -g test.c llvm-dwarfdump test.o 0x0000005c: DW_TAG_subprogram DW_AT_low_pc (0x0000000000000010) DW_AT_high_pc (0x0000000000000015) DW_AT_frame_base (DW_OP_reg7 RSP) DW_AT_linkage_name ("foo") DW_AT_name ("foo") DW_AT_decl_file ("/home/yhs/tests/sig-change/deadarg/test.c") DW_AT_decl_line (3) DW_AT_type (0x000000bb "char *") DW_AT_artificial (true) DW_AT_external (true) 0x0000006c: DW_TAG_formal_parameter DW_AT_location (DW_OP_reg5 RDI) DW_AT_decl_file ("/home/yhs/tests/sig-change/deadarg/test.c") DW_AT_decl_line (3) DW_AT_type (0x000000c4 "t *") 0x00000075: DW_TAG_formal_parameter DW_AT_location (DW_OP_reg4 RSI) DW_AT_decl_file ("/home/yhs/tests/sig-change/deadarg/test.c") DW_AT_decl_line (3) DW_AT_type (0x000000c4 "t *") 0x0000007e: DW_TAG_inlined_subroutine DW_AT_abstract_origin (0x0000009a "foo") DW_AT_low_pc (0x0000000000000010) DW_AT_high_pc (0x0000000000000015) DW_AT_call_file ("/home/yhs/tests/sig-change/deadarg/test.c") DW_AT_call_line (0) 0x0000008a: DW_TAG_formal_parameter DW_AT_location (DW_OP_reg5 RDI) DW_AT_abstract_origin (0x000000a2 "a") 0x00000091: DW_TAG_formal_parameter DW_AT_location (DW_OP_reg4 RSI) DW_AT_abstract_origin (0x000000aa "d") 0x00000098: NULL 0x00000099: NULL 0x0000009a: DW_TAG_subprogram DW_AT_name ("foo") DW_AT_decl_file ("/home/yhs/tests/sig-change/deadarg/test.c") DW_AT_decl_line (3) DW_AT_prototyped (true) DW_AT_type (0x000000bb "char *") DW_AT_inline (DW_INL_inlined) 0x000000a2: DW_TAG_formal_parameter DW_AT_name ("a") DW_AT_decl_file ("/home/yhs/tests/sig-change/deadarg/test.c") DW_AT_decl_line (3) DW_AT_type (0x000000c4 "t *") 0x000000aa: DW_TAG_formal_parameter DW_AT_name ("d") DW_AT_decl_file ("/home/yhs/tests/sig-change/deadarg/test.c") DW_AT_decl_line (3) DW_AT_type (0x000000c4 "t *") 0x000000b2: DW_TAG_formal_parameter DW_AT_name ("b") DW_AT_decl_file ("/home/yhs/tests/sig-change/deadarg/test.c") DW_AT_decl_line (3) DW_AT_type (0x000000d8 "int") 0x000000ba: NULL There are some restrictions in the current implementation: - Only C language is supported - BPF target is excluded as one of main goals for this pull request is to generate proper vmlinux BTF for arch's like x86_64/arm64 etc. - Function must not be a intrinsic, decl only, return value size more than arch register size and func with variable arguments. I have tested this patch set by building latest bpf-next linux kernel. For no-lto case: 65341 original number of functions 1085 new functions with this patch For thin-lto case: 65595 original number of functions 2492 new functions with this patch [1] llvm#127855

yonghong-song · 2025-10-12T18:55:55Z

@OCHyams I just uploaded another version which fixed some issues mentioned by @eddyz87. But the potential issue related to debugger is not investigated yet and I need your guidance how to test.

As discussed in earlier this thread, the new implementation is very similar to code like foo.clone (int a, int b) { foo (1, { a, b }); } where foo.clone represents the true signatures and the original function is inlined with proper debuginfo representation. Of course, 'foo.clone' name is just an illustration and its actual name is from the Function itself.

I think there probably already some examples in debugger to track inlined subroutines, it would be great if you can give some example.

OCHyams · 2025-10-15T10:36:39Z

Thanks for the additional info and link to the RFC thread. Sorry that this has been a bit of a protracted process; I appreciate my slow replies are not helping with that!

I think it's important to discuss the impact on debuggers a bit more, so I've commented over on the RFC.

yonghong-song · 2025-10-16T19:43:27Z

@OCHyams I tried a simple C program without and with signature change. It seems working. It would be great if you can give some comments.

Non lto example:

The source code:

$ cat ex2.c
#include <stdio.h>

__attribute__((noinline))
static int inc(int x, int y) {
    volatile int t = x + 1;
    printf("x=%d\n", t);
    return t;
}

static int do_work(int n) {
    int a = inc(n, n);
    printf("a=%d\n", a);
    return a;
}

int main() {
    return do_work(41);
}

The compiler command line:

clang -O2 -g ex2.c -o ex2

The lldb session:

$ lldb ./ex2
(lldb) target create "./ex2"    
Current executable set to '/home/yhs/tests/inline_lldb/c-nolto/ex-sig-change/ex2' (x86_64).
(lldb) break set -n main
Breakpoint 1: where = ex2`main + 1 [inlined] do_work + 1 at ex2.c:11:13, address = 0x0000000000001141
(lldb) run                                                          
Process 3832448 launched: '/home/yhs/tests/inline_lldb/c-nolto/ex-sig-change/ex2' (x86_64)
Process 3832448 stopped                                             
* thread #1, name = 'ex2', stop reason = breakpoint 1.1                                                                                  
    frame #0: 0x0000555555555141 ex2`do_work(n=41) at ex2.c:11:13 [inlined]
   8    }                                                                                                                                
   9                                                                
   10   static int do_work(int n) {                                                                                                      
-> 11       int a = inc(n, n);                                                                                                           
   12       printf("a=%d\n", a);                                    
   13       return a;
   14   }              
(lldb) bt                                                           
* thread #1, name = 'ex2', stop reason = breakpoint 1.1                                                                                  
  * frame #0: 0x0000555555555141 ex2`do_work(n=41) at ex2.c:11:13 [inlined]
    frame #1: 0x0000555555555140 ex2`main at ex2.c:17:12
    frame #2: 0x00007ffff7c2a610 libc.so.6`__libc_start_call_main + 128
    frame #3: 0x00007ffff7c2a6c0 libc.so.6`__libc_start_main@@GLIBC_2.34 + 128
(lldb) s             
Process 3832448 stopped
* thread #1, name = 'ex2', stop reason = step in
    frame #0: 0x0000555555555161 ex2`inc(x=41, y=<unavailable>) at ex2.c:5:18 [inlined]                                                  
   2   
   3    __attribute__((noinline))
   4    static int inc(int x, int y) {
-> 5        volatile int t = x + 1;
   6        printf("x=%d\n", t);
   7        return t;
   8    }
(lldb) bt
* thread #1, name = 'ex2', stop reason = step in
  * frame #0: 0x0000555555555161 ex2`inc(x=41, y=<unavailable>) at ex2.c:5:18 [inlined]
    frame #1: 0x0000555555555161 ex2`inc at ex2.c:0
    frame #2: 0x0000555555555146 ex2`do_work(n=41) at ex2.c:11:13 [inlined]
    frame #3: 0x0000555555555140 ex2`main at ex2.c:17:12
    frame #4: 0x00007ffff7c2a610 libc.so.6`__libc_start_call_main + 128
    frame #5: 0x00007ffff7c2a6c0 libc.so.6`__libc_start_main@@GLIBC_2.34 + 128
    frame #6: 0x0000555555555075 ex2`_start + 37
(lldb) n
Process 3832448 stopped
* thread #1, name = 'ex2', stop reason = step over
    frame #0: 0x0000555555555169 ex2`inc(x=41, y=<unavailable>) at ex2.c:6:22 [inlined]
   3    __attribute__((noinline))
   4    static int inc(int x, int y) {
   5        volatile int t = x + 1;
-> 6        printf("x=%d\n", t);
   7        return t;
   8    }
   9   
(lldb) bt
* thread #1, name = 'ex2', stop reason = step over
  * frame #0: 0x0000555555555169 ex2`inc(x=41, y=<unavailable>) at ex2.c:6:22 [inlined]
    frame #1: 0x0000555555555161 ex2`inc at ex2.c:0
    frame #2: 0x0000555555555146 ex2`do_work(n=41) at ex2.c:11:13 [inlined]
    frame #3: 0x0000555555555140 ex2`main at ex2.c:17:12
    frame #4: 0x00007ffff7c2a610 libc.so.6`__libc_start_call_main + 128
    frame #5: 0x00007ffff7c2a6c0 libc.so.6`__libc_start_main@@GLIBC_2.34 + 128
    frame #6: 0x0000555555555075 ex2`_start + 37
(lldb) n
x=42
Process 3832448 stopped
* thread #1, name = 'ex2', stop reason = step over
    frame #0: 0x000055555555517b ex2`inc(x=41, y=<unavailable>) at ex2.c:7:12 [inlined]
   4    static int inc(int x, int y) {
   5        volatile int t = x + 1;
   6        printf("x=%d\n", t);
-> 7        return t;
   8    }
   9   
   10   static int do_work(int n) {
(lldb) bt
* thread #1, name = 'ex2', stop reason = step over
  * frame #0: 0x000055555555517b ex2`inc(x=41, y=<unavailable>) at ex2.c:7:12 [inlined]
    frame #1: 0x0000555555555161 ex2`inc at ex2.c:0
    frame #2: 0x0000555555555146 ex2`do_work(n=41) at ex2.c:11:13 [inlined]
    frame #3: 0x0000555555555140 ex2`main at ex2.c:17:12
    frame #4: 0x00007ffff7c2a610 libc.so.6`__libc_start_call_main + 128
    frame #5: 0x00007ffff7c2a6c0 libc.so.6`__libc_start_main@@GLIBC_2.34 + 128
    frame #6: 0x0000555555555075 ex2`_start + 37
(lldb) n
Process 3832448 stopped
* thread #1, name = 'ex2', stop reason = step over
    frame #0: 0x0000555555555146 ex2`do_work(n=41) at ex2.c:11:13 [inlined]
   8    }
   9   
   10   static int do_work(int n) {
-> 11       int a = inc(n, n);
   12       printf("a=%d\n", a);
   13       return a;
   14   }
(lldb) bt
* thread #1, name = 'ex2', stop reason = step over
  * frame #0: 0x0000555555555146 ex2`do_work(n=41) at ex2.c:11:13 [inlined]
    frame #1: 0x0000555555555140 ex2`main at ex2.c:17:12
    frame #2: 0x00007ffff7c2a610 libc.so.6`__libc_start_call_main + 128
    frame #3: 0x00007ffff7c2a6c0 libc.so.6`__libc_start_main@@GLIBC_2.34 + 128
    frame #4: 0x0000555555555075 ex2`_start + 37

The key frame is

  * frame #0: 0x0000555555555161 ex2`inc(x=41, y=<unavailable>) at ex2.c:5:18 [inlined]
    frame #1: 0x0000555555555161 ex2`inc at ex2.c:0
    frame #2: 0x0000555555555146 ex2`do_work(n=41) at ex2.c:11:13 [inlined]
    frame #3: 0x0000555555555140 ex2`main at ex2.c:17:12
    frame #4: 0x00007ffff7c2a610 libc.so.6`__libc_start_call_main + 128
    frame #5: 0x00007ffff7c2a6c0 libc.so.6`__libc_start_main@@GLIBC_2.34 + 128
    frame #6: 0x0000555555555075 ex2`_start + 37

Without this pull request, at the beginning of inc(), the frame looks like below:

  * frame #0: 0x0000555555555161 ex2`inc(x=41, y=<unavailable>) at ex2.c:5:18
    frame #1: 0x0000555555555146 ex2`do_work(n=41) at ex2.c:11:13 [inlined]
    frame #2: 0x0000555555555140 ex2`main at ex2.c:17:12
    frame #3: 0x00007ffff7c2a610 libc.so.6`__libc_start_call_main + 128
    frame #4: 0x00007ffff7c2a6c0 libc.so.6`__libc_start_main@@GLIBC_2.34 + 128
    frame #5: 0x0000555555555075 ex2`_start + 37

So the difference is

  * frame #0: 0x0000555555555161 ex2`inc(x=41, y=<unavailable>) at ex2.c:5:18 [inlined]
    frame #1: 0x0000555555555161 ex2`inc at ex2.c:0

vs.

  * frame #0: 0x0000555555555161 ex2`inc(x=41, y=<unavailable>) at ex2.c:5:18

I think It does not impact debugger result itself although frame dump is slightly
different.

Thin-LTO example:

The source code:

$ cat main.c
#include <stdio.h>

int api(int);

int main(void) {
    int v = api(5);
    printf("v=%d\n", v);
    return v == 13 ? 0 : 1;
}
$ cat lib.c
#include <stdio.h>
#include <stdint.h>

__attribute__((noinline))
static int api_internal(int n, int y) {
    volatile int t = (n * 2) + 3;
    printf("t = %d\n", t);
    return t;
}

int api(int n) {
  return api_internal(n, n);
}

The compilation command line

clang -O2 -g -flto=thin -Wl,-plugin-opt=jobs=1 -fuse-ld=lld main.c lib.c -o app_lto

The lldb session:

$ lldb ./app_lto
(lldb) target create "./app_lto"
Current executable set to '/home/yhs/tests/inline_lldb/c-lto/ex/app_lto' (x86_64).
(lldb) break set -n main
Breakpoint 1: where = app_lto`main + 1 [inlined] api + 1 at lib.c:12:10, address = 0x0000000000001741
(lldb) run
Process 3947027 launched: '/home/yhs/tests/inline_lldb/c-lto/ex/app_lto' (x86_64)
Process 3947027 stopped
* thread #1, name = 'app_lto', stop reason = breakpoint 1.1
    frame #0: 0x0000555555555741 app_lto`api(n=5) at lib.c:12:10 [inlined]
   9    }
   10  
   11   int api(int n) {
-> 12     return api_internal(n, n);
   13   }
(lldb) bt
* thread #1, name = 'app_lto', stop reason = breakpoint 1.1
  * frame #0: 0x0000555555555741 app_lto`api(n=5) at lib.c:12:10 [inlined]
    frame #1: 0x0000555555555740 app_lto`main at main.c:6:13
    frame #2: 0x00007ffff7c2a610 libc.so.6`__libc_start_call_main + 128
    frame #3: 0x00007ffff7c2a6c0 libc.so.6`__libc_start_main@@GLIBC_2.34 + 128
    frame #4: 0x0000555555555675 app_lto`_start + 37
(lldb) s
Process 3947027 stopped
* thread #1, name = 'app_lto', stop reason = step in
    frame #0: 0x0000555555555781 app_lto`api_internal(n=5, y=<unavailable>) at lib.c:6:30 [inlined]
   3   
   4    __attribute__((noinline))
   5    static int api_internal(int n, int y) {
-> 6        volatile int t = (n * 2) + 3;
   7        printf("t = %d\n", t);
   8        return t;
   9    }
(lldb) bt
* thread #1, name = 'app_lto', stop reason = step in
  * frame #0: 0x0000555555555781 app_lto`api_internal(n=5, y=<unavailable>) at lib.c:6:30 [inlined]
    frame #1: 0x0000555555555781 app_lto`api_internal.llvm.17070606651919954320(n=5) at lib.c:0
    frame #2: 0x000055555555574b app_lto`api(n=5) at lib.c:12:10 [inlined]
    frame #3: 0x0000555555555740 app_lto`main at main.c:6:13
    frame #4: 0x00007ffff7c2a610 libc.so.6`__libc_start_call_main + 128
    frame #5: 0x00007ffff7c2a6c0 libc.so.6`__libc_start_main@@GLIBC_2.34 + 128
    frame #6: 0x0000555555555675 app_lto`_start + 37
(lldb) n
Process 3947027 stopped
* thread #1, name = 'app_lto', stop reason = step over
    frame #0: 0x000055555555578c app_lto`api_internal(n=5, y=<unavailable>) at lib.c:7:24 [inlined]
   4    __attribute__((noinline))
   5    static int api_internal(int n, int y) {
   6        volatile int t = (n * 2) + 3;
-> 7        printf("t = %d\n", t);
   8        return t;
   9    }
   10  
(lldb) bt
* thread #1, name = 'app_lto', stop reason = step over
  * frame #0: 0x000055555555578c app_lto`api_internal(n=5, y=<unavailable>) at lib.c:7:24 [inlined]
    frame #1: 0x0000555555555781 app_lto`api_internal.llvm.17070606651919954320(n=5) at lib.c:0
    frame #2: 0x000055555555574b app_lto`api(n=5) at lib.c:12:10 [inlined]
    frame #3: 0x0000555555555740 app_lto`main at main.c:6:13
    frame #4: 0x00007ffff7c2a610 libc.so.6`__libc_start_call_main + 128
    frame #5: 0x00007ffff7c2a6c0 libc.so.6`__libc_start_main@@GLIBC_2.34 + 128
    frame #6: 0x0000555555555675 app_lto`_start + 37
(lldb) n
t = 13
Process 3947027 stopped
* thread #1, name = 'app_lto', stop reason = step over
    frame #0: 0x000055555555579e app_lto`api_internal(n=<unavailable>, y=<unavailable>) at lib.c:8:12 [inlined]
   5    static int api_internal(int n, int y) {
   6        volatile int t = (n * 2) + 3;
   7        printf("t = %d\n", t);
-> 8        return t;
   9    }
   10  
   11   int api(int n) {
(lldb) bt
* thread #1, name = 'app_lto', stop reason = step over
  * frame #0: 0x000055555555579e app_lto`api_internal(n=<unavailable>, y=<unavailable>) at lib.c:8:12 [inlined]
    frame #1: 0x0000555555555781 app_lto`api_internal.llvm.17070606651919954320(n=<unavailable>) at lib.c:0
    frame #2: 0x000055555555574b app_lto`api(n=5) at lib.c:12:10 [inlined]
    frame #3: 0x0000555555555740 app_lto`main at main.c:6:13
    frame #4: 0x00007ffff7c2a610 libc.so.6`__libc_start_call_main + 128
    frame #5: 0x00007ffff7c2a6c0 libc.so.6`__libc_start_main@@GLIBC_2.34 + 128
    frame #6: 0x0000555555555675 app_lto`_start + 37
(lldb) n
Process 3947027 stopped
* thread #1, name = 'app_lto', stop reason = step over
    frame #0: 0x000055555555574b app_lto`api(n=5) at lib.c:12:10 [inlined]
   9    }
   10  
   11   int api(int n) {
-> 12     return api_internal(n, n);
   13   }

Without this pull request, at the beginning of api_internal(), the frame looks like below:

  * frame #0: 0x0000555555555781 app_lto`api_internal(n=5, y=<unavailable>) at lib.c:6:30
    frame #1: 0x000055555555574b app_lto`api(n=5) at lib.c:12:10 [inlined]
    frame #2: 0x0000555555555740 app_lto`main at main.c:6:13
    frame #3: 0x00007ffff7c2a610 libc.so.6`__libc_start_call_main + 128
    frame #4: 0x00007ffff7c2a6c0 libc.so.6`__libc_start_main@@GLIBC_2.34 + 128
    frame #5: 0x0000555555555675 app_lto`_start + 37

So the difference is

  * frame #0: 0x0000555555555781 app_lto`api_internal(n=5, y=<unavailable>) at lib.c:6:30 [inlined]
    frame #1: 0x0000555555555781 app_lto`api_internal.llvm.17070606651919954320(n=5) at lib.c:0

vs.

  * frame #0: 0x0000555555555781 app_lto`api_internal(n=5, y=<unavailable>) at lib.c:6:30

I think It does not impact debugger result itself although frame dump is slightly
different.

yonghong-song · 2025-10-17T15:52:51Z

@dzhidzhoev There are some discussions here or in discourse (https://discourse.llvm.org/t/rfc-identify-func-signature-change-in-llvm-compiled-kernel-image/82609/9),

The following are the concerns from @OCHyams

=====

Thanks for giving it a try and sharing your LLDB session. In your example you’ve said:

So the difference is

  * frame #0: 0x0000555555555161 ex2`inc(x=41, y=<unavailable>) at ex2.c:5:18 [inlined]
    frame #1: 0x0000555555555161 ex2`inc at ex2.c:0

vs.

  * frame #0: 0x0000555555555161 ex2`inc(x=41, y=<unavailable>) at ex2.c:5:18

I think It does not impact debugger result itself although frame dump is slightly
different.

That’s the kind of difference I was concerned about. You’ve said that it doesn’t impact the debugger, but IMO that’s a significant impact as the backtrace is incorrect and the top frame is described as “inlined”. Both could cause much confusion to someone debugging their code.

======

What do you think? The backtrace is different from the original one.
Any suggestion on the current inlinedAt approach? Or change back to my previous suggestion to have a simple implementation with a special CompileUnit and then do more in DwarfDebug (yonghong-song@be0cdea)?

In llvm pull request [1], the dwarf is changed to accommodate functions whose signatures are different from source level although they have the same name. Other non-source functions are also included in dwarf. The following is an example: The source: ==== $ cat test.c struct t { int a; }; char *tar(struct t *a, struct t *d); __attribute__((noinline)) static char * foo(struct t *a, struct t *d, int b) { return tar(a, d); } char *bar(struct t *a, struct t *d) { return foo(a, d, 1); } ==== Part of generated dwarf: ==== 0x0000005c: DW_TAG_subprogram DW_AT_low_pc (0x0000000000000010) DW_AT_high_pc (0x0000000000000015) DW_AT_frame_base (DW_OP_reg7 RSP) DW_AT_linkage_name ("foo") DW_AT_name ("foo") DW_AT_decl_file ("/home/yhs/tests/sig-change/deadarg/test.c") DW_AT_decl_line (3) DW_AT_type (0x000000bb "char *") DW_AT_artificial (true) DW_AT_external (true) 0x0000006c: DW_TAG_formal_parameter DW_AT_location (DW_OP_reg5 RDI) DW_AT_decl_file ("/home/yhs/tests/sig-change/deadarg/test.c") DW_AT_decl_line (3) DW_AT_type (0x000000c4 "t *") 0x00000075: DW_TAG_formal_parameter DW_AT_location (DW_OP_reg4 RSI) DW_AT_decl_file ("/home/yhs/tests/sig-change/deadarg/test.c") DW_AT_decl_line (3) DW_AT_type (0x000000c4 "t *") 0x0000007e: DW_TAG_inlined_subroutine DW_AT_abstract_origin (0x0000009a "foo") DW_AT_low_pc (0x0000000000000010) DW_AT_high_pc (0x0000000000000015) DW_AT_call_file ("/home/yhs/tests/sig-change/deadarg/test.c") DW_AT_call_line (0) 0x0000008a: DW_TAG_formal_parameter DW_AT_location (DW_OP_reg5 RDI) DW_AT_abstract_origin (0x000000a2 "a") 0x00000091: DW_TAG_formal_parameter DW_AT_location (DW_OP_reg4 RSI) DW_AT_abstract_origin (0x000000aa "d") 0x00000098: NULL 0x00000099: NULL 0x0000009a: DW_TAG_subprogram DW_AT_name ("foo") DW_AT_decl_file ("/home/yhs/tests/sig-change/deadarg/test.c") DW_AT_decl_line (3) DW_AT_prototyped (true) DW_AT_type (0x000000bb "char *") DW_AT_inline (DW_INL_inlined) 0x000000a2: DW_TAG_formal_parameter DW_AT_name ("a") DW_AT_decl_file ("/home/yhs/tests/sig-change/deadarg/test.c") DW_AT_decl_line (3) DW_AT_type (0x000000c4 "t *") 0x000000aa: DW_TAG_formal_parameter DW_AT_name ("d") DW_AT_decl_file ("/home/yhs/tests/sig-change/deadarg/test.c") DW_AT_decl_line (3) DW_AT_type (0x000000c4 "t *") 0x000000b2: DW_TAG_formal_parameter DW_AT_name ("b") DW_AT_decl_file ("/home/yhs/tests/sig-change/deadarg/test.c") DW_AT_decl_line (3) DW_AT_type (0x000000d8 "int") 0x000000ba: NULL ==== In the above, there are two subprograms with the same name 'foo'. Currently btf encoder will consider both functions as ELF functions. Since two subprograms have different signature, the funciton will be ignored. But actually, one of function 'foo' is marked as DW_INL_inlined which means we should not treat it as an elf funciton. The patch fixed this issue by filtering subprograms if the corresponding function__inlined() is true. This will fix the issue for [1]. But it should work fine without [1] too. [1] llvm/llvm-project#157349

yonghong-song · 2025-10-20T03:58:49Z

@dzhidzhoev @OCHyams I tried another approach. Basically, I will add a 'declaration' for the func whose signature changed. The 'declaration' will have the source level signature and the signature-changed func will have real signature.

See my github branch:
main...yonghong-song:llvm-project:dwarf-signature-change/dwarf-signature-change-extra-unit-v6-1

esp. this commit:
a8ff54b

This approach still has the difference with lldb. For example, for my previous example without lto:

So comparing this backtrace:
    * frame #0: 0x0000555555555161 ex2`inc(x=41) at ex2.c:5:18
      frame https://github.com/llvm/llvm-project/pull/1: 0x0000555555555146 ex2`do_work(n=41) at ex2.c:11:13 [inlined]
      frame https://github.com/llvm/llvm-project/pull/2: 0x0000555555555140 ex2`main at ex2.c:17:12
      frame https://github.com/llvm/llvm-project/pull/3: 0x00007ffff7c2a610 libc.so.6`__libc_start_call_main + 128
      frame https://github.com/llvm/llvm-project/pull/4: 0x00007ffff7c2a6c0 libc.so.6`__libc_start_main@@GLIBC_2.34 + 128
      frame https://github.com/llvm/llvm-project/pull/5: 0x0000555555555075 ex2`_start + 37
vs. the same backtrace without this pull request:
    * frame #0: 0x0000555555555161 ex2`inc(x=41, y=<unavailable>) at ex2.c:5:18
      frame https://github.com/llvm/llvm-project/pull/1: 0x0000555555555146 ex2`do_work(n=41) at ex2.c:11:13 [inlined]
      frame https://github.com/llvm/llvm-project/pull/2: 0x0000555555555140 ex2`main at ex2.c:17:12
      frame https://github.com/llvm/llvm-project/pull/3: 0x00007ffff7c2a610 libc.so.6`__libc_start_call_main + 128
      frame https://github.com/llvm/llvm-project/pull/4: 0x00007ffff7c2a6c0 libc.so.6`__libc_start_main@@GLIBC_2.34 + 128
      frame https://github.com/llvm/llvm-project/pull/5: 0x0000555555555075 ex2`_start + 37

You can see func ex2() will have two parameters without this pull request and will have one parameter with this pull request.

The following is the difference with lto enabled.

So comparing this backtrace:
  * frame #0: 0x0000555555555781 app_lto`api_internal.llvm.36994223926077692(n=5) at lib.c:6:30
    frame https://github.com/llvm/llvm-project/pull/1: 0x000055555555574b app_lto`api(n=5) at lib.c:12:10 [inlined]
    frame https://github.com/llvm/llvm-project/pull/2: 0x0000555555555740 app_lto`main at main.c:6:13
    frame https://github.com/llvm/llvm-project/pull/3: 0x00007ffff7c2a610 libc.so.6`__libc_start_call_main + 128
    frame https://github.com/llvm/llvm-project/pull/4: 0x00007ffff7c2a6c0 libc.so.6`__libc_start_main@@GLIBC_2.34 + 128
    frame https://github.com/llvm/llvm-project/pull/5: 0x0000555555555675 app_lto`_start + 37
vs. the same backtrace without this pull request:
  * frame #0: 0x0000555555555781 app_lto`api_internal(n=5, y=<unavailable>) at lib.c:6:30
    frame https://github.com/llvm/llvm-project/pull/1: 0x000055555555574b app_lto`api(n=5) at lib.c:12:10 [inlined]
    frame https://github.com/llvm/llvm-project/pull/2: 0x0000555555555740 app_lto`main at main.c:6:13
    frame https://github.com/llvm/llvm-project/pull/3: 0x00007ffff7c2a610 libc.so.6`__libc_start_call_main + 128
    frame https://github.com/llvm/llvm-project/pull/4: 0x00007ffff7c2a6c0 libc.so.6`__libc_start_main@@GLIBC_2.34 + 128
    frame https://github.com/llvm/llvm-project/pull/5: 0x0000555555555675 app_lto`_start + 37

You can see with this pull request, we have

  * frame #0: 0x0000555555555781 app_lto`api_internal.llvm.36994223926077692(n=5) at lib.c:6:30

The func name changed to the real func name and the argument becomes one instead of source level 2.

Although the simple non-lto and lto example works fine in the above, when I tried to build linux kernel (more complicated codes), I got a crash like below:

 #9 0x00007fdf91429873 abort (/lib64/libc.so.6+0x29873)
#10 0x00007fdf9142979b _nl_load_domain.cold (/lib64/libc.so.6+0x2979b)
#11 0x00007fdf914388c6 (/lib64/libc.so.6+0x388c6)
#12 0x00000000096d9731 llvm::DwarfFile::addScopeVariable(llvm::LexicalScope*, llvm::DbgVariable*) /home/yhs/work/yhs/llvm-project/llvm/lib/CodeGen/AsmPrinter/DwarfFile.cpp:112:3
#13 0x00000000096388d3 llvm::DwarfDebug::createConcreteEntity(llvm::DwarfCompileUnit&, llvm::LexicalScope&, llvm::DINode const*, llvm::DILocation const*, llvm::MCSymbol const*) /home/yhs/work/yhs/llvm-project/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp:1897:3
#14 0x000000000963919c llvm::DwarfDebug::collectEntityInfo(llvm::DwarfCompileUnit&, llvm::DISubprogram const*, llvm::DenseSet<std::pair<llvm::DINode const*, llvm::DILocation const*>, llvm::DenseMapInfo<std::pair<llvm::DINode const*, llvm::DILocation const*>, void>>&) /home/yhs/work/yhs/llvm-project/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp:2024:5
#15 0x000000000963bc31 llvm::DwarfDebug::endFunctionImpl(llvm::MachineFunction const*) /home/yhs/work/yhs/llvm-project/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp:2757:3
#16 0x00000000096150ff llvm::DebugHandlerBase::endFunction(llvm::MachineFunction const*) /home/yhs/work/yhs/llvm-project/llvm/lib/CodeGen/AsmPrinter/DebugHandlerBase.cpp:0:5
#17 0x00000000095e0a21 llvm::AsmPrinter::emitFunctionBody() /home/yhs/work/yhs/llvm-project/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp:2269:22

I have not figured out what is the problem and I want to hear expert opinion first about my suggestions.

dzhidzhoev · 2025-10-20T11:37:01Z

What do you think? The backtrace is different from the original one.
Any suggestion on the current inlinedAt approach? Or change back to my previous suggestion to have a simple implementation with a special CompileUnit and then do more in DwarfDebug (yonghong-song@be0cdea)?

I don't mind giving up "inlinedAt approach". It was just an idea, not a detailed plan.

dzhidzhoev · 2025-10-20T11:39:52Z

Tbh, I'm not sure why upstream insists on this being a separate pass. From ad-hoc patterns in e.g. getArg() it appears that original idea to handle signature changes in each individual optimization is simpler and more reliable way handle this.

Still, it would probably be nice to have a separate pass (or something in DwarfDebug) to find and somehow mark DISubprograms that do not correspond to IR function signatures anymore. But updating debug info at the place where Functions themselves are modified sounds more reasonable, indeed. A pass that modifies Function signature should know how to indicate this change in debug info metadata better.

dzhidzhoev · 2025-10-20T11:49:25Z

@dzhidzhoev @OCHyams I tried another approach. Basically, I will add a 'declaration' for the func whose signature changed. The 'declaration' will have the source level signature and the signature-changed func will have real signature.

See my github branch: main...yonghong-song:llvm-project:dwarf-signature-change/dwarf-signature-change-extra-unit-v6-1

esp. this commit: a8ff54b

This approach still has the difference with lldb. For example, for my previous example without lto:

So comparing this backtrace:
    * frame #0: 0x0000555555555161 ex2`inc(x=41) at ex2.c:5:18
      frame https://github.com/llvm/llvm-project/pull/1: 0x0000555555555146 ex2`do_work(n=41) at ex2.c:11:13 [inlined]
      frame https://github.com/llvm/llvm-project/pull/2: 0x0000555555555140 ex2`main at ex2.c:17:12
      frame https://github.com/llvm/llvm-project/pull/3: 0x00007ffff7c2a610 libc.so.6`__libc_start_call_main + 128
      frame https://github.com/llvm/llvm-project/pull/4: 0x00007ffff7c2a6c0 libc.so.6`__libc_start_main@@GLIBC_2.34 + 128
      frame https://github.com/llvm/llvm-project/pull/5: 0x0000555555555075 ex2`_start + 37
vs. the same backtrace without this pull request:
    * frame #0: 0x0000555555555161 ex2`inc(x=41, y=<unavailable>) at ex2.c:5:18
      frame https://github.com/llvm/llvm-project/pull/1: 0x0000555555555146 ex2`do_work(n=41) at ex2.c:11:13 [inlined]
      frame https://github.com/llvm/llvm-project/pull/2: 0x0000555555555140 ex2`main at ex2.c:17:12
      frame https://github.com/llvm/llvm-project/pull/3: 0x00007ffff7c2a610 libc.so.6`__libc_start_call_main + 128
      frame https://github.com/llvm/llvm-project/pull/4: 0x00007ffff7c2a6c0 libc.so.6`__libc_start_main@@GLIBC_2.34 + 128
      frame https://github.com/llvm/llvm-project/pull/5: 0x0000555555555075 ex2`_start + 37

Let's consider the case if we have two functions in the source code:

int foo(int x, int y) { ... }
int foo(int x) { ... }

And let's assume that the second argument of foo(int, int) is optimized out (so that the second argument is removed from the definition DISubprogram's arguments, by EmitChangedFuncDebugInfo).

If we try to execute print foo(5) in debugger, will debugger correctly resolve the overloaded functions call?

OCHyams · 2025-10-20T12:32:35Z

FWIW I replied over in the discourse thread. I think it's best to discuss these high-level design questions over there. (but if everyone disagrees of course I'll happily discuss here instead)

yonghong-song · 2025-10-20T16:46:54Z

FWIW I replied over in the discourse thread. I think it's best to discuss these high-level design questions over there. (but if everyone disagrees of course I'll happily discuss here instead)

For information for everybody else, the discourse link: https://discourse.llvm.org/t/rfc-identify-func-signature-change-in-llvm-compiled-kernel-image/82609/11

yonghong-song · 2025-10-20T16:52:53Z

Let's consider the case if we have two functions in the source code:
int foo(int x, int y) { ... }
int foo(int x) { ... }
And let's assume that the second argument of foo(int, int) is optimized out (so that the second argument is removed from the definition DISubprogram's arguments, by EmitChangedFuncDebugInfo).

If we try to execute print foo(5) in debugger, will debugger correctly resolve the overloaded functions call?

I tried the following example:

$ cat main.cc
int test(int x, int y);

int main(void) {
  return test(5, 10);
}

$ cat same_func_name.cc
#include <stdio.h>

__attribute__((noinline)) static int foo(int x, int y) {
  volatile int t = x + 1;
  printf("x + 1 = %d\n", t);
  return t;
}
__attribute__((noinline)) static int foo(int x) {
  volatile int t = x * x + 1;
  printf("x * x + 1 = %d\n", t);
  return t;
}
int test(int x, int y) {
  return foo(x, y) + foo(x); EmitChangedFuncDebugInfo.c
}
$ cat run.sh
clang++ -O2 -g same_func_name.cc main.cc -o same_func_name -mllvm -print-after-all
$

The related dwarf:

0x0000009e:   DW_TAG_subprogram                                                                                                       
                DW_AT_name      ("foo")                                                                                               
                DW_AT_decl_file ("/home/yhs/tests/inline_lldb/c-same-func-name/same_func_name.cc")                                    
                DW_AT_decl_line (3)                                                                                                   
                DW_AT_type      (0x000000b1 "int")                                                                                    
                DW_AT_declaration       (true)                                                                                        
                DW_AT_external  (true)                                                                                                
                                                                                                                                      
0x000000a6:     DW_TAG_formal_parameter                                                                                               
                  DW_AT_type    (0x000000b1 "int")                                                                                    
                                                                                                                                      
0x000000ab:     DW_TAG_formal_parameter                                                                                               
                  DW_AT_type    (0x000000b1 "int")                                                                                    
                                                                                                                                      
0x000000b0:     NULL                                                                                                                  
                                                                                                                                      
0x000000b1:   DW_TAG_base_type                                                                                                        
                DW_AT_name      ("int")                                                                                               
                DW_AT_encoding  (DW_ATE_signed)                                                                                       
                DW_AT_byte_size (0x04)                                                                                                                                      
0x000000b5:   DW_TAG_subprogram                                                                                                       
                DW_AT_low_pc    (0x0000000000001160)                                                                                  
                DW_AT_high_pc   (0x000000000000117f)                                                                                  
                DW_AT_frame_base        (DW_OP_reg7 RSP)                                                                              
                DW_AT_call_all_calls    (true)                                                                                        
                DW_AT_linkage_name      ("_ZL3fooii")                                                                                 
                DW_AT_specification     (0x0000009e "foo")                                                                            
                                                                                                                                      
0x000000c2:     DW_TAG_formal_parameter                                                                                               
                  DW_AT_location        (indexed (0x2) loclist = 0x0000003d: 
                     [0x0000000000001160, 0x0000000000001163): DW_OP_reg5 RDI
                     [0x0000000000001163, 0x000000000000117f): DW_OP_entry_value(DW_OP_reg5 RDI), DW_OP_stack_value)
                  DW_AT_name    ("x")
                  DW_AT_decl_file       ("/home/yhs/tests/inline_lldb/c-same-func-name/same_func_name.cc")
                  DW_AT_decl_line       (3)
                  DW_AT_type    (0x000000b1 "int")

0x000000cb:     DW_TAG_variable
                  DW_AT_location        (DW_OP_fbreg +4)
                  DW_AT_name    ("t")
                  DW_AT_decl_file       ("/home/yhs/tests/inline_lldb/c-same-func-name/same_func_name.cc")
                  DW_AT_decl_line       (4)
                  DW_AT_type    (0x00000122 "volatile int")

0x000000d6:     DW_TAG_call_site
                  DW_AT_call_origin     (0x00000108 "printf")
                  DW_AT_call_return_pc  (0x0000000000001179)

0x000000dc:     NULL

You can see that function foo (link_name _ZL3fooii) only have one argument now.

The lldb session:

$ cat run.sh
/clang++ -O2 -g same_func_name.cc main.cc -o same_func_name
$ lldb ./same_func_name
(lldb) target create "./same_func_name"
Current executable set to '/home/yhs/tests/inline_lldb/c-same-func-name/same_func_name' (x86_64).
(lldb) break set -n main
Breakpoint 1: where = same_func_name`main at main.cc:4:10, address = 0x00000000000011b0
(lldb) run
Process 2302230 launched: '/home/yhs/tests/inline_lldb/c-same-func-name/same_func_name' (x86_64)
Process 2302230 stopped
* thread #1, name = 'same_func_name', stop reason = breakpoint 1.1
    frame #0: 0x00005555555551b0 same_func_name`main at main.cc:4:10
   1    int test(int x, int y);
   2   
   3    int main(void) {
-> 4      return test(5, 10);
   5    }
   6   
(lldb) print foo(2)
x * x + 1 = 5
(int) 5
(lldb) print foo(2, 3)
             ˄
             ╰─ error: no matching function for call to 'foo'
note: Ran expression as 'C++14'.
note: note: candidate function not viable: requires single argument 'x', but 2 arguments were provided
(lldb)

So yes, foo(2, 3) not working probably due to the signature change.

Without EmitChangedFuncDebugInfo pass, foo(2, 3) can run in lldb properly.

$ lldb ./same_func_name
(lldb) target create "./same_func_name"
Current executable set to '/home/yhs/tests/inline_lldb/c-same-func-name/same_func_name' (x86_64).
(lldb) break set -n main
Breakpoint 1: where = same_func_name`main at main.cc:4:10, address = 0x00000000000011b0
(lldb) run
Process 2325392 launched: '/home/yhs/tests/inline_lldb/c-same-func-name/same_func_name' (x86_64)
Process 2325392 stopped
* thread #1, name = 'same_func_name', stop reason = breakpoint 1.1
    frame #0: 0x00005555555551b0 same_func_name`main at main.cc:4:10
   1    int test(int x, int y);
   2   
   3    int main(void) {
-> 4      return test(5, 10);
   5    }
   6   
(lldb) print foo(2)
x * x + 1 = 5
(int) 5
(lldb) print foo(2, 3)
x + 1 = 3
(int) 3
(lldb)

So in this case, the lldb probably needs to check DW_AT_specification (0x0000009e "foo") to find the original signature to make 'print foo(2, 3)' work.
This probably necessary for all signature changed functions in order to make debugger work with something like 'print foo(2, 3)'.

dafaust · 2025-10-22T19:15:27Z

Late to the discussion so it's quite possible I missed something, but have you looked at what GCC generates for cases like this? Why not use a similar format?

To use the example snippet in from the OP, compiled with gcc -c -O2 -g we get for "foo" the DWARF below.
First we have the original DW_TAG_subprogram reflecting the source, followed by a second DW_TAG_subprogram at the same level for the concrete instance referring via AT_abstract_origin to the original. The formals of the second reflect optimizations: note how "b" has no location information and instead is const_value: 1. The call site info for the call to tar is also now a child of the concrete subprogram.

 <1><e3>: Abbrev Number: 16 (DW_TAG_subprogram)
    <e4>   DW_AT_name        : foo
    <e8>   DW_AT_decl_file   : 1
    <e9>   DW_AT_decl_line   : 3
    <ea>   DW_AT_decl_column : 41
    <eb>   DW_AT_prototyped  : 1
    <eb>   DW_AT_type        : <0x6c>
    <ef>   DW_AT_inline      : 1	(inlined)
    <f0>   DW_AT_sibling     : <0x10d>
 <2><f4>: Abbrev Number: 2 (DW_TAG_formal_parameter)
    <f5>   DW_AT_name        : a
    <f7>   DW_AT_decl_file   : 1
    <f7>   DW_AT_decl_line   : 3
    <f7>   DW_AT_decl_column : 55
    <f8>   DW_AT_type        : <0x78>
 <2><fc>: Abbrev Number: 2 (DW_TAG_formal_parameter)
    <fd>   DW_AT_name        : d
    <ff>   DW_AT_decl_file   : 1
    <ff>   DW_AT_decl_line   : 3
    <ff>   DW_AT_decl_column : 68
    <100>   DW_AT_type        : <0x78>
 <2><104>: Abbrev Number: 2 (DW_TAG_formal_parameter)
    <105>   DW_AT_name        : b
    <107>   DW_AT_decl_file   : 1
    <107>   DW_AT_decl_line   : 3
    <107>   DW_AT_decl_column : 75
    <108>   DW_AT_type        : <0x4a>
 <2><10c>: Abbrev Number: 0
 <1><10d>: Abbrev Number: 17 (DW_TAG_subprogram)
    <10e>   DW_AT_abstract_origin: <0xe3>
    <112>   DW_AT_low_pc      : 0
    <11a>   DW_AT_high_pc     : 0x5
    <122>   DW_AT_frame_base  : 1 byte block: 9c 	(DW_OP_call_frame_cfa)
    <124>   DW_AT_call_all_calls: 1
 <2><124>: Abbrev Number: 7 (DW_TAG_formal_parameter)
    <125>   DW_AT_abstract_origin: <0xf4>
    <129>   DW_AT_location    : 0x34 (location list)
    <12d>   DW_AT_GNU_locviews: 0x30
 <2><131>: Abbrev Number: 7 (DW_TAG_formal_parameter)
    <132>   DW_AT_abstract_origin: <0xfc>
    <136>   DW_AT_location    : 0x46 (location list)
    <13a>   DW_AT_GNU_locviews: 0x42
 <2><13e>: Abbrev Number: 18 (DW_TAG_formal_parameter)
    <13f>   DW_AT_abstract_origin: <0x104>
    <143>   DW_AT_const_value : 1
 <2><144>: Abbrev Number: 6 (DW_TAG_call_site)
    <145>   DW_AT_call_return_pc: 0x5
    <14d>   DW_AT_call_tail_call: 1
    <14d>   DW_AT_call_origin : <0x51>
 <3><151>: Abbrev Number: 1 (DW_TAG_call_site_parameter)
    <152>   DW_AT_location    : 1 byte block: 55 	(DW_OP_reg5 (rdi))
    <154>   DW_AT_call_value  : 3 byte block: a3 1 55 	(DW_OP_entry_value: (DW_OP_reg5 (rdi)))
 <3><158>: Abbrev Number: 1 (DW_TAG_call_site_parameter)
    <159>   DW_AT_location    : 1 byte block: 54 	(DW_OP_reg4 (rsi))
    <15b>   DW_AT_call_value  : 3 byte block: a3 1 54 	(DW_OP_entry_value: (DW_OP_reg4 (rsi)))

yonghong-song · 2025-10-22T20:16:09Z

Late to the discussion so it's quite possible I missed something, but have you looked at what GCC generates for cases like this? Why not use a similar format?

To use the example snippet in from the OP, compiled with gcc -c -O2 -g we get for "foo" the DWARF below. First we have the original DW_TAG_subprogram reflecting the source, followed by a second DW_TAG_subprogram at the same level for the concrete instance referring via AT_abstract_origin to the original. The formals of the second reflect optimizations: note how "b" has no location information and instead is const_value: 1. The call site info for the call to tar is also now a child of the concrete subprogram.

 <1><e3>: Abbrev Number: 16 (DW_TAG_subprogram)
    <e4>   DW_AT_name        : foo
    <e8>   DW_AT_decl_file   : 1
    <e9>   DW_AT_decl_line   : 3
    <ea>   DW_AT_decl_column : 41
    <eb>   DW_AT_prototyped  : 1
    <eb>   DW_AT_type        : <0x6c>
    <ef>   DW_AT_inline      : 1	(inlined)
    <f0>   DW_AT_sibling     : <0x10d>
 <2><f4>: Abbrev Number: 2 (DW_TAG_formal_parameter)
    <f5>   DW_AT_name        : a
    <f7>   DW_AT_decl_file   : 1
    <f7>   DW_AT_decl_line   : 3
    <f7>   DW_AT_decl_column : 55
    <f8>   DW_AT_type        : <0x78>
 <2><fc>: Abbrev Number: 2 (DW_TAG_formal_parameter)
    <fd>   DW_AT_name        : d
    <ff>   DW_AT_decl_file   : 1
    <ff>   DW_AT_decl_line   : 3
    <ff>   DW_AT_decl_column : 68
    <100>   DW_AT_type        : <0x78>
 <2><104>: Abbrev Number: 2 (DW_TAG_formal_parameter)
    <105>   DW_AT_name        : b
    <107>   DW_AT_decl_file   : 1
    <107>   DW_AT_decl_line   : 3
    <107>   DW_AT_decl_column : 75
    <108>   DW_AT_type        : <0x4a>
 <2><10c>: Abbrev Number: 0
 <1><10d>: Abbrev Number: 17 (DW_TAG_subprogram)
    <10e>   DW_AT_abstract_origin: <0xe3>
    <112>   DW_AT_low_pc      : 0
    <11a>   DW_AT_high_pc     : 0x5
    <122>   DW_AT_frame_base  : 1 byte block: 9c 	(DW_OP_call_frame_cfa)
    <124>   DW_AT_call_all_calls: 1
 <2><124>: Abbrev Number: 7 (DW_TAG_formal_parameter)
    <125>   DW_AT_abstract_origin: <0xf4>
    <129>   DW_AT_location    : 0x34 (location list)
    <12d>   DW_AT_GNU_locviews: 0x30
 <2><131>: Abbrev Number: 7 (DW_TAG_formal_parameter)
    <132>   DW_AT_abstract_origin: <0xfc>
    <136>   DW_AT_location    : 0x46 (location list)
    <13a>   DW_AT_GNU_locviews: 0x42
 <2><13e>: Abbrev Number: 18 (DW_TAG_formal_parameter)
    <13f>   DW_AT_abstract_origin: <0x104>
    <143>   DW_AT_const_value : 1
 <2><144>: Abbrev Number: 6 (DW_TAG_call_site)
    <145>   DW_AT_call_return_pc: 0x5
    <14d>   DW_AT_call_tail_call: 1
    <14d>   DW_AT_call_origin : <0x51>
 <3><151>: Abbrev Number: 1 (DW_TAG_call_site_parameter)
    <152>   DW_AT_location    : 1 byte block: 55 	(DW_OP_reg5 (rdi))
    <154>   DW_AT_call_value  : 3 byte block: a3 1 55 	(DW_OP_entry_value: (DW_OP_reg5 (rdi)))
 <3><158>: Abbrev Number: 1 (DW_TAG_call_site_parameter)
    <159>   DW_AT_location    : 1 byte block: 54 	(DW_OP_reg4 (rsi))
    <15b>   DW_AT_call_value  : 3 byte block: a3 1 54 	(DW_OP_entry_value: (DW_OP_reg4 (rsi)))

Welcome board @dafaust Thanks for the comments. Just having one missed function parameter is easy to represent. However, there are many other cases e.g.

// Original func ==> Optimized func:
int foo(struct big a, int b). ==> void foo(struct big *a_ptr, int b);
// struct bytes_16 { long t1; long t2; }
int foo(struct bytes_16, int b) ==> int foo(long bytes_16_t1, int b);
int foo(struct bytes_16, int b) ==> int foo(long bytes_16_t1, long bytes_16_t2, int b);
// struct big { int a; int b; }, inside foo(), v->b is used.
int foo(struct big *v) => int b = v->b; int foo(int b)
// sometimes maybe totally unrelated arguments
int foo(struct big *v) => int foo(struct another *v)

#127855 is the original discussion. This pull request tried a different approach, but it will cause issues for lldb debugger, i.e., it will have visible change for debugger. So mostly like I will revert back to #127855.

This (#127855 (comment)) is what we agreed on with @jemarch.

Later we have a more elaborated format as in #127855 (comment)

dafaust · 2025-10-22T21:20:23Z

Welcome board @dafaust Thanks for the comments. Just having one missed function parameter is easy to represent. However, there are many other cases e.g.

// Original func ==> Optimized func:
int foo(struct big a, int b). ==> void foo(struct big *a_ptr, int b);
// struct bytes_16 { long t1; long t2; }
int foo(struct bytes_16, int b) ==> int foo(long bytes_16_t1, int b);
int foo(struct bytes_16, int b) ==> int foo(long bytes_16_t1, long bytes_16_t2, int b);
// struct big { int a; int b; }, inside foo(), v->b is used.
int foo(struct big *v) => int b = v->b; int foo(int b)
// sometimes maybe totally unrelated arguments
int foo(struct big *v) => int foo(struct another *v)

I see. So the IPA-SRA transformations are the bigger problem, and looks like GCC doesn't currently reflect them all in DWARF either.

I do wonder whether the GCC format above for constprop, which afaiu is based on the idea of "concrete out-of-line instance" discussed in DWARF5 spec 3.3.8.3, could adapt to represent all of these. For example, if a struct parameter is split by IPA-SRA to only pass the relevant field, it seems like the spec already accommodates concrete instances passing not only fewer, but also different parameters:

"3. A concrete instance tree may contain entries which do not correspond to
entries in the abstract instance tree to describe new entities that are specific to
a particular inlined expansion." (DWARF 5 spec 3.3.8.2)

so it might be possible to represent a struct param split by SRA by adding additional formal parameters to the concrete instance corresponding to the optimized version, and removing the AT_location of the relevant struct (or pointer) formal in the concrete instance to indicate it is not passed...

Anyway, I don't intend to derail. But maybe an alternate option to explore more if it's needed.

#127855 is the original discussion. This pull request tried a different approach, but it will cause issues for lldb debugger, i.e., it will have visible change for debugger. So mostly like I will revert back to #127855.

Ahh, ok.

This (#127855 (comment)) is what we agreed on with @jemarch.

Later we have a more elaborated format as in #127855 (comment)

Thanks for the explanation and links!

yonghong-song · 2025-10-23T03:54:05Z

Thanks @dafaust. I cannot fully understand what dwarf 5 (or 6) recommendation for changed signature. An example will be the best.

For the following:

"3. A concrete instance tree may contain entries which do not correspond to
entries in the abstract instance tree to describe new entities that are specific to
a particular inlined expansion." (DWARF 5 spec 3.3.8.2)

so it might be possible to represent a struct param split by SRA by adding additional formal parameters to the concrete instance corresponding to the optimized version, and removing the AT_location of the relevant struct (or pointer) formal in the concrete instance to indicate it is not passed...

If the intend is to add changed signatures to the DISubprogram (with locations) directly, we have to be very careful to design. I have tried this approach (directly replacing old signature to the new signature in the DISubprogram), it cause various issues for lldb (see some of above examples).

The llvm community recommendation is that the change should not impact any current lldb functionalities and common tools which deal with dwarf.

@callee

Add a new pass EmitChangedFuncDebugInfo which will add dwarf for additional functions whose signatures are changed during compiler transformations. The original intention is for bpf-based linux kernel tracing. The function signature is available in vmlinux BTF generated from pahole/dwarf. Such signature is generated from dwarf at the source level. But this is not ideal since some function may have signatures changed. If user still used the source level signature, users may not get correct results and may need some efforts to workaround the issue. So we want to encode the true signature (not different from the source one) in dwarf. With such additional information, dwarf users can get these signature changed functions. For example, pahole is able to process these signature changed functions and encode them into vmlinux BTF properly. History of multiple attempts ============================ Previously I have attempted a few tries ([1], [2] and [3]). Initially I tried to modify debuginfo in passes like ArgPromotion and DeadArgElim, but later on it is suggested to have a central place to handle new signatures ([1]). Later, I have another version of patch similar to this one, but the recommendation is to modify debuginfo to encode new signature within the same function, either through inlinedAt or new signature overwriting the old one. This seems working but it has some side effect on lldb, some lldb output (e.g. back trace) will be different from the previous one. The recommendation is to avoid any behavior change for lldb ([2] and [3]). So now, I came back to the solution discussed at the end of [1]. Basically a special dwarf entry will be generated to encode the new signature. The new signature will have a reference to the old source-level signature. So the tool can inspect dwarf to retrieve the related info. Examples and dwarf output ========================= In below, a few examples will show how changed signatures represented in dwarf: Example 1 --------- Source: $ cat test.c struct t { int a; }; char *tar(struct t *a, struct t *d); __attribute__((noinline)) static char * foo(struct t *a, int b, struct t *d) { return tar(a, d); } char *bar(struct t *a, struct t *d) { return foo(a, 1, d); } Compiled and dump dwarf with: $ clang -O2 -c -g test.c $ llvm-dwarfdump test.o 0x0000000c: DW_TAG_compile_unit ... 0x0000005c: DW_TAG_subprogram DW_AT_low_pc (0x0000000000000010) DW_AT_high_pc (0x0000000000000015) DW_AT_frame_base (DW_OP_reg7 RSP) DW_AT_call_all_calls (true) DW_AT_name ("foo") DW_AT_decl_file ("/home/yhs/tests/sig-change/deadarg/test.c") DW_AT_decl_line (3) DW_AT_prototyped (true) DW_AT_calling_convention (DW_CC_nocall) DW_AT_type (0x000000b1 "char *") 0x0000006c: DW_TAG_formal_parameter DW_AT_location (DW_OP_reg5 RDI) DW_AT_name ("a") DW_AT_decl_file ("/home/yhs/tests/sig-change/deadarg/test.c") DW_AT_decl_line (3) DW_AT_type (0x000000ba "t *") 0x00000076: DW_TAG_formal_parameter DW_AT_name ("b") DW_AT_decl_file ("/home/yhs/tests/sig-change/deadarg/test.c") DW_AT_decl_line (3) DW_AT_type (0x000000ce "int") 0x0000007e: DW_TAG_formal_parameter DW_AT_location (DW_OP_reg4 RSI) DW_AT_name ("d") DW_AT_decl_file ("/home/yhs/tests/sig-change/deadarg/test.c") DW_AT_decl_line (3) DW_AT_type (0x000000ba "t *") 0x00000088: DW_TAG_call_site ... 0x0000009d: NULL ... 0x000000d2: DW_TAG_inlined_subroutine DW_AT_name ("foo") DW_AT_type (0x000000b1 "char *") DW_AT_artificial (true) DW_AT_specification (0x0000005c "foo") 0x000000dc: DW_TAG_formal_parameter DW_AT_name ("a") DW_AT_type (0x000000ba "t *") 0x000000e2: DW_TAG_formal_parameter DW_AT_name ("d") DW_AT_type (0x000000ba "t *") 0x000000e8: NULL In the above, the DISubprogram 'foo' has the original signature but since parameter 'b' does not have DW_AT_location, it is clear that parameter will not be used. The actual function signature is represented in DW_TAG_inlined_subroutine. For the above case, it looks like DW_TAG_inlined_subroutine is not necessary. Let us try a few other examples below. Example 2 --------- Source: $ cat test.c struct t { long a; long b;}; __attribute__((noinline)) static long foo(struct t arg) { return arg.b * 5; } long bar(struct t arg) { return foo(arg); } Compiled and dump dwarf with: $ clang -O2 -c -g test.c $ llvm-dwarfdump test.o ... 0x0000004e: DW_TAG_subprogram DW_AT_low_pc (0x0000000000000010) DW_AT_high_pc (0x0000000000000015) DW_AT_frame_base (DW_OP_reg7 RSP) DW_AT_call_all_calls (true) DW_AT_name ("foo") DW_AT_decl_file ("/home/yhs/tests/sig-change/struct/test.c") DW_AT_decl_line (2) DW_AT_prototyped (true) DW_AT_calling_convention (DW_CC_nocall) DW_AT_type (0x0000006d "long") 0x0000005e: DW_TAG_formal_parameter DW_AT_location (DW_OP_piece 0x8, DW_OP_reg5 RDI, DW_OP_piece 0x8) DW_AT_name ("arg") DW_AT_decl_file ("/home/yhs/tests/sig-change/struct/test.c") DW_AT_decl_line (2) DW_AT_type (0x00000099 "t") 0x0000006c: NULL ... 0x00000088: DW_TAG_inlined_subroutine DW_AT_name ("foo") DW_AT_type (0x0000006d "long") DW_AT_artificial (true) DW_AT_specification (0x0000004e "foo") 0x00000092: DW_TAG_formal_parameter DW_AT_name ("arg") DW_AT_type (0x0000006d "long") 0x00000098: NULL In the above case for function foo(), the original argument is 'struct t', but the final actual argument is a 'long' type. DW_TAG_inlined_subroutine can clearly represent the signature type instead of doing DW_AT_location thing. There is a problem in the above then, it is not clear what formal parameter 'arg' corresponds to the original parameter. If necessary, the compiler could change 'arg' to e.g. 'arg_offset_8' to indicate it is 8 byte offset from the original struct. Example 3 --------- Source: $ cat test2.c struct t { long a; long b; long c;}; __attribute__((noinline)) long foo(struct t arg) { return arg.a * arg.c; } long bar(struct t arg) { return foo(arg); } Compiled and dump dwarf with: $ clang -O2 -c -g test2.c $ llvm-dwarfdump test2.o ... 0x0000003e: DW_TAG_subprogram DW_AT_low_pc (0x0000000000000010) DW_AT_high_pc (0x0000000000000015) DW_AT_frame_base (DW_OP_reg7 RSP) DW_AT_call_all_calls (true) DW_AT_name ("bar") DW_AT_decl_file ("/home/yhs/tests/sig-change/struct/test2.c") DW_AT_decl_line (5) DW_AT_prototyped (true) DW_AT_type (0x0000005f "long") DW_AT_external (true) 0x0000004d: DW_TAG_formal_parameter DW_AT_location (DW_OP_fbreg +8) DW_AT_name ("arg") DW_AT_decl_file ("/home/yhs/tests/sig-change/struct/test2.c") DW_AT_decl_line (5) DW_AT_type (0x00000079 "t") 0x00000058: DW_TAG_call_site DW_AT_call_origin (0x00000023 "foo") DW_AT_call_tail_call (true) DW_AT_call_pc (0x0000000000000010) 0x0000005e: NULL ... 0x00000063: DW_TAG_inlined_subroutine DW_AT_name ("foo") DW_AT_type (0x0000005f "long") DW_AT_artificial (true) DW_AT_specification (0x00000023 "foo") 0x0000006d: DW_TAG_formal_parameter DW_AT_name ("arg") DW_AT_type (0x00000074 "t *") 0x00000073: NULL In the above example, from DW_TAG_subprogram, it is not clear what kind of type the parameter should be. But DW_TAG_inlined_subroutine can clearly show what the type should be. Again, the name can be changed e.g. 'arg_ptr' if desired. Example 4 --------- Source: $ cat test.c __attribute__((noinline)) static int callee(const int *p) { return *p + 42; } int caller(void) { int x = 100; return callee(&x); } Compiled and dump dwarf with: $ clang -O3 -c -g test2.c $ llvm-dwarfdump test2.o ... 0x0000004a: DW_TAG_subprogram DW_AT_low_pc (0x0000000000000010) DW_AT_high_pc (0x0000000000000014) DW_AT_frame_base (DW_OP_reg7 RSP) DW_AT_call_all_calls (true) DW_AT_name ("callee") DW_AT_decl_file ("/home/yhs/tests/sig-change/prom/test.c") DW_AT_decl_line (1) DW_AT_prototyped (true) DW_AT_calling_convention (DW_CC_nocall) DW_AT_type (0x00000063 "int") 0x0000005a: DW_TAG_formal_parameter DW_AT_name ("p") DW_AT_decl_file ("/home/yhs/tests/sig-change/prom/test.c") DW_AT_decl_line (1) DW_AT_type (0x00000078 "const int *") 0x00000062: NULL ... 0x00000067: DW_TAG_inlined_subroutine DW_AT_name ("callee") DW_AT_type (0x00000063 "int") DW_AT_artificial (true) DW_AT_specification (0x0000004a "callee") 0x00000071: DW_TAG_formal_parameter DW_AT_name ("__0") DW_AT_type (0x00000063 "int") 0x00000077: NULL In the above, the function static int callee(const int *p) { return *p + 42; } is transformed to static int callee(int p) { return p + 42; } But the new signature is not reflected in DW_TAG_subprogram. The DW_TAG_inlined_subroutine can precisely capture the signature. Note that the parameter name is "__0" and "0" means the first argument. The reason is due to the following IR: define internal ... i32 @callee(i32 %0) unnamed_addr llvm#1 !dbg !23 { #dbg_value(ptr poison, !29, !DIExpression(), !30) %2 = add nsw i32 %0, 42, !dbg !31 ret i32 %2, !dbg !32 } ... !29 = !DILocalVariable(name: "p", arg: 1, scope: !23, file: !1, line: 1, type: !26) The reason is due to 'ptr poison' as 'ptr poison' mean the debug value should not be used any more. This is also the reason that the above DW_TAG_subprogram does not have location information. DW_TAG_inlined_subroutine can provide correct signature though. If we compile like below: clang -O3 -c -g test.c -fno-discard-value-names The function argument name will be preserved ... i32 @callee(i32 %p.0.val) ... nd in such cases, the DW_TAG_inlined_subroutine looks like below: 0x00000067: DW_TAG_inlined_subroutine DW_AT_name ("callee") DW_AT_type (0x00000063 "int") DW_AT_artificial (true) DW_AT_specification (0x0000004a "callee") 0x00000071: DW_TAG_formal_parameter DW_AT_name ("p__0__val") DW_AT_type (0x00000063 "int") 0x00000077: NULL Note that the original argument name replaces '.' with "__" so argument name has proper C standard. Based a run on linux kernel, the names like "__<arg_index>" roughly 2% of total signature changed functions, so we probably okay for now. Non-LTO vs. LTO --------------- For thin-lto mode, we often see kernel symbols like p9_req_cache.llvm.13472271643223911678 If this symbol has identical source level signature with p9_req_cache, then a special DW_TAG_inlined_subroutine will not be generated. But if a symbol with "<foo>.llvm.<hash>" has different signatures than the source level "<foo>", then a special DW_TAG_inlined_subroutine will be generated like below: 0x10f0793f: DW_TAG_inlined_subroutine DW_AT_name ("flow_offload_fill_route") DW_AT_linkage_name ("flow_offload_fill_route.llvm.14555965973926298225") DW_AT_artificial (true) DW_AT_specification (0x10ee9e54 "flow_offload_fill_route") 0x10f07949: DW_TAG_formal_parameter DW_AT_name ("flow") DW_AT_type (0x10ee837a "flow_offload *") 0x10f07951: DW_TAG_formal_parameter DW_AT_name ("route") DW_AT_type (0x10eea4ef "nf_flow_route *") 0x10f07959: DW_TAG_formal_parameter DW_AT_name ("dir") DW_AT_type (0x10ecef15 "int") 0x10f07961: NULL In the above, function "flow_offload_fill_route" has return type "int" at source level, but optimization eventually made the return type as "void". Note that it is possible one source symbol may have multiple linkage name's due to potentially (more than one) cloning in llvm. In such cases, multiple DW_TAG_inlined_subroutine instances might be possible. Some restrictions ================= There are some restrictions in the current implementation: - Only C language is supported - BPF target is excluded as one of main goals for this pull request is to generate proper vmlinux BTF for arch's like x86_64/arm64 etc. - Function must not be a intrinsic, decl only, return value size more than arch register size and func with variable arguments. - For arguments, only int/float/ptr types are supported. Some statistics with linux kernel ================================= I have tested this patch set by building latest bpf-next linux kernel. For no-lto case: 65341 original number of functions 1054 signature changed functions with this patch For thin-lto case: 65595 original number of functions 1323 signature changed functions with this patch Next step ========= With this llvm change, we will be able to do some work in pahole and libbpf. For pahole, currently we will see the warning: die__process_unit: DW_TAG_inlined_subroutine (0x1d) @ <0xf2db986> not handled in a c11 CU! Basically these DW_TAG_inlined_subroutine are not inside the DISubprogram. [1] llvm#127855 [2] llvm#157349 [3] https://discourse.llvm.org/t/rfc-identify-func-signature-change-in-llvm-compiled-kernel-image/82609

@callee

Add a new pass EmitChangedFuncDebugInfo which will add dwarf for additional functions whose signatures are changed during compiler transformations. The original intention is for bpf-based linux kernel tracing. The function signature is available in vmlinux BTF generated from pahole/dwarf. Such signature is generated from dwarf at the source level. But this is not ideal since some function may have signatures changed. If user still used the source level signature, users may not get correct results and may need some efforts to workaround the issue. So we want to encode the true signature (not different from the source one) in dwarf. With such additional information, dwarf users can get these signature changed functions. For example, pahole is able to process these signature changed functions and encode them into vmlinux BTF properly. History of multiple attempts ============================ Previously I have attempted a few tries ([1], [2] and [3]). Initially I tried to modify debuginfo in passes like ArgPromotion and DeadArgElim, but later on it is suggested to have a central place to handle new signatures ([1]). Later, I have another version of patch similar to this one, but the recommendation is to modify debuginfo to encode new signature within the same function, either through inlinedAt or new signature overwriting the old one. This seems working but it has some side effect on lldb, some lldb output (e.g. back trace) will be different from the previous one. The recommendation is to avoid any behavior change for lldb ([2] and [3]). So now, I came back to the solution discussed at the end of [1]. Basically a special dwarf entry will be generated to encode the new signature. The new signature will have a reference to the old source-level signature. So the tool can inspect dwarf to retrieve the related info. Examples and dwarf output ========================= In below, a few examples will show how changed signatures represented in dwarf: Example 1 --------- Source: $ cat test.c struct t { int a; }; char *tar(struct t *a, struct t *d); __attribute__((noinline)) static char * foo(struct t *a, int b, struct t *d) { return tar(a, d); } char *bar(struct t *a, struct t *d) { return foo(a, 1, d); } Compiled and dump dwarf with: $ clang -O2 -c -g test.c $ llvm-dwarfdump test.o 0x0000000c: DW_TAG_compile_unit ... 0x0000005c: DW_TAG_subprogram DW_AT_low_pc (0x0000000000000010) DW_AT_high_pc (0x0000000000000015) DW_AT_frame_base (DW_OP_reg7 RSP) DW_AT_call_all_calls (true) DW_AT_name ("foo") DW_AT_decl_file ("/home/yhs/tests/sig-change/deadarg/test.c") DW_AT_decl_line (3) DW_AT_prototyped (true) DW_AT_calling_convention (DW_CC_nocall) DW_AT_type (0x000000b1 "char *") 0x0000006c: DW_TAG_formal_parameter DW_AT_location (DW_OP_reg5 RDI) DW_AT_name ("a") DW_AT_decl_file ("/home/yhs/tests/sig-change/deadarg/test.c") DW_AT_decl_line (3) DW_AT_type (0x000000ba "t *") 0x00000076: DW_TAG_formal_parameter DW_AT_name ("b") DW_AT_decl_file ("/home/yhs/tests/sig-change/deadarg/test.c") DW_AT_decl_line (3) DW_AT_type (0x000000ce "int") 0x0000007e: DW_TAG_formal_parameter DW_AT_location (DW_OP_reg4 RSI) DW_AT_name ("d") DW_AT_decl_file ("/home/yhs/tests/sig-change/deadarg/test.c") DW_AT_decl_line (3) DW_AT_type (0x000000ba "t *") 0x00000088: DW_TAG_call_site ... 0x0000009d: NULL ... 0x000000d2: DW_TAG_inlined_subroutine DW_AT_name ("foo") DW_AT_type (0x000000b1 "char *") DW_AT_artificial (true) DW_AT_specification (0x0000005c "foo") 0x000000dc: DW_TAG_formal_parameter DW_AT_name ("a") DW_AT_type (0x000000ba "t *") 0x000000e2: DW_TAG_formal_parameter DW_AT_name ("d") DW_AT_type (0x000000ba "t *") 0x000000e8: NULL In the above, the DISubprogram 'foo' has the original signature but since parameter 'b' does not have DW_AT_location, it is clear that parameter will not be used. The actual function signature is represented in DW_TAG_inlined_subroutine. For the above case, it looks like DW_TAG_inlined_subroutine is not necessary. Let us try a few other examples below. Example 2 --------- Source: $ cat test.c struct t { long a; long b;}; __attribute__((noinline)) static long foo(struct t arg) { return arg.b * 5; } long bar(struct t arg) { return foo(arg); } Compiled and dump dwarf with: $ clang -O2 -c -g test.c $ llvm-dwarfdump test.o ... 0x0000004e: DW_TAG_subprogram DW_AT_low_pc (0x0000000000000010) DW_AT_high_pc (0x0000000000000015) DW_AT_frame_base (DW_OP_reg7 RSP) DW_AT_call_all_calls (true) DW_AT_name ("foo") DW_AT_decl_file ("/home/yhs/tests/sig-change/struct/test.c") DW_AT_decl_line (2) DW_AT_prototyped (true) DW_AT_calling_convention (DW_CC_nocall) DW_AT_type (0x0000006d "long") 0x0000005e: DW_TAG_formal_parameter DW_AT_location (DW_OP_piece 0x8, DW_OP_reg5 RDI, DW_OP_piece 0x8) DW_AT_name ("arg") DW_AT_decl_file ("/home/yhs/tests/sig-change/struct/test.c") DW_AT_decl_line (2) DW_AT_type (0x00000099 "t") 0x0000006c: NULL ... 0x00000088: DW_TAG_inlined_subroutine DW_AT_name ("foo") DW_AT_type (0x0000006d "long") DW_AT_artificial (true) DW_AT_specification (0x0000004e "foo") 0x00000092: DW_TAG_formal_parameter DW_AT_name ("arg") DW_AT_type (0x0000006d "long") 0x00000098: NULL In the above case for function foo(), the original argument is 'struct t', but the final actual argument is a 'long' type. DW_TAG_inlined_subroutine can clearly represent the signature type instead of doing DW_AT_location thing. There is a problem in the above then, it is not clear what formal parameter 'arg' corresponds to the original parameter. If necessary, the compiler could change 'arg' to e.g. 'arg_offset_8' to indicate it is 8 byte offset from the original struct. Example 3 --------- Source: $ cat test2.c struct t { long a; long b; long c;}; __attribute__((noinline)) long foo(struct t arg) { return arg.a * arg.c; } long bar(struct t arg) { return foo(arg); } Compiled and dump dwarf with: $ clang -O2 -c -g test2.c $ llvm-dwarfdump test2.o ... 0x0000003e: DW_TAG_subprogram DW_AT_low_pc (0x0000000000000010) DW_AT_high_pc (0x0000000000000015) DW_AT_frame_base (DW_OP_reg7 RSP) DW_AT_call_all_calls (true) DW_AT_name ("bar") DW_AT_decl_file ("/home/yhs/tests/sig-change/struct/test2.c") DW_AT_decl_line (5) DW_AT_prototyped (true) DW_AT_type (0x0000005f "long") DW_AT_external (true) 0x0000004d: DW_TAG_formal_parameter DW_AT_location (DW_OP_fbreg +8) DW_AT_name ("arg") DW_AT_decl_file ("/home/yhs/tests/sig-change/struct/test2.c") DW_AT_decl_line (5) DW_AT_type (0x00000079 "t") 0x00000058: DW_TAG_call_site DW_AT_call_origin (0x00000023 "foo") DW_AT_call_tail_call (true) DW_AT_call_pc (0x0000000000000010) 0x0000005e: NULL ... 0x00000063: DW_TAG_inlined_subroutine DW_AT_name ("foo") DW_AT_type (0x0000005f "long") DW_AT_artificial (true) DW_AT_specification (0x00000023 "foo") 0x0000006d: DW_TAG_formal_parameter DW_AT_name ("arg") DW_AT_type (0x00000074 "t *") 0x00000073: NULL In the above example, from DW_TAG_subprogram, it is not clear what kind of type the parameter should be. But DW_TAG_inlined_subroutine can clearly show what the type should be. Again, the name can be changed e.g. 'arg_ptr' if desired. Example 4 --------- Source: $ cat test.c __attribute__((noinline)) static int callee(const int *p) { return *p + 42; } int caller(void) { int x = 100; return callee(&x); } Compiled and dump dwarf with: $ clang -O3 -c -g test2.c $ llvm-dwarfdump test2.o ... 0x0000004a: DW_TAG_subprogram DW_AT_low_pc (0x0000000000000010) DW_AT_high_pc (0x0000000000000014) DW_AT_frame_base (DW_OP_reg7 RSP) DW_AT_call_all_calls (true) DW_AT_name ("callee") DW_AT_decl_file ("/home/yhs/tests/sig-change/prom/test.c") DW_AT_decl_line (1) DW_AT_prototyped (true) DW_AT_calling_convention (DW_CC_nocall) DW_AT_type (0x00000063 "int") 0x0000005a: DW_TAG_formal_parameter DW_AT_name ("p") DW_AT_decl_file ("/home/yhs/tests/sig-change/prom/test.c") DW_AT_decl_line (1) DW_AT_type (0x00000078 "const int *") 0x00000062: NULL ... 0x00000067: DW_TAG_inlined_subroutine DW_AT_name ("callee") DW_AT_type (0x00000063 "int") DW_AT_artificial (true) DW_AT_specification (0x0000004a "callee") 0x00000071: DW_TAG_formal_parameter DW_AT_name ("__0") DW_AT_type (0x00000063 "int") 0x00000077: NULL In the above, the function static int callee(const int *p) { return *p + 42; } is transformed to static int callee(int p) { return p + 42; } But the new signature is not reflected in DW_TAG_subprogram. The DW_TAG_inlined_subroutine can precisely capture the signature. Note that the parameter name is "__0" and "0" means the first argument. The reason is due to the following IR: define internal ... i32 @callee(i32 %0) unnamed_addr llvm#1 !dbg !23 { #dbg_value(ptr poison, !29, !DIExpression(), !30) %2 = add nsw i32 %0, 42, !dbg !31 ret i32 %2, !dbg !32 } ... !29 = !DILocalVariable(name: "p", arg: 1, scope: !23, file: !1, line: 1, type: !26) The reason is due to 'ptr poison' as 'ptr poison' mean the debug value should not be used any more. This is also the reason that the above DW_TAG_subprogram does not have location information. DW_TAG_inlined_subroutine can provide correct signature though. If we compile like below: clang -O3 -c -g test.c -fno-discard-value-names The function argument name will be preserved ... i32 @callee(i32 %p.0.val) ... and in such cases, the DW_TAG_inlined_subroutine looks like below: 0x00000067: DW_TAG_inlined_subroutine DW_AT_name ("callee") DW_AT_type (0x00000063 "int") DW_AT_artificial (true) DW_AT_specification (0x0000004a "callee") 0x00000071: DW_TAG_formal_parameter DW_AT_name ("p__0__val") DW_AT_type (0x00000063 "int") 0x00000077: NULL Note that the original argument name replaces '.' with "__" so argument name has proper C standard. Based a run on linux kernel, the names like "__<arg_index>" roughly 2% of total signature changed functions, so we probably okay for now. Non-LTO vs. LTO --------------- For thin-lto mode, we often see kernel symbols like p9_req_cache.llvm.13472271643223911678 If this symbol has identical source level signature with p9_req_cache, then a special DW_TAG_inlined_subroutine will not be generated. But if a symbol with "<foo>.llvm.<hash>" has different signatures than the source level "<foo>", then a special DW_TAG_inlined_subroutine will be generated like below: 0x10f0793f: DW_TAG_inlined_subroutine DW_AT_name ("flow_offload_fill_route") DW_AT_linkage_name ("flow_offload_fill_route.llvm.14555965973926298225") DW_AT_artificial (true) DW_AT_specification (0x10ee9e54 "flow_offload_fill_route") 0x10f07949: DW_TAG_formal_parameter DW_AT_name ("flow") DW_AT_type (0x10ee837a "flow_offload *") 0x10f07951: DW_TAG_formal_parameter DW_AT_name ("route") DW_AT_type (0x10eea4ef "nf_flow_route *") 0x10f07959: DW_TAG_formal_parameter DW_AT_name ("dir") DW_AT_type (0x10ecef15 "int") 0x10f07961: NULL In the above, function "flow_offload_fill_route" has return type "int" at source level, but optimization eventually made the return type as "void". Note that it is possible one source symbol may have multiple linkage name's due to potentially (more than one) cloning in llvm. In such cases, multiple DW_TAG_inlined_subroutine instances might be possible. Some restrictions ================= There are some restrictions in the current implementation: - Only C language is supported - BPF target is excluded as one of main goals for this pull request is to generate proper vmlinux BTF for arch's like x86_64/arm64 etc. - Function must not be a intrinsic, decl only, return value size more than arch register size and func with variable arguments. - For arguments, only int/float/ptr types are supported. Some statistics with linux kernel ================================= I have tested this patch set by building latest bpf-next linux kernel. For no-lto case: 65341 original number of functions 1054 signature changed functions with this patch For thin-lto case: 65595 original number of functions 1323 signature changed functions with this patch Next step ========= With this llvm change, we will be able to do some work in pahole and libbpf. For pahole, currently we will see the warning: die__process_unit: DW_TAG_inlined_subroutine (0x1d) @ <0xf2db986> not handled in a c11 CU! Basically these DW_TAG_inlined_subroutine are not inside the DISubprogram. [1] llvm#127855 [2] llvm#157349 [3] https://discourse.llvm.org/t/rfc-identify-func-signature-change-in-llvm-compiled-kernel-image/82609

dafaust · 2025-10-28T20:54:18Z

Having refreshed my mind on all the various discussions on this topic, the format in #127855 and now in #165310 is good and we should proceed with it.

Thanks @dafaust. I cannot fully understand what dwarf 5 (or 6) recommendation for changed signature. An example will be the best.

To be clear - I don't think there is any such recommendation from dwarf spec. For the inter-procedural constant propagation transformations GCC currently generates some dwarf reflecting the constprop specialized versions of a changed function (e.g. "foo" -> "foo.constprop") like the example in my first comment above. I was just thinking about whether something similar to that could be adapted to reflect other transformations also.

But that is all extraneous :) Sorry for the noise.
We should rather move forward with #165310

@callee

Add a new pass EmitChangedFuncDebugInfo which will add dwarf for additional functions whose signatures are changed during compiler transformations. The original intention is for bpf-based linux kernel tracing. The function signature is available in vmlinux BTF generated from pahole/dwarf. Such signature is generated from dwarf at the source level. But this is not ideal since some function may have signatures changed. If user still used the source level signature, users may not get correct results and may need some efforts to workaround the issue. So we want to encode the true signature (not different from the source one) in dwarf. With such additional information, dwarf users can get these signature changed functions. For example, pahole is able to process these signature changed functions and encode them into vmlinux BTF properly. History of multiple attempts ============================ Previously I have attempted a few tries ([1], [2] and [3]). Initially I tried to modify debuginfo in passes like ArgPromotion and DeadArgElim, but later on it is suggested to have a central place to handle new signatures ([1]). Later, I have another version of patch similar to this one, but the recommendation is to modify debuginfo to encode new signature within the same function, either through inlinedAt or new signature overwriting the old one. This seems working but it has some side effect on lldb, some lldb output (e.g. back trace) will be different from the previous one. The recommendation is to avoid any behavior change for lldb ([2] and [3]). So now, I came back to the solution discussed at the end of [1]. Basically a special dwarf entry will be generated to encode the new signature. The new signature will have a reference to the old source-level signature. So the tool can inspect dwarf to retrieve the related info. Examples and dwarf output ========================= In below, a few examples will show how changed signatures represented in dwarf: Example 1 --------- Source: $ cat test.c struct t { int a; }; char *tar(struct t *a, struct t *d); __attribute__((noinline)) static char * foo(struct t *a, int b, struct t *d) { return tar(a, d); } char *bar(struct t *a, struct t *d) { return foo(a, 1, d); } Compiled and dump dwarf with: $ clang -O2 -c -g test.c $ llvm-dwarfdump test.o 0x0000000c: DW_TAG_compile_unit ... 0x0000005c: DW_TAG_subprogram DW_AT_low_pc (0x0000000000000010) DW_AT_high_pc (0x0000000000000015) DW_AT_frame_base (DW_OP_reg7 RSP) DW_AT_call_all_calls (true) DW_AT_name ("foo") DW_AT_decl_file ("/home/yhs/tests/sig-change/deadarg/test.c") DW_AT_decl_line (3) DW_AT_prototyped (true) DW_AT_calling_convention (DW_CC_nocall) DW_AT_type (0x000000b1 "char *") 0x0000006c: DW_TAG_formal_parameter DW_AT_location (DW_OP_reg5 RDI) DW_AT_name ("a") DW_AT_decl_file ("/home/yhs/tests/sig-change/deadarg/test.c") DW_AT_decl_line (3) DW_AT_type (0x000000ba "t *") 0x00000076: DW_TAG_formal_parameter DW_AT_name ("b") DW_AT_decl_file ("/home/yhs/tests/sig-change/deadarg/test.c") DW_AT_decl_line (3) DW_AT_type (0x000000ce "int") 0x0000007e: DW_TAG_formal_parameter DW_AT_location (DW_OP_reg4 RSI) DW_AT_name ("d") DW_AT_decl_file ("/home/yhs/tests/sig-change/deadarg/test.c") DW_AT_decl_line (3) DW_AT_type (0x000000ba "t *") 0x00000088: DW_TAG_call_site ... 0x0000009d: NULL ... 0x000000d2: DW_TAG_inlined_subroutine DW_AT_name ("foo") DW_AT_type (0x000000b1 "char *") DW_AT_artificial (true) DW_AT_specification (0x0000005c "foo") 0x000000dc: DW_TAG_formal_parameter DW_AT_name ("a") DW_AT_type (0x000000ba "t *") 0x000000e2: DW_TAG_formal_parameter DW_AT_name ("d") DW_AT_type (0x000000ba "t *") 0x000000e8: NULL In the above, the DISubprogram 'foo' has the original signature but since parameter 'b' does not have DW_AT_location, it is clear that parameter will not be used. The actual function signature is represented in DW_TAG_inlined_subroutine. For the above case, it looks like DW_TAG_inlined_subroutine is not necessary. Let us try a few other examples below. Example 2 --------- Source: $ cat test.c struct t { long a; long b;}; __attribute__((noinline)) static long foo(struct t arg) { return arg.b * 5; } long bar(struct t arg) { return foo(arg); } Compiled and dump dwarf with: $ clang -O2 -c -g test.c $ llvm-dwarfdump test.o ... 0x0000004e: DW_TAG_subprogram DW_AT_low_pc (0x0000000000000010) DW_AT_high_pc (0x0000000000000015) DW_AT_frame_base (DW_OP_reg7 RSP) DW_AT_call_all_calls (true) DW_AT_name ("foo") DW_AT_decl_file ("/home/yhs/tests/sig-change/struct/test.c") DW_AT_decl_line (2) DW_AT_prototyped (true) DW_AT_calling_convention (DW_CC_nocall) DW_AT_type (0x0000006d "long") 0x0000005e: DW_TAG_formal_parameter DW_AT_location (DW_OP_piece 0x8, DW_OP_reg5 RDI, DW_OP_piece 0x8) DW_AT_name ("arg") DW_AT_decl_file ("/home/yhs/tests/sig-change/struct/test.c") DW_AT_decl_line (2) DW_AT_type (0x00000099 "t") 0x0000006c: NULL ... 0x00000088: DW_TAG_inlined_subroutine DW_AT_name ("foo") DW_AT_type (0x0000006d "long") DW_AT_artificial (true) DW_AT_specification (0x0000004e "foo") 0x00000092: DW_TAG_formal_parameter DW_AT_name ("arg") DW_AT_type (0x0000006d "long") 0x00000098: NULL In the above case for function foo(), the original argument is 'struct t', but the final actual argument is a 'long' type. DW_TAG_inlined_subroutine can clearly represent the signature type instead of doing DW_AT_location thing. There is a problem in the above then, it is not clear what formal parameter 'arg' corresponds to the original parameter. If necessary, the compiler could change 'arg' to e.g. 'arg_offset_8' to indicate it is 8 byte offset from the original struct. Example 3 --------- Source: $ cat test2.c struct t { long a; long b; long c;}; __attribute__((noinline)) long foo(struct t arg) { return arg.a * arg.c; } long bar(struct t arg) { return foo(arg); } Compiled and dump dwarf with: $ clang -O2 -c -g test2.c $ llvm-dwarfdump test2.o ... 0x0000003e: DW_TAG_subprogram DW_AT_low_pc (0x0000000000000010) DW_AT_high_pc (0x0000000000000015) DW_AT_frame_base (DW_OP_reg7 RSP) DW_AT_call_all_calls (true) DW_AT_name ("bar") DW_AT_decl_file ("/home/yhs/tests/sig-change/struct/test2.c") DW_AT_decl_line (5) DW_AT_prototyped (true) DW_AT_type (0x0000005f "long") DW_AT_external (true) 0x0000004d: DW_TAG_formal_parameter DW_AT_location (DW_OP_fbreg +8) DW_AT_name ("arg") DW_AT_decl_file ("/home/yhs/tests/sig-change/struct/test2.c") DW_AT_decl_line (5) DW_AT_type (0x00000079 "t") 0x00000058: DW_TAG_call_site DW_AT_call_origin (0x00000023 "foo") DW_AT_call_tail_call (true) DW_AT_call_pc (0x0000000000000010) 0x0000005e: NULL ... 0x00000063: DW_TAG_inlined_subroutine DW_AT_name ("foo") DW_AT_type (0x0000005f "long") DW_AT_artificial (true) DW_AT_specification (0x00000023 "foo") 0x0000006d: DW_TAG_formal_parameter DW_AT_name ("arg") DW_AT_type (0x00000074 "t *") 0x00000073: NULL In the above example, from DW_TAG_subprogram, it is not clear what kind of type the parameter should be. But DW_TAG_inlined_subroutine can clearly show what the type should be. Again, the name can be changed e.g. 'arg_ptr' if desired. Example 4 --------- Source: $ cat test.c __attribute__((noinline)) static int callee(const int *p) { return *p + 42; } int caller(void) { int x = 100; return callee(&x); } Compiled and dump dwarf with: $ clang -O3 -c -g test2.c $ llvm-dwarfdump test2.o ... 0x0000004a: DW_TAG_subprogram DW_AT_low_pc (0x0000000000000010) DW_AT_high_pc (0x0000000000000014) DW_AT_frame_base (DW_OP_reg7 RSP) DW_AT_call_all_calls (true) DW_AT_name ("callee") DW_AT_decl_file ("/home/yhs/tests/sig-change/prom/test.c") DW_AT_decl_line (1) DW_AT_prototyped (true) DW_AT_calling_convention (DW_CC_nocall) DW_AT_type (0x00000063 "int") 0x0000005a: DW_TAG_formal_parameter DW_AT_name ("p") DW_AT_decl_file ("/home/yhs/tests/sig-change/prom/test.c") DW_AT_decl_line (1) DW_AT_type (0x00000078 "const int *") 0x00000062: NULL ... 0x00000067: DW_TAG_inlined_subroutine DW_AT_name ("callee") DW_AT_type (0x00000063 "int") DW_AT_artificial (true) DW_AT_specification (0x0000004a "callee") 0x00000071: DW_TAG_formal_parameter DW_AT_name ("__0") DW_AT_type (0x00000063 "int") 0x00000077: NULL In the above, the function static int callee(const int *p) { return *p + 42; } is transformed to static int callee(int p) { return p + 42; } But the new signature is not reflected in DW_TAG_subprogram. The DW_TAG_inlined_subroutine can precisely capture the signature. Note that the parameter name is "__0" and "0" means the first argument. The reason is due to the following IR: define internal ... i32 @callee(i32 %0) unnamed_addr llvm#1 !dbg !23 { #dbg_value(ptr poison, !29, !DIExpression(), !30) %2 = add nsw i32 %0, 42, !dbg !31 ret i32 %2, !dbg !32 } ... !29 = !DILocalVariable(name: "p", arg: 1, scope: !23, file: !1, line: 1, type: !26) The reason is due to 'ptr poison' as 'ptr poison' mean the debug value should not be used any more. This is also the reason that the above DW_TAG_subprogram does not have location information. DW_TAG_inlined_subroutine can provide correct signature though. If we compile like below: clang -O3 -c -g test.c -fno-discard-value-names The function argument name will be preserved ... i32 @callee(i32 %p.0.val) ... and in such cases, the DW_TAG_inlined_subroutine looks like below: 0x00000067: DW_TAG_inlined_subroutine DW_AT_name ("callee") DW_AT_type (0x00000063 "int") DW_AT_artificial (true) DW_AT_specification (0x0000004a "callee") 0x00000071: DW_TAG_formal_parameter DW_AT_name ("p__0__val") DW_AT_type (0x00000063 "int") 0x00000077: NULL Note that the original argument name replaces '.' with "__" so argument name has proper C standard. Based a run on linux kernel, the names like "__<arg_index>" roughly 2% of total signature changed functions, so we probably okay for now. Non-LTO vs. LTO --------------- For thin-lto mode, we often see kernel symbols like p9_req_cache.llvm.13472271643223911678 Even if this symbol has identical source level signature with p9_req_cache, a special DW_TAG_inlined_subroutine will be generated with name 'p9_req_cache.llvm.13472271643223911678'. With this, some tool (e.g., pahole) may generate a BTF entry for this name which could be used for fentry/fexit tracing. But if a symbol with "<foo>.llvm.<hash>" has different signatures than the source level "<foo>", then a special DW_TAG_inlined_subroutine will be generated like below: 0x10f0793f: DW_TAG_inlined_subroutine DW_AT_name ("flow_offload_fill_route") DW_AT_linkage_name ("flow_offload_fill_route.llvm.14555965973926298225") DW_AT_artificial (true) DW_AT_specification (0x10ee9e54 "flow_offload_fill_route") 0x10f07949: DW_TAG_formal_parameter DW_AT_name ("flow") DW_AT_type (0x10ee837a "flow_offload *") 0x10f07951: DW_TAG_formal_parameter DW_AT_name ("route") DW_AT_type (0x10eea4ef "nf_flow_route *") 0x10f07959: DW_TAG_formal_parameter DW_AT_name ("dir") DW_AT_type (0x10ecef15 "int") 0x10f07961: NULL In the above, function "flow_offload_fill_route" has return type "int" at source level, but optimization eventually made the return type as "void". The tools like pahole may choice to generate two entries with DW_AT_name and DW_AT_linkage_name for vmlinux BTF. Note that it is possible one source symbol may have multiple linkage name's due to potentially (more than one) cloning in llvm. In such cases, multiple DW_TAG_inlined_subroutine instances might be possible. Some restrictions ================= There are some restrictions in the current implementation: - Only C language is supported - BPF target is excluded as one of main goals for this pull request is to generate proper vmlinux BTF for arch's like x86_64/arm64 etc. - Function must not be a intrinsic, decl only, return value size more than arch register size and func with variable arguments. - For arguments, only int/ptr types are supported. - Some union type arguments (e.g., 8B < union_size <= 16B) may have DIType issue so some function may be skipped. Some statistics with linux kernel ================================= I have tested this patch set by building latest bpf-next linux kernel. For no-lto case: 65341 original number of functions 1054 signature changed functions with this patch For thin-lto case: 65595 original number of functions 3150 signature changed functions with this patch Next step ========= With this llvm change, we will be able to do some work in pahole and libbpf. For pahole, currently we will see the warning: die__process_unit: DW_TAG_inlined_subroutine (0x1d) @ <0xf2db986> not handled in a c11 CU! Basically these DW_TAG_inlined_subroutine are not inside the DISubprogram. [1] llvm#127855 [2] llvm#157349 [3] https://discourse.llvm.org/t/rfc-identify-func-signature-change-in-llvm-compiled-kernel-image/82609

@callee

Add a new pass EmitChangedFuncDebugInfo which will add dwarf for additional functions whose signatures are changed during compiler transformations. The original intention is for bpf-based linux kernel tracing. The function signature is available in vmlinux BTF generated from pahole/dwarf. Such signature is generated from dwarf at the source level. But this is not ideal since some function may have signatures changed. If user still used the source level signature, users may not get correct results and may need some efforts to workaround the issue. So we want to encode the true signature (not different from the source one) in dwarf. With such additional information, dwarf users can get these signature changed functions. For example, pahole is able to process these signature changed functions and encode them into vmlinux BTF properly. History of multiple attempts ============================ Previously I have attempted a few tries ([1], [2] and [3]). Initially I tried to modify debuginfo in passes like ArgPromotion and DeadArgElim, but later on it is suggested to have a central place to handle new signatures ([1]). Later, I have another version of patch similar to this one, but the recommendation is to modify debuginfo to encode new signature within the same function, either through inlinedAt or new signature overwriting the old one. This seems working but it has some side effect on lldb, some lldb output (e.g. back trace) will be different from the previous one. The recommendation is to avoid any behavior change for lldb ([2] and [3]). So now, I came back to the solution discussed at the end of [1]. Basically a special dwarf entry will be generated to encode the new signature. The new signature will have a reference to the old source-level signature. So the tool can inspect dwarf to retrieve the related info. Examples and dwarf output ========================= In below, a few examples will show how changed signatures represented in dwarf: Example 1 --------- Source: $ cat test.c struct t { int a; }; char *tar(struct t *a, struct t *d); __attribute__((noinline)) static char * foo(struct t *a, int b, struct t *d) { return tar(a, d); } char *bar(struct t *a, struct t *d) { return foo(a, 1, d); } Compiled and dump dwarf with: $ clang -O2 -c -g test.c -mllvm -enable-changed-func-dbinfo $ llvm-dwarfdump test.o 0x0000000c: DW_TAG_compile_unit ... 0x0000005c: DW_TAG_subprogram DW_AT_low_pc (0x0000000000000010) DW_AT_high_pc (0x0000000000000015) DW_AT_frame_base (DW_OP_reg7 RSP) DW_AT_call_all_calls (true) DW_AT_name ("foo") DW_AT_decl_file ("/home/yhs/tests/sig-change/deadarg/test.c") DW_AT_decl_line (3) DW_AT_prototyped (true) DW_AT_calling_convention (DW_CC_nocall) DW_AT_type (0x000000b1 "char *") 0x0000006c: DW_TAG_formal_parameter DW_AT_location (DW_OP_reg5 RDI) DW_AT_name ("a") DW_AT_decl_file ("/home/yhs/tests/sig-change/deadarg/test.c") DW_AT_decl_line (3) DW_AT_type (0x000000ba "t *") 0x00000076: DW_TAG_formal_parameter DW_AT_name ("b") DW_AT_decl_file ("/home/yhs/tests/sig-change/deadarg/test.c") DW_AT_decl_line (3) DW_AT_type (0x000000ce "int") 0x0000007e: DW_TAG_formal_parameter DW_AT_location (DW_OP_reg4 RSI) DW_AT_name ("d") DW_AT_decl_file ("/home/yhs/tests/sig-change/deadarg/test.c") DW_AT_decl_line (3) DW_AT_type (0x000000ba "t *") 0x00000088: DW_TAG_call_site ... 0x0000009d: NULL ... 0x000000d2: DW_TAG_inlined_subroutine DW_AT_name ("foo") DW_AT_type (0x000000b1 "char *") DW_AT_artificial (true) DW_AT_specification (0x0000005c "foo") 0x000000dc: DW_TAG_formal_parameter DW_AT_name ("a") DW_AT_type (0x000000ba "t *") 0x000000e2: DW_TAG_formal_parameter DW_AT_name ("d") DW_AT_type (0x000000ba "t *") 0x000000e8: NULL In the above, the DISubprogram 'foo' has the original signature but since parameter 'b' does not have DW_AT_location, it is clear that parameter will not be used. The actual function signature is represented in DW_TAG_inlined_subroutine. For the above case, it looks like DW_TAG_inlined_subroutine is not necessary. Let us try a few other examples below. Example 2 --------- Source: $ cat test.c struct t { long a; long b;}; __attribute__((noinline)) static long foo(struct t arg) { return arg.b * 5; } long bar(struct t arg) { return foo(arg); } Compiled and dump dwarf with: $ clang -O2 -c -g test.c -mllvm -enable-changed-func-dbinfo $ llvm-dwarfdump test.o ... 0x0000004e: DW_TAG_subprogram DW_AT_low_pc (0x0000000000000010) DW_AT_high_pc (0x0000000000000015) DW_AT_frame_base (DW_OP_reg7 RSP) DW_AT_call_all_calls (true) DW_AT_name ("foo") DW_AT_decl_file ("/home/yhs/tests/sig-change/struct/test.c") DW_AT_decl_line (2) DW_AT_prototyped (true) DW_AT_calling_convention (DW_CC_nocall) DW_AT_type (0x0000006d "long") 0x0000005e: DW_TAG_formal_parameter DW_AT_location (DW_OP_piece 0x8, DW_OP_reg5 RDI, DW_OP_piece 0x8) DW_AT_name ("arg") DW_AT_decl_file ("/home/yhs/tests/sig-change/struct/test.c") DW_AT_decl_line (2) DW_AT_type (0x00000099 "t") 0x0000006c: NULL ... 0x00000088: DW_TAG_inlined_subroutine DW_AT_name ("foo") DW_AT_type (0x0000006d "long") DW_AT_artificial (true) DW_AT_specification (0x0000004e "foo") 0x00000092: DW_TAG_formal_parameter DW_AT_name ("arg") DW_AT_type (0x0000006d "long") 0x00000098: NULL In the above case for function foo(), the original argument is 'struct t', but the final actual argument is a 'long' type. DW_TAG_inlined_subroutine can clearly represent the signature type instead of doing DW_AT_location thing. There is a problem in the above then, it is not clear what formal parameter 'arg' corresponds to the original parameter. If necessary, the compiler could change 'arg' to e.g. 'arg_offset_8' to indicate it is 8 byte offset from the original struct. Example 3 --------- Source: $ cat test2.c struct t { long a; long b; long c;}; __attribute__((noinline)) long foo(struct t arg) { return arg.a * arg.c; } long bar(struct t arg) { return foo(arg); } Compiled and dump dwarf with: $ clang -O2 -c -g test2.c -mllvm -enable-changed-func-dbinfo $ llvm-dwarfdump test2.o ... 0x0000003e: DW_TAG_subprogram DW_AT_low_pc (0x0000000000000010) DW_AT_high_pc (0x0000000000000015) DW_AT_frame_base (DW_OP_reg7 RSP) DW_AT_call_all_calls (true) DW_AT_name ("bar") DW_AT_decl_file ("/home/yhs/tests/sig-change/struct/test2.c") DW_AT_decl_line (5) DW_AT_prototyped (true) DW_AT_type (0x0000005f "long") DW_AT_external (true) 0x0000004d: DW_TAG_formal_parameter DW_AT_location (DW_OP_fbreg +8) DW_AT_name ("arg") DW_AT_decl_file ("/home/yhs/tests/sig-change/struct/test2.c") DW_AT_decl_line (5) DW_AT_type (0x00000079 "t") 0x00000058: DW_TAG_call_site DW_AT_call_origin (0x00000023 "foo") DW_AT_call_tail_call (true) DW_AT_call_pc (0x0000000000000010) 0x0000005e: NULL ... 0x00000063: DW_TAG_inlined_subroutine DW_AT_name ("foo") DW_AT_type (0x0000005f "long") DW_AT_artificial (true) DW_AT_specification (0x00000023 "foo") 0x0000006d: DW_TAG_formal_parameter DW_AT_name ("arg") DW_AT_type (0x00000074 "t *") 0x00000073: NULL In the above example, from DW_TAG_subprogram, it is not clear what kind of type the parameter should be. But DW_TAG_inlined_subroutine can clearly show what the type should be. Again, the name can be changed e.g. 'arg_ptr' if desired. Example 4 --------- Source: $ cat test.c __attribute__((noinline)) static int callee(const int *p) { return *p + 42; } int caller(void) { int x = 100; return callee(&x); } Compiled and dump dwarf with: $ clang -O3 -c -g test2.c -mllvm -enable-changed-func-dbinfo $ llvm-dwarfdump test2.o ... 0x0000004a: DW_TAG_subprogram DW_AT_low_pc (0x0000000000000010) DW_AT_high_pc (0x0000000000000014) DW_AT_frame_base (DW_OP_reg7 RSP) DW_AT_call_all_calls (true) DW_AT_name ("callee") DW_AT_decl_file ("/home/yhs/tests/sig-change/prom/test.c") DW_AT_decl_line (1) DW_AT_prototyped (true) DW_AT_calling_convention (DW_CC_nocall) DW_AT_type (0x00000063 "int") 0x0000005a: DW_TAG_formal_parameter DW_AT_name ("p") DW_AT_decl_file ("/home/yhs/tests/sig-change/prom/test.c") DW_AT_decl_line (1) DW_AT_type (0x00000078 "const int *") 0x00000062: NULL ... 0x00000067: DW_TAG_inlined_subroutine DW_AT_name ("callee") DW_AT_type (0x00000063 "int") DW_AT_artificial (true) DW_AT_specification (0x0000004a "callee") 0x00000071: DW_TAG_formal_parameter DW_AT_name ("__0") DW_AT_type (0x00000063 "int") 0x00000077: NULL In the above, the function static int callee(const int *p) { return *p + 42; } is transformed to static int callee(int p) { return p + 42; } But the new signature is not reflected in DW_TAG_subprogram. The DW_TAG_inlined_subroutine can precisely capture the signature. Note that the parameter name is "__0" and "0" means the first argument. The reason is due to the following IR: define internal ... i32 @callee(i32 %0) unnamed_addr llvm#1 !dbg !23 { #dbg_value(ptr poison, !29, !DIExpression(), !30) %2 = add nsw i32 %0, 42, !dbg !31 ret i32 %2, !dbg !32 } ... !29 = !DILocalVariable(name: "p", arg: 1, scope: !23, file: !1, line: 1, type: !26) The reason is due to 'ptr poison' as 'ptr poison' mean the debug value should not be used any more. This is also the reason that the above DW_TAG_subprogram does not have location information. DW_TAG_inlined_subroutine can provide correct signature though. If we compile like below: clang -O3 -c -g test.c -fno-discard-value-names -mllvm -enable-changed-func-dbinfo The function argument name will be preserved ... i32 @callee(i32 %p.0.val) ... and in such cases, the DW_TAG_inlined_subroutine looks like below: 0x00000067: DW_TAG_inlined_subroutine DW_AT_name ("callee") DW_AT_type (0x00000063 "int") DW_AT_artificial (true) DW_AT_specification (0x0000004a "callee") 0x00000071: DW_TAG_formal_parameter DW_AT_name ("p__0__val") DW_AT_type (0x00000063 "int") 0x00000077: NULL Note that the original argument name replaces '.' with "__" so argument name has proper C standard. Non-LTO vs. LTO --------------- For thin-lto mode, we often see kernel symbols like p9_req_cache.llvm.13472271643223911678 Even if this symbol has identical source level signature with p9_req_cache, a special DW_TAG_inlined_subroutine will be generated with name 'p9_req_cache.llvm.13472271643223911678'. With this, some tool (e.g., pahole) may generate a BTF entry for this name which could be used for fentry/fexit tracing. But if a symbol with "<foo>.llvm.<hash>" has different signatures than the source level "<foo>", then a special DW_TAG_inlined_subroutine will be generated like below: 0x10f0793f: DW_TAG_inlined_subroutine DW_AT_name ("flow_offload_fill_route") DW_AT_linkage_name ("flow_offload_fill_route.llvm.14555965973926298225") DW_AT_artificial (true) DW_AT_specification (0x10ee9e54 "flow_offload_fill_route") 0x10f07949: DW_TAG_formal_parameter DW_AT_name ("flow") DW_AT_type (0x10ee837a "flow_offload *") 0x10f07951: DW_TAG_formal_parameter DW_AT_name ("route") DW_AT_type (0x10eea4ef "nf_flow_route *") 0x10f07959: DW_TAG_formal_parameter DW_AT_name ("dir") DW_AT_type (0x10ecef15 "int") 0x10f07961: NULL In the above, function "flow_offload_fill_route" has return type "int" at source level, but optimization eventually made the return type as "void". The tools like pahole may choice to generate two entries with DW_AT_name and DW_AT_linkage_name for vmlinux BTF. Note that it is possible one source symbol may have multiple linkage name's due to potentially (more than one) cloning in llvm. In such cases, multiple DW_TAG_inlined_subroutine instances might be possible. Some restrictions ================= There are some restrictions in the current implementation: - Only C language is supported - BPF target is excluded as one of main goals for this pull request is to generate proper vmlinux BTF for arch's like x86_64/arm64 etc. - Function must not be a intrinsic, decl only, return value size more than arch register size and func with variable arguments. - For arguments, only int/ptr types are supported. - Some union type arguments (e.g., 8B < union_size <= 16B) may have DIType issue so some function may be skipped. Some statistics with linux kernel ================================= I have tested this patch set by building latest bpf-next linux kernel. For no-lto case: 65341 original number of functions 1054 signature changed functions with this patch For thin-lto case: 65595 original number of functions 3150 signature changed functions with this patch Next step ========= With this llvm change, we will be able to do some work in pahole and libbpf. For pahole, currently we will see the warning: die__process_unit: DW_TAG_inlined_subroutine (0x1d) @ <0xf2db986> not handled in a c11 CU! Basically these DW_TAG_inlined_subroutine are not inside the DISubprogram. [1] llvm#127855 [2] llvm#157349 [3] https://discourse.llvm.org/t/rfc-identify-func-signature-change-in-llvm-compiled-kernel-image/82609

@callee

Add a new pass EmitChangedFuncDebugInfo which will add dwarf for additional functions whose signatures are changed during compiler transformations. The original intention is for bpf-based linux kernel tracing. The function signature is available in vmlinux BTF generated from pahole/dwarf. Such signature is generated from dwarf at the source level. But this is not ideal since some function may have signatures changed. If user still used the source level signature, users may not get correct results and may need some efforts to workaround the issue. So we want to encode the true signature (different from the source one) in dwarf. With such additional information, dwarf users can get these signature changed functions. For example, pahole is able to process these signature changed functions and encode them into vmlinux BTF properly. History of multiple attempts ============================ Previously I have attempted a few tries ([1], [2] and [3]). Initially I tried to modify debuginfo in passes like ArgPromotion and DeadArgElim, but later on it is suggested to have a central place to handle new signatures ([1]). Later, I have another version of patch similar to this one, but the recommendation is to modify debuginfo to encode new signature within the same function, either through inlinedAt or new signature overwriting the old one. This seems working but it has some side effect on lldb, some lldb output (e.g. back trace) will be different from the previous one. The recommendation is to avoid any behavior change for lldb ([2] and [3]). So now, I came back to the solution discussed at the end of [1]. Basically a special dwarf entry will be generated to encode the new signature. The new signature will have a reference to the old source-level signature. So the tool can inspect dwarf to retrieve the related info. Examples and dwarf output ========================= In below, a few examples will show how changed signatures represented in dwarf: Example 1 --------- Source: $ cat test.c struct t { int a; }; char *tar(struct t *a, struct t *d); __attribute__((noinline)) static char * foo(struct t *a, int b, struct t *d) { return tar(a, d); } char *bar(struct t *a, struct t *d) { return foo(a, 1, d); } Compiled and dump dwarf with: $ clang -O2 -c -g test.c -mllvm -enable-changed-func-dbinfo $ llvm-dwarfdump test.o 0x0000000c: DW_TAG_compile_unit ... 0x0000005c: DW_TAG_subprogram DW_AT_low_pc (0x0000000000000010) DW_AT_high_pc (0x0000000000000015) DW_AT_frame_base (DW_OP_reg7 RSP) DW_AT_call_all_calls (true) DW_AT_name ("foo") DW_AT_decl_file ("/home/yhs/tests/sig-change/deadarg/test.c") DW_AT_decl_line (3) DW_AT_prototyped (true) DW_AT_calling_convention (DW_CC_nocall) DW_AT_type (0x000000b1 "char *") 0x0000006c: DW_TAG_formal_parameter DW_AT_location (DW_OP_reg5 RDI) DW_AT_name ("a") DW_AT_decl_file ("/home/yhs/tests/sig-change/deadarg/test.c") DW_AT_decl_line (3) DW_AT_type (0x000000ba "t *") 0x00000076: DW_TAG_formal_parameter DW_AT_name ("b") DW_AT_decl_file ("/home/yhs/tests/sig-change/deadarg/test.c") DW_AT_decl_line (3) DW_AT_type (0x000000ce "int") 0x0000007e: DW_TAG_formal_parameter DW_AT_location (DW_OP_reg4 RSI) DW_AT_name ("d") DW_AT_decl_file ("/home/yhs/tests/sig-change/deadarg/test.c") DW_AT_decl_line (3) DW_AT_type (0x000000ba "t *") 0x00000088: DW_TAG_call_site ... 0x0000009d: NULL ... 0x000000d2: DW_TAG_inlined_subroutine DW_AT_name ("foo") DW_AT_type (0x000000b1 "char *") DW_AT_artificial (true) DW_AT_specification (0x0000005c "foo") 0x000000dc: DW_TAG_formal_parameter DW_AT_name ("a") DW_AT_type (0x000000ba "t *") 0x000000e2: DW_TAG_formal_parameter DW_AT_name ("d") DW_AT_type (0x000000ba "t *") 0x000000e8: NULL In the above, the DISubprogram 'foo' has the original signature but since parameter 'b' does not have DW_AT_location, it is clear that parameter will not be used. The actual function signature is represented in DW_TAG_inlined_subroutine. For the above case, it looks like DW_TAG_inlined_subroutine is not necessary. Let us try a few other examples below. Example 2 --------- Source: $ cat test.c struct t { long a; long b;}; __attribute__((noinline)) static long foo(struct t arg) { return arg.b * 5; } long bar(struct t arg) { return foo(arg); } Compiled and dump dwarf with: $ clang -O2 -c -g test.c -mllvm -enable-changed-func-dbinfo $ llvm-dwarfdump test.o ... 0x0000004e: DW_TAG_subprogram DW_AT_low_pc (0x0000000000000010) DW_AT_high_pc (0x0000000000000015) DW_AT_frame_base (DW_OP_reg7 RSP) DW_AT_call_all_calls (true) DW_AT_name ("foo") DW_AT_decl_file ("/home/yhs/tests/sig-change/struct/test.c") DW_AT_decl_line (2) DW_AT_prototyped (true) DW_AT_calling_convention (DW_CC_nocall) DW_AT_type (0x0000006d "long") 0x0000005e: DW_TAG_formal_parameter DW_AT_location (DW_OP_piece 0x8, DW_OP_reg5 RDI, DW_OP_piece 0x8) DW_AT_name ("arg") DW_AT_decl_file ("/home/yhs/tests/sig-change/struct/test.c") DW_AT_decl_line (2) DW_AT_type (0x00000099 "t") 0x0000006c: NULL ... 0x00000088: DW_TAG_inlined_subroutine DW_AT_name ("foo") DW_AT_type (0x0000006d "long") DW_AT_artificial (true) DW_AT_specification (0x0000004e "foo") 0x00000092: DW_TAG_formal_parameter DW_AT_name ("arg__coerce1") DW_AT_type (0x0000006d "long") 0x00000098: NULL In the above case for function foo(), the original argument is 'struct t', but the final actual argument is a 'long' type. DW_TAG_inlined_subroutine can clearly represent the signature type instead of doing DW_AT_location thing. Note that the name 'arg__coerce1' presents the second long type value of the struct 't'. The llvm may put 'arg.coerce1' as the func argument name, we use 'arg__coerce1' so the argument name can be represented in C code. Example 3 --------- Source: $ cat test2.c struct t { long a; long b; long c;}; __attribute__((noinline)) static long foo(struct t arg, int a) { return arg.a * arg.c; } long bar(struct t arg) { return foo(arg, 1); } Compiled and dump dwarf with: $ clang -O2 -c -g test2.c -mllvm -enable-changed-func-dbinfo $ llvm-dwarfdump test2.o ... 0x0000003e: DW_TAG_subprogram DW_AT_low_pc (0x0000000000000010) DW_AT_high_pc (0x0000000000000015) DW_AT_frame_base (DW_OP_reg7 RSP) DW_AT_call_all_calls (true) DW_AT_name ("bar") DW_AT_decl_file ("/home/yhs/tests/sig-change/struct/test2.c") DW_AT_decl_line (5) DW_AT_prototyped (true) DW_AT_type (0x0000005f "long") DW_AT_external (true) 0x0000004d: DW_TAG_formal_parameter DW_AT_location (DW_OP_fbreg +8) DW_AT_name ("arg") DW_AT_decl_file ("/home/yhs/tests/sig-change/struct/test2.c") DW_AT_decl_line (5) DW_AT_type (0x00000079 "t") 0x00000058: DW_TAG_call_site DW_AT_call_origin (0x00000023 "foo") DW_AT_call_tail_call (true) DW_AT_call_pc (0x0000000000000010) 0x0000005e: NULL ... 0x00000063: DW_TAG_inlined_subroutine DW_AT_name ("foo") DW_AT_type (0x0000005f "long") DW_AT_artificial (true) DW_AT_specification (0x00000023 "foo") 0x0000006d: DW_TAG_formal_parameter DW_AT_name ("arg") DW_AT_type (0x00000074 "t") 0x00000073: NULL In the above example, from DW_TAG_subprogram, it is not clear what kind of type the parameter should be. But DW_TAG_inlined_subroutine can clearly show what the type should be. Example 4 --------- Source: $ cat test.c __attribute__((noinline)) static int callee(const int *p) { return *p + 42; } int caller(void) { int x = 100; return callee(&x); } Compiled and dump dwarf with: $ clang -O3 -c -g test.c -mllvm -enable-changed-func-dbinfo $ llvm-dwarfdump test.o ... 0x0000004a: DW_TAG_subprogram DW_AT_low_pc (0x0000000000000010) DW_AT_high_pc (0x0000000000000014) DW_AT_frame_base (DW_OP_reg7 RSP) DW_AT_call_all_calls (true) DW_AT_name ("callee") DW_AT_decl_file ("/home/yhs/tests/sig-change/prom/test.c") DW_AT_decl_line (1) DW_AT_prototyped (true) DW_AT_calling_convention (DW_CC_nocall) DW_AT_type (0x00000063 "int") 0x0000005a: DW_TAG_formal_parameter DW_AT_name ("p") DW_AT_decl_file ("/home/yhs/tests/sig-change/prom/test.c") DW_AT_decl_line (1) DW_AT_type (0x00000078 "const int *") 0x00000062: NULL ... 0x00000067: DW_TAG_inlined_subroutine DW_AT_name ("callee") DW_AT_type (0x00000063 "int") DW_AT_artificial (true) DW_AT_specification (0x0000004a "callee") 0x00000071: DW_TAG_formal_parameter DW_AT_name ("__0") DW_AT_type (0x00000063 "int") 0x00000077: NULL In the above, the function static int callee(const int *p) { return *p + 42; } is transformed to static int callee(int p) { return p + 42; } But the new signature is not reflected in DW_TAG_subprogram. The DW_TAG_inlined_subroutine can precisely capture the signature. Note that the parameter name is "__0" and "0" means the first argument. The reason is due to the following IR: define internal ... i32 @callee(i32 %0) unnamed_addr llvm#1 !dbg !23 { #dbg_value(ptr poison, !29, !DIExpression(), !30) %2 = add nsw i32 %0, 42, !dbg !31 ret i32 %2, !dbg !32 } ... !29 = !DILocalVariable(name: "p", arg: 1, scope: !23, file: !1, line: 1, type: !26) The reason is due to 'ptr poison' as 'ptr poison' mean the debug value should not be used any more. This is also the reason that the above DW_TAG_subprogram does not have location information. DW_TAG_inlined_subroutine can provide correct signature though. With additional option like clang -O3 -c -g test.c -mllvm -enable-changed-func-dbinfo -fsave-optimization-record \ -foptimization-record-passes=emit-changed-func-debuginfo a file test.opt.yaml is generated with the following remark: $ cat test.opt.yaml --- !Passed Pass: emit-changed-func-debuginfo Name: FindNoDIVariable DebugLoc: { File: test.c, Line: 1, Column: 0 } Function: callee Args: - String: 'create a new int type ' - ArgName: '' - String: '(' - ArgIndex: '0' - String: ')' ... If we compile like below: clang -O3 -c -g test.c -fno-discard-value-names -mllvm -enable-changed-func-dbinfo The function argument name will be preserved ... i32 @callee(i32 %p.0.val) ... and in such cases, the DW_TAG_inlined_subroutine looks like below: 0x00000067: DW_TAG_inlined_subroutine DW_AT_name ("callee") DW_AT_type (0x00000063 "int") DW_AT_artificial (true) DW_AT_specification (0x0000004a "callee") 0x00000071: DW_TAG_formal_parameter DW_AT_name ("p__0__val") DW_AT_type (0x00000063 "int") 0x00000077: NULL Note that the original argument name replaces '.' with "__" so argument name has proper C standard. Non-LTO vs. LTO --------------- For thin-lto mode, we often see kernel symbols like p9_req_cache.llvm.13472271643223911678 Even if this symbol has identical source level signature with p9_req_cache, a special DW_TAG_inlined_subroutine will be generated with name 'p9_req_cache.llvm.13472271643223911678'. With this, some tool (e.g., pahole) may generate a BTF entry for this name which could be used for bpf fentry/fexit tracing. But if a symbol with "<foo>.llvm.<hash>" has different signatures than the source level "<foo>", then a special DW_TAG_inlined_subroutine will be generated like below: 0x10f0793f: DW_TAG_inlined_subroutine DW_AT_name ("flow_offload_fill_route") DW_AT_linkage_name ("flow_offload_fill_route.llvm.14555965973926298225") DW_AT_artificial (true) DW_AT_specification (0x10ee9e54 "flow_offload_fill_route") 0x10f07949: DW_TAG_formal_parameter DW_AT_name ("flow") DW_AT_type (0x10ee837a "flow_offload *") 0x10f07951: DW_TAG_formal_parameter DW_AT_name ("route") DW_AT_type (0x10eea4ef "nf_flow_route *") 0x10f07959: DW_TAG_formal_parameter DW_AT_name ("dir") DW_AT_type (0x10ecef15 "flow_offload_tuple_dir") 0x10f07961: NULL In the above, function "flow_offload_fill_route" has return type "int" at source level, but optimization eventually made the return type as "void". The tools like pahole may choose to generate two entries with DW_AT_name and DW_AT_linkage_name for vmlinux BTF. Function specialization ----------------------- LLVM has a pass FunctionSpecializer (FunctionSpecialization.cpp) which is called by SCCP pass (Interprocedural Sparse Conditional Constant Propagation). The FunctionSpecializer may clone functions and SCCP pass is available for both non-LTO and LTO passes. For any function, the default clones can be up to 3 and all these clones will have different signatures than the source signature. This is rare but it did happen. For example, for linux kernel thin lto mode, I found the following in the kernel symbol table: ffffffff812036d0 t print_cpu.specialized.1 In this particular case, after cloning, the original function 'print_cpu' is not used so it is removed. Here, the print_cpu() call is a static function. Basically, the compiler creates a specialized 'print_cpu.specialized.1' function and the original funciton 'print_cpu' also exists. The dwarf for the above two functions: 0x01484bea: DW_TAG_subprogram DW_AT_low_pc (0xffffffff812036d0) DW_AT_high_pc (0xffffffff8120400c) DW_AT_frame_base (DW_OP_reg6 RBP) DW_AT_call_all_calls (true) DW_AT_name ("print_cpu") DW_AT_decl_file ("/home/yhs/work/bpf-next/kernel/sched/debug.c") DW_AT_decl_line (922) DW_AT_prototyped (true) DW_AT_calling_convention (DW_CC_nocall) 0x01484bfa: DW_TAG_formal_parameter DW_AT_const_value (0) DW_AT_name ("m") DW_AT_decl_file ("/home/yhs/work/bpf-next/kernel/sched/debug.c") DW_AT_decl_line (922) DW_AT_type (0x0146fd21 "seq_file *") 0x01484c06: DW_TAG_formal_parameter DW_AT_location (indexed (0x7ee) loclist = 0x0011ce6d: [0xffffffff812036d5, 0xffffffff81203730): DW_OP_reg5 RDI [0xffffffff81203730, 0xffffffff812039fa): DW_OP_reg3 RBX [0xffffffff812039fa, 0xffffffff81203a89): DW_OP_entry_value(DW_OP_reg5 RDI), DW_OP_stack_value [0xffffffff81203a89, 0xffffffff81203a8d): DW_OP_reg3 RBX [0xffffffff81203a8d, 0xffffffff81203d58): DW_OP_breg7 RSP+12 [0xffffffff81203d7a, 0xffffffff81203ddd): DW_OP_breg7 RSP+12 [0xffffffff81203dfa, 0xffffffff81203f7b): DW_OP_breg7 RSP+12 [0xffffffff81203f7b, 0xffffffff81203f80): DW_OP_entry_value(DW_OP_reg5 RDI), DW_OP_stack_value [0xffffffff81203f80, 0xffffffff8120400c): DW_OP_reg3 RBX) DW_AT_name ("cpu") DW_AT_decl_file ("/home/yhs/work/bpf-next/kernel/sched/debug.c") DW_AT_decl_line (922) DW_AT_type (0x01462560 "int") ...... 0x014981fc: DW_TAG_inlined_subroutine DW_AT_name ("print_cpu.specialized.1") DW_AT_artificial (true) DW_AT_specification (0x01484bea "print_cpu") 0x01498204: DW_TAG_formal_parameter DW_AT_name ("cpu") DW_AT_type (0x01462560 "int") 0x0149820c: NULL The specailized function "print_cpu.specialized.1" has a signature different from the original one "print_cpu" and its name directly encoded into DW_AT_name. Some restrictions ================= There are some restrictions in the current implementation: - Only C language is supported - BPF target is excluded as one of main goals for this pull request is to generate proper vmlinux BTF for arch's like x86_64/arm64 etc. - Function must not be a intrinsic, decl only, return value size more than arch register size and func with variable arguments. - For arguments, only int/ptr types are supported. - Some union type arguments (e.g., 8B < union_size <= 16B) may have issue to pick which member so the related functions may be skipped. Remarks ======= A few remarks are available for debugging purpose including - cannot handle union arguments (greater than 8B but less/equal to 16B). - cannot find corresponding DILocalVariable for the argument. - certain cases of dbg fragment handling. Some statistics with linux kernel ================================= I have tested this patch set by building latest bpf-next linux kernel. For no-lto case: 66051 original number of functions 894 signature changed or new with-dot functions with this patch For thin-lto case: 66227 original number of functions 2993 signature changed or new with-dot functions with this patch Next step ========= With this llvm change, we will be able to do some work in pahole. For pahole, currently we will see the warning: die__process_unit: DW_TAG_inlined_subroutine (0x1d) @ <0xf2db986> not handled in a c11 CU! Basically these DW_TAG_inlined_subroutine are not inside the DISubprogram. [1] llvm#127855 [2] llvm#157349 [3] https://discourse.llvm.org/t/rfc-identify-func-signature-change-in-llvm-compiled-kernel-image/82609

@callee

Add a new pass EmitChangedFuncDebugInfo which will add dwarf for additional functions whose signatures are changed during compiler transformations. The original intention is for bpf-based linux kernel tracing. The function signature is available in vmlinux BTF generated from pahole/dwarf. Such signature is generated from dwarf at the source level. But this is not ideal since some function may have signatures changed. If user still used the source level signature, users may not get correct results and may need some efforts to workaround the issue. So we want to encode the true signature (different from the source one) in dwarf. With such additional information, dwarf users can get these signature changed functions. For example, pahole is able to process these signature changed functions and encode them into vmlinux BTF properly. History of multiple attempts ============================ Previously I have attempted a few tries ([1], [2] and [3]). Initially I tried to modify debuginfo in passes like ArgPromotion and DeadArgElim, but later on it is suggested to have a central place to handle new signatures ([1]). Later, I have another version of patch similar to this one, but the recommendation is to modify debuginfo to encode new signature within the same function, either through inlinedAt or new signature overwriting the old one. This seems working but it has some side effect on lldb, some lldb output (e.g. back trace) will be different from the previous one. The recommendation is to avoid any behavior change for lldb ([2] and [3]). So now, I came back to the solution discussed at the end of [1]. Basically a special dwarf entry will be generated to encode the new signature. The new signature will have a reference to the old source-level signature. So the tool can inspect dwarf to retrieve the related info. Examples and dwarf output ========================= In below, a few examples will show how changed signatures represented in dwarf: Example 1 --------- Source: $ cat test.c struct t { int a; }; char *tar(struct t *a, struct t *d); __attribute__((noinline)) static char * foo(struct t *a, int b, struct t *d) { return tar(a, d); } char *bar(struct t *a, struct t *d) { return foo(a, 1, d); } Compiled and dump dwarf with: $ clang -O2 -c -g test.c -mllvm -enable-changed-func-dbinfo $ llvm-dwarfdump test.o 0x0000000c: DW_TAG_compile_unit ... 0x0000005c: DW_TAG_subprogram DW_AT_low_pc (0x0000000000000010) DW_AT_high_pc (0x0000000000000015) DW_AT_frame_base (DW_OP_reg7 RSP) DW_AT_call_all_calls (true) DW_AT_name ("foo") DW_AT_decl_file ("/home/yhs/tests/sig-change/deadarg/test.c") DW_AT_decl_line (3) DW_AT_prototyped (true) DW_AT_calling_convention (DW_CC_nocall) DW_AT_type (0x000000b1 "char *") 0x0000006c: DW_TAG_formal_parameter DW_AT_location (DW_OP_reg5 RDI) DW_AT_name ("a") DW_AT_decl_file ("/home/yhs/tests/sig-change/deadarg/test.c") DW_AT_decl_line (3) DW_AT_type (0x000000ba "t *") 0x00000076: DW_TAG_formal_parameter DW_AT_name ("b") DW_AT_decl_file ("/home/yhs/tests/sig-change/deadarg/test.c") DW_AT_decl_line (3) DW_AT_type (0x000000ce "int") 0x0000007e: DW_TAG_formal_parameter DW_AT_location (DW_OP_reg4 RSI) DW_AT_name ("d") DW_AT_decl_file ("/home/yhs/tests/sig-change/deadarg/test.c") DW_AT_decl_line (3) DW_AT_type (0x000000ba "t *") 0x00000088: DW_TAG_call_site ... 0x0000009d: NULL ... 0x000000d2: DW_TAG_inlined_subroutine DW_AT_name ("foo") DW_AT_type (0x000000b1 "char *") DW_AT_artificial (true) DW_AT_specification (0x0000005c "foo") 0x000000dc: DW_TAG_formal_parameter DW_AT_name ("a") DW_AT_type (0x000000ba "t *") 0x000000e2: DW_TAG_formal_parameter DW_AT_name ("d") DW_AT_type (0x000000ba "t *") 0x000000e8: NULL In the above, the DISubprogram 'foo' has the original signature but since parameter 'b' does not have DW_AT_location, it is clear that parameter will not be used. The actual function signature is represented in DW_TAG_inlined_subroutine. For the above case, it looks like DW_TAG_inlined_subroutine is not necessary. Let us try a few other examples below. Example 2 --------- Source: $ cat test.c struct t { long a; long b;}; __attribute__((noinline)) static long foo(struct t arg) { return arg.b * 5; } long bar(struct t arg) { return foo(arg); } Compiled and dump dwarf with: $ clang -O2 -c -g test.c -mllvm -enable-changed-func-dbinfo $ llvm-dwarfdump test.o ... 0x0000004e: DW_TAG_subprogram DW_AT_low_pc (0x0000000000000010) DW_AT_high_pc (0x0000000000000015) DW_AT_frame_base (DW_OP_reg7 RSP) DW_AT_call_all_calls (true) DW_AT_name ("foo") DW_AT_decl_file ("/home/yhs/tests/sig-change/struct/test.c") DW_AT_decl_line (2) DW_AT_prototyped (true) DW_AT_calling_convention (DW_CC_nocall) DW_AT_type (0x0000006d "long") 0x0000005e: DW_TAG_formal_parameter DW_AT_location (DW_OP_piece 0x8, DW_OP_reg5 RDI, DW_OP_piece 0x8) DW_AT_name ("arg") DW_AT_decl_file ("/home/yhs/tests/sig-change/struct/test.c") DW_AT_decl_line (2) DW_AT_type (0x00000099 "t") 0x0000006c: NULL ... 0x00000088: DW_TAG_inlined_subroutine DW_AT_name ("foo") DW_AT_type (0x0000006d "long") DW_AT_artificial (true) DW_AT_specification (0x0000004e "foo") 0x00000092: DW_TAG_formal_parameter DW_AT_name ("b") DW_AT_type (0x0000006d "long") 0x00000098: NULL In the above case for function foo(), the original argument is 'struct t', but the final actual argument is a 'long' type. DW_TAG_inlined_subroutine can clearly represent the signature type instead of doing DW_AT_location thing. Note that the name 'b' presents the second long type value of the struct 't'. Example 3 --------- Source: $ cat test2.c struct t { long a; long b; long c;}; __attribute__((noinline)) static long foo(struct t arg, int a) { return arg.a * arg.c; } long bar(struct t arg) { return foo(arg, 1); } Compiled and dump dwarf with: $ clang -O2 -c -g test2.c -mllvm -enable-changed-func-dbinfo $ llvm-dwarfdump test2.o ... 0x0000003e: DW_TAG_subprogram DW_AT_low_pc (0x0000000000000010) DW_AT_high_pc (0x0000000000000015) DW_AT_frame_base (DW_OP_reg7 RSP) DW_AT_call_all_calls (true) DW_AT_name ("bar") DW_AT_decl_file ("/home/yhs/tests/sig-change/struct/test2.c") DW_AT_decl_line (5) DW_AT_prototyped (true) DW_AT_type (0x0000005f "long") DW_AT_external (true) 0x0000004d: DW_TAG_formal_parameter DW_AT_location (DW_OP_fbreg +8) DW_AT_name ("arg") DW_AT_decl_file ("/home/yhs/tests/sig-change/struct/test2.c") DW_AT_decl_line (5) DW_AT_type (0x00000079 "t") 0x00000058: DW_TAG_call_site DW_AT_call_origin (0x00000023 "foo") DW_AT_call_tail_call (true) DW_AT_call_pc (0x0000000000000010) 0x0000005e: NULL ... 0x00000063: DW_TAG_inlined_subroutine DW_AT_name ("foo") DW_AT_type (0x0000005f "long") DW_AT_artificial (true) DW_AT_specification (0x00000023 "foo") 0x0000006d: DW_TAG_formal_parameter DW_AT_name ("arg") DW_AT_type (0x00000074 "t") 0x00000073: NULL In the above example, from DW_TAG_subprogram, it is not clear what kind of type the parameter should be. But DW_TAG_inlined_subroutine can clearly show what the type should be. Example 4 --------- Source: $ cat test.c __attribute__((noinline)) static int callee(const int *p) { return *p + 42; } int caller(void) { int x = 100; return callee(&x); } Compiled and dump dwarf with: $ clang -O3 -c -g test.c -mllvm -enable-changed-func-dbinfo $ llvm-dwarfdump test.o ... 0x0000004a: DW_TAG_subprogram DW_AT_low_pc (0x0000000000000010) DW_AT_high_pc (0x0000000000000014) DW_AT_frame_base (DW_OP_reg7 RSP) DW_AT_call_all_calls (true) DW_AT_name ("callee") DW_AT_decl_file ("/home/yhs/tests/sig-change/prom/test.c") DW_AT_decl_line (1) DW_AT_prototyped (true) DW_AT_calling_convention (DW_CC_nocall) DW_AT_type (0x00000063 "int") 0x0000005a: DW_TAG_formal_parameter DW_AT_name ("p") DW_AT_decl_file ("/home/yhs/tests/sig-change/prom/test.c") DW_AT_decl_line (1) DW_AT_type (0x00000078 "const int *") 0x00000062: NULL ... 0x00000067: DW_TAG_inlined_subroutine DW_AT_name ("callee") DW_AT_type (0x00000063 "int") DW_AT_artificial (true) DW_AT_specification (0x0000004a "callee") 0x00000071: DW_TAG_formal_parameter DW_AT_name ("__0") DW_AT_type (0x00000063 "int") 0x00000077: NULL In the above, the function static int callee(const int *p) { return *p + 42; } is transformed to static int callee(int p) { return p + 42; } But the new signature is not reflected in DW_TAG_subprogram. The DW_TAG_inlined_subroutine can precisely capture the signature. Note that the parameter name is "__0" and "0" means the first argument. The reason is due to the following IR: define internal ... i32 @callee(i32 %0) unnamed_addr llvm#1 !dbg !23 { #dbg_value(ptr poison, !29, !DIExpression(), !30) %2 = add nsw i32 %0, 42, !dbg !31 ret i32 %2, !dbg !32 } ... !29 = !DILocalVariable(name: "p", arg: 1, scope: !23, file: !1, line: 1, type: !26) The reason is due to 'ptr poison' as 'ptr poison' mean the debug value should not be used any more. This is also the reason that the above DW_TAG_subprogram does not have location information. DW_TAG_inlined_subroutine can provide correct signature though. With additional option like clang -O3 -c -g test.c -mllvm -enable-changed-func-dbinfo -fsave-optimization-record \ -foptimization-record-passes=emit-changed-func-debuginfo a file test.opt.yaml is generated with the following remark: $ cat test.opt.yaml --- !Passed Pass: emit-changed-func-debuginfo Name: FindNoDIVariable DebugLoc: { File: test.c, Line: 1, Column: 0 } Function: callee Args: - String: 'create a new int type ' - ArgName: '' - String: '(' - ArgIndex: '0' - String: ')' ... If we compile like below: clang -O3 -c -g test.c -fno-discard-value-names -mllvm -enable-changed-func-dbinfo The function argument name will be preserved ... i32 @callee(i32 %p.0.val) ... and in such cases, the DW_TAG_inlined_subroutine looks like below: 0x00000067: DW_TAG_inlined_subroutine DW_AT_name ("callee") DW_AT_type (0x00000063 "int") DW_AT_artificial (true) DW_AT_specification (0x0000004a "callee") 0x00000071: DW_TAG_formal_parameter DW_AT_name ("p__0__val") DW_AT_type (0x00000063 "int") 0x00000077: NULL Note that the original argument name replaces '.' with "__" so argument name has proper C standard. Non-LTO vs. LTO --------------- For thin-lto mode, we often see kernel symbols like p9_req_cache.llvm.13472271643223911678 Even if this symbol has identical source level signature with p9_req_cache, a special DW_TAG_inlined_subroutine will be generated with name 'p9_req_cache.llvm.13472271643223911678'. With this, some tool (e.g., pahole) may generate a BTF entry for this name which could be used for bpf fentry/fexit tracing. But if a symbol with "<foo>.llvm.<hash>" has different signatures than the source level "<foo>", then a special DW_TAG_inlined_subroutine will be generated like below: 0x10f0793f: DW_TAG_inlined_subroutine DW_AT_name ("flow_offload_fill_route.llvm.14555965973926298225") DW_AT_artificial (true) DW_AT_specification (0x10ee9e54 "flow_offload_fill_route") 0x10f07949: DW_TAG_formal_parameter DW_AT_name ("flow") DW_AT_type (0x10ee837a "flow_offload *") 0x10f07951: DW_TAG_formal_parameter DW_AT_name ("route") DW_AT_type (0x10eea4ef "nf_flow_route *") 0x10f07959: DW_TAG_formal_parameter DW_AT_name ("dir") DW_AT_type (0x10ecef15 "flow_offload_tuple_dir") 0x10f07961: NULL In the above, function "flow_offload_fill_route" has return type "int" at source level, but optimization eventually made the return type as "void". Function specialization ----------------------- LLVM has a pass FunctionSpecializer (FunctionSpecialization.cpp) which is called by SCCP pass (Interprocedural Sparse Conditional Constant Propagation). The FunctionSpecializer may clone functions and SCCP pass is available for both non-LTO and LTO passes. For any function, the default clones can be up to 3 and all these clones will have different signatures than the source signature. This is rare but it did happen. For example, for linux kernel thin lto mode, I found the following in the kernel symbol table: ffffffff812036d0 t print_cpu.specialized.1 In this particular case, after cloning, the original function 'print_cpu' is not used so it is removed. Here, the print_cpu() call is a static function. Basically, the compiler creates a specialized 'print_cpu.specialized.1' function and the original funciton 'print_cpu' also exists. The dwarf for the above two functions: 0x01484bea: DW_TAG_subprogram DW_AT_low_pc (0xffffffff812036d0) DW_AT_high_pc (0xffffffff8120400c) DW_AT_frame_base (DW_OP_reg6 RBP) DW_AT_call_all_calls (true) DW_AT_name ("print_cpu") DW_AT_decl_file ("/home/yhs/work/bpf-next/kernel/sched/debug.c") DW_AT_decl_line (922) DW_AT_prototyped (true) DW_AT_calling_convention (DW_CC_nocall) 0x01484bfa: DW_TAG_formal_parameter DW_AT_const_value (0) DW_AT_name ("m") DW_AT_decl_file ("/home/yhs/work/bpf-next/kernel/sched/debug.c") DW_AT_decl_line (922) DW_AT_type (0x0146fd21 "seq_file *") 0x01484c06: DW_TAG_formal_parameter DW_AT_location (indexed (0x7ee) loclist = 0x0011ce6d: [0xffffffff812036d5, 0xffffffff81203730): DW_OP_reg5 RDI [0xffffffff81203730, 0xffffffff812039fa): DW_OP_reg3 RBX [0xffffffff812039fa, 0xffffffff81203a89): DW_OP_entry_value(DW_OP_reg5 RDI), DW_OP_stack_value [0xffffffff81203a89, 0xffffffff81203a8d): DW_OP_reg3 RBX [0xffffffff81203a8d, 0xffffffff81203d58): DW_OP_breg7 RSP+12 [0xffffffff81203d7a, 0xffffffff81203ddd): DW_OP_breg7 RSP+12 [0xffffffff81203dfa, 0xffffffff81203f7b): DW_OP_breg7 RSP+12 [0xffffffff81203f7b, 0xffffffff81203f80): DW_OP_entry_value(DW_OP_reg5 RDI), DW_OP_stack_value [0xffffffff81203f80, 0xffffffff8120400c): DW_OP_reg3 RBX) DW_AT_name ("cpu") DW_AT_decl_file ("/home/yhs/work/bpf-next/kernel/sched/debug.c") DW_AT_decl_line (922) DW_AT_type (0x01462560 "int") ...... 0x014981fc: DW_TAG_inlined_subroutine DW_AT_name ("print_cpu.specialized.1") DW_AT_artificial (true) DW_AT_specification (0x01484bea "print_cpu") 0x01498204: DW_TAG_formal_parameter DW_AT_name ("cpu") DW_AT_type (0x01462560 "int") 0x0149820c: NULL The specailized function "print_cpu.specialized.1" has a signature different from the original one "print_cpu" and its name directly encoded into DW_AT_name. Some restrictions ================= There are some restrictions in the current implementation: - Only C language is supported - BPF target is excluded as one of main goals for this pull request is to generate proper vmlinux BTF for arch's like x86_64/arm64 etc. - Function must not be a intrinsic, decl only, return value size more than arch register size and func with variable arguments. - For arguments, only int/ptr types are supported. - Some union type arguments (e.g., 8B < union_size <= 16B) may have issue to pick which member so the related functions may be skipped. Remarks ======= A few remarks are available for debugging purpose including - cannot handle union arguments (greater than 8B but less/equal to 16B). - cannot find corresponding DILocalVariable for the argument. - certain cases of dbg fragment handling. Some statistics with linux kernel ================================= I have tested this patch set by building latest bpf-next linux kernel. For no-lto case: 66051 original number of functions 894 signature changed or new with-dot functions with this patch For thin-lto case: 66227 original number of functions 2990 signature changed or new with-dot functions with this patch Next step ========= With this llvm change, we will be able to do some work in pahole. For pahole, currently we will see the warning: die__process_unit: DW_TAG_inlined_subroutine (0x1d) @ <0xf2db986> not handled in a c11 CU! Basically these DW_TAG_inlined_subroutine are not inside the DISubprogram. [1] llvm#127855 [2] llvm#157349 [3] https://discourse.llvm.org/t/rfc-identify-func-signature-change-in-llvm-compiled-kernel-image/82609

@callee

Add a new pass EmitChangedFuncDebugInfo which will add dwarf for additional functions whose signatures are changed during compiler transformations. The original intention is for bpf-based linux kernel tracing. The function signature is available in vmlinux BTF generated from pahole/dwarf. Such signature is generated from dwarf at the source level. But this is not ideal since some function may have signatures changed. If user still used the source level signature, users may not get correct results and may need some efforts to workaround the issue. So we want to encode the true signature (different from the source one) in dwarf. With such additional information, dwarf users can get these signature changed functions. For example, pahole is able to process these signature changed functions and encode them into vmlinux BTF properly. History of multiple attempts ============================ Previously I have attempted a few tries ([1], [2] and [3]). Initially I tried to modify debuginfo in passes like ArgPromotion and DeadArgElim, but later on it is suggested to have a central place to handle new signatures ([1]). Later, I have another version of patch similar to this one, but the recommendation is to modify debuginfo to encode new signature within the same function, either through inlinedAt or new signature overwriting the old one. This seems working but it has some side effect on lldb, some lldb output (e.g. back trace) will be different from the previous one. The recommendation is to avoid any behavior change for lldb ([2] and [3]). So now, I came back to the solution discussed at the end of [1]. Basically a special dwarf entry will be generated to encode the new signature. The new signature will have a reference to the old source-level signature. So the tool can inspect dwarf to retrieve the related info. Examples and dwarf output ========================= In below, a few examples will show how changed signatures represented in dwarf: Example 1 --------- Source: $ cat test.c struct t { int a; }; char *tar(struct t *a, struct t *d); __attribute__((noinline)) static char * foo(struct t *a, int b, struct t *d) { return tar(a, d); } char *bar(struct t *a, struct t *d) { return foo(a, 1, d); } Compiled and dump dwarf with: $ clang -O2 -c -g test.c -mllvm -enable-changed-func-dbinfo $ llvm-dwarfdump test.o 0x0000000c: DW_TAG_compile_unit ... 0x0000005c: DW_TAG_subprogram DW_AT_low_pc (0x0000000000000010) DW_AT_high_pc (0x0000000000000015) DW_AT_frame_base (DW_OP_reg7 RSP) DW_AT_call_all_calls (true) DW_AT_name ("foo") DW_AT_decl_file ("/home/yhs/tests/sig-change/deadarg/test.c") DW_AT_decl_line (3) DW_AT_prototyped (true) DW_AT_calling_convention (DW_CC_nocall) DW_AT_type (0x000000b1 "char *") 0x0000006c: DW_TAG_formal_parameter DW_AT_location (DW_OP_reg5 RDI) DW_AT_name ("a") DW_AT_decl_file ("/home/yhs/tests/sig-change/deadarg/test.c") DW_AT_decl_line (3) DW_AT_type (0x000000ba "t *") 0x00000076: DW_TAG_formal_parameter DW_AT_name ("b") DW_AT_decl_file ("/home/yhs/tests/sig-change/deadarg/test.c") DW_AT_decl_line (3) DW_AT_type (0x000000ce "int") 0x0000007e: DW_TAG_formal_parameter DW_AT_location (DW_OP_reg4 RSI) DW_AT_name ("d") DW_AT_decl_file ("/home/yhs/tests/sig-change/deadarg/test.c") DW_AT_decl_line (3) DW_AT_type (0x000000ba "t *") 0x00000088: DW_TAG_call_site ... 0x0000009d: NULL ... 0x000000d2: DW_TAG_inlined_subroutine DW_AT_name ("foo") DW_AT_type (0x000000b1 "char *") DW_AT_artificial (true) DW_AT_specification (0x0000005c "foo") 0x000000dc: DW_TAG_formal_parameter DW_AT_name ("a") DW_AT_type (0x000000ba "t *") 0x000000e2: DW_TAG_formal_parameter DW_AT_name ("d") DW_AT_type (0x000000ba "t *") 0x000000e8: NULL In the above, the DISubprogram 'foo' has the original signature but since parameter 'b' does not have DW_AT_location, it is clear that parameter will not be used. The actual function signature is represented in DW_TAG_inlined_subroutine. For the above case, it looks like DW_TAG_inlined_subroutine is not necessary. Let us try a few other examples below. Example 2 --------- Source: $ cat test.c struct t { long a; long b;}; __attribute__((noinline)) static long foo(struct t arg) { return arg.b * 5; } long bar(struct t arg) { return foo(arg); } Compiled and dump dwarf with: $ clang -O2 -c -g test.c -mllvm -enable-changed-func-dbinfo $ llvm-dwarfdump test.o ... 0x0000004e: DW_TAG_subprogram DW_AT_low_pc (0x0000000000000010) DW_AT_high_pc (0x0000000000000015) DW_AT_frame_base (DW_OP_reg7 RSP) DW_AT_call_all_calls (true) DW_AT_name ("foo") DW_AT_decl_file ("/home/yhs/tests/sig-change/struct/test.c") DW_AT_decl_line (2) DW_AT_prototyped (true) DW_AT_calling_convention (DW_CC_nocall) DW_AT_type (0x0000006d "long") 0x0000005e: DW_TAG_formal_parameter DW_AT_location (DW_OP_piece 0x8, DW_OP_reg5 RDI, DW_OP_piece 0x8) DW_AT_name ("arg") DW_AT_decl_file ("/home/yhs/tests/sig-change/struct/test.c") DW_AT_decl_line (2) DW_AT_type (0x00000099 "t") 0x0000006c: NULL ... 0x00000088: DW_TAG_inlined_subroutine DW_AT_name ("foo") DW_AT_type (0x0000006d "long") DW_AT_artificial (true) DW_AT_specification (0x0000004e "foo") 0x00000092: DW_TAG_formal_parameter DW_AT_name ("b") DW_AT_type (0x0000006d "long") 0x00000098: NULL In the above case for function foo(), the original argument is 'struct t', but the final actual argument is a 'long' type. DW_TAG_inlined_subroutine can clearly represent the signature type instead of doing DW_AT_location thing. Note that the name 'b' presents the second long type value of the struct 't'. Example 3 --------- Source: $ cat test2.c struct t { long a; long b; long c;}; __attribute__((noinline)) static long foo(struct t arg, int a) { return arg.a * arg.c; } long bar(struct t arg) { return foo(arg, 1); } Compiled and dump dwarf with: $ clang -O2 -c -g test2.c -mllvm -enable-changed-func-dbinfo $ llvm-dwarfdump test2.o ... 0x0000003e: DW_TAG_subprogram DW_AT_low_pc (0x0000000000000010) DW_AT_high_pc (0x0000000000000015) DW_AT_frame_base (DW_OP_reg7 RSP) DW_AT_call_all_calls (true) DW_AT_name ("bar") DW_AT_decl_file ("/home/yhs/tests/sig-change/struct/test2.c") DW_AT_decl_line (5) DW_AT_prototyped (true) DW_AT_type (0x0000005f "long") DW_AT_external (true) 0x0000004d: DW_TAG_formal_parameter DW_AT_location (DW_OP_fbreg +8) DW_AT_name ("arg") DW_AT_decl_file ("/home/yhs/tests/sig-change/struct/test2.c") DW_AT_decl_line (5) DW_AT_type (0x00000079 "t") 0x00000058: DW_TAG_call_site DW_AT_call_origin (0x00000023 "foo") DW_AT_call_tail_call (true) DW_AT_call_pc (0x0000000000000010) 0x0000005e: NULL ... 0x00000063: DW_TAG_inlined_subroutine DW_AT_name ("foo") DW_AT_type (0x0000005f "long") DW_AT_artificial (true) DW_AT_specification (0x00000023 "foo") 0x0000006d: DW_TAG_formal_parameter DW_AT_name ("arg") DW_AT_type (0x00000074 "t") 0x00000073: NULL In the above example, from DW_TAG_subprogram, it is not clear what kind of type the parameter should be. But DW_TAG_inlined_subroutine can clearly show what the type should be. Example 4 --------- Source: $ cat test.c __attribute__((noinline)) static int callee(const int *p) { return *p + 42; } int caller(void) { int x = 100; return callee(&x); } Compiled and dump dwarf with: $ clang -O3 -c -g test.c -mllvm -enable-changed-func-dbinfo $ llvm-dwarfdump test.o ... 0x0000004a: DW_TAG_subprogram DW_AT_low_pc (0x0000000000000010) DW_AT_high_pc (0x0000000000000014) DW_AT_frame_base (DW_OP_reg7 RSP) DW_AT_call_all_calls (true) DW_AT_name ("callee") DW_AT_decl_file ("/home/yhs/tests/sig-change/prom/test.c") DW_AT_decl_line (1) DW_AT_prototyped (true) DW_AT_calling_convention (DW_CC_nocall) DW_AT_type (0x00000063 "int") 0x0000005a: DW_TAG_formal_parameter DW_AT_name ("p") DW_AT_decl_file ("/home/yhs/tests/sig-change/prom/test.c") DW_AT_decl_line (1) DW_AT_type (0x00000078 "const int *") 0x00000062: NULL ... 0x00000067: DW_TAG_inlined_subroutine DW_AT_name ("callee") DW_AT_type (0x00000063 "int") DW_AT_artificial (true) DW_AT_specification (0x0000004a "callee") 0x00000071: DW_TAG_formal_parameter DW_AT_name ("__0") DW_AT_type (0x00000063 "int") 0x00000077: NULL In the above, the function static int callee(const int *p) { return *p + 42; } is transformed to static int callee(int p) { return p + 42; } But the new signature is not reflected in DW_TAG_subprogram. The DW_TAG_inlined_subroutine can precisely capture the signature. Note that the parameter name is "__0" and "0" means the first argument. The reason is due to the following IR: define internal ... i32 @callee(i32 %0) unnamed_addr llvm#1 !dbg !23 { #dbg_value(ptr poison, !29, !DIExpression(), !30) %2 = add nsw i32 %0, 42, !dbg !31 ret i32 %2, !dbg !32 } ... !29 = !DILocalVariable(name: "p", arg: 1, scope: !23, file: !1, line: 1, type: !26) The reason is due to 'ptr poison' as 'ptr poison' mean the debug value should not be used any more. This is also the reason that the above DW_TAG_subprogram does not have location information. DW_TAG_inlined_subroutine can provide correct signature though. With additional option like clang -O3 -c -g test.c -mllvm -enable-changed-func-dbinfo -fsave-optimization-record \ -foptimization-record-passes=emit-changed-func-debuginfo a file test.opt.yaml is generated with the following remark: $ cat test.opt.yaml --- !Passed Pass: emit-changed-func-debuginfo Name: FindNoDIVariable DebugLoc: { File: test.c, Line: 1, Column: 0 } Function: callee Args: - String: 'create a new int type ' - ArgName: '' - String: '(' - ArgIndex: '0' - String: ')' ... If we compile like below: clang -O3 -c -g test.c -fno-discard-value-names -mllvm -enable-changed-func-dbinfo The function argument name will be preserved ... i32 @callee(i32 %p.0.val) ... and in such cases, the DW_TAG_inlined_subroutine looks like below: 0x00000067: DW_TAG_inlined_subroutine DW_AT_name ("callee") DW_AT_type (0x00000063 "int") DW_AT_artificial (true) DW_AT_specification (0x0000004a "callee") 0x00000071: DW_TAG_formal_parameter DW_AT_name ("p__0__val") DW_AT_type (0x00000063 "int") 0x00000077: NULL Note that the original argument name replaces '.' with "__" so argument name has proper C standard. Non-LTO vs. LTO --------------- For thin-lto mode, we often see kernel symbols like p9_req_cache.llvm.13472271643223911678 Regardless that this symbol has identical source level signature with p9_req_cache, a special DW_TAG_inlined_subroutine will be generated with name 'p9_req_cache.llvm.13472271643223911678'. With this, some tool (e.g., pahole) may generate a BTF entry for this name which could be used for bpf fentry/fexit tracing. 0x10f0793f: DW_TAG_inlined_subroutine DW_AT_name ("flow_offload_fill_route.llvm.14555965973926298225") DW_AT_artificial (true) DW_AT_specification (0x10ee9e54 "flow_offload_fill_route") 0x10f07949: DW_TAG_formal_parameter DW_AT_name ("flow") DW_AT_type (0x10ee837a "flow_offload *") 0x10f07951: DW_TAG_formal_parameter DW_AT_name ("route") DW_AT_type (0x10eea4ef "nf_flow_route *") 0x10f07959: DW_TAG_formal_parameter DW_AT_name ("dir") DW_AT_type (0x10ecef15 "flow_offload_tuple_dir") 0x10f07961: NULL In the above, function "flow_offload_fill_route" has return type "int" at source level, but optimization eventually made the return type as "void". Function specialization ----------------------- LLVM has a pass FunctionSpecializer (FunctionSpecialization.cpp) which is called by SCCP pass (Interprocedural Sparse Conditional Constant Propagation). The FunctionSpecializer may clone functions and SCCP pass is available for both non-LTO and LTO passes. For any function, the default clones can be up to 3 and all these clones will have different signatures than the source signature. This is rare but it did happen. For example, for linux kernel thin lto mode, I found the following in the kernel symbol table: ffffffff812036d0 t print_cpu.specialized.1 In this particular case, after cloning, the original function 'print_cpu' is not used so it is removed. Here, the print_cpu() call is a static function. Basically, the compiler creates a specialized 'print_cpu.specialized.1' function and the original funciton 'print_cpu' also exists. The dwarf for the above two functions: 0x01484bea: DW_TAG_subprogram DW_AT_low_pc (0xffffffff812036d0) DW_AT_high_pc (0xffffffff8120400c) DW_AT_frame_base (DW_OP_reg6 RBP) DW_AT_call_all_calls (true) DW_AT_name ("print_cpu") DW_AT_decl_file ("/home/yhs/work/bpf-next/kernel/sched/debug.c") DW_AT_decl_line (922) DW_AT_prototyped (true) DW_AT_calling_convention (DW_CC_nocall) 0x01484bfa: DW_TAG_formal_parameter DW_AT_const_value (0) DW_AT_name ("m") DW_AT_decl_file ("/home/yhs/work/bpf-next/kernel/sched/debug.c") DW_AT_decl_line (922) DW_AT_type (0x0146fd21 "seq_file *") 0x01484c06: DW_TAG_formal_parameter DW_AT_location (indexed (0x7ee) loclist = 0x0011ce6d: [0xffffffff812036d5, 0xffffffff81203730): DW_OP_reg5 RDI [0xffffffff81203730, 0xffffffff812039fa): DW_OP_reg3 RBX [0xffffffff812039fa, 0xffffffff81203a89): DW_OP_entry_value(DW_OP_reg5 RDI), DW_OP_stack_value [0xffffffff81203a89, 0xffffffff81203a8d): DW_OP_reg3 RBX [0xffffffff81203a8d, 0xffffffff81203d58): DW_OP_breg7 RSP+12 [0xffffffff81203d7a, 0xffffffff81203ddd): DW_OP_breg7 RSP+12 [0xffffffff81203dfa, 0xffffffff81203f7b): DW_OP_breg7 RSP+12 [0xffffffff81203f7b, 0xffffffff81203f80): DW_OP_entry_value(DW_OP_reg5 RDI), DW_OP_stack_value [0xffffffff81203f80, 0xffffffff8120400c): DW_OP_reg3 RBX) DW_AT_name ("cpu") DW_AT_decl_file ("/home/yhs/work/bpf-next/kernel/sched/debug.c") DW_AT_decl_line (922) DW_AT_type (0x01462560 "int") ...... 0x014981fc: DW_TAG_inlined_subroutine DW_AT_name ("print_cpu.specialized.1") DW_AT_artificial (true) DW_AT_specification (0x01484bea "print_cpu") 0x01498204: DW_TAG_formal_parameter DW_AT_name ("cpu") DW_AT_type (0x01462560 "int") 0x0149820c: NULL The specailized function "print_cpu.specialized.1" has a signature different from the original one "print_cpu" and its name directly encoded into DW_AT_name. Some restrictions ================= There are some restrictions in the current implementation: - Only C language is supported - BPF target is excluded as one of main goals for this pull request is to generate proper vmlinux BTF for arch's like x86_64/arm64 etc. - Function must not be a intrinsic, decl only, return value size more than arch register size and func with variable arguments. - For arguments, only int/ptr types are supported. - Some union type arguments (e.g., 8B < union_size <= 16B) may have issue to pick which member so the related functions may be skipped. Remarks ======= A few remarks are available for debugging purpose including - cannot handle union arguments (greater than 8B but less/equal to 16B). - cannot find corresponding DILocalVariable for the argument. - certain cases of dbg fragment handling. Some statistics with linux kernel ================================= I have tested this patch set by building latest bpf-next linux kernel. For no-lto case: 66051 original number of functions 894 signature changed or new with-dot functions with this patch For thin-lto case: 66227 original number of functions 2990 signature changed or new with-dot functions with this patch Next step ========= With this llvm change, we will be able to do some work in pahole. For pahole, currently we will see the warning: die__process_unit: DW_TAG_inlined_subroutine (0x1d) @ <0xf2db986> not handled in a c11 CU! Basically these DW_TAG_inlined_subroutine are not inside the DISubprogram. [1] llvm#127855 [2] llvm#157349 [3] https://discourse.llvm.org/t/rfc-identify-func-signature-change-in-llvm-compiled-kernel-image/82609

llvmbot added llvm:codegen debuginfo llvm:transforms labels Sep 7, 2025

yonghong-song requested review from arsenm, clayborg, dwblaikie and pogo59 September 7, 2025 16:48

yonghong-song force-pushed the signature-change branch 3 times, most recently from a864cf6 to aea80d2 Compare September 7, 2025 18:35

yonghong-song requested review from 4ast and eddyz87 September 7, 2025 18:38

yonghong-song force-pushed the signature-change branch 14 times, most recently from c9f140c to c995ada Compare September 8, 2025 01:41

arsenm reviewed Sep 8, 2025

View reviewed changes

Yonghong Song added 3 commits October 11, 2025 22:19

yonghong-song force-pushed the signature-change branch from e8844f5 to 821d62a Compare October 12, 2025 17:59

yonghong-song mentioned this pull request Oct 27, 2025

[RFC][LLVM] Emit dwarf data for changed-signature and new functions #165310

Open

[RFC] Emit dwarf data for signature-changed or new functions #157349

Are you sure you want to change the base?

[RFC] Emit dwarf data for signature-changed or new functions #157349

Uh oh!

Conversation

yonghong-song commented Sep 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Sep 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Sep 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yonghong-song commented Sep 7, 2025

Uh oh!

arsenm Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

yonghong-song Sep 16, 2025

Choose a reason for hiding this comment

Uh oh!

yonghong-song commented Oct 12, 2025

Uh oh!

OCHyams commented Oct 15, 2025

Uh oh!

yonghong-song commented Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Non lto example:

Thin-LTO example:

Uh oh!

yonghong-song commented Oct 17, 2025

Uh oh!

yonghong-song commented Oct 20, 2025

Uh oh!

dzhidzhoev commented Oct 20, 2025

Uh oh!

dzhidzhoev commented Oct 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dzhidzhoev commented Oct 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

OCHyams commented Oct 20, 2025

Uh oh!

yonghong-song commented Oct 20, 2025

Uh oh!

yonghong-song commented Oct 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dafaust commented Oct 22, 2025

Uh oh!

yonghong-song commented Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dafaust commented Oct 22, 2025

Uh oh!

yonghong-song commented Oct 23, 2025

Uh oh!

dafaust commented Oct 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

yonghong-song commented Sep 7, 2025 •

edited

Loading

llvmbot commented Sep 7, 2025 •

edited

Loading

github-actions bot commented Sep 7, 2025 •

edited

Loading

yonghong-song commented Oct 16, 2025 •

edited

Loading

dzhidzhoev commented Oct 20, 2025 •

edited

Loading

dzhidzhoev commented Oct 20, 2025 •

edited

Loading

yonghong-song commented Oct 20, 2025 •

edited

Loading

yonghong-song commented Oct 22, 2025 •

edited

Loading