Skip to content

Conversation

@jryans
Copy link
Member

@jryans jryans commented Nov 3, 2025

DISubprograms are attached to call sites to support various debug info features, including entry values and tail calls. Clang 9.0 (0f65168) was the first version to include this kind of call site DISubprogram attachment.

This earlier work appears to visit only some call site variants, however. The call site attachment was added to a higher-level EmitCall path in Clang's code gen that is only used by some call variants. In particular, some C++ member calls use a different code gen path, which did not include this call site attachment step, and thus the debug info it triggers (e.g. call site entries) was not emitted for such calls.

This moves DISubprogram attachment to a lower-level call emission path that is used by all call variants.

Fixes #161962

`DISubprogram`s are attached to call sites to support various debug info
features, including entry values and tail calls. Clang 9.0
(0f65168) was the first version to
include this kind of call site `DISubprogram` attachment.

This earlier work appears to visit only some call site variants,
however. The call site attachment was added to a higher-level `EmitCall`
path in Clang's code gen that is only used by some call variants. In
particular, some C++ member calls use a different code gen path, which
did not include this call site attachment step, and thus the debug info
it triggers (e.g. call site entries) was not emitted for such calls.

This moves `DISubprogram` attachment to a lower-level call emission path
that is used by all call variants.

Fixes llvm#161962
@jryans jryans added clang:codegen IR generation bugs: mangling, exceptions, etc. debuginfo labels Nov 3, 2025
@llvmbot llvmbot added the clang Clang issues not falling into any other category label Nov 3, 2025
@llvmbot
Copy link
Member

llvmbot commented Nov 3, 2025

@llvm/pr-subscribers-clang
@llvm/pr-subscribers-debuginfo

@llvm/pr-subscribers-clang-codegen

Author: J. Ryan Stinnett (jryans)

Changes

DISubprograms are attached to call sites to support various debug info features, including entry values and tail calls. Clang 9.0 (0f65168) was the first version to include this kind of call site DISubprogram attachment.

This earlier work appears to visit only some call site variants, however. The call site attachment was added to a higher-level EmitCall path in Clang's code gen that is only used by some call variants. In particular, some C++ member calls use a different code gen path, which did not include this call site attachment step, and thus the debug info it triggers (e.g. call site entries) was not emitted for such calls.

This moves DISubprogram attachment to a lower-level call emission path that is used by all call variants.

Fixes #161962


Full diff: https://github.com/llvm/llvm-project/pull/166202.diff

5 Files Affected:

  • (modified) clang/lib/CodeGen/CGCall.cpp (+15)
  • (modified) clang/lib/CodeGen/CGDebugInfo.cpp (+6-2)
  • (modified) clang/lib/CodeGen/CGDebugInfo.h (+1-1)
  • (modified) clang/lib/CodeGen/CGExpr.cpp (-9)
  • (added) clang/test/DebugInfo/CXX/decl-member-call.cpp (+25)
diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index 465f3f4e670c2..2e4ffd42b915c 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -6277,6 +6277,21 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo &CallInfo,
     pushDestroy(QualType::DK_nontrivial_c_struct, Ret.getAggregateAddress(),
                 RetTy);
 
+  if (CalleeDecl) {
+    // Generate function declaration DISuprogram in order to be used
+    // in debug info about call sites.
+    if (CGDebugInfo *DI = getDebugInfo()) {
+      CodeGenFunction CalleeCGF(CGM);
+      const GlobalDecl &CalleeGlobalDecl =
+          Callee.getAbstractInfo().getCalleeDecl();
+      CalleeCGF.CurGD = CalleeGlobalDecl;
+      FunctionArgList Args;
+      QualType ResTy = CalleeCGF.BuildFunctionArgList(CalleeGlobalDecl, Args);
+      DI->EmitFuncDeclForCallSite(
+          CI, DI->getFunctionType(CalleeDecl, ResTy, Args), CalleeGlobalDecl);
+    }
+  }
+
   return Ret;
 }
 
diff --git a/clang/lib/CodeGen/CGDebugInfo.cpp b/clang/lib/CodeGen/CGDebugInfo.cpp
index 07a2cfb21bef2..67675dae71c34 100644
--- a/clang/lib/CodeGen/CGDebugInfo.cpp
+++ b/clang/lib/CodeGen/CGDebugInfo.cpp
@@ -4952,7 +4952,7 @@ void CGDebugInfo::EmitFunctionDecl(GlobalDecl GD, SourceLocation Loc,
 
 void CGDebugInfo::EmitFuncDeclForCallSite(llvm::CallBase *CallOrInvoke,
                                           QualType CalleeType,
-                                          const FunctionDecl *CalleeDecl) {
+                                          GlobalDecl CalleeGlobalDecl) {
   if (!CallOrInvoke)
     return;
   auto *Func = dyn_cast<llvm::Function>(CallOrInvoke->getCalledOperand());
@@ -4961,6 +4961,9 @@ void CGDebugInfo::EmitFuncDeclForCallSite(llvm::CallBase *CallOrInvoke,
   if (Func->getSubprogram())
     return;
 
+  const FunctionDecl *CalleeDecl =
+      cast<FunctionDecl>(CalleeGlobalDecl.getDecl());
+
   // Do not emit a declaration subprogram for a function with nodebug
   // attribute, or if call site info isn't required.
   if (CalleeDecl->hasAttr<NoDebugAttr>() ||
@@ -4971,7 +4974,8 @@ void CGDebugInfo::EmitFuncDeclForCallSite(llvm::CallBase *CallOrInvoke,
   // create the one describing the function in order to have complete
   // call site debug info.
   if (!CalleeDecl->isStatic() && !CalleeDecl->isInlined())
-    EmitFunctionDecl(CalleeDecl, CalleeDecl->getLocation(), CalleeType, Func);
+    EmitFunctionDecl(CalleeGlobalDecl, CalleeDecl->getLocation(), CalleeType,
+                     Func);
 }
 
 void CGDebugInfo::EmitInlineFunctionStart(CGBuilderTy &Builder, GlobalDecl GD) {
diff --git a/clang/lib/CodeGen/CGDebugInfo.h b/clang/lib/CodeGen/CGDebugInfo.h
index 78c3eb9c5792e..1f695941f113a 100644
--- a/clang/lib/CodeGen/CGDebugInfo.h
+++ b/clang/lib/CodeGen/CGDebugInfo.h
@@ -511,7 +511,7 @@ class CGDebugInfo {
   /// This is needed for call site debug info.
   void EmitFuncDeclForCallSite(llvm::CallBase *CallOrInvoke,
                                QualType CalleeType,
-                               const FunctionDecl *CalleeDecl);
+                               GlobalDecl CalleeGlobalDecl);
 
   /// Constructs the debug code for exiting a function.
   void EmitFunctionEnd(CGBuilderTy &Builder, llvm::Function *Fn);
diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp
index 01f2161f27555..a837f00732748 100644
--- a/clang/lib/CodeGen/CGExpr.cpp
+++ b/clang/lib/CodeGen/CGExpr.cpp
@@ -6632,15 +6632,6 @@ RValue CodeGenFunction::EmitCall(QualType CalleeType,
                          E == MustTailCall, E->getExprLoc());
 
   if (auto *CalleeDecl = dyn_cast_or_null<FunctionDecl>(TargetDecl)) {
-    // Generate function declaration DISuprogram in order to be used
-    // in debug info about call sites.
-    if (CGDebugInfo *DI = getDebugInfo()) {
-      FunctionArgList Args;
-      QualType ResTy = BuildFunctionArgList(CalleeDecl, Args);
-      DI->EmitFuncDeclForCallSite(LocalCallOrInvoke,
-                                  DI->getFunctionType(CalleeDecl, ResTy, Args),
-                                  CalleeDecl);
-    }
     if (CalleeDecl->hasAttr<RestrictAttr>() ||
         CalleeDecl->hasAttr<AllocSizeAttr>()) {
       // Function has 'malloc' (aka. 'restrict') or 'alloc_size' attribute.
diff --git a/clang/test/DebugInfo/CXX/decl-member-call.cpp b/clang/test/DebugInfo/CXX/decl-member-call.cpp
new file mode 100644
index 0000000000000..95758a2985c0c
--- /dev/null
+++ b/clang/test/DebugInfo/CXX/decl-member-call.cpp
@@ -0,0 +1,25 @@
+// RUN: %clang_cc1 -O1 -triple x86_64-unknown_unknown -emit-llvm \
+// RUN:   -debug-info-kind=standalone -dwarf-version=5 %s -o - | FileCheck %s
+
+// Ensure both nonmember and member calls to declared function
+// have attached `DISubprogram`s.
+
+int nonmember(int n);
+
+struct S {
+  int x;
+  int member(int n);
+};
+
+int main(int argc, char** argv) {
+  struct S s = {};
+  int a = s.member(argc);
+  int b = nonmember(argc);
+  return a + b;
+}
+
+// CHECK: declare !dbg ![[SP1:[0-9]+]] noundef i32 @_ZN1S6memberEi(
+// CHECK: declare !dbg ![[SP2:[0-9]+]] noundef i32 @_Z9nonmemberi(
+
+// CHECK: ![[SP1]] = !DISubprogram(name: "member", linkageName: "_ZN1S6memberEi"
+// CHECK: ![[SP2]] = !DISubprogram(name: "nonmember", linkageName: "_Z9nonmemberi"

@jryans
Copy link
Member Author

jryans commented Nov 3, 2025

This change does increase debug info size a bit. Here's a diff of object file section sizes when compiling a recent commit of Clang using -O2 -g on Linux:

    FILE SIZE        VM SIZE    
 --------------  -------------- 
  +1.4% +9.97Mi  [ = ]       0    .debug_info
  +2.3% +1.01Mi  [ = ]       0    .debug_addr
  +0.7%  +836Ki  [ = ]       0    .debug_str_offsets
  +0.0%  +129Ki  [ = ]       0    .debug_str
  +0.0% +22.1Ki  [ = ]       0    .debug_loclists
  +0.0% +7.96Ki  [ = ]       0    .debug_rnglists
   +21% +1.44Ki  [ = ]       0    [Unmapped]
  -0.0%     -59  [ = ]       0    .debug_line_str
  -0.0%    -540  [ = ]       0    .debug_line
  -0.0% -1.44Ki  -0.0% -1.44Ki    .rodata
  -0.5% -60.6Ki  [ = ]       0    .debug_abbrev
  +0.7% +11.9Mi  -0.0% -1.44Ki    TOTAL

Since this additional call site info was intended to be there all along (AIUI) and is needed to properly retrieve variable values that use DW_OP_entry_value, I believe we should accept the larger binary size.

@djtodoro
Copy link
Collaborator

djtodoro commented Nov 4, 2025

This change does increase debug info size a bit. Here's a diff of object file section sizes when compiling a recent commit of Clang using -O2 -g on Linux:

    FILE SIZE        VM SIZE    
 --------------  -------------- 
  +1.4% +9.97Mi  [ = ]       0    .debug_info
  +2.3% +1.01Mi  [ = ]       0    .debug_addr
  +0.7%  +836Ki  [ = ]       0    .debug_str_offsets
  +0.0%  +129Ki  [ = ]       0    .debug_str
  +0.0% +22.1Ki  [ = ]       0    .debug_loclists
  +0.0% +7.96Ki  [ = ]       0    .debug_rnglists
   +21% +1.44Ki  [ = ]       0    [Unmapped]
  -0.0%     -59  [ = ]       0    .debug_line_str
  -0.0%    -540  [ = ]       0    .debug_line
  -0.0% -1.44Ki  -0.0% -1.44Ki    .rodata
  -0.5% -60.6Ki  [ = ]       0    .debug_abbrev
  +0.7% +11.9Mi  -0.0% -1.44Ki    TOTAL

Since this additional call site info was intended to be there all along (AIUI) and is needed to properly retrieve variable values that use DW_OP_entry_value, I believe we should accept the larger binary size.

This is okay. Thanks.

However, I would mention that AFAIK, a call_site entry can be useful iff:

  1. is marked as a tail call (debuggers use it to show the artificial frames)
  2. has call-site-paramters (debuggers use it along with entry_values)

Otherwise, we can avoid the call_sites in the final DWARF, and it should reduce the final binary size.

@jryans
Copy link
Member Author

jryans commented Nov 4, 2025

I would mention that AFAIK, a call_site entry can be useful iff:

  1. is marked as a tail call (debuggers use it to show the artificial frames)
  2. has call-site-paramters (debuggers use it along with entry_values)

Otherwise, we can avoid the call_sites in the final DWARF, and it should reduce the final binary size.

Thanks for summarising those major use cases for the call site info.

From a quick skim, I believe many of the added call site entries have call site parameters (and thus add clear value assuming any callee variables use DW_OP_entry_value). There are also some added non-tail call site entries without call site parameters, and those indeed may not be used by debuggers today (though perhaps some misc. tools use it to build a call graph).

If we wish to remove call site entries that don't seem to add value, I think it would be best to discuss that separately from this PR.

Copy link
Contributor

@OCHyams OCHyams left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM, thanks for this, I think it looks right. Just a couple questions (one inline) -

Does -g[no-]omit-unreferenced-methods still work as expected?

Copy link
Member Author

@jryans jryans left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review! 😄

Does -g[no-]omit-unreferenced-methods still work as expected?

Yes, that appears to be covered by at least the clang/test/DebugInfo/CXX/incomplete-types.cpp test. Switching the omit flag on does the expected thing of limiting emitted function types.

Copy link
Contributor

@OCHyams OCHyams left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this SGTM

@github-actions
Copy link

github-actions bot commented Nov 7, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Contributor

@OCHyams OCHyams left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a new round of compile time results after lifting up the call site emission conditions check.

We're now doing less work on average for many O0 debug runs since we're checking whether to emit earlier than the previous version did. The O3 debug run continues to show a slight increase above baseline, but that's expected since there's a bit more call site emission work happening in optimised debug runs.

Please take a look and let me know if this round looks good to you as well.

Excellent, LGTM! Thanks for digging into that.

@jryans jryans merged commit 342e28f into llvm:main Nov 10, 2025
10 checks passed
ckoparkar added a commit to ckoparkar/llvm-project that referenced this pull request Nov 10, 2025
* main: (1028 commits)
  [clang][DebugInfo] Attach `DISubprogram` to additional call variants (llvm#166202)
  [C2y] Claim nonconformance to WG14 N3348 (llvm#166966)
  [X86] 2012-01-10-UndefExceptionEdge.ll - regenerate test checks (llvm#167307)
  Remove unused standard headers: <string>, <optional>, <numeric>, <tuple> (llvm#167232)
  [DebugInfo] Add Verifier check for incorrectly-scoped retainedNodes (llvm#166855)
  [VPlan] Don't apply predication discount to non-originally-predicated blocks (llvm#160449)
  [libc++] Avoid overloaded `operator,` for (`T`, `Iter`) cases (llvm#161049)
  [tools][llc] Make save-stats.ll test target independent (llvm#167238)
  [AArch64] Fallback to PRFUM for PRFM with negative or unaligned offset (llvm#166756)
  [X86] ldexp-avx512.ll - add v8f16/v16f16/v32f16 test coverage for llvm#165694 (llvm#167294)
  [DropAssumes] Drop dereferenceable assumptions after vectorization. (llvm#166947)
  [VPlan] Simplify branch-cond with getVectorTripCount (llvm#155604)
  Remove unused <algorithm> inclusion (llvm#166942)
  [AArch64] Combine subtract with borrow to SBC. (llvm#165271)
  [AArch64][SVE] Avoid redundant extend of unsigned i8/i16 extracts. (llvm#165863)
  [SPIRV] Fix failing assertion in SPIRVAsmPrinter (llvm#166909)
  [libc++] Merge insert/emplace(const_iterator, Args...) implementations (llvm#166470)
  [libc++] Replace __libcpp_is_final with a variable template (llvm#167137)
  [gn build] Port 152bda7
  [libc++] Replace the last uses of __tuple_types with __type_list (llvm#167214)
  ...
CarlosAlbertoEnciso added a commit to CarlosAlbertoEnciso/llvm-project that referenced this pull request Nov 14, 2025
As the extra call site information is available by default, update
other targets that support call site debug information.

Move the 'CallSiteInfo' set to the block introduced by:
  llvm#166202
@asmok-g
Copy link
Contributor

asmok-g commented Nov 19, 2025

Heads-up: This is causing ~1.5% increase in bin size of clang when built internally in Google and ~1.8% increase in dwp file. It does the same for a couple other internal google targets. I'm posting a heads-up for now and we will discuss internally how much of an issue this is before we come back to the thread. Thanks

@djtodoro
Copy link
Collaborator

I guess a patch that handles this #166202 (comment) can improve things here a lot.

@jryans
Copy link
Member Author

jryans commented Nov 19, 2025

This is causing ~1.5% increase in bin size of clang when built internally in Google and ~1.8% increase in dwp file.

Thanks for the update. A bit more of an increase than my own comparison above, but perhaps further optimisations, LTO modes, etc. are revealing more opportunities to insert this kind of debug info.

Since this is effectively fixing a debug correctness issue, we would want to retain the core fix here. If the size change is deemed to be too large, please open a new issue so we can use that to coordinate ways to reduce space, such as @djtodoro's suggestions above.

@OCHyams
Copy link
Contributor

OCHyams commented Nov 20, 2025

However, I would mention that AFAIK, a call_site entry can be useful iff:

  1. is marked as a tail call (debuggers use it to show the artificial frames)
  2. has call-site-paramters (debuggers use it along with entry_values)
    Otherwise, we can avoid the call_sites in the final DWARF, and it should reduce the final binary size.

I don't think this is true for our (Sony) debugger, which wants to see call site info even in cases without call site parameters (it's not only using it for entry_value computation).

@asmok-g said:

Heads-up: This is causing ~1.5% increase in bin size of clang when built internally in Google and ~1.8% increase in dwp file. It does the same for a couple other internal google targets. I'm posting a heads-up for now and we will discuss internally how much of an issue this is before we come back to the thread. Thanks

@jryans said:

[...] this is effectively fixing a debug correctness issue [...]

As @jryans points out this is more of a bug fix as iIIUC t's adding coverage that should have been there from the start.

Does google use call site information (does it care about entry_values, or do any extra stuff with the info)? If not, I wonder if it would be useful to add a flag to disable call site info for your builds (if such a flag doesn't already exist)?

@alexfh
Copy link
Contributor

alexfh commented Nov 20, 2025

Does google use call site information (does it care about entry_values, or do any extra stuff with the info)?

@dwblaikie or @labath should know the answer.

I wonder if it would be useful to add a flag to disable call site info for your builds (if such a flag doesn't already exist)?

I think regardless of the debug information completeness improvement, the 1.5%+ increase may be problematic in some cases. We probably need a flag to mitigate this increase without disabling debug information altogether. If there's already a flag that controls the information being added by this change (or a reasonable superset of it), we may be able to suggest it as a workaround for teams who have to squeeze their binaries into hard size limits.

@jryans
Copy link
Member Author

jryans commented Nov 20, 2025

I have attempted to carry over the points made so far into #168851. Let's try to focus further discussion there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang:codegen IR generation bugs: mangling, exceptions, etc. clang Clang issues not falling into any other category debuginfo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Call site entries for declared C++ member functions not reported in debug info

6 participants