Skip to content

Conversation

SergejSalnikov
Copy link

@SergejSalnikov SergejSalnikov commented Oct 13, 2025

Use source location for macro arguments when generating debug info.

Demo.

fixes #160667

Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@SergejSalnikov SergejSalnikov marked this pull request as ready for review October 13, 2025 12:17
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:codegen IR generation bugs: mangling, exceptions, etc. debuginfo labels Oct 13, 2025
@llvmbot
Copy link
Member

llvmbot commented Oct 13, 2025

@llvm/pr-subscribers-clang-codegen

@llvm/pr-subscribers-clang

Author: SKill (SergejSalnikov)

Changes

Use source location for macro arguments when generating debug info.

fixes #160667


Full diff: https://github.com/llvm/llvm-project/pull/163190.diff

1 Files Affected:

  • (modified) clang/lib/CodeGen/CGDebugInfo.cpp (+23-6)
diff --git a/clang/lib/CodeGen/CGDebugInfo.cpp b/clang/lib/CodeGen/CGDebugInfo.cpp
index 9fe9a13610296..b5216806c3e83 100644
--- a/clang/lib/CodeGen/CGDebugInfo.cpp
+++ b/clang/lib/CodeGen/CGDebugInfo.cpp
@@ -110,6 +110,20 @@ static bool IsArtificial(VarDecl const *VD) {
                               cast<Decl>(VD->getDeclContext())->isImplicit());
 }
 
+/// Given a SourceLocation object, return the spelling location referenced by
+/// the ID.
+///
+/// The key difference from SourceManager::getPresumedLoc that a presumed
+/// location of macro arguments is based on the spelling location.
+static PresumedLoc getPresumedLoc(SourceManager &SM, SourceLocation Loc) {
+  // If the location is a macro argument expansion, get the spelling location
+  // instead.
+  if (SM.isMacroArgExpansion(Loc, nullptr)) {
+    Loc = SM.getSpellingLoc(Loc);
+  }
+  return SM.getPresumedLoc(Loc);
+}
+
 CGDebugInfo::CGDebugInfo(CodeGenModule &CGM)
     : CGM(CGM), DebugKind(CGM.getCodeGenOpts().getDebugInfo()),
       DebugTypeExtRefs(CGM.getCodeGenOpts().DebugTypeExtRefs),
@@ -318,7 +332,11 @@ void CGDebugInfo::setLocation(SourceLocation Loc) {
   if (Loc.isInvalid())
     return;
 
-  CurLoc = CGM.getContext().getSourceManager().getExpansionLoc(Loc);
+  SourceManager &SM = CGM.getContext().getSourceManager();
+  if (SM.isMacroArgExpansion(Loc)) {
+    Loc = SM.getSpellingLoc(Loc);
+  }
+  CurLoc = SM.getExpansionLoc(Loc);
 
   // If we've changed files in the middle of a lexical scope go ahead
   // and create a new lexical scope with file node if it's different
@@ -326,7 +344,6 @@ void CGDebugInfo::setLocation(SourceLocation Loc) {
   if (LexicalBlockStack.empty())
     return;
 
-  SourceManager &SM = CGM.getContext().getSourceManager();
   auto *Scope = cast<llvm::DIScope>(LexicalBlockStack.back());
   PresumedLoc PCLoc = SM.getPresumedLoc(CurLoc);
   if (PCLoc.isInvalid() || Scope->getFile() == getOrCreateFile(CurLoc))
@@ -545,7 +562,7 @@ llvm::DIFile *CGDebugInfo::getOrCreateFile(SourceLocation Loc) {
     FileName = TheCU->getFile()->getFilename();
     CSInfo = TheCU->getFile()->getChecksum();
   } else {
-    PresumedLoc PLoc = SM.getPresumedLoc(Loc);
+    PresumedLoc PLoc = getPresumedLoc(SM, Loc);
     FileName = PLoc.getFilename();
 
     if (FileName.empty()) {
@@ -627,7 +644,7 @@ unsigned CGDebugInfo::getLineNumber(SourceLocation Loc) {
   if (Loc.isInvalid())
     return 0;
   SourceManager &SM = CGM.getContext().getSourceManager();
-  return SM.getPresumedLoc(Loc).getLine();
+  return getPresumedLoc(SM, Loc).getLine();
 }
 
 unsigned CGDebugInfo::getColumnNumber(SourceLocation Loc, bool Force) {
@@ -639,7 +656,7 @@ unsigned CGDebugInfo::getColumnNumber(SourceLocation Loc, bool Force) {
   if (Loc.isInvalid() && CurLoc.isInvalid())
     return 0;
   SourceManager &SM = CGM.getContext().getSourceManager();
-  PresumedLoc PLoc = SM.getPresumedLoc(Loc.isValid() ? Loc : CurLoc);
+  PresumedLoc PLoc = getPresumedLoc(SM, Loc.isValid() ? Loc : CurLoc);
   return PLoc.isValid() ? PLoc.getColumn() : 0;
 }
 
@@ -6183,7 +6200,7 @@ void CGDebugInfo::EmitGlobalAlias(const llvm::GlobalValue *GV,
 void CGDebugInfo::AddStringLiteralDebugInfo(llvm::GlobalVariable *GV,
                                             const StringLiteral *S) {
   SourceLocation Loc = S->getStrTokenLoc(0);
-  PresumedLoc PLoc = CGM.getContext().getSourceManager().getPresumedLoc(Loc);
+  PresumedLoc PLoc = getPresumedLoc(CGM.getContext().getSourceManager(), Loc);
   if (!PLoc.isValid())
     return;
 

Copy link

github-actions bot commented Oct 13, 2025

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff origin/main HEAD --extensions cpp,h,c -- clang/test/DebugInfo/Generic/macro-info.c clang/include/clang/Basic/SourceManager.h clang/lib/Basic/SourceManager.cpp clang/lib/CodeGen/CGDebugInfo.cpp clang/lib/CodeGen/CGDebugInfo.h --diff_from_common_commit

⚠️
The reproduction instructions above might return results for more than one PR
in a stack if you are using a stacked PR workflow. You can limit the results by
changing origin/main to the base branch/commit you want to compare against.
⚠️

View the diff from clang-format here.
diff --git a/clang/include/clang/Basic/SourceManager.h b/clang/include/clang/Basic/SourceManager.h
index 5e8ca172a..48ddfd9d7 100644
--- a/clang/include/clang/Basic/SourceManager.h
+++ b/clang/include/clang/Basic/SourceManager.h
@@ -1261,7 +1261,8 @@ public:
   SourceLocation getRefinedSpellingLoc(SourceLocation Loc) const {
     // Handle the non-mapped case inline, defer to out of line code to handle
     // expansions.
-    if (Loc.isFileID()) return Loc;
+    if (Loc.isFileID())
+      return Loc;
     return getRefinedSpellingLocSlowCase(Loc);
   }
 
diff --git a/clang/lib/CodeGen/CGDebugInfo.cpp b/clang/lib/CodeGen/CGDebugInfo.cpp
index 835a9538f..913fb7da3 100644
--- a/clang/lib/CodeGen/CGDebugInfo.cpp
+++ b/clang/lib/CodeGen/CGDebugInfo.cpp
@@ -5469,8 +5469,7 @@ void CGDebugInfo::EmitDeclareOfBlockDeclRefVariable(
       Ty = CreateSelfType(VD->getType(), Ty);
 
   // Get location information.
-  const unsigned Line =
-      getLineNumber(Loc.isValid() ? Loc : CurLoc);
+  const unsigned Line = getLineNumber(Loc.isValid() ? Loc : CurLoc);
   unsigned Column = getColumnNumber(Loc);
 
   const llvm::DataLayout &target = CGM.getDataLayout();
@@ -5579,7 +5578,8 @@ void CGDebugInfo::EmitDeclareOfBlockLiteralArgVariable(const CGBlockInfo &block,
   const BlockDecl *blockDecl = block.getBlockDecl();
 
   // Collect some general information about the block's location.
-  SourceLocation loc = getRefinedSpellingLocation(blockDecl->getCaretLocation());
+  SourceLocation loc =
+      getRefinedSpellingLocation(blockDecl->getCaretLocation());
   llvm::DIFile *tunit = getOrCreateFile(loc);
   unsigned line = getLineNumber(loc);
   unsigned column = getColumnNumber(loc);
@@ -6111,8 +6111,8 @@ void CGDebugInfo::EmitGlobalVariable(const ValueDecl *VD, const APValue &Init) {
     }
 
   GV.reset(DBuilder.createGlobalVariableExpression(
-      DContext, Name, StringRef(), Unit, getLineNumber(Loc), Ty,
-      true, true, InitExpr, getOrCreateStaticDataMemberDeclarationOrNull(VarD),
+      DContext, Name, StringRef(), Unit, getLineNumber(Loc), Ty, true, true,
+      InitExpr, getOrCreateStaticDataMemberDeclarationOrNull(VarD),
       TemplateParameters, Align));
 }
 
@@ -6131,8 +6131,8 @@ void CGDebugInfo::EmitExternalVariable(llvm::GlobalVariable *Var,
   llvm::DIScope *DContext = getDeclContextDescriptor(D);
   llvm::DIGlobalVariableExpression *GVE =
       DBuilder.createGlobalVariableExpression(
-          DContext, Name, StringRef(), Unit, getLineNumber(Loc),
-          Ty, false, false, nullptr, nullptr, nullptr, Align);
+          DContext, Name, StringRef(), Unit, getLineNumber(Loc), Ty, false,
+          false, nullptr, nullptr, nullptr, Align);
   Var->addDebugInfo(GVE);
 }
 
@@ -6228,8 +6228,8 @@ void CGDebugInfo::AddStringLiteralDebugInfo(llvm::GlobalVariable *GV,
   llvm::DIFile *File = getOrCreateFile(Loc);
   llvm::DIGlobalVariableExpression *Debug =
       DBuilder.createGlobalVariableExpression(
-          nullptr, StringRef(), StringRef(), File,
-          getLineNumber(Loc), getOrCreateType(S->getType(), File), true);
+          nullptr, StringRef(), StringRef(), File, getLineNumber(Loc),
+          getOrCreateType(S->getType(), File), true);
   GV->addDebugInfo(Debug);
 }
 

@SergejSalnikov SergejSalnikov force-pushed the main branch 2 times, most recently from de48195 to f0734d4 Compare October 13, 2025 16:30
Copy link
Collaborator

@dwblaikie dwblaikie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some test coverage will be needed - take a look in clang/test/DebugInfo/... and see about adding a test that demonstrates the change in locations provided by this patch?

@SergejSalnikov SergejSalnikov force-pushed the main branch 2 times, most recently from 8e0373b to 6ef1441 Compare October 13, 2025 16:49
return ParentLoc;
}
return SM.getSpellingLoc(Loc);
}
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't mind hearing from @AaronBallman whether this is the right/efficient way to make this determination.

Sorry, I'm new to github forced pushed the changelist and made the previous comment obsolete.

This implementation is garbage. I'll update the code tomorrow with a better version.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See SourceManager.getRefinedSpellingLocSlowCase.

@Michael137 Michael137 self-requested a review October 14, 2025 10:00
The method preserves source locations for macro arguments.
@llvmbot llvmbot added the clang:frontend Language frontend issues, e.g. anything involving "Sema" label Oct 14, 2025
@SergejSalnikov
Copy link
Author

I've found that the same location expansion logic was sometime executed multiple times for the same location, so I've optmized the code to reduce number of location adjustments. I believe new code should be even faster than the old one.

Also I've added the test.

@Michael137
Copy link
Member

Ran CI on this PR. I expect some tests to fail. Fixing those up would help show what the new expected behaviour is for a wider range of macros. Also, providing a couple of examples (before/after) in the PR description would be useful. Particularly for debug locations within macro bodies. Don't think you added a test for that?

@Michael137
Copy link
Member

CC @AaronBallman for the SourceManager changes

@SergejSalnikov
Copy link
Author

SergejSalnikov commented Oct 15, 2025

@Michael137, I've run the test and they all pass. There were some issues with code formatting, which I've fixed.
The newly added test can be found here.
The demo can be found here.

Copy link
Collaborator

@dwblaikie dwblaikie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks about right to me.

Are there any uses of "unrefined" locations in CGDebugInfo with this patch? If there are, what's the distinction/how was it chosen which would be refined, and which would not?
The slow/fast path through getRefinedSpellingLoc probably isn't worth it - probably make the whole thing outofline?

Some questions for other reviewers, etc:

  1. getRefinedSpellingLoc - I'm not sure "refined" carries enough information (is it a reference to some other existing use of the term?) - but I don't have any great suggestions for a name. Perhaps "immediate" or "local" spelling location?
  2. it's probably not practical to test all the modified code paths - any thoughts on what the right testing tradeoff is here? That various codepaths /inside/ getRefinedSpellingLoc are tested, from perhaps a variety of call sites/ways that manifests in the resulting IR metadata without being exhaustive, seems OK to me? (so perhaps in the test case at least an instruction location and a type location could be tested?)

Comment on lines +18 to +19
// CHECK: DIGlobalVariable(name: "global3",{{.*}} line: [[@LINE+6]]
// CHECK: DIGlobalVariable(name: "global2",{{.*}} line: [[@LINE+2]]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could these comments (including those elsewhere in the test) be moved closer (within the macro argument) to the relevant line - ideally immediately prior (so it's @LINE+1, or even in a trailing comment on the same line using @LINE)

Copy link
Author

@SergejSalnikov SergejSalnikov Oct 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The order in which variables are defined is important. As you can see the global3 precedes global2 despite decreasing line number.

I think that placing CHECKs right before the marco is good for this test.

@SergejSalnikov
Copy link
Author

SergejSalnikov commented Oct 15, 2025

Are there any uses of "unrefined" locations in CGDebugInfo with this patch? If there are, what's the distinction/how was it chosen which would be refined, and which would not? The slow/fast path through getRefinedSpellingLoc probably isn't worth it - probably make the whole thing outofline?

I'm converting all locations that came from public entry points into refined locations. The only non-converted locations are ones that located in private methods and it's guaranteed that the called already passes a refined location.

Keep in mind that majority of tokens are not related to macro so having a fast path seems reasonable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang:codegen IR generation bugs: mangling, exceptions, etc. clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category debuginfo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Use locations inside macros (rather than the macro usage location) for instruction source locations in debug info

4 participants