-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[DebugInfo] Preserve line and column number when merging debug info. #129960
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 4 commits
cb23baa
df04e25
e43ffd1
ffea531
713ae85
4ba8427
7129758
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -21,6 +21,7 @@ | |
| #include "llvm/IR/IntrinsicInst.h" | ||
| #include "llvm/IR/Type.h" | ||
| #include "llvm/IR/Value.h" | ||
| #include "llvm/Support/CommandLine.h" | ||
|
|
||
| #include <numeric> | ||
| #include <optional> | ||
|
|
@@ -34,6 +35,12 @@ cl::opt<bool> EnableFSDiscriminator( | |
| cl::desc("Enable adding flow sensitive discriminators")); | ||
| } // namespace llvm | ||
|
|
||
| // When true, preserves line and column number by picking one of the merged | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd like to have a more descriptive name for this option, since we are discarding information, we're not preserving half of the merged source location. I came up with some suggestions:
I'm really just trying various synonyms for "pick", "select", "choose".
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done, "pick-merged-source-locations" sounds good to me. |
||
| // location info in a deterministic manner to assist sample based PGO. | ||
| static cl::opt<bool> PreserveMergedDebugInfo( | ||
| "preserve-merged-debug-info", cl::init(false), cl::Hidden, | ||
| cl::desc("Preserve line and column number when merging locations.")); | ||
|
|
||
| uint32_t DIType::getAlignInBits() const { | ||
| return (getTag() == dwarf::DW_TAG_LLVM_ptrauth_type ? 0 : SubclassData32); | ||
| } | ||
|
|
@@ -125,6 +132,14 @@ DILocation *DILocation::getMergedLocation(DILocation *LocA, DILocation *LocB) { | |
| if (LocA == LocB) | ||
| return LocA; | ||
|
|
||
| if (PreserveMergedDebugInfo) { | ||
snehasish marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| auto A = std::make_tuple(LocA->getLine(), LocA->getColumn(), | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should filenames be compared here? DILocation has However, given the use case, would it be reasonable to always pick Regardless, I think adding directories and files would follow the principle of least surprise.
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sure, I've added
I think the pick policy should be independent of the order of the parameters passed to this function. So
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: Does
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No, the only place it would be useful is if |
||
| LocA->getDiscriminator()); | ||
| auto B = std::make_tuple(LocB->getLine(), LocB->getColumn(), | ||
| LocB->getDiscriminator()); | ||
| return A < B ? LocA : LocB; | ||
| } | ||
|
|
||
| LLVMContext &C = LocA->getContext(); | ||
|
|
||
| using LocVec = SmallVector<const DILocation *>; | ||
|
|
||
| Original file line number | Diff line number | Diff line change | ||
|---|---|---|---|---|
| @@ -0,0 +1,77 @@ | ||||
| ;; This test verifies that we assign a deterministic location for merged | ||||
| ;; instructions when -preserve-merged-debug-info is enabled. We use the | ||||
| ;; simplifycfg pass to test this behaviour since it was a common source of | ||||
| ;; merged instructions, however we intend this to apply to all users of the | ||||
| ;; getMergedLocation API. | ||||
|
|
||||
| ;; Run simplifycfg and check that only 1 call to bar remains and it's debug | ||||
| ;; location has a valid line number (lexicographically smallest). | ||||
| ; RUN: opt %s -passes=simplifycfg -hoist-common-insts -preserve-merged-debug-info -S | FileCheck %s --check-prefix=ENABLED | ||||
| ; ENABLED: call i32 @bar{{.*!dbg !}}[[TAG:[0-9]+]] | ||||
| ; ENABLED-NOT: call i32 @bar | ||||
| ; ENABLED: ![[TAG]] = !DILocation(line: 9, column: 16, scope: !9) | ||||
|
|
||||
| ;; Run simplifycfg without the pass to ensure that we don't spuriously start | ||||
| ;; passing the test if simplifycfg behaviour changes. | ||||
| ; RUN: opt %s -passes=simplifycfg -hoist-common-insts -preserve-merged-debug-info=false -S | FileCheck %s --check-prefix=DISABLED | ||||
| ; DISABLED: call i32 @bar{{.*!dbg !}}[[TAG:[0-9]+]] | ||||
| ; DISABLED-NOT: call i32 @bar | ||||
| ; DISABLED: ![[TAG]] = !DILocation(line: 0, scope: !9) | ||||
|
|
||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. A file-level comment recording the intention of this test will assist future developers who cause this test to fail.
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done.
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You know, we have unit tests for this API with decent fixtures:
Maybe this is more trouble than it's worth, but you could make the cl::opt non-static, declare it in the unittest, modify it directly (
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd prefer to leave this as is since the current test also covers the SimplifyCfg transformations usage of |
||||
| ; ModuleID = '../llvm/test/DebugInfo/Inputs/debug-info-merge-call.c' | ||||
| source_filename = "../llvm/test/DebugInfo/Inputs/debug-info-merge-call.c" | ||||
| target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128" | ||||
| target triple = "x86_64-unknown-linux-gnu" | ||||
|
|
||||
| ; Function Attrs: nounwind uwtable | ||||
| define dso_local i32 @test(i32 %n) !dbg !9 { | ||||
| entry: | ||||
| %call = call i32 @foo(i32 %n), !dbg !12 | ||||
| %cmp1 = icmp sgt i32 %n, 100, !dbg !13 | ||||
| br i1 %cmp1, label %if.then, label %if.else, !dbg !13 | ||||
|
|
||||
| if.then: ; preds = %entry | ||||
| %call2 = call i32 @bar(i32 %n), !dbg !14 | ||||
| %add = add nsw i32 %call2, %call, !dbg !15 | ||||
| br label %if.end, !dbg !16 | ||||
|
|
||||
| if.else: ; preds = %entry | ||||
| %call4 = call i32 @bar(i32 %n), !dbg !17 | ||||
| br label %if.end | ||||
|
|
||||
| if.end: ; preds = %if.else, %if.then | ||||
| %r.0 = phi i32 [ %add, %if.then ], [ %call4, %if.else ], !dbg !18 | ||||
| ret i32 %r.0, !dbg !19 | ||||
| } | ||||
|
|
||||
| declare !dbg !20 i32 @foo(i32) | ||||
|
|
||||
| declare !dbg !21 i32 @bar(i32) | ||||
|
|
||||
| !llvm.dbg.cu = !{!0} | ||||
| !llvm.module.flags = !{!2, !3, !4, !5, !6, !7} | ||||
| !llvm.ident = !{!8} | ||||
|
|
||||
| !0 = distinct !DICompileUnit(language: DW_LANG_C11, file: !1, producer: "clang version 21.0.0git ([email protected]:snehasish/llvm-project.git 6ce41db6b0275d060d6e60f88b96a1657024345c)", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly, splitDebugInlining: false, nameTableKind: None) | ||||
| !1 = !DIFile(filename: "../llvm/test/DebugInfo/Inputs/debug-info-merge-call.c", directory: "/usr/local/google/home/snehasishk/working/llvm-project/build-assert", checksumkind: CSK_MD5, checksum: "ac1be6c40dad11691922d600f9d55c55") | ||||
| !2 = !{i32 7, !"Dwarf Version", i32 5} | ||||
| !3 = !{i32 2, !"Debug Info Version", i32 3} | ||||
| !4 = !{i32 1, !"wchar_size", i32 4} | ||||
| !5 = !{i32 8, !"PIC Level", i32 2} | ||||
| !6 = !{i32 7, !"PIE Level", i32 2} | ||||
| !7 = !{i32 7, !"uwtable", i32 2} | ||||
| !8 = !{!"clang version 21.0.0git ([email protected]:snehasish/llvm-project.git 6ce41db6b0275d060d6e60f88b96a1657024345c)"} | ||||
| !9 = distinct !DISubprogram(name: "test", scope: !1, file: !1, line: 5, type: !10, scopeLine: 5, flags: DIFlagPrototyped | DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !0) | ||||
| !10 = !DISubroutineType(types: !11) | ||||
| !11 = !{} | ||||
| !12 = !DILocation(line: 7, column: 13, scope: !9) | ||||
| !13 = !DILocation(line: 8, column: 8, scope: !9) | ||||
| !14 = !DILocation(line: 9, column: 16, scope: !9) | ||||
| !15 = !DILocation(line: 9, column: 14, scope: !9) | ||||
| !16 = !DILocation(line: 10, column: 3, scope: !9) | ||||
| !17 = !DILocation(line: 11, column: 10, scope: !9) | ||||
| !18 = !DILocation(line: 0, scope: !9) | ||||
| !19 = !DILocation(line: 13, column: 3, scope: !9) | ||||
| !20 = !DISubprogram(name: "foo", scope: !1, file: !1, line: 2, type: !10, flags: DIFlagPrototyped, spFlags: DISPFlagOptimized) | ||||
| !21 = !DISubprogram(name: "bar", scope: !1, file: !1, line: 1, type: !10, flags: DIFlagPrototyped, spFlags: DISPFlagOptimized) | ||||
|
|
||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are branch instruction locations also critical? I've had more conversations with other sample PGO folks, and my mental model is that PGO constructs branch weights from LBR (Last Branch Record) counters. So, the key to getting accurate branch weights is not having deterministic, distinct source locations on every instruction in the basic block, it's tracking deterministic, distinct source locations on every branch instruction. I think it turns out that, in practice, branch instructions are control instructions, so they tend to either folded away, duplicated in order (inlining, unswitching), or left alone. It doesn't really make sense to speculate a branch instruction. The folding we do just needs to be extremely careful about tracking slocs, just like it has to care about updating branch weights.
If that's accurate, I think it would be helpful to document that maintaining deterministic and distinct source locations are what's important for sample PGO. I think this would go a long way to helping motivate why we can safely retain more source locations on out-of-order instructions without negatively impacting sample PGO results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, though the design of sample attribution mitigates this somewhat by inferring block weight from the max of all instructions in the basic block. Calls are more important since they carry additional metadata apart from their own execution count as noted earlier.
Updated text, ptal.