-
Notifications
You must be signed in to change notification settings - Fork 14.7k
[AMDGPU] Fixed llvm-debuginfo-analyzer for AMDGPU. #145125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
…te DWARF correctly in AMDGPU
4dfaf21
to
c6bacae
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the patch!
I have some small nits, and one larger concern around changing AMDPAL
relocation behavior. If I understand the intent of the change, I think that part of the change should be removed and instead support for the REL relocations added in LogicalView or whatever library it depends on
llvm/include/llvm/DebugInfo/LogicalView/Readers/LVBinaryReader.h
Outdated
Show resolved
Hide resolved
@llvm/pr-subscribers-debuginfo Author: Adam Yang (adam-yang) ChangesConstructing Target triple with To run a full test, also fixed a failure with the Patch is 24.49 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/145125.diff 7 Files Affected:
diff --git a/llvm/include/llvm/DebugInfo/LogicalView/Readers/LVBinaryReader.h b/llvm/include/llvm/DebugInfo/LogicalView/Readers/LVBinaryReader.h
index 1847fa8323480..2cf4a8ec6a37f 100644
--- a/llvm/include/llvm/DebugInfo/LogicalView/Readers/LVBinaryReader.h
+++ b/llvm/include/llvm/DebugInfo/LogicalView/Readers/LVBinaryReader.h
@@ -159,7 +159,8 @@ class LVBinaryReader : public LVReader {
LVAddress WasmCodeSectionOffset = 0;
// Loads all info for the architecture of the provided object file.
- Error loadGenericTargetInfo(StringRef TheTriple, StringRef TheFeatures);
+ Error loadGenericTargetInfo(StringRef TheTriple, StringRef TheFeatures,
+ StringRef TheCPU);
virtual void mapRangeAddress(const object::ObjectFile &Obj) {}
virtual void mapRangeAddress(const object::ObjectFile &Obj,
diff --git a/llvm/lib/DebugInfo/LogicalView/Readers/LVBinaryReader.cpp b/llvm/lib/DebugInfo/LogicalView/Readers/LVBinaryReader.cpp
index 80b4185b7c600..0df9137a3bd37 100644
--- a/llvm/lib/DebugInfo/LogicalView/Readers/LVBinaryReader.cpp
+++ b/llvm/lib/DebugInfo/LogicalView/Readers/LVBinaryReader.cpp
@@ -275,7 +275,8 @@ void LVBinaryReader::mapVirtualAddress(const object::COFFObjectFile &COFFObj) {
}
Error LVBinaryReader::loadGenericTargetInfo(StringRef TheTriple,
- StringRef TheFeatures) {
+ StringRef TheFeatures,
+ StringRef TheCPU) {
std::string TargetLookupError;
const Target *TheTarget =
TargetRegistry::lookupTarget(TheTriple, TargetLookupError);
@@ -298,9 +299,8 @@ Error LVBinaryReader::loadGenericTargetInfo(StringRef TheTriple,
MAI.reset(AsmInfo);
// Target subtargets.
- StringRef CPU;
MCSubtargetInfo *SubtargetInfo(
- TheTarget->createMCSubtargetInfo(TheTriple, CPU, TheFeatures));
+ TheTarget->createMCSubtargetInfo(TheTriple, TheCPU, TheFeatures));
if (!SubtargetInfo)
return createStringError(errc::invalid_argument,
"no subtarget info for target " + TheTriple);
diff --git a/llvm/lib/DebugInfo/LogicalView/Readers/LVCodeViewReader.cpp b/llvm/lib/DebugInfo/LogicalView/Readers/LVCodeViewReader.cpp
index e5895516b5e77..2ff70816b4bf1 100644
--- a/llvm/lib/DebugInfo/LogicalView/Readers/LVCodeViewReader.cpp
+++ b/llvm/lib/DebugInfo/LogicalView/Readers/LVCodeViewReader.cpp
@@ -1190,7 +1190,12 @@ Error LVCodeViewReader::loadTargetInfo(const ObjectFile &Obj) {
FeaturesValue = SubtargetFeatures();
}
FeaturesValue = *Features;
- return loadGenericTargetInfo(TT.str(), FeaturesValue.getString());
+
+ StringRef CPU;
+ if (auto OptCPU = Obj.tryGetCPUName())
+ CPU = *OptCPU;
+
+ return loadGenericTargetInfo(TT.str(), FeaturesValue.getString(), CPU);
}
Error LVCodeViewReader::loadTargetInfo(const PDBFile &Pdb) {
@@ -1200,8 +1205,9 @@ Error LVCodeViewReader::loadTargetInfo(const PDBFile &Pdb) {
TT.setOS(Triple::Win32);
StringRef TheFeature = "";
+ StringRef TheCPU = "";
- return loadGenericTargetInfo(TT.str(), TheFeature);
+ return loadGenericTargetInfo(TT.str(), TheFeature, TheCPU);
}
std::string LVCodeViewReader::getRegisterName(LVSmall Opcode,
diff --git a/llvm/lib/DebugInfo/LogicalView/Readers/LVDWARFReader.cpp b/llvm/lib/DebugInfo/LogicalView/Readers/LVDWARFReader.cpp
index 696e2bc948a2e..62134dfdadf46 100644
--- a/llvm/lib/DebugInfo/LogicalView/Readers/LVDWARFReader.cpp
+++ b/llvm/lib/DebugInfo/LogicalView/Readers/LVDWARFReader.cpp
@@ -956,10 +956,7 @@ LVElement *LVDWARFReader::getElementForOffset(LVOffset Offset,
Error LVDWARFReader::loadTargetInfo(const ObjectFile &Obj) {
// Detect the architecture from the object file. We usually don't need OS
// info to lookup a target and create register info.
- Triple TT;
- TT.setArch(Triple::ArchType(Obj.getArch()));
- TT.setVendor(Triple::UnknownVendor);
- TT.setOS(Triple::UnknownOS);
+ Triple TT = Obj.makeTriple();
// Features to be passed to target/subtarget
Expected<SubtargetFeatures> Features = Obj.getFeatures();
@@ -969,7 +966,12 @@ Error LVDWARFReader::loadTargetInfo(const ObjectFile &Obj) {
FeaturesValue = SubtargetFeatures();
}
FeaturesValue = *Features;
- return loadGenericTargetInfo(TT.str(), FeaturesValue.getString());
+
+ StringRef CPU;
+ if (auto OptCPU = Obj.tryGetCPUName())
+ CPU = *OptCPU;
+
+ return loadGenericTargetInfo(TT.str(), FeaturesValue.getString(), CPU);
}
void LVDWARFReader::mapRangeAddress(const ObjectFile &Obj) {
diff --git a/llvm/lib/Target/AMDGPU/SIPreAllocateWWMRegs.cpp b/llvm/lib/Target/AMDGPU/SIPreAllocateWWMRegs.cpp
index 205a45a045a42..f807c567efa2f 100644
--- a/llvm/lib/Target/AMDGPU/SIPreAllocateWWMRegs.cpp
+++ b/llvm/lib/Target/AMDGPU/SIPreAllocateWWMRegs.cpp
@@ -130,6 +130,11 @@ void SIPreAllocateWWMRegs::rewriteRegs(MachineFunction &MF) {
if (VirtReg.isPhysical())
continue;
+ if (!VirtReg.isValid()) {
+ assert(MI.isDebugInstr() && "non-debug use of noreg");
+ continue;
+ }
+
if (!VRM->hasPhys(VirtReg))
continue;
diff --git a/llvm/test/CodeGen/AMDGPU/amdgpu-llvm-debuginfo-analyzer.ll b/llvm/test/CodeGen/AMDGPU/amdgpu-llvm-debuginfo-analyzer.ll
new file mode 100644
index 0000000000000..2cff21c66172d
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/amdgpu-llvm-debuginfo-analyzer.ll
@@ -0,0 +1,101 @@
+; RUN: llc %s -o %t.o -mcpu=gfx1030 -filetype=obj -O0
+; RUN: llvm-debuginfo-analyzer %t.o --print=all --attribute=all | FileCheck %s
+
+; This test compiles this module with AMDGPU backend under -O0,
+; and makes sure llvm-debuginfo-analyzer works for it.
+
+; Simple checks to make sure llvm-debuginfo-analzyer didn't fail early.
+; CHECK: Logical View:
+; CHECK: {CompileUnit}
+; CHECK-DAG: {Parameter} 'dtid' -> [0x{{[a-f0-9]+}}]'uint3'
+; CHECK-DAG: {Variable} 'my_var2' -> [0x{{[a-f0-9]+}}]'float'
+; CHECK-DAG: {Line} {{.+}}basic_var.hlsl
+; CHECK: {Code} 's_endpgm'
+
+source_filename = "module"
+target datalayout = "e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-p7:160:256:256:32-p8:128:128-p9:192:256:256:32-p10:32:32-p11:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5-G1-ni:7:8:9-p32:32:32-v8:8-v16:16-v32:32-v48:32-v64:32-v80:32-v96:32-v112:32-v128:32-v144:32-v160:32-v176:32-v192:32-v208:32-v224:32-v240:32-v256:32-i1:32-i8:8-i16:16-i32:32-i64:32-f16:16-f32:32-f64:32"
+target triple = "amdgcn-amd-amdpal"
+
+%dx.types.ResRet.f32 = type { float, float, float, float, i32 }
+
+define dllexport amdgpu_cs void @_amdgpu_cs_main(i32 inreg noundef %globalTable, i32 inreg noundef %userdata4, <3 x i32> inreg noundef %WorkgroupId, i32 inreg noundef %MultiDispatchInfo, <3 x i32> noundef %LocalInvocationId) #0 !dbg !14 {
+ %LocalInvocationId.i0 = extractelement <3 x i32> %LocalInvocationId, i64 0, !dbg !28
+ %WorkgroupId.i0 = extractelement <3 x i32> %WorkgroupId, i64 0, !dbg !28
+ %1 = call i64 @llvm.amdgcn.s.getpc(), !dbg !28
+ %2 = shl i32 %WorkgroupId.i0, 6, !dbg !28
+ %3 = add i32 %LocalInvocationId.i0, %2, !dbg !28
+ #dbg_value(i32 %3, !29, !DIExpression(DW_OP_LLVM_fragment, 0, 32), !28)
+ %4 = and i64 %1, -4294967296, !dbg !30
+ %5 = zext i32 %userdata4 to i64, !dbg !30
+ %6 = or disjoint i64 %4, %5, !dbg !30
+ %7 = inttoptr i64 %6 to ptr addrspace(4), !dbg !30
+ call void @llvm.assume(i1 true) [ "align"(ptr addrspace(4) %7, i32 4), "dereferenceable"(ptr addrspace(4) %7, i32 -1) ], !dbg !30
+ %8 = load <4 x i32>, ptr addrspace(4) %7, align 4, !dbg !30, !invariant.load !2
+ %9 = call float @llvm.amdgcn.struct.buffer.load.format.f32(<4 x i32> %8, i32 %3, i32 0, i32 0, i32 0), !dbg !30
+ #dbg_value(%dx.types.ResRet.f32 poison, !31, !DIExpression(), !32)
+ %10 = fmul reassoc arcp contract afn float %9, 2.000000e+00, !dbg !33
+ #dbg_value(float %10, !34, !DIExpression(), !35)
+ call void @llvm.assume(i1 true) [ "align"(ptr addrspace(4) %7, i32 4), "dereferenceable"(ptr addrspace(4) %7, i32 -1) ], !dbg !36
+ %11 = getelementptr i8, ptr addrspace(4) %7, i64 32, !dbg !36
+ %.upto01 = insertelement <4 x float> poison, float %10, i64 0, !dbg !36
+ %12 = shufflevector <4 x float> %.upto01, <4 x float> poison, <4 x i32> zeroinitializer, !dbg !36
+ %13 = load <4 x i32>, ptr addrspace(4) %11, align 4, !dbg !36, !invariant.load !2
+ call void @llvm.amdgcn.struct.buffer.store.format.v4f32(<4 x float> %12, <4 x i32> %13, i32 %3, i32 0, i32 0, i32 0), !dbg !36
+ ret void, !dbg !37
+}
+
+declare noundef i64 @llvm.amdgcn.s.getpc() #1
+
+declare void @llvm.assume(i1 noundef) #2
+
+declare void @llvm.amdgcn.struct.buffer.store.format.v4f32(<4 x float>, <4 x i32>, i32, i32, i32, i32 immarg) #3
+
+declare float @llvm.amdgcn.struct.buffer.load.format.f32(<4 x i32>, i32, i32, i32, i32 immarg) #4
+
+attributes #0 = { memory(readwrite) "amdgpu-flat-work-group-size"="64,64" "amdgpu-memory-bound"="false" "amdgpu-num-sgpr"="4294967295" "amdgpu-num-vgpr"="4294967295" "amdgpu-prealloc-sgpr-spill-vgprs" "amdgpu-unroll-threshold"="1200" "amdgpu-wave-limiter"="false" "amdgpu-work-group-info-arg-no"="3" "denormal-fp-math"="ieee" "denormal-fp-math-f32"="preserve-sign" "target-features"=",+wavefrontsize64,+cumode,+enable-flat-scratch" }
+attributes #1 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
+attributes #2 = { nocallback nofree nosync nounwind willreturn memory(inaccessiblemem: write) }
+attributes #3 = { nocallback nofree nosync nounwind willreturn memory(write) }
+attributes #4 = { nocallback nofree nosync nounwind willreturn memory(read) }
+
+!llvm.dbg.cu = !{!0}
+!llvm.module.flags = !{!12, !13}
+
+!0 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus, file: !1, producer: "dxcoob 1.7.2308.16 (52da17e29)", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, enums: !2, globals: !3)
+!1 = !DIFile(filename: "tests\\basic_var.hlsl", directory: "")
+!2 = !{}
+!3 = !{!4, !10}
+!4 = distinct !DIGlobalVariableExpression(var: !5, expr: !DIExpression())
+!5 = !DIGlobalVariable(name: "u0", linkageName: "\01?u0@@3V?$RWBuffer@M@@A", scope: !0, file: !1, line: 2, type: !6, isLocal: false, isDefinition: true)
+!6 = !DICompositeType(tag: DW_TAG_class_type, name: "RWBuffer<float>", file: !1, line: 2, size: 32, align: 32, elements: !2, templateParams: !7)
+!7 = !{!8}
+!8 = !DITemplateTypeParameter(name: "element", type: !9)
+!9 = !DIBasicType(name: "float", size: 32, align: 32, encoding: DW_ATE_float)
+!10 = distinct !DIGlobalVariableExpression(var: !11, expr: !DIExpression())
+!11 = !DIGlobalVariable(name: "u1", linkageName: "\01?u1@@3V?$RWBuffer@M@@A", scope: !0, file: !1, line: 3, type: !6, isLocal: false, isDefinition: true)
+!12 = !{i32 2, !"Dwarf Version", i32 5}
+!13 = !{i32 2, !"Debug Info Version", i32 3}
+!14 = distinct !DISubprogram(name: "main", scope: !1, file: !1, line: 7, type: !15, scopeLine: 7, flags: DIFlagPrototyped, spFlags: DISPFlagDefinition, unit: !0)
+!15 = !DISubroutineType(types: !16)
+!16 = !{null, !17}
+!17 = !DIDerivedType(tag: DW_TAG_typedef, name: "uint3", file: !1, baseType: !18)
+!18 = !DICompositeType(tag: DW_TAG_class_type, name: "vector<unsigned int, 3>", file: !1, size: 96, align: 32, elements: !19, templateParams: !24)
+!19 = !{!20, !22, !23}
+!20 = !DIDerivedType(tag: DW_TAG_member, name: "x", scope: !18, file: !1, baseType: !21, size: 32, align: 32, flags: DIFlagPublic)
+!21 = !DIBasicType(name: "unsigned int", size: 32, align: 32, encoding: DW_ATE_unsigned)
+!22 = !DIDerivedType(tag: DW_TAG_member, name: "y", scope: !18, file: !1, baseType: !21, size: 32, align: 32, offset: 32, flags: DIFlagPublic)
+!23 = !DIDerivedType(tag: DW_TAG_member, name: "z", scope: !18, file: !1, baseType: !21, size: 32, align: 32, offset: 64, flags: DIFlagPublic)
+!24 = !{!25, !26}
+!25 = !DITemplateTypeParameter(name: "element", type: !21)
+!26 = !DITemplateValueParameter(name: "element_count", type: !27, value: i32 3)
+!27 = !DIBasicType(name: "int", size: 32, align: 32, encoding: DW_ATE_signed)
+!28 = !DILocation(line: 7, column: 17, scope: !14)
+!29 = !DILocalVariable(name: "dtid", arg: 1, scope: !14, file: !1, line: 7, type: !17)
+!30 = !DILocation(line: 11, column: 18, scope: !14)
+!31 = !DILocalVariable(name: "my_var", scope: !14, file: !1, line: 11, type: !9)
+!32 = !DILocation(line: 11, column: 9, scope: !14)
+!33 = !DILocation(line: 14, column: 26, scope: !14)
+!34 = !DILocalVariable(name: "my_var2", scope: !14, file: !1, line: 14, type: !9)
+!35 = !DILocation(line: 14, column: 9, scope: !14)
+!36 = !DILocation(line: 17, column: 14, scope: !14)
+!37 = !DILocation(line: 19, column: 1, scope: !14)
diff --git a/llvm/test/CodeGen/AMDGPU/si-pre-allocate-wwwmregs-dbg-noreg.mir b/llvm/test/CodeGen/AMDGPU/si-pre-allocate-wwwmregs-dbg-noreg.mir
new file mode 100644
index 0000000000000..4b5fea863289b
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/si-pre-allocate-wwwmregs-dbg-noreg.mir
@@ -0,0 +1,210 @@
+# RUN: llc %s -o - -mcpu=gfx1030 -O0 -run-pass=si-pre-allocate-wwm-regs | FileCheck %s
+
+# Simple regression test to make sure DBG_VALUE $noreg does not assert in the pass
+
+# CHECK: S_ENDPGM
+
+--- |
+ source_filename = "module"
+ target datalayout = "e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-p7:160:256:256:32-p8:128:128:128:48-p9:192:256:256:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5-G1-ni:7:8:9"
+ target triple = "amdgcn-amd-amdpal"
+
+ %dx.types.ResRet.f32 = type { float, float, float, float, i32 }
+
+ define dllexport amdgpu_cs void @_amdgpu_cs_main(i32 inreg noundef %globalTable, i32 inreg noundef %userdata4, <3 x i32> inreg noundef %WorkgroupId, i32 inreg noundef %MultiDispatchInfo, <3 x i32> noundef %LocalInvocationId) #0 !dbg !14 {
+ %LocalInvocationId.i0 = extractelement <3 x i32> %LocalInvocationId, i64 0, !dbg !28
+ %WorkgroupId.i0 = extractelement <3 x i32> %WorkgroupId, i64 0, !dbg !28
+ %1 = call i64 @llvm.amdgcn.s.getpc(), !dbg !28
+ %2 = shl i32 %WorkgroupId.i0, 6, !dbg !28
+ %3 = add i32 %LocalInvocationId.i0, %2, !dbg !28
+ #dbg_value(i32 %3, !29, !DIExpression(DW_OP_LLVM_fragment, 0, 32), !28)
+ %4 = and i64 %1, -4294967296, !dbg !30
+ %5 = zext i32 %userdata4 to i64, !dbg !30
+ %6 = or disjoint i64 %4, %5, !dbg !30
+ %7 = inttoptr i64 %6 to ptr addrspace(4), !dbg !30, !amdgpu.uniform !2
+ %8 = load <4 x i32>, ptr addrspace(4) %7, align 4, !dbg !30, !invariant.load !2
+ %9 = call float @llvm.amdgcn.struct.buffer.load.format.f32(<4 x i32> %8, i32 %3, i32 0, i32 0, i32 0), !dbg !30
+ #dbg_value(%dx.types.ResRet.f32 poison, !31, !DIExpression(), !32)
+ %10 = fmul reassoc arcp contract afn float %9, 2.000000e+00, !dbg !33
+ #dbg_value(float %10, !34, !DIExpression(), !35)
+ %11 = getelementptr i8, ptr addrspace(4) %7, i64 32, !dbg !36, !amdgpu.uniform !2
+ %.upto01 = insertelement <4 x float> poison, float %10, i64 0, !dbg !36
+ %12 = shufflevector <4 x float> %.upto01, <4 x float> poison, <4 x i32> zeroinitializer, !dbg !36
+ %13 = load <4 x i32>, ptr addrspace(4) %11, align 4, !dbg !36, !invariant.load !2
+ call void @llvm.amdgcn.struct.buffer.store.format.v4f32(<4 x float> %12, <4 x i32> %13, i32 %3, i32 0, i32 0, i32 0), !dbg !36
+ ret void, !dbg !37
+ }
+
+ declare noundef i64 @llvm.amdgcn.s.getpc() #1
+ declare void @llvm.amdgcn.struct.buffer.store.format.v4f32(<4 x float>, <4 x i32>, i32, i32, i32, i32 immarg) #3
+ declare float @llvm.amdgcn.struct.buffer.load.format.f32(<4 x i32>, i32, i32, i32, i32 immarg) #4
+
+ attributes #0 = { memory(readwrite) "amdgpu-flat-work-group-size"="64,64" "amdgpu-memory-bound"="false" "amdgpu-num-sgpr"="4294967295" "amdgpu-num-vgpr"="4294967295" "amdgpu-prealloc-sgpr-spill-vgprs" "amdgpu-unroll-threshold"="1200" "amdgpu-wave-limiter"="false" "amdgpu-work-group-info-arg-no"="3" "denormal-fp-math"="ieee" "denormal-fp-math-f32"="preserve-sign" "target-cpu"="gfx1030" "target-features"=",+wavefrontsize64,+cumode,+enable-flat-scratch" }
+ attributes #1 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) "target-cpu"="gfx1030" }
+ attributes #2 = { nocallback nofree nosync nounwind willreturn memory(inaccessiblemem: write) "target-cpu"="gfx1030" }
+ attributes #3 = { nocallback nofree nosync nounwind willreturn memory(write) "target-cpu"="gfx1030" }
+ attributes #4 = { nocallback nofree nosync nounwind willreturn memory(read) "target-cpu"="gfx1030" }
+
+ !llvm.dbg.cu = !{!0}
+ !llvm.module.flags = !{!12, !13}
+
+ !0 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus, file: !1, producer: "dxcoob 1.7.2308.16 (52da17e29)", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, enums: !2, globals: !3)
+ !1 = !DIFile(filename: "tests\\basic_var.hlsl", directory: "")
+ !2 = !{}
+ !3 = !{!4, !10}
+ !4 = distinct !DIGlobalVariableExpression(var: !5, expr: !DIExpression())
+ !5 = !DIGlobalVariable(name: "u0", linkageName: "\01?u0@@3V?$RWBuffer@M@@A", scope: !0, file: !1, line: 2, type: !6, isLocal: false, isDefinition: true)
+ !6 = !DICompositeType(tag: DW_TAG_class_type, name: "RWBuffer<float>", file: !1, line: 2, size: 32, align: 32, elements: !2, templateParams: !7)
+ !7 = !{!8}
+ !8 = !DITemplateTypeParameter(name: "element", type: !9)
+ !9 = !DIBasicType(name: "float", size: 32, align: 32, encoding: DW_ATE_float)
+ !10 = distinct !DIGlobalVariableExpression(var: !11, expr: !DIExpression())
+ !11 = !DIGlobalVariable(name: "u1", linkageName: "\01?u1@@3V?$RWBuffer@M@@A", scope: !0, file: !1, line: 3, type: !6, isLocal: false, isDefinition: true)
+ !12 = !{i32 2, !"Dwarf Version", i32 5}
+ !13 = !{i32 2, !"Debug Info Version", i32 3}
+ !14 = distinct !DISubprogram(name: "main", scope: !1, file: !1, line: 7, type: !15, scopeLine: 7, flags: DIFlagPrototyped, spFlags: DISPFlagDefinition, unit: !0)
+ !15 = !DISubroutineType(types: !16)
+ !16 = !{null, !17}
+ !17 = !DIDerivedType(tag: DW_TAG_typedef, name: "uint3", file: !1, baseType: !18)
+ !18 = !DICompositeType(tag: DW_TAG_class_type, name: "vector<unsigned int, 3>", file: !1, size: 96, align: 32, elements: !19, templateParams: !24)
+ !19 = !{!20, !22, !23}
+ !20 = !DIDerivedType(tag: DW_TAG_member, name: "x", scope: !18, file: !1, baseType: !21, size: 32, align: 32, flags: DIFlagPublic)
+ !21 = !DIBasicType(name: "unsigned int", size: 32, align: 32, encoding: DW_ATE_unsigned)
+ !22 = !DIDerivedType(tag: DW_TAG_member, name: "y", scope: !18, file: !1, baseType: !21, size: 32, align: 32, offset: 32, flags: DIFlagPublic)
+ !23 = !DIDerivedType(tag: DW_TAG_member, name: "z", scope: !18, file: !1, baseType: !21, size: 32, align: 32, offset: 64, flags: DIFlagPublic)
+ !24 = !{!25, !26}
+ !25 = !DITemplateTypeParameter(name: "element", type: !21)
+ !26 = !DITemplateValueParameter(name: "element_count", type: !27, value: i32 3)
+ !27 = !DIBasicType(name: "int", size: 32, align: 32, encoding: DW_ATE_signed)
+ !28 = !DILocation(line: 7, column: 17, scope: !14)
+ !29 = !DILocalVariable(name: "dtid", arg: 1, scope: !14, file: !1, line: 7, type: !17)
+ !30 = !DILocation(line: 11, column: 18, scope: !14)
+ !31 = !DILocalVariable(name: "my_var", scope: !14, file: !1, line: 11, type: !9)
+ !32 = !DILocation(line: 11, column: 9, scope: !14)
+ !33 = !DILocation(line: 14, column: 26, scope: !14)
+ !34 = !DILocalVariable(name: "my_var2", scope: !14, file: !1, line: 14, type: !9)
+ !35 = !DILocation(line: 14, column: 9, scope: !14)
+ !36 = !DILocation(line: 17, column: 14, scope: !14)
+ !37 = !DILocation(line: 19, column: 1, scope: !14)
+...
+---
+name: _amdgpu_cs_main
+alignment: 1
+exposesReturnsTwice: false
+legalized: false
+regBankSelected: false
+selected: false
+failedISel: false
+tracksRegLiveness: true
+hasWinCFI: false
+noPhis: true
+isSSA: false
+noVRegs: false
+hasFakeUses: false
+callsEHReturn: false
+callsUnwindInit: false
+hasEHContTarget: false
+hasEHScopes: false
+hasEHFunclets: false
+isOutlined: false
+debugInstrRef: false
+failsVerification: false
+tracksDebugUserValues: false
+fixedStack: []
+stack: []
+entry_values: []
+callSites: []
+debugValueSubstitutions: []
+constants: []
+machineFunctionInfo:
+ explicitKernArgSize: 0
+ maxKernArgAlign: 4
+ ldsSize: 0
+ gdsSize: ...
[truncated]
|
@llvm/pr-subscribers-backend-amdgpu Author: Adam Yang (adam-yang) ChangesConstructing Target triple with To run a full test, also fixed a failure with the Patch is 24.49 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/145125.diff 7 Files Affected:
diff --git a/llvm/include/llvm/DebugInfo/LogicalView/Readers/LVBinaryReader.h b/llvm/include/llvm/DebugInfo/LogicalView/Readers/LVBinaryReader.h
index 1847fa8323480..2cf4a8ec6a37f 100644
--- a/llvm/include/llvm/DebugInfo/LogicalView/Readers/LVBinaryReader.h
+++ b/llvm/include/llvm/DebugInfo/LogicalView/Readers/LVBinaryReader.h
@@ -159,7 +159,8 @@ class LVBinaryReader : public LVReader {
LVAddress WasmCodeSectionOffset = 0;
// Loads all info for the architecture of the provided object file.
- Error loadGenericTargetInfo(StringRef TheTriple, StringRef TheFeatures);
+ Error loadGenericTargetInfo(StringRef TheTriple, StringRef TheFeatures,
+ StringRef TheCPU);
virtual void mapRangeAddress(const object::ObjectFile &Obj) {}
virtual void mapRangeAddress(const object::ObjectFile &Obj,
diff --git a/llvm/lib/DebugInfo/LogicalView/Readers/LVBinaryReader.cpp b/llvm/lib/DebugInfo/LogicalView/Readers/LVBinaryReader.cpp
index 80b4185b7c600..0df9137a3bd37 100644
--- a/llvm/lib/DebugInfo/LogicalView/Readers/LVBinaryReader.cpp
+++ b/llvm/lib/DebugInfo/LogicalView/Readers/LVBinaryReader.cpp
@@ -275,7 +275,8 @@ void LVBinaryReader::mapVirtualAddress(const object::COFFObjectFile &COFFObj) {
}
Error LVBinaryReader::loadGenericTargetInfo(StringRef TheTriple,
- StringRef TheFeatures) {
+ StringRef TheFeatures,
+ StringRef TheCPU) {
std::string TargetLookupError;
const Target *TheTarget =
TargetRegistry::lookupTarget(TheTriple, TargetLookupError);
@@ -298,9 +299,8 @@ Error LVBinaryReader::loadGenericTargetInfo(StringRef TheTriple,
MAI.reset(AsmInfo);
// Target subtargets.
- StringRef CPU;
MCSubtargetInfo *SubtargetInfo(
- TheTarget->createMCSubtargetInfo(TheTriple, CPU, TheFeatures));
+ TheTarget->createMCSubtargetInfo(TheTriple, TheCPU, TheFeatures));
if (!SubtargetInfo)
return createStringError(errc::invalid_argument,
"no subtarget info for target " + TheTriple);
diff --git a/llvm/lib/DebugInfo/LogicalView/Readers/LVCodeViewReader.cpp b/llvm/lib/DebugInfo/LogicalView/Readers/LVCodeViewReader.cpp
index e5895516b5e77..2ff70816b4bf1 100644
--- a/llvm/lib/DebugInfo/LogicalView/Readers/LVCodeViewReader.cpp
+++ b/llvm/lib/DebugInfo/LogicalView/Readers/LVCodeViewReader.cpp
@@ -1190,7 +1190,12 @@ Error LVCodeViewReader::loadTargetInfo(const ObjectFile &Obj) {
FeaturesValue = SubtargetFeatures();
}
FeaturesValue = *Features;
- return loadGenericTargetInfo(TT.str(), FeaturesValue.getString());
+
+ StringRef CPU;
+ if (auto OptCPU = Obj.tryGetCPUName())
+ CPU = *OptCPU;
+
+ return loadGenericTargetInfo(TT.str(), FeaturesValue.getString(), CPU);
}
Error LVCodeViewReader::loadTargetInfo(const PDBFile &Pdb) {
@@ -1200,8 +1205,9 @@ Error LVCodeViewReader::loadTargetInfo(const PDBFile &Pdb) {
TT.setOS(Triple::Win32);
StringRef TheFeature = "";
+ StringRef TheCPU = "";
- return loadGenericTargetInfo(TT.str(), TheFeature);
+ return loadGenericTargetInfo(TT.str(), TheFeature, TheCPU);
}
std::string LVCodeViewReader::getRegisterName(LVSmall Opcode,
diff --git a/llvm/lib/DebugInfo/LogicalView/Readers/LVDWARFReader.cpp b/llvm/lib/DebugInfo/LogicalView/Readers/LVDWARFReader.cpp
index 696e2bc948a2e..62134dfdadf46 100644
--- a/llvm/lib/DebugInfo/LogicalView/Readers/LVDWARFReader.cpp
+++ b/llvm/lib/DebugInfo/LogicalView/Readers/LVDWARFReader.cpp
@@ -956,10 +956,7 @@ LVElement *LVDWARFReader::getElementForOffset(LVOffset Offset,
Error LVDWARFReader::loadTargetInfo(const ObjectFile &Obj) {
// Detect the architecture from the object file. We usually don't need OS
// info to lookup a target and create register info.
- Triple TT;
- TT.setArch(Triple::ArchType(Obj.getArch()));
- TT.setVendor(Triple::UnknownVendor);
- TT.setOS(Triple::UnknownOS);
+ Triple TT = Obj.makeTriple();
// Features to be passed to target/subtarget
Expected<SubtargetFeatures> Features = Obj.getFeatures();
@@ -969,7 +966,12 @@ Error LVDWARFReader::loadTargetInfo(const ObjectFile &Obj) {
FeaturesValue = SubtargetFeatures();
}
FeaturesValue = *Features;
- return loadGenericTargetInfo(TT.str(), FeaturesValue.getString());
+
+ StringRef CPU;
+ if (auto OptCPU = Obj.tryGetCPUName())
+ CPU = *OptCPU;
+
+ return loadGenericTargetInfo(TT.str(), FeaturesValue.getString(), CPU);
}
void LVDWARFReader::mapRangeAddress(const ObjectFile &Obj) {
diff --git a/llvm/lib/Target/AMDGPU/SIPreAllocateWWMRegs.cpp b/llvm/lib/Target/AMDGPU/SIPreAllocateWWMRegs.cpp
index 205a45a045a42..f807c567efa2f 100644
--- a/llvm/lib/Target/AMDGPU/SIPreAllocateWWMRegs.cpp
+++ b/llvm/lib/Target/AMDGPU/SIPreAllocateWWMRegs.cpp
@@ -130,6 +130,11 @@ void SIPreAllocateWWMRegs::rewriteRegs(MachineFunction &MF) {
if (VirtReg.isPhysical())
continue;
+ if (!VirtReg.isValid()) {
+ assert(MI.isDebugInstr() && "non-debug use of noreg");
+ continue;
+ }
+
if (!VRM->hasPhys(VirtReg))
continue;
diff --git a/llvm/test/CodeGen/AMDGPU/amdgpu-llvm-debuginfo-analyzer.ll b/llvm/test/CodeGen/AMDGPU/amdgpu-llvm-debuginfo-analyzer.ll
new file mode 100644
index 0000000000000..2cff21c66172d
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/amdgpu-llvm-debuginfo-analyzer.ll
@@ -0,0 +1,101 @@
+; RUN: llc %s -o %t.o -mcpu=gfx1030 -filetype=obj -O0
+; RUN: llvm-debuginfo-analyzer %t.o --print=all --attribute=all | FileCheck %s
+
+; This test compiles this module with AMDGPU backend under -O0,
+; and makes sure llvm-debuginfo-analyzer works for it.
+
+; Simple checks to make sure llvm-debuginfo-analzyer didn't fail early.
+; CHECK: Logical View:
+; CHECK: {CompileUnit}
+; CHECK-DAG: {Parameter} 'dtid' -> [0x{{[a-f0-9]+}}]'uint3'
+; CHECK-DAG: {Variable} 'my_var2' -> [0x{{[a-f0-9]+}}]'float'
+; CHECK-DAG: {Line} {{.+}}basic_var.hlsl
+; CHECK: {Code} 's_endpgm'
+
+source_filename = "module"
+target datalayout = "e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-p7:160:256:256:32-p8:128:128-p9:192:256:256:32-p10:32:32-p11:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5-G1-ni:7:8:9-p32:32:32-v8:8-v16:16-v32:32-v48:32-v64:32-v80:32-v96:32-v112:32-v128:32-v144:32-v160:32-v176:32-v192:32-v208:32-v224:32-v240:32-v256:32-i1:32-i8:8-i16:16-i32:32-i64:32-f16:16-f32:32-f64:32"
+target triple = "amdgcn-amd-amdpal"
+
+%dx.types.ResRet.f32 = type { float, float, float, float, i32 }
+
+define dllexport amdgpu_cs void @_amdgpu_cs_main(i32 inreg noundef %globalTable, i32 inreg noundef %userdata4, <3 x i32> inreg noundef %WorkgroupId, i32 inreg noundef %MultiDispatchInfo, <3 x i32> noundef %LocalInvocationId) #0 !dbg !14 {
+ %LocalInvocationId.i0 = extractelement <3 x i32> %LocalInvocationId, i64 0, !dbg !28
+ %WorkgroupId.i0 = extractelement <3 x i32> %WorkgroupId, i64 0, !dbg !28
+ %1 = call i64 @llvm.amdgcn.s.getpc(), !dbg !28
+ %2 = shl i32 %WorkgroupId.i0, 6, !dbg !28
+ %3 = add i32 %LocalInvocationId.i0, %2, !dbg !28
+ #dbg_value(i32 %3, !29, !DIExpression(DW_OP_LLVM_fragment, 0, 32), !28)
+ %4 = and i64 %1, -4294967296, !dbg !30
+ %5 = zext i32 %userdata4 to i64, !dbg !30
+ %6 = or disjoint i64 %4, %5, !dbg !30
+ %7 = inttoptr i64 %6 to ptr addrspace(4), !dbg !30
+ call void @llvm.assume(i1 true) [ "align"(ptr addrspace(4) %7, i32 4), "dereferenceable"(ptr addrspace(4) %7, i32 -1) ], !dbg !30
+ %8 = load <4 x i32>, ptr addrspace(4) %7, align 4, !dbg !30, !invariant.load !2
+ %9 = call float @llvm.amdgcn.struct.buffer.load.format.f32(<4 x i32> %8, i32 %3, i32 0, i32 0, i32 0), !dbg !30
+ #dbg_value(%dx.types.ResRet.f32 poison, !31, !DIExpression(), !32)
+ %10 = fmul reassoc arcp contract afn float %9, 2.000000e+00, !dbg !33
+ #dbg_value(float %10, !34, !DIExpression(), !35)
+ call void @llvm.assume(i1 true) [ "align"(ptr addrspace(4) %7, i32 4), "dereferenceable"(ptr addrspace(4) %7, i32 -1) ], !dbg !36
+ %11 = getelementptr i8, ptr addrspace(4) %7, i64 32, !dbg !36
+ %.upto01 = insertelement <4 x float> poison, float %10, i64 0, !dbg !36
+ %12 = shufflevector <4 x float> %.upto01, <4 x float> poison, <4 x i32> zeroinitializer, !dbg !36
+ %13 = load <4 x i32>, ptr addrspace(4) %11, align 4, !dbg !36, !invariant.load !2
+ call void @llvm.amdgcn.struct.buffer.store.format.v4f32(<4 x float> %12, <4 x i32> %13, i32 %3, i32 0, i32 0, i32 0), !dbg !36
+ ret void, !dbg !37
+}
+
+declare noundef i64 @llvm.amdgcn.s.getpc() #1
+
+declare void @llvm.assume(i1 noundef) #2
+
+declare void @llvm.amdgcn.struct.buffer.store.format.v4f32(<4 x float>, <4 x i32>, i32, i32, i32, i32 immarg) #3
+
+declare float @llvm.amdgcn.struct.buffer.load.format.f32(<4 x i32>, i32, i32, i32, i32 immarg) #4
+
+attributes #0 = { memory(readwrite) "amdgpu-flat-work-group-size"="64,64" "amdgpu-memory-bound"="false" "amdgpu-num-sgpr"="4294967295" "amdgpu-num-vgpr"="4294967295" "amdgpu-prealloc-sgpr-spill-vgprs" "amdgpu-unroll-threshold"="1200" "amdgpu-wave-limiter"="false" "amdgpu-work-group-info-arg-no"="3" "denormal-fp-math"="ieee" "denormal-fp-math-f32"="preserve-sign" "target-features"=",+wavefrontsize64,+cumode,+enable-flat-scratch" }
+attributes #1 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
+attributes #2 = { nocallback nofree nosync nounwind willreturn memory(inaccessiblemem: write) }
+attributes #3 = { nocallback nofree nosync nounwind willreturn memory(write) }
+attributes #4 = { nocallback nofree nosync nounwind willreturn memory(read) }
+
+!llvm.dbg.cu = !{!0}
+!llvm.module.flags = !{!12, !13}
+
+!0 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus, file: !1, producer: "dxcoob 1.7.2308.16 (52da17e29)", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, enums: !2, globals: !3)
+!1 = !DIFile(filename: "tests\\basic_var.hlsl", directory: "")
+!2 = !{}
+!3 = !{!4, !10}
+!4 = distinct !DIGlobalVariableExpression(var: !5, expr: !DIExpression())
+!5 = !DIGlobalVariable(name: "u0", linkageName: "\01?u0@@3V?$RWBuffer@M@@A", scope: !0, file: !1, line: 2, type: !6, isLocal: false, isDefinition: true)
+!6 = !DICompositeType(tag: DW_TAG_class_type, name: "RWBuffer<float>", file: !1, line: 2, size: 32, align: 32, elements: !2, templateParams: !7)
+!7 = !{!8}
+!8 = !DITemplateTypeParameter(name: "element", type: !9)
+!9 = !DIBasicType(name: "float", size: 32, align: 32, encoding: DW_ATE_float)
+!10 = distinct !DIGlobalVariableExpression(var: !11, expr: !DIExpression())
+!11 = !DIGlobalVariable(name: "u1", linkageName: "\01?u1@@3V?$RWBuffer@M@@A", scope: !0, file: !1, line: 3, type: !6, isLocal: false, isDefinition: true)
+!12 = !{i32 2, !"Dwarf Version", i32 5}
+!13 = !{i32 2, !"Debug Info Version", i32 3}
+!14 = distinct !DISubprogram(name: "main", scope: !1, file: !1, line: 7, type: !15, scopeLine: 7, flags: DIFlagPrototyped, spFlags: DISPFlagDefinition, unit: !0)
+!15 = !DISubroutineType(types: !16)
+!16 = !{null, !17}
+!17 = !DIDerivedType(tag: DW_TAG_typedef, name: "uint3", file: !1, baseType: !18)
+!18 = !DICompositeType(tag: DW_TAG_class_type, name: "vector<unsigned int, 3>", file: !1, size: 96, align: 32, elements: !19, templateParams: !24)
+!19 = !{!20, !22, !23}
+!20 = !DIDerivedType(tag: DW_TAG_member, name: "x", scope: !18, file: !1, baseType: !21, size: 32, align: 32, flags: DIFlagPublic)
+!21 = !DIBasicType(name: "unsigned int", size: 32, align: 32, encoding: DW_ATE_unsigned)
+!22 = !DIDerivedType(tag: DW_TAG_member, name: "y", scope: !18, file: !1, baseType: !21, size: 32, align: 32, offset: 32, flags: DIFlagPublic)
+!23 = !DIDerivedType(tag: DW_TAG_member, name: "z", scope: !18, file: !1, baseType: !21, size: 32, align: 32, offset: 64, flags: DIFlagPublic)
+!24 = !{!25, !26}
+!25 = !DITemplateTypeParameter(name: "element", type: !21)
+!26 = !DITemplateValueParameter(name: "element_count", type: !27, value: i32 3)
+!27 = !DIBasicType(name: "int", size: 32, align: 32, encoding: DW_ATE_signed)
+!28 = !DILocation(line: 7, column: 17, scope: !14)
+!29 = !DILocalVariable(name: "dtid", arg: 1, scope: !14, file: !1, line: 7, type: !17)
+!30 = !DILocation(line: 11, column: 18, scope: !14)
+!31 = !DILocalVariable(name: "my_var", scope: !14, file: !1, line: 11, type: !9)
+!32 = !DILocation(line: 11, column: 9, scope: !14)
+!33 = !DILocation(line: 14, column: 26, scope: !14)
+!34 = !DILocalVariable(name: "my_var2", scope: !14, file: !1, line: 14, type: !9)
+!35 = !DILocation(line: 14, column: 9, scope: !14)
+!36 = !DILocation(line: 17, column: 14, scope: !14)
+!37 = !DILocation(line: 19, column: 1, scope: !14)
diff --git a/llvm/test/CodeGen/AMDGPU/si-pre-allocate-wwwmregs-dbg-noreg.mir b/llvm/test/CodeGen/AMDGPU/si-pre-allocate-wwwmregs-dbg-noreg.mir
new file mode 100644
index 0000000000000..4b5fea863289b
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/si-pre-allocate-wwwmregs-dbg-noreg.mir
@@ -0,0 +1,210 @@
+# RUN: llc %s -o - -mcpu=gfx1030 -O0 -run-pass=si-pre-allocate-wwm-regs | FileCheck %s
+
+# Simple regression test to make sure DBG_VALUE $noreg does not assert in the pass
+
+# CHECK: S_ENDPGM
+
+--- |
+ source_filename = "module"
+ target datalayout = "e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-p7:160:256:256:32-p8:128:128:128:48-p9:192:256:256:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5-G1-ni:7:8:9"
+ target triple = "amdgcn-amd-amdpal"
+
+ %dx.types.ResRet.f32 = type { float, float, float, float, i32 }
+
+ define dllexport amdgpu_cs void @_amdgpu_cs_main(i32 inreg noundef %globalTable, i32 inreg noundef %userdata4, <3 x i32> inreg noundef %WorkgroupId, i32 inreg noundef %MultiDispatchInfo, <3 x i32> noundef %LocalInvocationId) #0 !dbg !14 {
+ %LocalInvocationId.i0 = extractelement <3 x i32> %LocalInvocationId, i64 0, !dbg !28
+ %WorkgroupId.i0 = extractelement <3 x i32> %WorkgroupId, i64 0, !dbg !28
+ %1 = call i64 @llvm.amdgcn.s.getpc(), !dbg !28
+ %2 = shl i32 %WorkgroupId.i0, 6, !dbg !28
+ %3 = add i32 %LocalInvocationId.i0, %2, !dbg !28
+ #dbg_value(i32 %3, !29, !DIExpression(DW_OP_LLVM_fragment, 0, 32), !28)
+ %4 = and i64 %1, -4294967296, !dbg !30
+ %5 = zext i32 %userdata4 to i64, !dbg !30
+ %6 = or disjoint i64 %4, %5, !dbg !30
+ %7 = inttoptr i64 %6 to ptr addrspace(4), !dbg !30, !amdgpu.uniform !2
+ %8 = load <4 x i32>, ptr addrspace(4) %7, align 4, !dbg !30, !invariant.load !2
+ %9 = call float @llvm.amdgcn.struct.buffer.load.format.f32(<4 x i32> %8, i32 %3, i32 0, i32 0, i32 0), !dbg !30
+ #dbg_value(%dx.types.ResRet.f32 poison, !31, !DIExpression(), !32)
+ %10 = fmul reassoc arcp contract afn float %9, 2.000000e+00, !dbg !33
+ #dbg_value(float %10, !34, !DIExpression(), !35)
+ %11 = getelementptr i8, ptr addrspace(4) %7, i64 32, !dbg !36, !amdgpu.uniform !2
+ %.upto01 = insertelement <4 x float> poison, float %10, i64 0, !dbg !36
+ %12 = shufflevector <4 x float> %.upto01, <4 x float> poison, <4 x i32> zeroinitializer, !dbg !36
+ %13 = load <4 x i32>, ptr addrspace(4) %11, align 4, !dbg !36, !invariant.load !2
+ call void @llvm.amdgcn.struct.buffer.store.format.v4f32(<4 x float> %12, <4 x i32> %13, i32 %3, i32 0, i32 0, i32 0), !dbg !36
+ ret void, !dbg !37
+ }
+
+ declare noundef i64 @llvm.amdgcn.s.getpc() #1
+ declare void @llvm.amdgcn.struct.buffer.store.format.v4f32(<4 x float>, <4 x i32>, i32, i32, i32, i32 immarg) #3
+ declare float @llvm.amdgcn.struct.buffer.load.format.f32(<4 x i32>, i32, i32, i32, i32 immarg) #4
+
+ attributes #0 = { memory(readwrite) "amdgpu-flat-work-group-size"="64,64" "amdgpu-memory-bound"="false" "amdgpu-num-sgpr"="4294967295" "amdgpu-num-vgpr"="4294967295" "amdgpu-prealloc-sgpr-spill-vgprs" "amdgpu-unroll-threshold"="1200" "amdgpu-wave-limiter"="false" "amdgpu-work-group-info-arg-no"="3" "denormal-fp-math"="ieee" "denormal-fp-math-f32"="preserve-sign" "target-cpu"="gfx1030" "target-features"=",+wavefrontsize64,+cumode,+enable-flat-scratch" }
+ attributes #1 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) "target-cpu"="gfx1030" }
+ attributes #2 = { nocallback nofree nosync nounwind willreturn memory(inaccessiblemem: write) "target-cpu"="gfx1030" }
+ attributes #3 = { nocallback nofree nosync nounwind willreturn memory(write) "target-cpu"="gfx1030" }
+ attributes #4 = { nocallback nofree nosync nounwind willreturn memory(read) "target-cpu"="gfx1030" }
+
+ !llvm.dbg.cu = !{!0}
+ !llvm.module.flags = !{!12, !13}
+
+ !0 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus, file: !1, producer: "dxcoob 1.7.2308.16 (52da17e29)", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, enums: !2, globals: !3)
+ !1 = !DIFile(filename: "tests\\basic_var.hlsl", directory: "")
+ !2 = !{}
+ !3 = !{!4, !10}
+ !4 = distinct !DIGlobalVariableExpression(var: !5, expr: !DIExpression())
+ !5 = !DIGlobalVariable(name: "u0", linkageName: "\01?u0@@3V?$RWBuffer@M@@A", scope: !0, file: !1, line: 2, type: !6, isLocal: false, isDefinition: true)
+ !6 = !DICompositeType(tag: DW_TAG_class_type, name: "RWBuffer<float>", file: !1, line: 2, size: 32, align: 32, elements: !2, templateParams: !7)
+ !7 = !{!8}
+ !8 = !DITemplateTypeParameter(name: "element", type: !9)
+ !9 = !DIBasicType(name: "float", size: 32, align: 32, encoding: DW_ATE_float)
+ !10 = distinct !DIGlobalVariableExpression(var: !11, expr: !DIExpression())
+ !11 = !DIGlobalVariable(name: "u1", linkageName: "\01?u1@@3V?$RWBuffer@M@@A", scope: !0, file: !1, line: 3, type: !6, isLocal: false, isDefinition: true)
+ !12 = !{i32 2, !"Dwarf Version", i32 5}
+ !13 = !{i32 2, !"Debug Info Version", i32 3}
+ !14 = distinct !DISubprogram(name: "main", scope: !1, file: !1, line: 7, type: !15, scopeLine: 7, flags: DIFlagPrototyped, spFlags: DISPFlagDefinition, unit: !0)
+ !15 = !DISubroutineType(types: !16)
+ !16 = !{null, !17}
+ !17 = !DIDerivedType(tag: DW_TAG_typedef, name: "uint3", file: !1, baseType: !18)
+ !18 = !DICompositeType(tag: DW_TAG_class_type, name: "vector<unsigned int, 3>", file: !1, size: 96, align: 32, elements: !19, templateParams: !24)
+ !19 = !{!20, !22, !23}
+ !20 = !DIDerivedType(tag: DW_TAG_member, name: "x", scope: !18, file: !1, baseType: !21, size: 32, align: 32, flags: DIFlagPublic)
+ !21 = !DIBasicType(name: "unsigned int", size: 32, align: 32, encoding: DW_ATE_unsigned)
+ !22 = !DIDerivedType(tag: DW_TAG_member, name: "y", scope: !18, file: !1, baseType: !21, size: 32, align: 32, offset: 32, flags: DIFlagPublic)
+ !23 = !DIDerivedType(tag: DW_TAG_member, name: "z", scope: !18, file: !1, baseType: !21, size: 32, align: 32, offset: 64, flags: DIFlagPublic)
+ !24 = !{!25, !26}
+ !25 = !DITemplateTypeParameter(name: "element", type: !21)
+ !26 = !DITemplateValueParameter(name: "element_count", type: !27, value: i32 3)
+ !27 = !DIBasicType(name: "int", size: 32, align: 32, encoding: DW_ATE_signed)
+ !28 = !DILocation(line: 7, column: 17, scope: !14)
+ !29 = !DILocalVariable(name: "dtid", arg: 1, scope: !14, file: !1, line: 7, type: !17)
+ !30 = !DILocation(line: 11, column: 18, scope: !14)
+ !31 = !DILocalVariable(name: "my_var", scope: !14, file: !1, line: 11, type: !9)
+ !32 = !DILocation(line: 11, column: 9, scope: !14)
+ !33 = !DILocation(line: 14, column: 26, scope: !14)
+ !34 = !DILocalVariable(name: "my_var2", scope: !14, file: !1, line: 14, type: !9)
+ !35 = !DILocation(line: 14, column: 9, scope: !14)
+ !36 = !DILocation(line: 17, column: 14, scope: !14)
+ !37 = !DILocation(line: 19, column: 1, scope: !14)
+...
+---
+name: _amdgpu_cs_main
+alignment: 1
+exposesReturnsTwice: false
+legalized: false
+regBankSelected: false
+selected: false
+failedISel: false
+tracksRegLiveness: true
+hasWinCFI: false
+noPhis: true
+isSSA: false
+noVRegs: false
+hasFakeUses: false
+callsEHReturn: false
+callsUnwindInit: false
+hasEHContTarget: false
+hasEHScopes: false
+hasEHFunclets: false
+isOutlined: false
+debugInstrRef: false
+failsVerification: false
+tracksDebugUserValues: false
+fixedStack: []
+stack: []
+entry_values: []
+callSites: []
+debugValueSubstitutions: []
+constants: []
+machineFunctionInfo:
+ explicitKernArgSize: 0
+ maxKernArgAlign: 4
+ ldsSize: 0
+ gdsSize: ...
[truncated]
|
Thanks for the review @slinder1! This change was the precursor for #149429, which is intended to be a replacement for this change. Did you have a chance to take a look at that? I'm open to get this change in first, and reduce #149429 to a smaller PR if that makes things easier. |
@@ -130,6 +130,11 @@ void SIPreAllocateWWMRegs::rewriteRegs(MachineFunction &MF) { | |||
if (VirtReg.isPhysical()) | |||
continue; | |||
|
|||
if (!VirtReg.isValid()) { | |||
assert(MI.isDebugInstr() && "non-debug use of noreg"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This assert probably isn't true in general, but would belong in the machine verifier
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there AMDGPU instructions that take noreg? I could see other targets failing verifier if the assert is run on all targets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A generic instruction easily could. I think we use this as a placeholder in a few places. In any case you should just drop the assert
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also I thought noreg counted as a physical register, so won't the continue above catch it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FirstPhysicalRegister
is defined as 1, so NoReg is excluded. In any case, the assert is removed.
llvm/test/CodeGen/AMDGPU/si-pre-allocate-wwwmregs-dbg-noreg.mir
Outdated
Show resolved
Hide resolved
llvm/test/CodeGen/AMDGPU/si-pre-allocate-wwwmregs-dbg-noreg.mir
Outdated
Show resolved
Hide resolved
attributes #0 = { memory(readwrite) "amdgpu-flat-work-group-size"="64,64" "amdgpu-memory-bound"="false" "amdgpu-num-sgpr"="4294967295" "amdgpu-num-vgpr"="4294967295" "amdgpu-prealloc-sgpr-spill-vgprs" "amdgpu-unroll-threshold"="1200" "amdgpu-wave-limiter"="false" "amdgpu-work-group-info-arg-no"="3" "denormal-fp-math"="ieee" "denormal-fp-math-f32"="preserve-sign" "target-cpu"="gfx1030" "target-features"=",+wavefrontsize64,+cumode,+enable-flat-scratch" } | ||
attributes #1 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) "target-cpu"="gfx1030" } | ||
attributes #2 = { nocallback nofree nosync nounwind willreturn memory(inaccessiblemem: write) "target-cpu"="gfx1030" } | ||
attributes #3 = { nocallback nofree nosync nounwind willreturn memory(write) "target-cpu"="gfx1030" } | ||
attributes #4 = { nocallback nofree nosync nounwind willreturn memory(read) "target-cpu"="gfx1030" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove unnecessary attributes
Constructing Target triple with
ObjectFile::makeTriple
instead of just withArch
and leaving the rest unknown. Also creating the subtarget with theCPU
. AMDGPU needs the full triple andCPU
to disassemble correctly.To run a full test, also fixed a failure with the
$noreg
operand inDBG_VALUE
, and writing relocation withAMDPAL
.