Skip to content

Conversation

@Nerixyz
Copy link
Contributor

@Nerixyz Nerixyz commented Sep 8, 2025

After parsing blocks in a function, the blocks should be marked as parsed for them to be dumped (see Function::Dump). As explained in #114906 (comment), this happens (accidentally?) in the DIA plugin when parsing variables, because it calls function.GetBlock(can_create=true) which marks blocks as parsed. In the native plugin, this was never called, so blocks and variables were never included in the lldb-test symbols output.

The variables.test for the DIA plugin tests this. One difference between the plugins is how they specify the location of local variables. This causes the output of the native plugin to be two lines per variable, whereas the DIA plugin has one line:

(native):
000002C4B7593020:       Variable{0x1c800001}, name = "var_arg1", type = {0000000000000744} 0x000002C4B6CA7900 (int), scope = parameter, location = 0x00000000:
        [0x000000014000102c, 0x000000014000103e): DW_OP_breg7 RSP+8
(DIA):
000002778C827EE0:       Variable{0x0000001b}, name = "var_arg1", type = {0000000000000005} 0x000002778C1FBAB0 (int), scope = parameter, decl = VariablesTest.cpp:32, location = DW_OP_breg7 RSP+8

In the test, I filtered lines starting with spaces followed by [0x, so we can still use CHECK-NEXT.


Another difference between the plugins is that DIA marks the this pointer as artificial (equivalent to DWARF). This is done if a variable's object kind is ObjectPtr (source). As far as I know, there isn't anything in the debug info that says "this variable is the this pointer" other than the name/type of a variable and the type of the function.

@Nerixyz Nerixyz requested a review from ZequanWu September 8, 2025 15:36
@llvmbot llvmbot added the lldb label Sep 8, 2025
@llvmbot
Copy link
Member

llvmbot commented Sep 8, 2025

@llvm/pr-subscribers-lldb

Author: nerix (Nerixyz)

Changes

After parsing blocks in a function, the blocks should be marked as parsed for them to be dumped (see Function::Dump). As explained in #114906 (comment), this happens (accidentally?) in the DIA plugin when parsing variables, because it calls function.GetBlock(can_create=true) which marks blocks as parsed. In the native plugin, this was never called, so blocks and variables were never included in the lldb-test symbols output.

The variables.test for the DIA plugin tests this. One difference between the plugins is how they specify the location of local variables. This causes the output of the native plugin to be two lines per variable, whereas the DIA plugin has one line:

(native):
000002C4B7593020:       Variable{0x1c800001}, name = "var_arg1", type = {0000000000000744} 0x000002C4B6CA7900 (int), scope = parameter, location = 0x00000000:
        [0x000000014000102c, 0x000000014000103e): DW_OP_breg7 RSP+8
(DIA):
000002778C827EE0:       Variable{0x0000001b}, name = "var_arg1", type = {0000000000000005} 0x000002778C1FBAB0 (int), scope = parameter, decl = VariablesTest.cpp:32, location = DW_OP_breg7 RSP+8

In the test, I filtered lines starting with spaces followed by [0x, so we can still use CHECK-NEXT.


Another difference between the plugins is that DIA marks the this pointer as artificial (equivalent to DWARF). This is done if a variable's object kind is ObjectPtr (source). As far as I know, there isn't anything in the debug info that says "this variable is the this pointer" other than the name/type of a variable and the type of the function.


Full diff: https://github.com/llvm/llvm-project/pull/157493.diff

2 Files Affected:

  • (modified) lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.cpp (+2)
  • (modified) lldb/test/Shell/SymbolFile/PDB/variables.test (+22-12)
diff --git a/lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.cpp b/lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.cpp
index 112eb06e462fc..81b2818fa07bd 100644
--- a/lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.cpp
+++ b/lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.cpp
@@ -1624,6 +1624,8 @@ size_t SymbolFileNativePDB::ParseBlocksRecursive(Function &func) {
   for (uint64_t uid : remove_uids) {
     m_inline_sites.erase(uid);
   }
+
+  func.GetBlock(false).SetBlockInfoHasBeenParsed(true, true);
   return count;
 }
 
diff --git a/lldb/test/Shell/SymbolFile/PDB/variables.test b/lldb/test/Shell/SymbolFile/PDB/variables.test
index 9ee10f75c7e38..970d714c29c3b 100644
--- a/lldb/test/Shell/SymbolFile/PDB/variables.test
+++ b/lldb/test/Shell/SymbolFile/PDB/variables.test
@@ -2,15 +2,27 @@ REQUIRES: system-windows, msvc
 RUN: mkdir -p %t.dir
 RUN: %build --compiler=clang-cl --mode=compile --arch=64 --nodefaultlib --output=%t.dir/VariablesTest.cpp.obj %S/Inputs/VariablesTest.cpp
 RUN: %build --compiler=msvc --mode=link --arch=64 --nodefaultlib --output=%t.dir/VariablesTest.cpp.exe %t.dir/VariablesTest.cpp.obj
-RUN: lldb-test symbols %t.dir/VariablesTest.cpp.exe > %t.dir/VariablesTest.out
-RUN: FileCheck --check-prefix=GLOBALS --input-file=%t.dir/VariablesTest.out %s
-RUN: FileCheck --check-prefix=FUNC-F --input-file=%t.dir/VariablesTest.out %s
-RUN: FileCheck --check-prefix=FUNC-MAIN --input-file=%t.dir/VariablesTest.out %s
-RUN: FileCheck --check-prefix=FUNC-CONSTRUCTOR --input-file=%t.dir/VariablesTest.out %s
-RUN: FileCheck --check-prefix=FUNC-MEMBER --input-file=%t.dir/VariablesTest.out %s
+# Note: The native plugin creates a location list for variables that's only valid for the function.
+#       The DIA plugin creates a location expression that's always valid. This causes DIA to output
+#       one line per variable where the native plugin would output two (the second would contain the
+#       location information). This removes the second line from the output of the native plugin.
+#       It's done in both cases, because LLDB might not be compiled with the DIA SDK in which case
+#       the native plugin is always used.
+RUN: env LLDB_USE_NATIVE_PDB_READER=0 lldb-test symbols %t.dir/VariablesTest.cpp.exe | sed '/^ \+\[0x/d' > %t.dir/VariablesTest.DIA.out
+RUN: env LLDB_USE_NATIVE_PDB_READER=1 lldb-test symbols %t.dir/VariablesTest.cpp.exe | sed '/^ \+\[0x/d' > %t.dir/VariablesTest.Native.out
+RUN: FileCheck --check-prefix=GLOBALS --input-file=%t.dir/VariablesTest.DIA.out %s
+RUN: FileCheck --check-prefix=GLOBALS --input-file=%t.dir/VariablesTest.Native.out %s
+RUN: FileCheck --check-prefix=FUNC-F --input-file=%t.dir/VariablesTest.DIA.out %s
+RUN: FileCheck --check-prefix=FUNC-F --input-file=%t.dir/VariablesTest.Native.out %s
+RUN: FileCheck --check-prefix=FUNC-MAIN --input-file=%t.dir/VariablesTest.DIA.out %s
+RUN: FileCheck --check-prefix=FUNC-MAIN --input-file=%t.dir/VariablesTest.Native.out %s
+RUN: FileCheck --check-prefix=FUNC-CONSTRUCTOR --input-file=%t.dir/VariablesTest.DIA.out %s
+RUN: FileCheck --check-prefix=FUNC-CONSTRUCTOR --input-file=%t.dir/VariablesTest.Native.out %s
+RUN: FileCheck --check-prefix=FUNC-MEMBER --input-file=%t.dir/VariablesTest.DIA.out %s
+RUN: FileCheck --check-prefix=FUNC-MEMBER --input-file=%t.dir/VariablesTest.Native.out %s
 
 GLOBALS: Module [[MOD:.*]]
-GLOBALS: SymbolFile pdb ([[MOD]])
+GLOBALS: SymbolFile {{(native-)?}}pdb ([[MOD]])
 GLOBALS:     CompileUnit{{.*}}, language = "c++", file = '{{.*}}\VariablesTest.cpp'
 GLOBALS-DAG:   Variable{{.*}}, name = "g_IntVar"
 GLOBALS-SAME:  scope = global, location = {{.*}}, external
@@ -30,7 +42,7 @@ GLOBALS-DAG:   Variable{{.*}}, name = "g_Const"
 GLOBALS-SAME:  scope = ??? (2)
 GLOBALS:     Function
 
-FUNC-F:      Function{{.*}}, mangled = ?f@@YAHHH@Z
+FUNC-F:      Function{{.*}}, {{mangled = \?f@@YAHHH@Z|demangled = f}}
 FUNC-F-NEXT:   Block
 FUNC-F-NEXT:     Variable{{.*}}, name = "var_arg1"
 FUNC-F-SAME:                     scope = parameter
@@ -39,7 +51,7 @@ FUNC-F-SAME:                     scope = parameter
 FUNC-F-NEXT:     Variable{{.*}}, name = "same_name_var"
 FUNC-F-SAME:                     scope = local
 
-FUNC-MAIN:      Function{{.*}}, mangled = main
+FUNC-MAIN:      Function{{.*}}, {{(de)?}}mangled = main
 FUNC-MAIN-NEXT:   Block
 FUNC-MAIN-NEXT:     Variable{{.*}}, name = "same_name_var"
 FUNC-MAIN-SAME:                     scope = local
@@ -52,11 +64,10 @@ FUNC-MAIN-SAME:                     scope = local
 FUNC-MAIN-NEXT:     Variable{{.*}}, name = "a"
 FUNC-MAIN-SAME:                     scope = local
 
-FUNC-CONSTRUCTOR:      Function{{.*}}, {{(de)?}}mangled = {{.*}}{{(Class::)?}}Class{{.*}}
+FUNC-CONSTRUCTOR:      Function{{.*}}, {{(de)?}}mangled = {{.*}}Class::Class{{.*}}
 FUNC-CONSTRUCTOR-NEXT:   Block
 FUNC-CONSTRUCTOR-NEXT:     Variable{{.*}}, name = "this"
 FUNC-CONSTRUCTOR-SAME:                     scope = parameter
-FUNC-CONSTRUCTOR-SAME:                     artificial
 FUNC-CONSTRUCTOR-NEXT:     Variable{{.*}}, name = "a"
 FUNC-CONSTRUCTOR-SAME:                     scope = parameter
 
@@ -64,4 +75,3 @@ FUNC-MEMBER:      Function{{.*}}, {{(de)?}}mangled = {{.*}}{{(Class::)?}}Func{{.
 FUNC-MEMBER-NEXT:   Block
 FUNC-MEMBER-NEXT:     Variable{{.*}}, name = "this"
 FUNC-MEMBER-SAME:                     scope = parameter
-FUNC-MEMBER-SAME:                     artificial

@Michael137
Copy link
Member

As far as I know, there isn't anything in the debug info that says "this variable is the this pointer" other than the name/type of a variable and the type of the function.

How does the DIA PDB plugin do it then? Why can't the native plugin use PDB_DataKind::ObjectPtr?

@Nerixyz
Copy link
Contributor Author

Nerixyz commented Sep 8, 2025

Why can't the native plugin use PDB_DataKind::ObjectPtr?

It's something DIA determines - in IDiaSymbol::get_dataKind.

The symbol information we have is the following:

    1132 | S_GPROC32 [size = 52] `Class::Func`
           parent = 0, end = 1248, addr = 0001:0208, code size = 7
           type = `0x1010 (void Class::())`, debug start = 0, debug end = 0, flags = noinline | opt debuginfo
    1184 | S_FRAMEPROC [size = 32]
           size = 8, padding size = 0, offset to padding = 0
           bytes of callee saved registers = 0, exception handler addr = 0000:0000
           local fp reg = RSP, param fp reg = RSP
           flags = safe buffers
    1216 | S_LOCAL [size = 16] `this`
           type=0x1013 (Class*), flags = param
    1232 | S_DEFRANGE_FRAMEPOINTER_REL [size = 16]
           offset = 0, range = [0001:0213,+2)
           gaps = []
    1248 | S_END [size = 4]

This is Class::Func() which takes no parameters (except this).

@Nerixyz Nerixyz merged commit 406d6bd into llvm:main Sep 9, 2025
11 checks passed
@Nerixyz Nerixyz deleted the fix/lldb-npdb-func-blocks branch November 7, 2025 14:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants