Skip to content

Conversation

@dwblaikie
Copy link
Contributor

This adds just enough debug info for i32/int parameters and return
values, with a path forward for adding DWARF type metadata for other
types.

As it happens, return type information is carried separately from
parameter information:

  • Return type information is carried in the type of the DISubprogram
    (as a DISubroutineType - which does carry parameter type information
    as well, but that's unused when the DWARF is emitted by LLVM)
  • Parameter information is carried by DILocalVariables with a non-zero
    arg value (representing the order of function parameters)

In the absence of locations for the parameters (future work), nothing
would usually keep the DILocalVariable live/reachable when emitting
DWARF - so for cases where this can happen (for clang, this happens in
optimized builds where all references to the parameter variable might be
optimized away) the variables can be "retained" in a list on the
DISubprogram - achieved by passing AlwaysPreserve parameter to
createParameterVariable (adds them to a list, then that list gets
attached to the DISubprogram when it's finalized later)

For now, any unsupported types are emitted as void* (except void
return, which is implemented as void) as a placeholder.

Given this example:

import Core library "io";
class MyClass {
}
fn Unsupported(v: MyClass) {
}
fn Ret() -> i32 {
  return 42;
}
fn Arg(x: i32) {
  Core.Print(x);
}
fn Run() {
}

this is the resulting DWARF:

DW_TAG_compile_unit
  DW_AT_name    ("test.carbon")
  DW_TAG_subprogram
    DW_AT_name  ("Unsupported")
    DW_TAG_formal_parameter
      DW_AT_type        (0x00000066 "void *")
  DW_TAG_subprogram
    DW_AT_name  ("Ret")
    DW_AT_type  (0x00000062 "int")
  DW_TAG_subprogram
    DW_AT_name  ("Arg")
    DW_TAG_formal_parameter
      DW_AT_type        (0x00000062 "int")
  DW_TAG_subprogram
    DW_AT_name  ("Run")
  DW_TAG_base_type
    DW_AT_name  ("int")
  DW_TAG_pointer_type

And the debugger:

(gdb) p Ret()
$1 = 42
(gdb) p Arg(4)
4
$2 = void

I'm not sure if there's a way this logic should be merged with the logic
for making the llvm::Function type (which the DISubroutineType
building code was inspired by/copied from) - since they're done at
different times/places, I don't think there's an easy way to do it in
one pass, but maybe the code can be shared (even if it's run twice) in
some generic SemIR::Function type walker.

…meters/return value

This adds just enough debug info for i32/int parameters and return
values, with a path forward for adding DWARF type metadata for other
types.

As it happens, return type information is carried separately from
parameter information:
* Return type information is carried in the `type` of the `DISubprogram`
  (as a `DISubroutineType` - which does carry parameter type information
  as well, but that's unused when the DWARF is emitted by LLVM)
* Parameter information is carried by `DILocalVariable`s with a non-zero
  `arg` value (representing the order of function parameters)

In the absence of locations for the parameters (future work), nothing
would usually keep the `DILocalVariable` live/reachable when emitting
DWARF - so for cases where this can happen (for clang, this happens in
optimized builds where all references to the parameter variable might be
optimized away) the variables can be "retained" in a list on the
`DISubprogram` - achieved by passing `AlwaysPreserve` parameter to
`createParameterVariable` (adds them to a list, then that list gets
attached to the `DISubprogram` when it's finalized later)

For now, any unsupported types are emitted as `void*` (except void
return, which is implemented as void) as a placeholder.

Given this example:
```
import Core library "io";
class MyClass {
}
fn Unsupported(v: MyClass) {
}
fn Ret() -> i32 {
  return 42;
}
fn Arg(x: i32) {
  Core.Print(x);
}
fn Run() {
}
```
this is the resulting DWARF:
```
DW_TAG_compile_unit
  DW_AT_name    ("test.carbon")
  DW_TAG_subprogram
    DW_AT_name  ("Unsupported")
    DW_TAG_formal_parameter
      DW_AT_type        (0x00000066 "void *")
  DW_TAG_subprogram
    DW_AT_name  ("Ret")
    DW_AT_type  (0x00000062 "int")
  DW_TAG_subprogram
    DW_AT_name  ("Arg")
    DW_TAG_formal_parameter
      DW_AT_type        (0x00000062 "int")
  DW_TAG_subprogram
    DW_AT_name  ("Run")
  DW_TAG_base_type
    DW_AT_name  ("int")
  DW_TAG_pointer_type
```
And the debugger:
```
(gdb) p Ret()
$1 = 42
(gdb) p Arg(4)
4
$2 = void
```

I'm not sure if there's a way this logic should be merged with the logic
for making the `llvm::Function` type (which the `DISubroutineType`
building code was inspired by/copied from) - since they're done at
different times/places, I don't think there's an easy way to do it in
one pass, but maybe the code can be shared (even if it's run twice) in
some generic `SemIR::Function` type walker.
@dwblaikie dwblaikie requested a review from a team as a code owner November 20, 2025 21:32
@dwblaikie dwblaikie requested review from danakj and removed request for a team November 20, 2025 21:32

// Returns both the lowered llvm IR type and the lowered llvm IR debug info
// type for the given type_id.
auto GetTypes(SemIR::TypeId type_id) const
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is something like this?

Suggested change
auto GetTypes(SemIR::TypeId type_id) const
auto GetTypeAndDebug(SemIR::TypeId type_id) const

Or maybe

Suggested change
auto GetTypes(SemIR::TypeId type_id) const
auto GetTypeAndDIType(SemIR::TypeId type_id) const

Or

Suggested change
auto GetTypes(SemIR::TypeId type_id) const
auto GetTypeAndDebugType(SemIR::TypeId type_id) const

Or

Suggested change
auto GetTypes(SemIR::TypeId type_id) const
auto GetTypeAndDebugInfoType(SemIR::TypeId type_id) const

Comment on lines +69 to 71
CARBON_CHECK(types_.Get(type_id).first, "Missing type {0}: {1}", type_id,
sem_ir().types().GetAsInst(type_id));
return types_.Get(type_id);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we avoid calling Get() twice here?

auto GetTypeType() -> llvm::StructType* { return context().GetTypeType(); }

auto context() -> Context& { return *context_; }
auto context() const -> Context& { return *context_; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally our Context types are always used mutably, so this seems surprising to me. And it's not const-correct either. Can we remove the need for this?

Comment on lines +203 to +207
auto BuildDISubroutineType(
const SemIR::Function&, SemIR::SpecificId specific_id,
llvm::SmallVectorImpl<llvm::DIType*>& debug_parameter_types) const
-> llvm::DISubroutineType*;
auto BuildDIType(SemIR::TypeId type_id) const -> llvm::DIType*;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add comments for these?

Comment on lines 209 to 210
// Builds the type for the given instruction, which should then be cached by
// the caller.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update this? Do they cache the debug info?

Comment on lines +799 to +800
// TODO: if int_repr.kind == SemIR::InitRepr::ByCopy - be sure the return
// type is tagged with indirect calling convention
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// TODO: if int_repr.kind == SemIR::InitRepr::ByCopy - be sure the return
// type is tagged with indirect calling convention
// TODO: If int_repr.kind == SemIR::InitRepr::ByCopy - be sure the return
// type is tagged with indirect calling convention.

// TODO: expose the `Call` parameter patterns in `Function`, and use them here
llvm::SmallVector<llvm::Metadata*, 16> element_types;
element_types.push_back(return_info.type_id.has_value()
? get_debug_type(return_info.type_id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if get_debug_type() returns nullptr for a None input?

Comment on lines +786 to +787
CARBON_CHECK(type_id.has_value());
if (!type_id.has_value()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These lines seem to contradict each other

param_pattern_info->inst_id));
switch (auto value_rep = SemIR::ValueRepr::ForType(sem_ir(), param_type_id);
value_rep.kind) {
case SemIR::ValueRepr::Unknown:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this get a different FATAL message?


auto FileContext::BuildDISubroutineType(
const SemIR::Function& function, SemIR::SpecificId specific_id,
llvm::SmallVectorImpl<llvm::DIType*>& debug_parameter_types) const
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider returning a struct instead of using an out-parameter?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants