Skip to content
Closed
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
66 changes: 49 additions & 17 deletions llvm/docs/SourceLevelDebugging.rst
Original file line number Diff line number Diff line change
Expand Up @@ -130,6 +130,40 @@ debugging information influences optimization passes then it will be reported
as a failure. See :doc:`TestingGuide` for more information on LLVM test
infrastructure and how to run various tests.

.. _variables_and_variable_fragments:

Variables and Variable Fragments
================================

Source language variables (or just "variables") are represented by `local
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we add "and constants"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just imo, but I think we could skip mentioning constants, on the basis that the sentence as written isn't exclusive (it doesn't specify that only variables are represented by these classes), and I don't believe(?) that we use fragments for constants since we should always have a single whole definition for them, so the following text would only concern variables. No complaints from me about adding a mention of constants here, though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the terminology gets a bit fuzzy, because "constants" can mean different things in different source languages anyway, right?

I didn't audit every other mention of "variable" in the document, but I imagine this same ambiguity comes up elsewhere, so it would be ideal to get a consistent definition to point back at. DWARF seems to call these "data objects" and says:

Chapter 4
Data Object and Object List Entries
This section presents the debugging information entries that describe individual
data objects: variables, parameters and constants, and lists of those objects that
may be grouped in a single declaration, such as a common block.

Maybe we can just adopt "data object" here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the interest of not bike-shedding this patch forever, I'll avoid commenting too much more (it already looks "good enough" to me!), but I think using "data object" outside of the DWARF context may just be adding confusion. Personally I feel like "Variable" is a good term, since the information here explicitly applies to just "variables" and not really to constants, but ultimately I put my vote for either "variables" or "variables and constants".

variable <LangRef.html#dilocalvariable>`_ and `global variable
<LangRef.html#diglobalvariable>`_ metadata nodes.

When variables are not allocated to contiguous memory or to a single LLVM value
the metadata must record enough information to describe each "piece" or
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor concern here, we say that:

When variables are not allocated

And then one of two options:

to contiguous memory
or
to a single LLVM value

This second part feels off to me: "allocated to a single LLVM value"

One suggestion would be to avoid mentioning allocation, and instead combine the two possibilities in terms of "describing":

"When variables cannot be described by a single use of an LLVM value, ...".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw I really like the wording of everything else!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that the wording could be improved - I think that your proposed wording is better, but might be ambiguous with respect to variadic debug values, which aren't necessarily fragments but may be "described" by many LLVM values. Perhaps "When variables are split up across multiple LLVM values...", and it may also be worthwhile to be direct and give an example, i.e. "...such as for structs that have been split into separate LLVM values for each member...". No particularly strong feelings on this part from my end though!

Copy link
Contributor

@felipepiovezan felipepiovezan Feb 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we say "cannot be described by a single expression" then? IIRC when we have multiple fragments, each fragment is described by its own expression (and its own intrinsic)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, the wording I proposed is imprecise/ambiguous; I think I also bled over concepts from the "use fragments for everything" world I was originally thinking in terms of.

I like the framing in terms of the number of expressions for many cases, but we still have cases where we use a single fragment+expression to describe that only one piece of a variable is defined, right? Maybe rather than starting with the "reason" for fragments I can start with just exactly what they are, then in a note go into the various cases they are used and give some examples?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe rather than starting with the "reason" for fragments

I think you make a good point here. We don't need to justify the reason for fragments here (arguably it is self evident); we can just state what fragments are:

"A variable may be described in terms of fragments, where a fragment is a contiguous span of bits. This is achieved by ... [rest of the second paragraph]".

If we want to provide motivation, we can do so later through an example or just state in high level terms one code transformation that may cause fragments to appear.

"fragment" of the source variable. Each fragment is a contiguous span of bits
of the variable it is a part of.

This is achieved by encoding fragment information at the end of the
``DIExpression`` with the ``DW_OP_LLVM_fragment`` operation, whose operands are
the bit offset of the fragment relative to the start of the variable, and the
fragment size in bits.

.. note::

The ``DW_OP_LLVM_fragment`` operation acts only to encode the fragment
information, and does not have an effect on the semantics of the expression.

A debug intrinsic which refers to a ``DIExpression`` ending with a fragment
operation provides information about the fragment of the variable it refers to,
rather than the whole variable.

An equivalence relation over the set of all variables and variable fragments
called "spatially overlapping" is defined in order to describe when intrinsics
terminate the effects of other intrinsics. A variable spatially overlaps with
itself and all fragments of itself. Fragments additionally spatially overlap
with other fragments sharing one common bit of the same variable.

.. _format:

Debugging information format
Expand Down Expand Up @@ -180,7 +214,9 @@ track source local variables through optimization and code generation.

.. code-block:: llvm

void @llvm.dbg.declare(metadata, metadata, metadata)
void @llvm.dbg.declare(metadata <type> <value>,
metadata <!DILocalVariable>,
metadata <!DIExpression>)

This intrinsic provides information about a local element (e.g., variable).
The first argument is metadata holding the address of variable, typically a
Expand Down Expand Up @@ -221,7 +257,9 @@ agree on the memory location.

.. code-block:: llvm

void @llvm.dbg.value(metadata, metadata, metadata)
void @llvm.dbg.value(metadata <<type> <value>|!DIArgList>,
metadata <!DILocalVariable>,
metadata <!DIExpression>)

This intrinsic provides information when a user source variable is set to a new
value. The first argument is the new value (wrapped as metadata). The second
Expand All @@ -243,12 +281,12 @@ the complex expression derives the direct value.

.. code-block:: llvm

void @llvm.dbg.assign(Value *Value,
DIExpression *ValueExpression,
DILocalVariable *Variable,
DIAssignID *ID,
Value *Address,
DIExpression *AddressExpression)
void @llvm.dbg.assign(metadata <<type> <value>|!DIArgList>,
metadata <!DIExpression>,
metadata <!DILocalVariable>,
metadata <!DIAssignID>,
metadata <type> <value>,
metadata <!DIExpression>)

This intrinsic marks the position in IR where a source assignment occurred. It
encodes the value of the variable. It references the store, if any, that
Expand All @@ -259,13 +297,6 @@ argument is a ``DIAssignID`` used to reference a store. The fifth is the
destination of the store (wrapped as metadata), and the sixth is a `complex
expression <LangRef.html#diexpression>`_ that modifies it.

The formal LLVM-IR signature is:

.. code-block:: llvm

void @llvm.dbg.assign(metadata, metadata, metadata, metadata, metadata, metadata)


See :doc:`AssignmentTracking` for more info.

Object lifetimes and scoping
Expand Down Expand Up @@ -417,8 +448,9 @@ values through compilation, when objects are promoted to SSA values an
``llvm.dbg.value`` intrinsic is created for each assignment, recording the
variable's new location. Compared with the ``llvm.dbg.declare`` intrinsic:

* A dbg.value terminates the effect of any preceding dbg.values for (any
overlapping fragments of) the specified variable.
* A dbg.value terminates the effect of any preceding dbg.values for any
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we be consistent and use llvm.dbg.value here?

spatially overlapping variables or fragments of the specified variable or
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way this is worded sounds as if in this example:

%1 = ...
call llvm.dbg.value(metadata %1, DILocalVariable("x"), DIExpression())
%2 = ...
call llvm.dbg.value(metadata %1, DILocalVariable("y"), DIExpression())

The second intrinsic ends the range of "x", but that's not the case. It only terminates it if it refers to the same variable or a fragment of it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
spatially overlapping variables or fragments of the specified variable or
spatially overlapping llvm.dbg.values of the specified variable or

Perhaps this - I think given the definition of spatial overlaps, it should be clear that it applies to both fragment and non-fragment variable definitions.

Copy link
Contributor Author

@slinder1 slinder1 Feb 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, it might be good to add a note here, and to make the term "spatially overlapping" be a link back to the definition, but the intention was to define "spatial overlap" such that this could just talk in terms of the universe of all variables and fragments and by definition still be restricted to the way LLVM currently works. In other words, a dbg.value for "x" cannot spatially overlap with a dbg.value for "y" by definition, so the wording here does not need to repeat that restriction.

Edit: at least it doesn't need to repeat it "normatively"; if a note helps the reader understand I'm happy to include one!

fragment.
* The dbg.value's position in the IR defines where in the instruction stream
the variable's value changes.
* Operands can be constants, indicating the variable is assigned a
Expand Down