@@ -37,13 +37,13 @@ includes contributions to open source projects such as LLVM [:ref:`LLVM
3737
3838The LLVM compiler has upstream support for commercially available AMD GPU
3939hardware (AMDGPU) [:ref: `AMDGPU-LLVM <amdgpu-dwarf-AMDGPU-LLVM >`]. The open
40- source ROCgdb [:ref: `AMD-ROCgdb <amdgpu-dwarf-AMD-ROCgdb >`] GDB based debugger
40+ source ROCgdb [:ref: `AMD-ROCgdb <amdgpu-dwarf-AMD-ROCgdb >`] GDB- based debugger
4141also has support for AMDGPU which is being upstreamed. Support for AMDGPU is
4242also being added by third parties to the GCC [:ref: `GCC <amdgpu-dwarf-GCC >`]
4343compiler and the Perforce TotalView HPC Debugger [:ref: `Perforce-TotalView
4444<amdgpu-dwarf-Perforce-TotalView>`].
4545
46- To support debugging heterogeneous programs several features that are not
46+ To support debugging heterogeneous programs, several features that are not
4747provided by current DWARF Version 5 [:ref: `DWARF <amdgpu-dwarf-DWARF >`] have
4848been identified. The :ref: `amdgpu-dwarf-extensions ` section gives an overview of
4949the extensions devised to address the missing features. The extensions seek to
@@ -107,7 +107,7 @@ for each in terms of heterogeneous debugging.
107107DWARF Version 5 does not allow location descriptions to be entries on the DWARF
108108expression stack. They can only be the final result of the evaluation of a DWARF
109109expression. However, by allowing a location description to be a first-class
110- entry on the DWARF expression stack it becomes possible to compose expressions
110+ entry on the DWARF expression stack, it becomes possible to compose expressions
111111containing both values and location descriptions naturally. It allows objects to
112112be located in any kind of memory address space, in registers, be implicit
113113values, be undefined, or a composite of any of these.
@@ -123,20 +123,20 @@ non-default address spaces and generalizing the power of composite location
123123descriptions to any kind of location description.
124124
125125For those familiar with the definition of location descriptions in DWARF Version
126- 5, the definitions in these extensions are presented differently, but does in
126+ 5, the definitions in these extensions are presented differently, but do in
127127fact define the same concept with the same fundamental semantics. However, it
128128does so in a way that allows the concept to extend to support address spaces,
129129bit addressing, the ability for composite location descriptions to be composed
130130of any kind of location description, and the ability to support objects located
131131at multiple places. Collectively these changes expand the set of architectures
132- that can be supported and improves support for optimized code.
132+ that can be supported and improve support for optimized code.
133133
134134Several approaches were considered, and the one presented, together with the
135135extensions it enables, appears to be the simplest and cleanest one that offers
136136the greatest improvement of DWARF's ability to support debugging optimized GPU
137137and non-GPU code. Examining the GDB debugger and LLVM compiler, it appears only
138138to require modest changes as they both already have to support general use of
139- location descriptions. It is anticipated that will also be the case for other
139+ location descriptions. It is anticipated that this will also be the case for other
140140debuggers and compilers.
141141
142142GDB has been modified to evaluate DWARF Version 5 expressions with location
@@ -156,7 +156,7 @@ DWARF Expression Stack* [:ref:`AMDGPU-DWARF-LOC
1561562.2 Generalize CFI to Allow Any Location Description Kind
157157---------------------------------------------------------
158158
159- CFI describes restoring callee saved registers that are spilled. Currently CFI
159+ CFI describes restoring callee saved registers that are spilled. Currently, CFI
160160only allows a location description that is a register, memory address, or
161161implicit location description. AMDGPU optimized code may spill scalar registers
162162into portions of vector registers. This requires extending CFI to allow any
@@ -223,7 +223,7 @@ infinite precision offsets to allow it to correctly track a series of positive
223223and negative offsets that may transiently overflow or underflow, but end up in
224224range. This is simple for the arithmetic operations as they are defined in terms
225225of two's complement arithmetic on a base type of a fixed size. Therefore, the
226- offset operation define that integer overflow is ill-formed. This is in contrast
226+ offset operation defines that integer overflow is ill-formed. This is in contrast
227227to the ``DW_OP_plus ``, ``DW_OP_plus_uconst ``, and ``DW_OP_minus `` arithmetic
228228operations which define that it causes wrap-around.
229229
@@ -359,7 +359,7 @@ address space at a fixed address.
359359
360360The ``DW_OP_LLVM_form_aspace_address `` (see
361361:ref: `amdgpu-dwarf-memory-location-description-operations `) operation is defined
362- to create a memory location description from an address and address space. If
362+ to create a memory location description from an address and address space. It
363363can be used to specify the location of a variable that is allocated in a
364364specific address space. This allows the size of addresses in an address space to
365365be larger than the generic type. It also allows a consumer great implementation
@@ -372,7 +372,7 @@ In contrast, if the ``DW_OP_LLVM_form_aspace_address`` operation had been
372372defined to produce a value, and an implicit conversion to a memory location
373373description was defined, then it would be limited to the size of the generic
374374type (which matches the size of the default address space). An implementation
375- would likely have to use *reserved ranges * of value to represent different
375+ would likely have to use *reserved ranges * of values to represent different
376376address spaces. Such a value would likely not match any address value in the
377377actual hardware. That would require the consumer to have special treatment for
378378such values.
@@ -528,7 +528,7 @@ active. To describe the conceptual location of non-active lanes requires an
528528attribute that has an expression that computes the source location PC for each
529529lane.
530530
531- For efficiency, the expression calculates the source location the wavefront as a
531+ For efficiency, the expression calculates the source location of the wavefront as a
532532whole. This can be done using the ``DW_OP_LLVM_select_bit_piece `` (see
533533:ref: `amdgpu-dwarf-operation-to-create-vector-composite-location-descriptions `)
534534operation.
@@ -564,7 +564,7 @@ information entry to indicate that there is additional target architecture
564564specific information in the debugging information entries of that compilation
565565unit. This allows a consumer to know what extensions are present in the debugger
566566information entries as is possible with the augmentation string of other
567- sections. See .
567+ sections.
568568
569569The format that should be used for an augmentation string is also recommended.
570570This allows a consumer to parse the string when it contains information from
@@ -581,24 +581,24 @@ See :ref:`amdgpu-dwarf-full-and-partial-compilation-unit-entries`,
581581
582582AMDGPU supports programming languages that include online compilation where the
583583source text may be created at runtime. For example, the OpenCL and HIP language
584- runtimes support online compilation. To support is , a way to embed the source
584+ runtimes support online compilation. To support this , a way to embed the source
585585text in the debug information is provided.
586586
587587See :ref: `amdgpu-dwarf-line-number-information `.
588588
5895892.17 Allow MD5 Checksums to be Optionally Present
590590-------------------------------------------------
591591
592- In DWARF Version 5 the file timestamp and file size can be optional, but if the
593- MD5 checksum is present it must be valid for all files. This is a problem if
592+ In DWARF Version 5, the file timestamp and file size can be optional, but if the
593+ MD5 checksum is present, it must be valid for all files. This is a problem if
594594using link time optimization to combine compilation units where some have MD5
595- checksums and some do not. Therefore, sSupport to allow MD5 checksums to be
596- optionally present in the line table is added .
595+ checksums, and others do not. Therefore, the line table is extended to allow MD5
596+ checksums to be optional .
597597
598598See :ref: `amdgpu-dwarf-line-number-information `.
599599
600- 2.18 Add the HIP Programing Language
601- ------------------------------------
600+ 2.18 Add the HIP Programming Language
601+ -------------------------------------
602602
603603The HIP programming language [:ref: `HIP <amdgpu-dwarf-HIP >`], which is supported
604604by the AMDGPU, is added.
@@ -617,7 +617,7 @@ hardware to allow a single instruction to execute multiple iterations using
617617vector registers.
618618
619619Note that although this is similar to SIMT execution, the way a client debugger
620- uses the information is fundamentally different. In SIMT execution the debugger
620+ uses the information is fundamentally different. In SIMT execution, the debugger
621621needs to present the concurrent execution as distinct source language threads
622622that the user can list and switch focus between. With iteration concurrency
623623optimizations, such as software pipelining and vectorized SIMD, the debugger
@@ -648,7 +648,7 @@ language loop iterations are executing concurrently. See
648648It is common in SIMD vectorization for the compiler to generate code that
649649promotes portions of an array into vector registers. For example, if the
650650hardware has vector registers with 8 elements, and 8 wide SIMD instructions, the
651- compiler may vectorize a loop so that is executes 8 iterations concurrently for
651+ compiler may vectorize a loop so that it executes 8 iterations concurrently for
652652each vectorized loop iteration.
653653
654654On the first iteration of the generated vectorized loop, iterations 0 to 7 of
@@ -691,7 +691,7 @@ Inside the loop body, the machine code loads ``src[i]`` and ``dst[i]`` into
691691registers, adds them, and stores the result back into ``dst[i] ``.
692692
693693Considering the location of ``dst `` and ``src `` in the loop body, the elements
694- ``dst[i] `` and ``src[i] `` would be located in registers, all other elements are
694+ ``dst[i] `` and ``src[i] `` would be located in registers; all other elements are
695695located in memory. Let register ``R0 `` contain the base address of ``dst ``,
696696register ``R1 `` contain ``i ``, and register ``R2 `` contain the registerized
697697``dst[i] `` element. We can describe the location of ``dst `` as a memory location
@@ -722,7 +722,7 @@ with a register location overlaid at a runtime offset involving ``i``:
722722----------------------------------------------
723723
724724AMDGPU supports languages, such as OpenCL, that define source language memory
725- spaces. Support is added to define language specific memory spaces so they can
725+ spaces. Support is added to define language- specific memory spaces so they can
726726be used in a consistent way by consumers. See :ref: `amdgpu-dwarf-memory-spaces `.
727727
728728A new attribute ``DW_AT_LLVM_memory_space `` is added to support using memory
@@ -738,9 +738,9 @@ accommodates only 32 unique operations. In practice, the lack of a central
738738registry and a desire for backwards compatibility means vendor extensions are
739739never retired, even when standard versions are accepted into DWARF proper. This
740740has produced a situation where the effective encoding space available for new
741- vendor extensions is miniscule today.
741+ vendor extensions is minuscule today.
742742
743- To expand this encoding space a new DWARF operation ``DW_OP_LLVM_user `` is
743+ To expand this encoding space, a new DWARF operation ``DW_OP_LLVM_user `` is
744744added which acts as a "prefix" for vendor extensions. It is followed by a
745745ULEB128 encoded vendor extension opcode, which is then followed by the operands
746746of the corresponding vendor extension operation.
@@ -776,7 +776,7 @@ A. Changes Relative to DWARF Version 5
776776 .. note ::
777777
778778 Notes are included to describe how the changes are to be applied to the
779- DWARF Version 5 standard. They also describe rational and issues that may
779+ DWARF Version 5 standard. They also describe rationale and issues that may
780780 need further consideration.
781781
782782A.2 General Description
@@ -898,7 +898,7 @@ elements that can be specified are:
898898
899899*A current lane *
900900
901- The 0 based SIMT lane identifier to be used in evaluating a user presented
901+ The 0- based SIMT lane identifier to be used in evaluating a user presented
902902 expression. This applies to source languages that are implemented for a target
903903 architecture using a SIMT execution model. These implementations map source
904904 language threads of execution to lanes of the target architecture threads.
@@ -917,7 +917,7 @@ elements that can be specified are:
917917
918918*A current iteration *
919919
920- The 0 based source language iteration instance to be used in evaluating a user
920+ The 0- based source language iteration instance to be used in evaluating a user
921921 presented expression. This applies to target architectures that support
922922 optimizations that result in executing multiple source language loop iterations
923923 concurrently.
@@ -1845,7 +1845,7 @@ There are these special value operations currently defined:
18451845 interpreted as a value of T. If a conversion is wanted it can be done
18461846 explicitly using a ``DW_OP_convert `` operation.
18471847
1848- GDB has a per register hook that allows a target specific conversion on a
1848+ GDB has a per register hook that allows a target- specific conversion on a
18491849 register by register basis. It defaults to truncation of bigger registers.
18501850 Removing use of the target hook does not cause any test failures in common
18511851 architectures. If the compiler for a target architecture did want some
@@ -1855,7 +1855,7 @@ There are these special value operations currently defined:
18551855 If T is a larger type than the register size, then the default GDB
18561856 register hook reads bytes from the next register (or reads out of bounds
18571857 for the last register!). Removing use of the target hook does not cause
1858- any test failures in common architectures (except an illegal hand written
1858+ any test failures in common architectures (except an illegal hand- written
18591859 assembly test). If a target architecture requires this behavior, these
18601860 extensions allow a composite location description to be used to combine
18611861 multiple registers.
@@ -2283,7 +2283,7 @@ bit offset equal to V scaled by 8 (the byte size).
22832283 The implicit conversion could also be defined as target architecture specific.
22842284 For example, GDB checks if V is an integral type. If it is not it gives an
22852285 error. Otherwise, GDB zero-extends V to 64 bits. If the GDB target defines a
2286- hook function, then it is called. The target specific hook function can modify
2286+ hook function, then it is called. The target- specific hook function can modify
22872287 the 64-bit value, possibly sign extending based on the original value type.
22882288 Finally, GDB treats the 64-bit value V as a memory location address.
22892289
0 commit comments