@@ -9,14 +9,14 @@ Introduction
99============
1010
1111Coroutines in C++ were introduced in C++20, and the user experience for
12- debugging them can still be challenging. This document guides you how to most
12+ debugging them can still be challenging. This document guides you on how to most
1313efficiently debug coroutines and how to navigate existing shortcomings in
1414debuggers and compilers.
1515
1616Coroutines are generally used either as generators or for asynchronous
1717programming. In this document, we will discuss both use cases. Even if you are
1818using coroutines for asynchronous programming, you should still read the
19- generators section, as it will introduce foundational debugging techniques also
19+ generators section, as it introduces foundational debugging techniques also
2020applicable to the debugging of asynchronous programs.
2121
2222Both compilers (clang, gcc, ...) and debuggers (lldb, gdb, ...) are
@@ -34,15 +34,15 @@ scripting. This guide comes with a basic GDB script for coroutine debugging.
3434This guide will first showcase the more polished, bleeding-edge experience, but
3535will also show you how to debug coroutines with older toolchains. In general,
3636the older your toolchain, the deeper you will have to dive into the
37- implementation details of coroutines (such as their ABI). The further down in
38- this document you go , the more low-level, technical the content will become. If
37+ implementation details of coroutines (such as their ABI). The further down you go in
38+ this document, the more low-level, technical the content will become. If
3939you are on an up-to-date toolchain, you will hopefully be able to stop reading
4040earlier.
4141
4242Debugging generators
4343====================
4444
45- One of the two major use cases for coroutines in C++ are generators, i.e.,
45+ One of the two major use cases for coroutines in C++ is generators, i.e.,
4646functions which can produce values via ``co_yield ``. Values are produced
4747lazily, on-demand. For this purpose, every time a new value is requested, the
4848coroutine gets resumed. As soon as it reaches a ``co_yield `` and thereby
@@ -141,7 +141,7 @@ a regular function.
141141
142142Note the two additional variables ``__promise `` and ``__coro_frame ``. Those
143143show the internal state of the coroutine. They are not relevant for our
144- generator example, but will be relevant for asynchronous programming described
144+ generator example but will be relevant for asynchronous programming described
145145in the next section.
146146
147147Stepping out of a coroutine
@@ -174,7 +174,7 @@ Inspecting a suspended coroutine
174174--------------------------------
175175
176176The ``print10Elements `` function receives an opaque ``generator `` type. Let's
177- assume we are suspended at the ``++gen; `` line, and want to inspect the
177+ assume we are suspended at the ``++gen; `` line and want to inspect the
178178generator and its internal state.
179179
180180To do so, we can simply look into the ``gen.hdl `` variable. LLDB comes with a
@@ -188,7 +188,7 @@ We can see two function pointers ``resume`` and ``destroy``. These pointers
188188point to the resume / destroy functions. By inspecting those function pointers,
189189we can see that our ``generator `` is actually backed by our ``fibonacci ``
190190coroutine. When using VS Code + lldb-dap, you can Cmd+Click on the function
191- address (``0x555... `` in the screenshot) to directly jump to the function
191+ address (``0x555... `` in the screenshot) to jump directly to the function
192192definition backing your coroutine handle.
193193
194194Next, we see the ``promise ``. In our case, this reveals the current value of
@@ -247,12 +247,12 @@ the line number of the current suspension point in the promise:
247247 };
248248
249249This stores the return address of ``await_suspend `` within the promise.
250- Thereby, we can read it back from the promise of a suspended coroutine, and map
250+ Thereby, we can read it back from the promise of a suspended coroutine and map
251251it to an exact source code location. For a complete example, see the ``task ``
252252type used below for asynchronous programming.
253253
254254Alternatively, we can modify the C++ code to store the line number in the
255- promise type. We can use a ``std::source_location `` to get the line number of
255+ promise type. We can use ``std::source_location `` to get the line number of
256256the await and store it inside the ``promise_type ``. In the debugger, we can
257257then read the line number from the promise of the suspended coroutine.
258258
@@ -270,7 +270,7 @@ then read the line number from the promise of the suspended coroutine.
270270 };
271271
272272The downside of both approaches is that they come at the price of additional
273- runtime cost. In particular the second approach increases binary size, since it
273+ runtime cost. In particular, the second approach increases binary size, since it
274274requires additional ``std::source_location `` objects, and those source
275275locations are not stripped by split-dwarf. Whether the first approach is worth
276276the additional runtime cost is a trade-off you need to make yourself.
@@ -285,7 +285,7 @@ provide custom debugging support, so in addition to this guide, you might want
285285to check out their documentation.
286286
287287When using coroutines for asynchronous programming, your library usually
288- provides you some ``task `` type. This type usually looks similar to this:
288+ provides you with some ``task `` type. This type usually looks similar to this:
289289
290290.. code-block :: c++
291291
@@ -479,7 +479,7 @@ One such solution is to store the list of in-flight coroutines in a collection:
479479 };
480480
481481With this in place, it is possible to inspect ``inflight_coroutines `` from the
482- debugger, and rely on LLDB's ``std::coroutine_handle `` pretty-printer to
482+ debugger and rely on LLDB's ``std::coroutine_handle `` pretty-printer to
483483inspect the coroutines.
484484
485485This technique will track *all * coroutines, also the ones which are currently
@@ -498,8 +498,8 @@ LLDB before 21.0 did not yet show the ``__coro_frame`` inside
498498``coroutine_handle ``. To inspect the coroutine frame, you had to use the
499499approach described in the :ref: `devirtualization ` section.
500500
501- LLDB before 18.0 was hiding the ``__promise `` and ``__coro_frame ``
502- variable by default. The variables are still present, but they need to be
501+ LLDB before 18.0 hid the ``__promise `` and ``__coro_frame ``
502+ variables by default. The variables are still present, but they need to be
503503explicitly added to the "watch" pane in VS Code or requested via
504504``print __promise `` and ``print __coro_frame `` from the debugger console.
505505
@@ -511,9 +511,9 @@ section.
511511Toolchain Implementation Details
512512================================
513513
514- This section covers the ABI, as well as additional compiler-specific behavior.
514+ This section covers the ABI as well as additional compiler-specific behavior.
515515The ABI is followed by all compilers, on all major systems, including Windows,
516- Linux and macOS. Different compilers emit different debug information, though.
516+ Linux, and macOS. Different compilers emit different debug information, though.
517517
518518Ramp, resume and destroy functions
519519----------------------------------
@@ -595,7 +595,7 @@ functions as their first two members. As such, we can read the function
595595pointers from the coroutine frame and then obtain the function's name from its
596596address.
597597
598- The promise is guaranteed to be at a 16 byte offset from the coroutine frame.
598+ The promise is guaranteed to be at a 16- byte offset from the coroutine frame.
599599If we have a coroutine handle at address 0x416eb0, we can hence reinterpret-cast
600600the promise as follows:
601601
@@ -607,8 +607,8 @@ Implementation in clang / LLVM
607607------------------------------
608608
609609The C++ Coroutines feature in the Clang compiler is implemented in two parts of
610- the compiler. Semantic analysis is performed in Clang, and Coroutine
611- construction and optimization takes place in the LLVM middle-end.
610+ the compiler. Semantic analysis is performed in Clang, and coroutine
611+ construction and optimization take place in the LLVM middle-end.
612612
613613For each coroutine function, the frontend generates a single corresponding
614614LLVM-IR function. This function uses special ``llvm.coro.suspend `` intrinsics
@@ -622,7 +622,7 @@ points into the coroutine frame. Most of the heavy lifting to preserve debugging
622622information is done in this pass. This pass needs to rewrite all variable
623623locations to point into the coroutine frame.
624624
625- Afterwards, a couple of additional optimizations are applied, before code
625+ Afterwards, a couple of additional optimizations are applied before code
626626gets emitted, but none of them are really interesting regarding debugging
627627information.
628628
@@ -636,8 +636,8 @@ However, this is not possible for coroutine frames because the frames are
636636constructed in the LLVM middle-end.
637637
638638To mitigate this problem, the LLVM middle end attempts to generate some debug
639- information, which is unfortunately incomplete, since much of the language
640- specific information is missing in the middle end.
639+ information, which is unfortunately incomplete, since much of the
640+ language- specific information is missing in the middle end.
641641
642642.. _devirtualization :
643643
@@ -655,7 +655,7 @@ There are two possible approaches to do so:
655655 We can lookup their types and thereby get the types of promise
656656 and coroutine frame.
657657
658- In gdb, one can use the following approach to devirtualize coroutine type,
658+ In gdb, one can use the following approach to devirtualize a coroutine type,
659659assuming we have a ``std::coroutine_handle `` is at address 0x418eb0:
660660
661661::
@@ -679,18 +679,18 @@ LLDB comes with devirtualization support out of the box, as part of the
679679pretty-printer for ``std::coroutine_handle ``. Internally, this pretty-printer
680680uses the second approach. We look up the types in the destroy function and not
681681the resume function because the resume function pointer will be set to a
682- nullptr as soon as a coroutine reaches its final suspension point. If we used
682+ `` nullptr `` as soon as a coroutine reaches its final suspension point. If we used
683683the resume function, devirtualization would hence fail for all coroutines that
684684have reached their final suspension point.
685685
686686Interpreting the coroutine frame in optimized builds
687687----------------------------------------------------
688688
689689The ``__coro_frame `` variable usually refers to the coroutine frame of an
690- *in-flight * coroutine. This means, the coroutine is currently executing.
690+ *in-flight * coroutine. This means the coroutine is currently executing.
691691However, the compiler only guarantees the coroutine frame to be in a consistent
692692state while the coroutine is suspended. As such, the variables inside the
693- ``__coro_frame `` variable might be outdated, in particular when optimizations
693+ ``__coro_frame `` variable might be outdated, particularly when optimizations
694694are enabled.
695695
696696Furthermore, when optimizations are enabled, the compiler will layout the
@@ -731,7 +731,7 @@ despite ``a`` being frequently incremented.
731731
732732While this might be surprising, this is a result of the optimizer recognizing
733733that it can eliminate most of the load/store operations.
734- The above code gets optimized to the equivalent of:
734+ The above code is optimized to the equivalent of:
735735
736736.. code-block :: c++
737737
@@ -1180,5 +1180,5 @@ The authors of the Folly libraries wrote a blog post series on how they debug co
11801180* `Async stack traces in folly: Improving debugging in the developer lifecycle <https://developers.facebook.com/blog/post/2021/10/21/async-stack-traces-folly-improving-debugging-developer-lifecycle/ >`_
11811181
11821182Besides some topics also covered here (stack traces from the debugger), Folly's blog post series also covers
1183- more additional topics, such as capturing async stack traces in performance profiles via eBPF filters
1183+ additional topics, such as capturing async stack traces in performance profiles via eBPF filters
11841184and printing async stack traces on crashes.
0 commit comments