You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: BoostToStdCoroutineSwitchPlan.md
+56-53Lines changed: 56 additions & 53 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -34,8 +34,6 @@ This document describes the plan for migrating rippled's coroutine implementatio
34
34
35
35
Coroutines in rippled are used to handle long-running RPC requests — such as pathfinding — without blocking server threads. When a request needs to wait for an external event, the coroutine **suspends** (freeing the thread for other work) and **resumes** later when the event completes.
36
36
37
-
**In simple terms:** Think of a restaurant kitchen. The current system (Boost.Coroutine) is like giving each order its own dedicated chef who stands idle while waiting for the oven — expensive in staff. The new system (C++20 coroutines) is like giving each order a ticket: the chef puts the ticket aside when waiting for the oven, picks up another order, and returns to the first ticket when the oven beeps. Same kitchen, far fewer idle chefs.
|**Context switch cost**|~19 cycles / 9 ns with fcontext; ~1,130 cycles / 547 ns with ucontext (ASAN/TSAN builds) |~20-50 CPU cycles (function call) |
151
+
|**Allocation**| Stack allocated at creation | Heap allocation (compiler may elide) |
152
+
|**Cache behavior**| Poor (large stack rarely fully used) | Good (small frame, hot data close) |
153
+
|**Compiler optimization**| Opaque to compiler | Inlinable, optimizable |
156
154
157
155
### 4.4 Feature Parity Analysis
158
156
@@ -233,9 +231,17 @@ However, these types are small, well-understood, and have extensive reference im
233
231
234
232
**Claim**: If coroutine A `co_await`s coroutine B, and B completes synchronously, B's `final_suspend` resumes A on the same stack, potentially building up unbounded stack depth.
235
233
236
-
**Analysis**: This is addressed by **symmetric transfer** via `FinalAwaiter::await_suspend()` returning a `coroutine_handle<>` instead of `void`. The compiler transforms this into a tail-call, preventing stack growth. This is the standard solution used by all major coroutine libraries and is implemented in our `FinalAwaiter` design (Section 7.1).
234
+
**Why this is a real problem without symmetric transfer**: When `await_suspend()` returns `void`, the coroutine unconditionally suspends and returns from `.resume()`. If the awaited coroutine completes synchronously and calls `.resume()` on the awaiter, each such call adds a stack frame. In a loop that repeatedly `co_await`s short-lived coroutines (e.g., a generator producing millions of values), the stack grows with each iteration until it overflows — typically after ~1M iterations.
235
+
236
+
**How symmetric transfer solves it**: When `await_suspend()` returns a `coroutine_handle<>` instead of `void`, the compiler destroys the current coroutine's stack frame _before_ jumping to the returned handle. This is effectively a tail-call: `resume()` becomes a `jmp` instead of a `call`, so each chained resumption consumes **zero additional stack space**.
237
+
238
+
The C++ standard (P0913R0) mandates this by requiring: _"Implementations shall not impose any limits on how many coroutines can be resumed in this fashion."_ This effectively requires compilers to implement tail-call-like behavior — any finite stack would impose a limit otherwise.
237
239
238
-
**Verdict**: Solved by symmetric transfer (already in our design).
240
+
**Returning `std::noop_coroutine()`** from `await_suspend()` signals "suspend and return to caller" without resuming another coroutine, serving the role that `void` return used to play.
241
+
242
+
**Applicability to rippled**: rippled does not chain coroutines (coroutine A awaiting coroutine B). The `co_await` points in rippled await `JobQueueAwaiter` (reschedules on the thread pool) and `yieldAndPost()` (suspend + re-post), both of which always suspend asynchronously. However, symmetric transfer is still implemented in our `FinalAwaiter` ([Section 7.1](#71-new-type-design)) as a best practice — it costs nothing and prevents stack overflow if the usage pattern ever changes.
243
+
244
+
**Verdict**: Real concern for coroutine chains, but does not affect rippled's current usage. Solved by symmetric transfer in our design regardless.
239
245
240
246
#### **Concern 5: Dangling Reference Risk**
241
247
@@ -271,7 +277,7 @@ However, these types are small, well-understood, and have extensive reference im
271
277
272
278
**Verdict**: Separate system. Out of scope for this migration.
273
279
274
-
**Consequence — `Boost::context` dependency is retained**: Because `boost::asio::spawn` depends on `Boost.Context` for its stackful fiber implementation, the `Boost::context` library **cannot be removed** as part of this migration. The CMake cleanup (Phase 4) replaces `Boost::coroutine` with `Boost::context` — it does not eliminate the Boost fiber dependency entirely.
280
+
**Consequence — `Boost::context` dependency is retained**: Because `boost::asio::spawn` depends on `Boost.Context` for its stackful fiber implementation, the `Boost::context` library **cannot be removed** as part of this migration. The CMake cleanup ([Phase 4](#phase-4-cleanup)) replaces `Boost::coroutine` with `Boost::context` — it does not eliminate the Boost fiber dependency entirely.
275
281
276
282
Additionally, when running under ASAN or TSAN, `Boost.Context` must be built with the `ucontext` backend (not the default `fcontext`) so that it emits `__sanitizer_start_switch_fiber` / `__sanitizer_finish_switch_fiber` annotations during fiber context switches. Without these annotations, the sanitizers cannot track memory ownership across fiber stack switches and will report false positives (stack-use-after-scope under ASAN, data races under TSAN) for the `boost::asio::spawn` call sites listed above. This requires:
P1B["Create JobQueueAwaiter<br/>(schedules resume on JobQueue)"]
586
590
P1C["Add postCoroTask() to JobQueue<br/>(parallel to postCoro)"]
587
591
P1D["Unit tests for new primitives"]
588
592
P1A --> P1B --> P1C --> P1D
589
593
end
590
594
591
-
subgraph "Phase 2: Entry Point Migration"
595
+
subgraph PH2 ["Phase 2: Entry Point Migration"]
592
596
P2A["Migrate ServerHandler::onRequest()"]
593
597
P2B["Migrate ServerHandler::onWSMessage()"]
594
598
P2C["Migrate GRPCServer::CallData::process()"]
@@ -598,32 +602,31 @@ graph TD
598
602
P2C --> P2D
599
603
end
600
604
601
-
subgraph "Phase 3: Handler Migration"
605
+
subgraph PH3 ["Phase 3: Handler Migration"]
602
606
P3A["Migrate RipplePathFind handler"]
603
607
P3B["Verify all other handlers<br/>(no active yield usage)"]
604
608
end
605
609
606
-
subgraph "Phase 4: Cleanup"
610
+
subgraph PH4 ["Phase 4: Cleanup"]
607
611
P4A["Remove old Coro class"]
608
612
P4B["Remove Boost.Coroutine from CMake"]
609
613
P4C["Remove deprecation warning suppression"]
610
614
P4D["Final benchmarks & validation"]
615
+
P4A --> P4B --> P4C --> P4D
611
616
end
612
617
613
-
P1D --> P2A
614
-
P2D --> P3A
615
-
P3B --> P4A
616
-
P3A --> P4A
617
-
P4A --> P4B --> P4C --> P4D
618
+
PH1 --> PH2
619
+
PH2 --> PH3
620
+
PH3 --> PH4
618
621
```
619
622
620
623
**Reading the diagram:**
621
624
622
-
-**Phase 1** builds the new coroutine primitives (`CoroTask`, `JobQueueAwaiter`, `postCoroTask()`) alongside the existing Boost code. No production code changes.
623
-
-**Phase 2** migrates the three entry points (HTTP, WebSocket, gRPC) to use `postCoroTask()` and updates `RPC::Context`.
624
-
-**Phase 3** migrates the `RipplePathFind` handler and verifies no other handlers use `yield()`.
625
-
-**Phase 4** removes the old `Coro` class, `Coro.ipp`, `Boost::coroutine` from CMake, and runs final benchmarks.
626
-
- Each phase depends on the previous one completing. The old code is not deleted until Phase 4, so rollback is safe through Phases 1–3.
625
+
-**[Phase 1](#phase-1-new-coroutine-primitives)** builds the new coroutine primitives (`CoroTask`, `JobQueueAwaiter`, `postCoroTask()`) alongside the existing Boost code. No production code changes.
626
+
-**[Phase 2](#phase-2-entry-point-migration)** migrates the three entry points (HTTP, WebSocket, gRPC) to use `postCoroTask()` and updates `RPC::Context`.
627
+
-**[Phase 3](#phase-3-handler-migration)** migrates the `RipplePathFind` handler and verifies no other handlers use `yield()`.
628
+
-**[Phase 4](#phase-4-cleanup)** removes the old `Coro` class, `Coro.ipp`, `Boost::coroutine` from CMake, and runs final benchmarks.
629
+
- Each phase depends on the previous one completing. The old code is not deleted until [Phase 4](#phase-4-cleanup), so rollback is safe through Phases 1–3.
|**Performance regression** in context switching | Low | High | Benchmark before/after; C++20 should be faster |
1069
+
|**Coroutine frame lifetime bugs** (use-after-destroy) | Medium | High | ASAN testing, RAII wrapper for handle, code review |
1070
+
|**Data races on resume**| Medium | High | TSAN testing, careful await_suspend() implementation |
1071
+
|**LocalValue corruption** across threads | Low | High | Dedicated test with 4+ concurrent coroutines |
1072
+
|**Shutdown race conditions**| Medium | Medium | Replicate existing mutex/cv pattern in new design |
1073
+
|**Missed coroutine consumer** during migration | Low | Medium | Exhaustive grep audit ([Section 5.4](#54-all-coroutine-touchpoints) is complete) |
1074
+
|**Compiler bugs** in coroutine codegen | Low | Medium | Test on all three compilers (GCC, Clang, MSVC) |
1075
+
|**Exception loss** across suspension points | Medium | Medium | Test exception propagation in every phase |
1076
+
|**Third-party code depending on Boost.Coroutine**| Very Low | Low | Grep confirms only internal usage |
1077
+
|**Dangling references in coroutine frames**| Medium | High | ASAN testing, avoid reference params in coroutine functions, use shared_ptr |
1078
+
|**Colored function infection spreading**| Low | Medium | Only 4 call sites need co_await; no nested handlers suspend |
1079
+
|**Symmetric transfer not available**| Very Low | High | All target compilers (GCC 12+, Clang 16+) support symmetric transfer |
1080
+
|**Future handler adding deep yield**| Low | Medium | Code review + CI: static analysis flag any yield from nested depth |
1078
1081
1079
1082
### 9.2 Rollback Strategy
1080
1083
@@ -1101,7 +1104,7 @@ graph TD
1101
1104
P4 --> PREVENT
1102
1105
```
1103
1106
1104
-
**Key principle**: Old `Coro` class and `postCoro()` remain in the codebase through Phases 1-3. They are only removed in Phase 4, after all migration is validated. Each phase is independently revertible via `git revert`.
1107
+
**Key principle**: Old `Coro` class and `postCoro()` remain in the codebase through Phases 1-3. They are only removed in [Phase 4](#phase-4-cleanup), after all migration is validated. Each phase is independently revertible via `git revert`.
1105
1108
1106
1109
### 9.3 Specific Risk: Stackful → Stackless Limitation
1107
1110
@@ -1496,7 +1499,7 @@ ASAN annotations (`__sanitizer_start_switch_fiber` / `__sanitizer_finish_switch_
1496
1499
The migration only removes `Boost::coroutine`. rippled's production server code (`BaseHTTPPeer.h`) and test infrastructure (`yield_to.h`) still use `boost::asio::spawn`, which depends on `Boost.Context` for stackful fiber execution. Migrating those call sites to `boost::asio::co_spawn` / `boost::asio::awaitable` is a separate initiative. See [Concern 6](#concern-6-yield_toh--boostasiospawn) for details.
1497
1500
1498
1501
**Can C++20 stackless coroutines yield from deeply nested function calls?**
1499
-
No — `co_await` can only appear in the immediate coroutine function body. However, an exhaustive audit confirmed that all `yield()` calls in rippled are at the top level of their lambda or handler function. No deep nesting exists. See Section 4.6, Concern 1.
1502
+
No — `co_await` can only appear in the immediate coroutine function body. However, an exhaustive audit confirmed that all `yield()` calls in rippled are at the top level of their lambda or handler function. No deep nesting exists. See [Section 4.6, Concern 1](#concern-1-cannot-suspend-from-nested-call-stacks).
1500
1503
1501
1504
**Why was `RipplePathFind` not migrated to use `co_await` as the plan originally proposed?**
1502
1505
During implementation, it was simpler and more robust to replace the coroutine-based yield/post pattern with a `std::condition_variable` synchronous wait. Since `RipplePathFind` is the only handler that suspends, and it already runs on a JobQueue worker thread, blocking that thread for up to 30 seconds is acceptable. This eliminates coroutine complexity from the handler entirely.
0 commit comments