Skip to content

Commit 186779c

Browse files
Joel Fernandesneeraju
authored andcommitted
rcu: Document separation of rcu_state and rnp's gp_seq
The details of this are subtle and was discussed recently. Add a quick-quiz about this and refer to it from the code, for more clarity. Reviewed-by: "Paul E. McKenney" <[email protected]> Signed-off-by: Joel Fernandes <[email protected]> Signed-off-by: Neeraj Upadhyay (AMD) <[email protected]>
1 parent 30a7806 commit 186779c

File tree

2 files changed

+37
-0
lines changed

2 files changed

+37
-0
lines changed

Documentation/RCU/Design/Data-Structures/Data-Structures.rst

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -286,6 +286,39 @@ in order to detect the beginnings and ends of grace periods in a
286286
distributed fashion. The values flow from ``rcu_state`` to ``rcu_node``
287287
(down the tree from the root to the leaves) to ``rcu_data``.
288288

289+
+-----------------------------------------------------------------------+
290+
| **Quick Quiz**: |
291+
+-----------------------------------------------------------------------+
292+
| Given that the root rcu_node structure has a gp_seq field, |
293+
| why does RCU maintain a separate gp_seq in the rcu_state structure? |
294+
| Why not just use the root rcu_node's gp_seq as the official record |
295+
| and update it directly when starting a new grace period? |
296+
+-----------------------------------------------------------------------+
297+
| **Answer**: |
298+
+-----------------------------------------------------------------------+
299+
| On single-node RCU trees (where the root node is also a leaf), |
300+
| updating the root node's gp_seq immediately would create unnecessary |
301+
| lock contention. Here's why: |
302+
| |
303+
| If we did rcu_seq_start() directly on the root node's gp_seq: |
304+
| |
305+
| 1. All CPUs would immediately see their node's gp_seq from their rdp's|
306+
| gp_seq, in rcu_pending(). They would all then invoke the RCU-core. |
307+
| 2. Which calls note_gp_changes() and try to acquire the node lock. |
308+
| 3. But rnp->qsmask isn't initialized yet (happens later in |
309+
| rcu_gp_init()) |
310+
| 4. So each CPU would acquire the lock, find it can't determine if it |
311+
| needs to report quiescent state (no qsmask), update rdp->gp_seq, |
312+
| and release the lock. |
313+
| 5. Result: Lots of lock acquisitions with no grace period progress |
314+
| |
315+
| By having a separate rcu_state.gp_seq, we can increment the official |
316+
| grace period counter without immediately affecting what CPUs see in |
317+
| their nodes. The hierarchical propagation in rcu_gp_init() then |
318+
| updates the root node's gp_seq and qsmask together under the same lock|
319+
| acquisition, avoiding this useless contention. |
320+
+-----------------------------------------------------------------------+
321+
289322
Miscellaneous
290323
'''''''''''''
291324

kernel/rcu/tree.c

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1845,6 +1845,10 @@ static noinline_for_stack bool rcu_gp_init(void)
18451845
* use-after-free errors. For a detailed explanation of this race, see
18461846
* Documentation/RCU/Design/Requirements/Requirements.rst in the
18471847
* "Hotplug CPU" section.
1848+
*
1849+
* Also note that the root rnp's gp_seq is kept separate from, and lags,
1850+
* the rcu_state's gp_seq, for a reason. See the Quick-Quiz on
1851+
* Single-node systems for more details (in Data-Structures.rst).
18481852
*/
18491853
rcu_seq_start(&rcu_state.gp_seq);
18501854
/* Ensure that rcu_seq_done_exact() guardband doesn't give false positives. */

0 commit comments

Comments
 (0)