Skip to content

Commit 68cadad

Browse files
committed
Merge tag 'rcu.2023.08.21a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
Pull RCU updates from Paul McKenney: - Documentation updates - Miscellaneous fixes, perhaps most notably simplifying SRCU_NOTIFIER_INIT() as suggested - RCU Tasks updates, most notably treating Tasks RCU callbacks as lazy while still treating synchronous grace periods as urgent. Also fixes one bug that restores the ability to apply debug-objects to RCU Tasks and another that fixes a race condition that could result in false-positive failures of the boot-time self-test code - RCU-scalability performance-test updates, most notably adding the ability to measure the RCU-Tasks's grace-period kthread's CPU consumption. This proved quite useful for the RCU Tasks work - Reference-acquisition/release performance-test updates, including a fix for an uninitialized wait_queue_head_t - Miscellaneous torture-test updates - Torture-test scripting updates, including removal of the non-longer-functional formal-verification scripts, test builds of individual RCU Tasks flavors, better diagnostics for loss of connectivity for distributed rcutorture tests, disabling of reboot loops in qemu/KVM-based rcutorture testing, and passing of init parameters to rcutorture's init program * tag 'rcu.2023.08.21a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (64 commits) rcu: Use WRITE_ONCE() for assignments to ->next for rculist_nulls rcu: Make the rcu_nocb_poll boot parameter usable via boot config rcu: Mark __rcu_irq_enter_check_tick() ->rcu_urgent_qs load srcu,notifier: Remove #ifdefs in favor of SRCU Tiny srcu_usage rcutorture: Stop right-shifting torture_random() return values torture: Stop right-shifting torture_random() return values torture: Move stutter_wait() timeouts to hrtimers torture: Move torture_shuffle() timeouts to hrtimers torture: Move torture_onoff() timeouts to hrtimers torture: Make torture_hrtimeout_*() use TASK_IDLE torture: Add lock_torture writer_fifo module parameter torture: Add a kthread-creation callback to _torture_create_kthread() rcu-tasks: Fix boot-time RCU tasks debug-only deadlock rcu-tasks: Permit use of debug-objects with RCU Tasks flavors checkpatch: Complain about unexpected uses of RCU Tasks Trace torture: Cause mkinitrd.sh to indicate failure on compile errors torture: Make init program dump command-line arguments torture: Switch qemu from -nographic to -display none torture: Add init-program support for loongarch torture: Avoid torture-test reboot loops ...
2 parents 727dbda + fe24a0b commit 68cadad

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

78 files changed

+636
-1771
lines changed

Documentation/RCU/lockdep-splat.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ misuses of the RCU API, most notably using one of the rcu_dereference()
1010
family to access an RCU-protected pointer without the proper protection.
1111
When such misuse is detected, an lockdep-RCU splat is emitted.
1212

13-
The usual cause of a lockdep-RCU slat is someone accessing an
13+
The usual cause of a lockdep-RCU splat is someone accessing an
1414
RCU-protected data structure without either (1) being in the right kind of
1515
RCU read-side critical section or (2) holding the right update-side lock.
1616
This problem can therefore be serious: it might result in random memory

Documentation/RCU/rculist_nulls.rst

Lines changed: 27 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -18,19 +18,30 @@ to solve following problem.
1818

1919
Without 'nulls', a typical RCU linked list managing objects which are
2020
allocated with SLAB_TYPESAFE_BY_RCU kmem_cache can use the following
21-
algorithms:
21+
algorithms. Following examples assume 'obj' is a pointer to such
22+
objects, which is having below type.
23+
24+
::
25+
26+
struct object {
27+
struct hlist_node obj_node;
28+
atomic_t refcnt;
29+
unsigned int key;
30+
};
2231

2332
1) Lookup algorithm
2433
-------------------
2534

2635
::
2736

2837
begin:
29-
rcu_read_lock()
38+
rcu_read_lock();
3039
obj = lockless_lookup(key);
3140
if (obj) {
32-
if (!try_get_ref(obj)) // might fail for free objects
41+
if (!try_get_ref(obj)) { // might fail for free objects
42+
rcu_read_unlock();
3343
goto begin;
44+
}
3445
/*
3546
* Because a writer could delete object, and a writer could
3647
* reuse these object before the RCU grace period, we
@@ -54,7 +65,7 @@ but a version with an additional memory barrier (smp_rmb())
5465
struct hlist_node *node, *next;
5566
for (pos = rcu_dereference((head)->first);
5667
pos && ({ next = pos->next; smp_rmb(); prefetch(next); 1; }) &&
57-
({ tpos = hlist_entry(pos, typeof(*tpos), member); 1; });
68+
({ obj = hlist_entry(pos, typeof(*obj), obj_node); 1; });
5869
pos = rcu_dereference(next))
5970
if (obj->key == key)
6071
return obj;
@@ -66,10 +77,10 @@ And note the traditional hlist_for_each_entry_rcu() misses this smp_rmb()::
6677
struct hlist_node *node;
6778
for (pos = rcu_dereference((head)->first);
6879
pos && ({ prefetch(pos->next); 1; }) &&
69-
({ tpos = hlist_entry(pos, typeof(*tpos), member); 1; });
80+
({ obj = hlist_entry(pos, typeof(*obj), obj_node); 1; });
7081
pos = rcu_dereference(pos->next))
71-
if (obj->key == key)
72-
return obj;
82+
if (obj->key == key)
83+
return obj;
7384
return NULL;
7485

7586
Quoting Corey Minyard::
@@ -86,7 +97,7 @@ Quoting Corey Minyard::
8697
2) Insertion algorithm
8798
----------------------
8899

89-
We need to make sure a reader cannot read the new 'obj->obj_next' value
100+
We need to make sure a reader cannot read the new 'obj->obj_node.next' value
90101
and previous value of 'obj->key'. Otherwise, an item could be deleted
91102
from a chain, and inserted into another chain. If new chain was empty
92103
before the move, 'next' pointer is NULL, and lockless reader can not
@@ -129,8 +140,7 @@ very very fast (before the end of RCU grace period)
129140
Avoiding extra smp_rmb()
130141
========================
131142

132-
With hlist_nulls we can avoid extra smp_rmb() in lockless_lookup()
133-
and extra _release() in insert function.
143+
With hlist_nulls we can avoid extra smp_rmb() in lockless_lookup().
134144

135145
For example, if we choose to store the slot number as the 'nulls'
136146
end-of-list marker for each slot of the hash table, we can detect
@@ -142,6 +152,9 @@ the beginning. If the object was moved to the same chain,
142152
then the reader doesn't care: It might occasionally
143153
scan the list again without harm.
144154

155+
Note that using hlist_nulls means the type of 'obj_node' field of
156+
'struct object' becomes 'struct hlist_nulls_node'.
157+
145158

146159
1) lookup algorithm
147160
-------------------
@@ -151,7 +164,7 @@ scan the list again without harm.
151164
head = &table[slot];
152165
begin:
153166
rcu_read_lock();
154-
hlist_nulls_for_each_entry_rcu(obj, node, head, member) {
167+
hlist_nulls_for_each_entry_rcu(obj, node, head, obj_node) {
155168
if (obj->key == key) {
156169
if (!try_get_ref(obj)) { // might fail for free objects
157170
rcu_read_unlock();
@@ -182,6 +195,9 @@ scan the list again without harm.
182195
2) Insert algorithm
183196
-------------------
184197

198+
Same to the above one, but uses hlist_nulls_add_head_rcu() instead of
199+
hlist_add_head_rcu().
200+
185201
::
186202

187203
/*

Documentation/admin-guide/kernel-parameters.txt

Lines changed: 55 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2938,6 +2938,10 @@
29382938
locktorture.torture_type= [KNL]
29392939
Specify the locking implementation to test.
29402940

2941+
locktorture.writer_fifo= [KNL]
2942+
Run the write-side locktorture kthreads at
2943+
sched_set_fifo() real-time priority.
2944+
29412945
locktorture.verbose= [KNL]
29422946
Enable additional printk() statements.
29432947

@@ -4949,6 +4953,15 @@
49494953
test until boot completes in order to avoid
49504954
interference.
49514955

4956+
rcuscale.kfree_by_call_rcu= [KNL]
4957+
In kernels built with CONFIG_RCU_LAZY=y, test
4958+
call_rcu() instead of kfree_rcu().
4959+
4960+
rcuscale.kfree_mult= [KNL]
4961+
Instead of allocating an object of size kfree_obj,
4962+
allocate one of kfree_mult * sizeof(kfree_obj).
4963+
Defaults to 1.
4964+
49524965
rcuscale.kfree_rcu_test= [KNL]
49534966
Set to measure performance of kfree_rcu() flooding.
49544967

@@ -4974,6 +4987,12 @@
49744987
Number of loops doing rcuscale.kfree_alloc_num number
49754988
of allocations and frees.
49764989

4990+
rcuscale.minruntime= [KNL]
4991+
Set the minimum test run time in seconds. This
4992+
does not affect the data-collection interval,
4993+
but instead allows better measurement of things
4994+
like CPU consumption.
4995+
49774996
rcuscale.nreaders= [KNL]
49784997
Set number of RCU readers. The value -1 selects
49794998
N, where N is the number of CPUs. A value
@@ -4988,7 +5007,7 @@
49885007
the same as for rcuscale.nreaders.
49895008
N, where N is the number of CPUs
49905009

4991-
rcuscale.perf_type= [KNL]
5010+
rcuscale.scale_type= [KNL]
49925011
Specify the RCU implementation to test.
49935012

49945013
rcuscale.shutdown= [KNL]
@@ -5004,6 +5023,11 @@
50045023
in microseconds. The default of zero says
50055024
no holdoff.
50065025

5026+
rcuscale.writer_holdoff_jiffies= [KNL]
5027+
Additional write-side holdoff between grace
5028+
periods, but in jiffies. The default of zero
5029+
says no holdoff.
5030+
50075031
rcutorture.fqs_duration= [KNL]
50085032
Set duration of force_quiescent_state bursts
50095033
in microseconds.
@@ -5285,6 +5309,13 @@
52855309
number avoids disturbing real-time workloads,
52865310
but lengthens grace periods.
52875311

5312+
rcupdate.rcu_task_lazy_lim= [KNL]
5313+
Number of callbacks on a given CPU that will
5314+
cancel laziness on that CPU. Use -1 to disable
5315+
cancellation of laziness, but be advised that
5316+
doing so increases the danger of OOM due to
5317+
callback flooding.
5318+
52885319
rcupdate.rcu_task_stall_info= [KNL]
52895320
Set initial timeout in jiffies for RCU task stall
52905321
informational messages, which give some indication
@@ -5314,6 +5345,29 @@
53145345
A change in value does not take effect until
53155346
the beginning of the next grace period.
53165347

5348+
rcupdate.rcu_tasks_lazy_ms= [KNL]
5349+
Set timeout in milliseconds RCU Tasks asynchronous
5350+
callback batching for call_rcu_tasks().
5351+
A negative value will take the default. A value
5352+
of zero will disable batching. Batching is
5353+
always disabled for synchronize_rcu_tasks().
5354+
5355+
rcupdate.rcu_tasks_rude_lazy_ms= [KNL]
5356+
Set timeout in milliseconds RCU Tasks
5357+
Rude asynchronous callback batching for
5358+
call_rcu_tasks_rude(). A negative value
5359+
will take the default. A value of zero will
5360+
disable batching. Batching is always disabled
5361+
for synchronize_rcu_tasks_rude().
5362+
5363+
rcupdate.rcu_tasks_trace_lazy_ms= [KNL]
5364+
Set timeout in milliseconds RCU Tasks
5365+
Trace asynchronous callback batching for
5366+
call_rcu_tasks_trace(). A negative value
5367+
will take the default. A value of zero will
5368+
disable batching. Batching is always disabled
5369+
for synchronize_rcu_tasks_trace().
5370+
53175371
rcupdate.rcu_self_test= [KNL]
53185372
Run the RCU early boot self tests
53195373

include/linux/notifier.h

Lines changed: 0 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -73,9 +73,7 @@ struct raw_notifier_head {
7373

7474
struct srcu_notifier_head {
7575
struct mutex mutex;
76-
#ifdef CONFIG_TREE_SRCU
7776
struct srcu_usage srcuu;
78-
#endif
7977
struct srcu_struct srcu;
8078
struct notifier_block __rcu *head;
8179
};
@@ -106,22 +104,13 @@ extern void srcu_init_notifier_head(struct srcu_notifier_head *nh);
106104
#define RAW_NOTIFIER_INIT(name) { \
107105
.head = NULL }
108106

109-
#ifdef CONFIG_TREE_SRCU
110107
#define SRCU_NOTIFIER_INIT(name, pcpu) \
111108
{ \
112109
.mutex = __MUTEX_INITIALIZER(name.mutex), \
113110
.head = NULL, \
114111
.srcuu = __SRCU_USAGE_INIT(name.srcuu), \
115112
.srcu = __SRCU_STRUCT_INIT(name.srcu, name.srcuu, pcpu), \
116113
}
117-
#else
118-
#define SRCU_NOTIFIER_INIT(name, pcpu) \
119-
{ \
120-
.mutex = __MUTEX_INITIALIZER(name.mutex), \
121-
.head = NULL, \
122-
.srcu = __SRCU_STRUCT_INIT(name.srcu, name.srcuu, pcpu), \
123-
}
124-
#endif
125114

126115
#define ATOMIC_NOTIFIER_HEAD(name) \
127116
struct atomic_notifier_head name = \

include/linux/rculist_nulls.h

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -101,7 +101,7 @@ static inline void hlist_nulls_add_head_rcu(struct hlist_nulls_node *n,
101101
{
102102
struct hlist_nulls_node *first = h->first;
103103

104-
n->next = first;
104+
WRITE_ONCE(n->next, first);
105105
WRITE_ONCE(n->pprev, &h->first);
106106
rcu_assign_pointer(hlist_nulls_first_rcu(h), n);
107107
if (!is_a_nulls(first))
@@ -137,7 +137,7 @@ static inline void hlist_nulls_add_tail_rcu(struct hlist_nulls_node *n,
137137
last = i;
138138

139139
if (last) {
140-
n->next = last->next;
140+
WRITE_ONCE(n->next, last->next);
141141
n->pprev = &last->next;
142142
rcu_assign_pointer(hlist_nulls_next_rcu(last), n);
143143
} else {

include/linux/rcupdate_trace.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -87,6 +87,7 @@ static inline void rcu_read_unlock_trace(void)
8787
void call_rcu_tasks_trace(struct rcu_head *rhp, rcu_callback_t func);
8888
void synchronize_rcu_tasks_trace(void);
8989
void rcu_barrier_tasks_trace(void);
90+
struct task_struct *get_rcu_tasks_trace_gp_kthread(void);
9091
#else
9192
/*
9293
* The BPF JIT forms these addresses even when it doesn't call these

include/linux/rcupdate_wait.h

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,11 @@ do { \
4242
* call_srcu() function, with this wrapper supplying the pointer to the
4343
* corresponding srcu_struct.
4444
*
45+
* Note that call_rcu_hurry() should be used instead of call_rcu()
46+
* because in kernels built with CONFIG_RCU_LAZY=y the delay between the
47+
* invocation of call_rcu() and that of the corresponding RCU callback
48+
* can be multiple seconds.
49+
*
4550
* The first argument tells Tiny RCU's _wait_rcu_gp() not to
4651
* bother waiting for RCU. The reason for this is because anywhere
4752
* synchronize_rcu_mult() can be called is automatically already a full

include/linux/srcutiny.h

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,10 @@ void srcu_drive_gp(struct work_struct *wp);
4848
#define DEFINE_STATIC_SRCU(name) \
4949
static struct srcu_struct name = __SRCU_STRUCT_INIT(name, name, name)
5050

51+
// Dummy structure for srcu_notifier_head.
52+
struct srcu_usage { };
53+
#define __SRCU_USAGE_INIT(name) { }
54+
5155
void synchronize_srcu(struct srcu_struct *ssp);
5256

5357
/*

include/linux/torture.h

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -108,12 +108,15 @@ bool torture_must_stop(void);
108108
bool torture_must_stop_irq(void);
109109
void torture_kthread_stopping(char *title);
110110
int _torture_create_kthread(int (*fn)(void *arg), void *arg, char *s, char *m,
111-
char *f, struct task_struct **tp);
111+
char *f, struct task_struct **tp, void (*cbf)(struct task_struct *tp));
112112
void _torture_stop_kthread(char *m, struct task_struct **tp);
113113

114114
#define torture_create_kthread(n, arg, tp) \
115115
_torture_create_kthread(n, (arg), #n, "Creating " #n " task", \
116-
"Failed to create " #n, &(tp))
116+
"Failed to create " #n, &(tp), NULL)
117+
#define torture_create_kthread_cb(n, arg, tp, cbf) \
118+
_torture_create_kthread(n, (arg), #n, "Creating " #n " task", \
119+
"Failed to create " #n, &(tp), cbf)
117120
#define torture_stop_kthread(n, tp) \
118121
_torture_stop_kthread("Stopping " #n " task", &(tp))
119122

kernel/locking/locktorture.c

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,7 @@ torture_param(int, stutter, 5, "Number of jiffies to run/halt test, 0=disable");
4545
torture_param(int, rt_boost, 2,
4646
"Do periodic rt-boost. 0=Disable, 1=Only for rt_mutex, 2=For all lock types.");
4747
torture_param(int, rt_boost_factor, 50, "A factor determining how often rt-boost happens.");
48+
torture_param(int, writer_fifo, 0, "Run writers at sched_set_fifo() priority");
4849
torture_param(int, verbose, 1, "Enable verbose debugging printk()s");
4950
torture_param(int, nested_locks, 0, "Number of nested locks (max = 8)");
5051
/* Going much higher trips "BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!" errors */
@@ -809,7 +810,8 @@ static int lock_torture_writer(void *arg)
809810
bool skip_main_lock;
810811

811812
VERBOSE_TOROUT_STRING("lock_torture_writer task started");
812-
set_user_nice(current, MAX_NICE);
813+
if (!rt_task(current))
814+
set_user_nice(current, MAX_NICE);
813815

814816
do {
815817
if ((torture_random(&rand) & 0xfffff) == 0)
@@ -1015,8 +1017,7 @@ static void lock_torture_cleanup(void)
10151017

10161018
if (writer_tasks) {
10171019
for (i = 0; i < cxt.nrealwriters_stress; i++)
1018-
torture_stop_kthread(lock_torture_writer,
1019-
writer_tasks[i]);
1020+
torture_stop_kthread(lock_torture_writer, writer_tasks[i]);
10201021
kfree(writer_tasks);
10211022
writer_tasks = NULL;
10221023
}
@@ -1244,8 +1245,9 @@ static int __init lock_torture_init(void)
12441245
goto create_reader;
12451246

12461247
/* Create writer. */
1247-
firsterr = torture_create_kthread(lock_torture_writer, &cxt.lwsa[i],
1248-
writer_tasks[i]);
1248+
firsterr = torture_create_kthread_cb(lock_torture_writer, &cxt.lwsa[i],
1249+
writer_tasks[i],
1250+
writer_fifo ? sched_set_fifo : NULL);
12491251
if (torture_init_error(firsterr))
12501252
goto unwind;
12511253

0 commit comments

Comments
 (0)