Skip to content

Commit b36c830

Browse files
author
Ingo Molnar
committed
Merge branch 'for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu
Pull v5.10 RCU changes from Paul E. McKenney: - Debugging for smp_call_function(). - Strict grace periods for KASAN. The point of this series is to find RCU-usage bugs, so the corresponding new RCU_STRICT_GRACE_PERIOD Kconfig option depends on both DEBUG_KERNEL and RCU_EXPERT, and is further disabled by dfefault. Finally, the help text includes a goodly list of scary caveats. - New smp_call_function() torture test. - Torture-test updates. - Documentation updates. - Miscellaneous fixes. Signed-off-by: Ingo Molnar <[email protected]>
2 parents 583090b + 6fe208f commit b36c830

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

57 files changed

+1582
-421
lines changed

Documentation/RCU/Design/Data-Structures/Data-Structures.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -963,7 +963,7 @@ exit and perhaps also vice versa. Therefore, whenever the
963963
``->dynticks_nesting`` field is incremented up from zero, the
964964
``->dynticks_nmi_nesting`` field is set to a large positive number, and
965965
whenever the ``->dynticks_nesting`` field is decremented down to zero,
966-
the the ``->dynticks_nmi_nesting`` field is set to zero. Assuming that
966+
the ``->dynticks_nmi_nesting`` field is set to zero. Assuming that
967967
the number of misnested interrupts is not sufficient to overflow the
968968
counter, this approach corrects the ``->dynticks_nmi_nesting`` field
969969
every time the corresponding CPU enters the idle loop from process

Documentation/RCU/Design/Requirements/Requirements.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2162,7 +2162,7 @@ scheduling-clock interrupt be enabled when RCU needs it to be:
21622162
this sort of thing.
21632163
#. If a CPU is in a portion of the kernel that is absolutely positively
21642164
no-joking guaranteed to never execute any RCU read-side critical
2165-
sections, and RCU believes this CPU to to be idle, no problem. This
2165+
sections, and RCU believes this CPU to be idle, no problem. This
21662166
sort of thing is used by some architectures for light-weight
21672167
exception handlers, which can then avoid the overhead of
21682168
``rcu_irq_enter()`` and ``rcu_irq_exit()`` at exception entry and
@@ -2431,7 +2431,7 @@ However, there are legitimate preemptible-RCU implementations that do
24312431
not have this property, given that any point in the code outside of an
24322432
RCU read-side critical section can be a quiescent state. Therefore,
24332433
*RCU-sched* was created, which follows “classic” RCU in that an
2434-
RCU-sched grace period waits for for pre-existing interrupt and NMI
2434+
RCU-sched grace period waits for pre-existing interrupt and NMI
24352435
handlers. In kernels built with ``CONFIG_PREEMPT=n``, the RCU and
24362436
RCU-sched APIs have identical implementations, while kernels built with
24372437
``CONFIG_PREEMPT=y`` provide a separate implementation for each.

Documentation/RCU/whatisRCU.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -360,7 +360,7 @@ order to amortize their overhead over many uses of the corresponding APIs.
360360

361361
There are at least three flavors of RCU usage in the Linux kernel. The diagram
362362
above shows the most common one. On the updater side, the rcu_assign_pointer(),
363-
sychronize_rcu() and call_rcu() primitives used are the same for all three
363+
synchronize_rcu() and call_rcu() primitives used are the same for all three
364364
flavors. However for protection (on the reader side), the primitives used vary
365365
depending on the flavor:
366366

Documentation/admin-guide/kernel-parameters.txt

Lines changed: 135 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -3070,6 +3070,10 @@
30703070
and gids from such clients. This is intended to ease
30713071
migration from NFSv2/v3.
30723072

3073+
nmi_backtrace.backtrace_idle [KNL]
3074+
Dump stacks even of idle CPUs in response to an
3075+
NMI stack-backtrace request.
3076+
30733077
nmi_debug= [KNL,SH] Specify one or more actions to take
30743078
when a NMI is triggered.
30753079
Format: [state][,regs][,debounce][,die]
@@ -4149,46 +4153,55 @@
41494153
This wake_up() will be accompanied by a
41504154
WARN_ONCE() splat and an ftrace_dump().
41514155

4156+
rcutree.rcu_unlock_delay= [KNL]
4157+
In CONFIG_RCU_STRICT_GRACE_PERIOD=y kernels,
4158+
this specifies an rcu_read_unlock()-time delay
4159+
in microseconds. This defaults to zero.
4160+
Larger delays increase the probability of
4161+
catching RCU pointer leaks, that is, buggy use
4162+
of RCU-protected pointers after the relevant
4163+
rcu_read_unlock() has completed.
4164+
41524165
rcutree.sysrq_rcu= [KNL]
41534166
Commandeer a sysrq key to dump out Tree RCU's
41544167
rcu_node tree with an eye towards determining
41554168
why a new grace period has not yet started.
41564169

4157-
rcuperf.gp_async= [KNL]
4170+
rcuscale.gp_async= [KNL]
41584171
Measure performance of asynchronous
41594172
grace-period primitives such as call_rcu().
41604173

4161-
rcuperf.gp_async_max= [KNL]
4174+
rcuscale.gp_async_max= [KNL]
41624175
Specify the maximum number of outstanding
41634176
callbacks per writer thread. When a writer
41644177
thread exceeds this limit, it invokes the
41654178
corresponding flavor of rcu_barrier() to allow
41664179
previously posted callbacks to drain.
41674180

4168-
rcuperf.gp_exp= [KNL]
4181+
rcuscale.gp_exp= [KNL]
41694182
Measure performance of expedited synchronous
41704183
grace-period primitives.
41714184

4172-
rcuperf.holdoff= [KNL]
4185+
rcuscale.holdoff= [KNL]
41734186
Set test-start holdoff period. The purpose of
41744187
this parameter is to delay the start of the
41754188
test until boot completes in order to avoid
41764189
interference.
41774190

4178-
rcuperf.kfree_rcu_test= [KNL]
4191+
rcuscale.kfree_rcu_test= [KNL]
41794192
Set to measure performance of kfree_rcu() flooding.
41804193

4181-
rcuperf.kfree_nthreads= [KNL]
4194+
rcuscale.kfree_nthreads= [KNL]
41824195
The number of threads running loops of kfree_rcu().
41834196

4184-
rcuperf.kfree_alloc_num= [KNL]
4197+
rcuscale.kfree_alloc_num= [KNL]
41854198
Number of allocations and frees done in an iteration.
41864199

4187-
rcuperf.kfree_loops= [KNL]
4188-
Number of loops doing rcuperf.kfree_alloc_num number
4200+
rcuscale.kfree_loops= [KNL]
4201+
Number of loops doing rcuscale.kfree_alloc_num number
41894202
of allocations and frees.
41904203

4191-
rcuperf.nreaders= [KNL]
4204+
rcuscale.nreaders= [KNL]
41924205
Set number of RCU readers. The value -1 selects
41934206
N, where N is the number of CPUs. A value
41944207
"n" less than -1 selects N-n+1, where N is again
@@ -4197,23 +4210,23 @@
41974210
A value of "n" less than or equal to -N selects
41984211
a single reader.
41994212

4200-
rcuperf.nwriters= [KNL]
4213+
rcuscale.nwriters= [KNL]
42014214
Set number of RCU writers. The values operate
4202-
the same as for rcuperf.nreaders.
4215+
the same as for rcuscale.nreaders.
42034216
N, where N is the number of CPUs
42044217

4205-
rcuperf.perf_type= [KNL]
4218+
rcuscale.perf_type= [KNL]
42064219
Specify the RCU implementation to test.
42074220

4208-
rcuperf.shutdown= [KNL]
4221+
rcuscale.shutdown= [KNL]
42094222
Shut the system down after performance tests
42104223
complete. This is useful for hands-off automated
42114224
testing.
42124225

4213-
rcuperf.verbose= [KNL]
4226+
rcuscale.verbose= [KNL]
42144227
Enable additional printk() statements.
42154228

4216-
rcuperf.writer_holdoff= [KNL]
4229+
rcuscale.writer_holdoff= [KNL]
42174230
Write-side holdoff between grace periods,
42184231
in microseconds. The default of zero says
42194232
no holdoff.
@@ -4266,6 +4279,18 @@
42664279
are zero, rcutorture acts as if is interpreted
42674280
they are all non-zero.
42684281

4282+
rcutorture.irqreader= [KNL]
4283+
Run RCU readers from irq handlers, or, more
4284+
accurately, from a timer handler. Not all RCU
4285+
flavors take kindly to this sort of thing.
4286+
4287+
rcutorture.leakpointer= [KNL]
4288+
Leak an RCU-protected pointer out of the reader.
4289+
This can of course result in splats, and is
4290+
intended to test the ability of things like
4291+
CONFIG_RCU_STRICT_GRACE_PERIOD=y to detect
4292+
such leaks.
4293+
42694294
rcutorture.n_barrier_cbs= [KNL]
42704295
Set callbacks/threads for rcu_barrier() testing.
42714296

@@ -4487,8 +4512,8 @@
44874512
refscale.shutdown= [KNL]
44884513
Shut down the system at the end of the performance
44894514
test. This defaults to 1 (shut it down) when
4490-
rcuperf is built into the kernel and to 0 (leave
4491-
it running) when rcuperf is built as a module.
4515+
refscale is built into the kernel and to 0 (leave
4516+
it running) when refscale is built as a module.
44924517

44934518
refscale.verbose= [KNL]
44944519
Enable additional printk() statements.
@@ -4634,6 +4659,98 @@
46344659
Format: integer between 0 and 10
46354660
Default is 0.
46364661

4662+
scftorture.holdoff= [KNL]
4663+
Number of seconds to hold off before starting
4664+
test. Defaults to zero for module insertion and
4665+
to 10 seconds for built-in smp_call_function()
4666+
tests.
4667+
4668+
scftorture.longwait= [KNL]
4669+
Request ridiculously long waits randomly selected
4670+
up to the chosen limit in seconds. Zero (the
4671+
default) disables this feature. Please note
4672+
that requesting even small non-zero numbers of
4673+
seconds can result in RCU CPU stall warnings,
4674+
softlockup complaints, and so on.
4675+
4676+
scftorture.nthreads= [KNL]
4677+
Number of kthreads to spawn to invoke the
4678+
smp_call_function() family of functions.
4679+
The default of -1 specifies a number of kthreads
4680+
equal to the number of CPUs.
4681+
4682+
scftorture.onoff_holdoff= [KNL]
4683+
Number seconds to wait after the start of the
4684+
test before initiating CPU-hotplug operations.
4685+
4686+
scftorture.onoff_interval= [KNL]
4687+
Number seconds to wait between successive
4688+
CPU-hotplug operations. Specifying zero (which
4689+
is the default) disables CPU-hotplug operations.
4690+
4691+
scftorture.shutdown_secs= [KNL]
4692+
The number of seconds following the start of the
4693+
test after which to shut down the system. The
4694+
default of zero avoids shutting down the system.
4695+
Non-zero values are useful for automated tests.
4696+
4697+
scftorture.stat_interval= [KNL]
4698+
The number of seconds between outputting the
4699+
current test statistics to the console. A value
4700+
of zero disables statistics output.
4701+
4702+
scftorture.stutter_cpus= [KNL]
4703+
The number of jiffies to wait between each change
4704+
to the set of CPUs under test.
4705+
4706+
scftorture.use_cpus_read_lock= [KNL]
4707+
Use use_cpus_read_lock() instead of the default
4708+
preempt_disable() to disable CPU hotplug
4709+
while invoking one of the smp_call_function*()
4710+
functions.
4711+
4712+
scftorture.verbose= [KNL]
4713+
Enable additional printk() statements.
4714+
4715+
scftorture.weight_single= [KNL]
4716+
The probability weighting to use for the
4717+
smp_call_function_single() function with a zero
4718+
"wait" parameter. A value of -1 selects the
4719+
default if all other weights are -1. However,
4720+
if at least one weight has some other value, a
4721+
value of -1 will instead select a weight of zero.
4722+
4723+
scftorture.weight_single_wait= [KNL]
4724+
The probability weighting to use for the
4725+
smp_call_function_single() function with a
4726+
non-zero "wait" parameter. See weight_single.
4727+
4728+
scftorture.weight_many= [KNL]
4729+
The probability weighting to use for the
4730+
smp_call_function_many() function with a zero
4731+
"wait" parameter. See weight_single.
4732+
Note well that setting a high probability for
4733+
this weighting can place serious IPI load
4734+
on the system.
4735+
4736+
scftorture.weight_many_wait= [KNL]
4737+
The probability weighting to use for the
4738+
smp_call_function_many() function with a
4739+
non-zero "wait" parameter. See weight_single
4740+
and weight_many.
4741+
4742+
scftorture.weight_all= [KNL]
4743+
The probability weighting to use for the
4744+
smp_call_function_all() function with a zero
4745+
"wait" parameter. See weight_single and
4746+
weight_many.
4747+
4748+
scftorture.weight_all_wait= [KNL]
4749+
The probability weighting to use for the
4750+
smp_call_function_all() function with a
4751+
non-zero "wait" parameter. See weight_single
4752+
and weight_many.
4753+
46374754
skew_tick= [KNL] Offset the periodic timer tick per cpu to mitigate
46384755
xtime_lock contention on larger systems, and/or RCU lock
46394756
contention on all systems with CONFIG_MAXSMP set.

MAINTAINERS

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17547,8 +17547,9 @@ S: Supported
1754717547
T: git git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git dev
1754817548
F: Documentation/RCU/torture.rst
1754917549
F: kernel/locking/locktorture.c
17550-
F: kernel/rcu/rcuperf.c
17550+
F: kernel/rcu/rcuscale.c
1755117551
F: kernel/rcu/rcutorture.c
17552+
F: kernel/rcu/refscale.c
1755217553
F: kernel/torture.c
1755317554

1755417555
TOSHIBA ACPI EXTRAS DRIVER

arch/x86/kvm/mmu/page_track.c

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -229,7 +229,8 @@ void kvm_page_track_write(struct kvm_vcpu *vcpu, gpa_t gpa, const u8 *new,
229229
return;
230230

231231
idx = srcu_read_lock(&head->track_srcu);
232-
hlist_for_each_entry_rcu(n, &head->track_notifier_list, node)
232+
hlist_for_each_entry_srcu(n, &head->track_notifier_list, node,
233+
srcu_read_lock_held(&head->track_srcu))
233234
if (n->track_write)
234235
n->track_write(vcpu, gpa, new, bytes, n);
235236
srcu_read_unlock(&head->track_srcu, idx);
@@ -254,7 +255,8 @@ void kvm_page_track_flush_slot(struct kvm *kvm, struct kvm_memory_slot *slot)
254255
return;
255256

256257
idx = srcu_read_lock(&head->track_srcu);
257-
hlist_for_each_entry_rcu(n, &head->track_notifier_list, node)
258+
hlist_for_each_entry_srcu(n, &head->track_notifier_list, node,
259+
srcu_read_lock_held(&head->track_srcu))
258260
if (n->track_flush_slot)
259261
n->track_flush_slot(kvm, slot, n);
260262
srcu_read_unlock(&head->track_srcu, idx);

include/linux/rculist.h

Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -63,9 +63,17 @@ static inline void INIT_LIST_HEAD_RCU(struct list_head *list)
6363
RCU_LOCKDEP_WARN(!(cond) && !rcu_read_lock_any_held(), \
6464
"RCU-list traversed in non-reader section!"); \
6565
})
66+
67+
#define __list_check_srcu(cond) \
68+
({ \
69+
RCU_LOCKDEP_WARN(!(cond), \
70+
"RCU-list traversed without holding the required lock!");\
71+
})
6672
#else
6773
#define __list_check_rcu(dummy, cond, extra...) \
6874
({ check_arg_count_one(extra); })
75+
76+
#define __list_check_srcu(cond) ({ })
6977
#endif
7078

7179
/*
@@ -385,6 +393,25 @@ static inline void list_splice_tail_init_rcu(struct list_head *list,
385393
&pos->member != (head); \
386394
pos = list_entry_rcu(pos->member.next, typeof(*pos), member))
387395

396+
/**
397+
* list_for_each_entry_srcu - iterate over rcu list of given type
398+
* @pos: the type * to use as a loop cursor.
399+
* @head: the head for your list.
400+
* @member: the name of the list_head within the struct.
401+
* @cond: lockdep expression for the lock required to traverse the list.
402+
*
403+
* This list-traversal primitive may safely run concurrently with
404+
* the _rcu list-mutation primitives such as list_add_rcu()
405+
* as long as the traversal is guarded by srcu_read_lock().
406+
* The lockdep expression srcu_read_lock_held() can be passed as the
407+
* cond argument from read side.
408+
*/
409+
#define list_for_each_entry_srcu(pos, head, member, cond) \
410+
for (__list_check_srcu(cond), \
411+
pos = list_entry_rcu((head)->next, typeof(*pos), member); \
412+
&pos->member != (head); \
413+
pos = list_entry_rcu(pos->member.next, typeof(*pos), member))
414+
388415
/**
389416
* list_entry_lockless - get the struct for this entry
390417
* @ptr: the &struct list_head pointer.
@@ -683,6 +710,27 @@ static inline void hlist_add_behind_rcu(struct hlist_node *n,
683710
pos = hlist_entry_safe(rcu_dereference_raw(hlist_next_rcu(\
684711
&(pos)->member)), typeof(*(pos)), member))
685712

713+
/**
714+
* hlist_for_each_entry_srcu - iterate over rcu list of given type
715+
* @pos: the type * to use as a loop cursor.
716+
* @head: the head for your list.
717+
* @member: the name of the hlist_node within the struct.
718+
* @cond: lockdep expression for the lock required to traverse the list.
719+
*
720+
* This list-traversal primitive may safely run concurrently with
721+
* the _rcu list-mutation primitives such as hlist_add_head_rcu()
722+
* as long as the traversal is guarded by srcu_read_lock().
723+
* The lockdep expression srcu_read_lock_held() can be passed as the
724+
* cond argument from read side.
725+
*/
726+
#define hlist_for_each_entry_srcu(pos, head, member, cond) \
727+
for (__list_check_srcu(cond), \
728+
pos = hlist_entry_safe(rcu_dereference_raw(hlist_first_rcu(head)),\
729+
typeof(*(pos)), member); \
730+
pos; \
731+
pos = hlist_entry_safe(rcu_dereference_raw(hlist_next_rcu(\
732+
&(pos)->member)), typeof(*(pos)), member))
733+
686734
/**
687735
* hlist_for_each_entry_rcu_notrace - iterate over rcu list of given type (for tracing)
688736
* @pos: the type * to use as a loop cursor.

0 commit comments

Comments
 (0)