Skip to content

Commit 8cdf2d1

Browse files
committed
Merge tag 'rcu.release.v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux
Pull RCU updates from Frederic Weisbecker: "SRCU: - Introduction of the new SRCU-lite flavour with a new pair of srcu_read_[un]lock_lite() APIs. In practice the read side using this flavour becomes lighter by removing a full memory barrier on LOCK and a full memory barrier on UNLOCK. This comes at the expense of a higher latency write side with two (in the best case of a snaphot of unused read-sides) or more RCU grace periods on the update side which now assumes by itself the whole full ordering guarantee against the LOCK/UNLOCK counters on both indexes, along with the accesses performed inside. Uretprobes is a known potential user. Note this doesn't replace the default normal flavour of SRCU which still behaves the same as usual. - Add testing of SRCU-lite through rcutorture and rcuscale - Various cleanups on the way. Fixes: - Allow short-circuiting RCU-TASKS-RUDE grace periods on architectures that have sane noinstr boundaries forbidding tracing on low-level idle and kernel entry code. RCU-TASKS is enough on such configurations because it involves an RCU grace period that waits for all idle tasks to either schedule out voluntarily or enter into RCU unwatched noinstr code. - Allow and test start_poll_synchronize_rcu() with IRQs disabled. - Mention rcuog kthreads in relevant documentation and Kconfig help - Various fixes and consolidations rcutorture: - Add --no-affinity on tools to leave the affinity setting of guests up to the user. - Add guest_os_delay parameter to rcuscale for better warm-up control. - Fix and improve some rcuscale error handling. - Various cleanups and fixes stall: - Remove dead code - Stop dumping tasks if a stalled grace period eventually ended midway as that only produces confusing output. - Optimize detection of stalling CPUs and avoid useless node locking otherwise. NOCB: - Fix rcu_barrier() hang due to a race against callbacks deoffloading. This is not yet used, except by rcutorture, and waits for its promised cpusets interface. - Remove leftover function declaration" * tag 'rcu.release.v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux: (42 commits) rcuscale: Remove redundant WARN_ON_ONCE() splat rcuscale: Do a proper cleanup if kfree_scale_init() fails srcu: Unconditionally record srcu_read_lock_lite() in ->srcu_reader_flavor srcu: Check for srcu_read_lock_lite() across all CPUs srcu: Remove smp_mb() from srcu_read_unlock_lite() rcutorture: Avoid printing cpu=-1 for no-fault RCU boost failure rcuscale: Add guest_os_delay module parameter refscale: Correct affinity check torture: Add --no-affinity parameter to kvm.sh rcu/nocb: Fix missed RCU barrier on deoffloading rcu/kvfree: Fix data-race in __mod_timer / kvfree_call_rcu rcu/srcutiny: don't return before reenabling preemption rcu-tasks: Remove open-coded one-byte cmpxchg() emulation doc: Remove kernel-parameters.txt entry for rcutorture.read_exit rcutorture: Test start-poll primitives with interrupts disabled rcu: Permit start_poll_synchronize_rcu*() with interrupts disabled rcu: Allow short-circuiting of synchronize_rcu_tasks_rude() doc: Add rcuog kthreads to kernel-per-CPU-kthreads.rst rcu: Add rcuog kthreads to RCU_NOCB_CPU help text rcu: Use the BITS_PER_LONG macro ...
2 parents 2b5d5f2 + d8dfba2 commit 8cdf2d1

27 files changed

+468
-227
lines changed

Documentation/RCU/stallwarn.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -249,7 +249,7 @@ ticks this GP)" indicates that this CPU has not taken any scheduling-clock
249249
interrupts during the current stalled grace period.
250250

251251
The "idle=" portion of the message prints the dyntick-idle state.
252-
The hex number before the first "/" is the low-order 12 bits of the
252+
The hex number before the first "/" is the low-order 16 bits of the
253253
dynticks counter, which will have an even-numbered value if the CPU
254254
is in dyntick-idle mode and an odd-numbered value otherwise. The hex
255255
number between the two "/"s is the value of the nesting, which will be

Documentation/admin-guide/kernel-parameters.txt

Lines changed: 8 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -5415,11 +5415,6 @@
54155415
Set time (jiffies) between CPU-hotplug operations,
54165416
or zero to disable CPU-hotplug testing.
54175417

5418-
rcutorture.read_exit= [KNL]
5419-
Set the number of read-then-exit kthreads used
5420-
to test the interaction of RCU updaters and
5421-
task-exit processing.
5422-
54235418
rcutorture.read_exit_burst= [KNL]
54245419
The number of times in a given read-then-exit
54255420
episode that a set of read-then-exit kthreads
@@ -5429,6 +5424,14 @@
54295424
The delay, in seconds, between successive
54305425
read-then-exit testing episodes.
54315426

5427+
rcutorture.reader_flavor= [KNL]
5428+
A bit mask indicating which readers to use.
5429+
If there is more than one bit set, the readers
5430+
are entered from low-order bit up, and are
5431+
exited in the opposite order. For SRCU, the
5432+
0x1 bit is normal readers, 0x2 NMI-safe readers,
5433+
and 0x4 light-weight readers.
5434+
54325435
rcutorture.shuffle_interval= [KNL]
54335436
Set task-shuffle interval (s). Shuffling tasks
54345437
allows some CPUs to go into dyntick-idle mode

Documentation/admin-guide/kernel-per-CPU-kthreads.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -315,7 +315,7 @@ To reduce its OS jitter, do at least one of the following:
315315
to do.
316316

317317
Name:
318-
rcuop/%d and rcuos/%d
318+
rcuop/%d, rcuos/%d, and rcuog/%d
319319

320320
Purpose:
321321
Offload RCU callbacks from the corresponding CPU.

include/linux/rcutiny.h

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -165,7 +165,6 @@ static inline bool rcu_inkernel_boot_has_ended(void) { return true; }
165165
static inline bool rcu_is_watching(void) { return true; }
166166
static inline void rcu_momentary_eqs(void) { }
167167
static inline void kfree_rcu_scheduler_running(void) { }
168-
static inline bool rcu_gp_might_be_stalled(void) { return false; }
169168

170169
/* Avoid RCU read-side critical sections leaking across. */
171170
static inline void rcu_all_qs(void) { barrier(); }

include/linux/rcutree.h

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,6 @@ void kvfree_rcu_barrier(void);
4040
void rcu_barrier(void);
4141
void rcu_momentary_eqs(void);
4242
void kfree_rcu_scheduler_running(void);
43-
bool rcu_gp_might_be_stalled(void);
4443

4544
struct rcu_gp_oldstate {
4645
unsigned long rgos_norm;

include/linux/srcu.h

Lines changed: 69 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,13 @@ void call_srcu(struct srcu_struct *ssp, struct rcu_head *head,
5656
void cleanup_srcu_struct(struct srcu_struct *ssp);
5757
int __srcu_read_lock(struct srcu_struct *ssp) __acquires(ssp);
5858
void __srcu_read_unlock(struct srcu_struct *ssp, int idx) __releases(ssp);
59+
#ifdef CONFIG_TINY_SRCU
60+
#define __srcu_read_lock_lite __srcu_read_lock
61+
#define __srcu_read_unlock_lite __srcu_read_unlock
62+
#else // #ifdef CONFIG_TINY_SRCU
63+
int __srcu_read_lock_lite(struct srcu_struct *ssp) __acquires(ssp);
64+
void __srcu_read_unlock_lite(struct srcu_struct *ssp, int idx) __releases(ssp);
65+
#endif // #else // #ifdef CONFIG_TINY_SRCU
5966
void synchronize_srcu(struct srcu_struct *ssp);
6067

6168
#define SRCU_GET_STATE_COMPLETED 0x1
@@ -176,17 +183,6 @@ static inline int srcu_read_lock_held(const struct srcu_struct *ssp)
176183

177184
#endif /* #else #ifdef CONFIG_DEBUG_LOCK_ALLOC */
178185

179-
#define SRCU_NMI_UNKNOWN 0x0
180-
#define SRCU_NMI_UNSAFE 0x1
181-
#define SRCU_NMI_SAFE 0x2
182-
183-
#if defined(CONFIG_PROVE_RCU) && defined(CONFIG_TREE_SRCU)
184-
void srcu_check_nmi_safety(struct srcu_struct *ssp, bool nmi_safe);
185-
#else
186-
static inline void srcu_check_nmi_safety(struct srcu_struct *ssp,
187-
bool nmi_safe) { }
188-
#endif
189-
190186

191187
/**
192188
* srcu_dereference_check - fetch SRCU-protected pointer for later dereferencing
@@ -236,33 +232,67 @@ static inline void srcu_check_nmi_safety(struct srcu_struct *ssp,
236232
* a mutex that is held elsewhere while calling synchronize_srcu() or
237233
* synchronize_srcu_expedited().
238234
*
239-
* Note that srcu_read_lock() and the matching srcu_read_unlock() must
240-
* occur in the same context, for example, it is illegal to invoke
241-
* srcu_read_unlock() in an irq handler if the matching srcu_read_lock()
242-
* was invoked in process context.
235+
* The return value from srcu_read_lock() must be passed unaltered
236+
* to the matching srcu_read_unlock(). Note that srcu_read_lock() and
237+
* the matching srcu_read_unlock() must occur in the same context, for
238+
* example, it is illegal to invoke srcu_read_unlock() in an irq handler
239+
* if the matching srcu_read_lock() was invoked in process context. Or,
240+
* for that matter to invoke srcu_read_unlock() from one task and the
241+
* matching srcu_read_lock() from another.
243242
*/
244243
static inline int srcu_read_lock(struct srcu_struct *ssp) __acquires(ssp)
245244
{
246245
int retval;
247246

248-
srcu_check_nmi_safety(ssp, false);
247+
srcu_check_read_flavor(ssp, SRCU_READ_FLAVOR_NORMAL);
249248
retval = __srcu_read_lock(ssp);
250249
srcu_lock_acquire(&ssp->dep_map);
251250
return retval;
252251
}
253252

253+
/**
254+
* srcu_read_lock_lite - register a new reader for an SRCU-protected structure.
255+
* @ssp: srcu_struct in which to register the new reader.
256+
*
257+
* Enter an SRCU read-side critical section, but for a light-weight
258+
* smp_mb()-free reader. See srcu_read_lock() for more information.
259+
*
260+
* If srcu_read_lock_lite() is ever used on an srcu_struct structure,
261+
* then none of the other flavors may be used, whether before, during,
262+
* or after. Note that grace-period auto-expediting is disabled for _lite
263+
* srcu_struct structures because auto-expedited grace periods invoke
264+
* synchronize_rcu_expedited(), IPIs and all.
265+
*
266+
* Note that srcu_read_lock_lite() can be invoked only from those contexts
267+
* where RCU is watching, that is, from contexts where it would be legal
268+
* to invoke rcu_read_lock(). Otherwise, lockdep will complain.
269+
*/
270+
static inline int srcu_read_lock_lite(struct srcu_struct *ssp) __acquires(ssp)
271+
{
272+
int retval;
273+
274+
srcu_check_read_flavor_lite(ssp);
275+
retval = __srcu_read_lock_lite(ssp);
276+
rcu_try_lock_acquire(&ssp->dep_map);
277+
return retval;
278+
}
279+
254280
/**
255281
* srcu_read_lock_nmisafe - register a new reader for an SRCU-protected structure.
256282
* @ssp: srcu_struct in which to register the new reader.
257283
*
258284
* Enter an SRCU read-side critical section, but in an NMI-safe manner.
259285
* See srcu_read_lock() for more information.
286+
*
287+
* If srcu_read_lock_nmisafe() is ever used on an srcu_struct structure,
288+
* then none of the other flavors may be used, whether before, during,
289+
* or after.
260290
*/
261291
static inline int srcu_read_lock_nmisafe(struct srcu_struct *ssp) __acquires(ssp)
262292
{
263293
int retval;
264294

265-
srcu_check_nmi_safety(ssp, true);
295+
srcu_check_read_flavor(ssp, SRCU_READ_FLAVOR_NMI);
266296
retval = __srcu_read_lock_nmisafe(ssp);
267297
rcu_try_lock_acquire(&ssp->dep_map);
268298
return retval;
@@ -274,7 +304,7 @@ srcu_read_lock_notrace(struct srcu_struct *ssp) __acquires(ssp)
274304
{
275305
int retval;
276306

277-
srcu_check_nmi_safety(ssp, false);
307+
srcu_check_read_flavor(ssp, SRCU_READ_FLAVOR_NORMAL);
278308
retval = __srcu_read_lock(ssp);
279309
return retval;
280310
}
@@ -303,7 +333,7 @@ srcu_read_lock_notrace(struct srcu_struct *ssp) __acquires(ssp)
303333
static inline int srcu_down_read(struct srcu_struct *ssp) __acquires(ssp)
304334
{
305335
WARN_ON_ONCE(in_nmi());
306-
srcu_check_nmi_safety(ssp, false);
336+
srcu_check_read_flavor(ssp, SRCU_READ_FLAVOR_NORMAL);
307337
return __srcu_read_lock(ssp);
308338
}
309339

@@ -318,11 +348,27 @@ static inline void srcu_read_unlock(struct srcu_struct *ssp, int idx)
318348
__releases(ssp)
319349
{
320350
WARN_ON_ONCE(idx & ~0x1);
321-
srcu_check_nmi_safety(ssp, false);
351+
srcu_check_read_flavor(ssp, SRCU_READ_FLAVOR_NORMAL);
322352
srcu_lock_release(&ssp->dep_map);
323353
__srcu_read_unlock(ssp, idx);
324354
}
325355

356+
/**
357+
* srcu_read_unlock_lite - unregister a old reader from an SRCU-protected structure.
358+
* @ssp: srcu_struct in which to unregister the old reader.
359+
* @idx: return value from corresponding srcu_read_lock().
360+
*
361+
* Exit a light-weight SRCU read-side critical section.
362+
*/
363+
static inline void srcu_read_unlock_lite(struct srcu_struct *ssp, int idx)
364+
__releases(ssp)
365+
{
366+
WARN_ON_ONCE(idx & ~0x1);
367+
srcu_check_read_flavor(ssp, SRCU_READ_FLAVOR_LITE);
368+
srcu_lock_release(&ssp->dep_map);
369+
__srcu_read_unlock_lite(ssp, idx);
370+
}
371+
326372
/**
327373
* srcu_read_unlock_nmisafe - unregister a old reader from an SRCU-protected structure.
328374
* @ssp: srcu_struct in which to unregister the old reader.
@@ -334,7 +380,7 @@ static inline void srcu_read_unlock_nmisafe(struct srcu_struct *ssp, int idx)
334380
__releases(ssp)
335381
{
336382
WARN_ON_ONCE(idx & ~0x1);
337-
srcu_check_nmi_safety(ssp, true);
383+
srcu_check_read_flavor(ssp, SRCU_READ_FLAVOR_NMI);
338384
rcu_lock_release(&ssp->dep_map);
339385
__srcu_read_unlock_nmisafe(ssp, idx);
340386
}
@@ -343,7 +389,7 @@ static inline void srcu_read_unlock_nmisafe(struct srcu_struct *ssp, int idx)
343389
static inline notrace void
344390
srcu_read_unlock_notrace(struct srcu_struct *ssp, int idx) __releases(ssp)
345391
{
346-
srcu_check_nmi_safety(ssp, false);
392+
srcu_check_read_flavor(ssp, SRCU_READ_FLAVOR_NORMAL);
347393
__srcu_read_unlock(ssp, idx);
348394
}
349395

@@ -360,7 +406,7 @@ static inline void srcu_up_read(struct srcu_struct *ssp, int idx)
360406
{
361407
WARN_ON_ONCE(idx & ~0x1);
362408
WARN_ON_ONCE(in_nmi());
363-
srcu_check_nmi_safety(ssp, false);
409+
srcu_check_read_flavor(ssp, SRCU_READ_FLAVOR_NORMAL);
364410
__srcu_read_unlock(ssp, idx);
365411
}
366412

include/linux/srcutiny.h

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -81,6 +81,9 @@ static inline void srcu_barrier(struct srcu_struct *ssp)
8181
synchronize_srcu(ssp);
8282
}
8383

84+
#define srcu_check_read_flavor(ssp, read_flavor) do { } while (0)
85+
#define srcu_check_read_flavor_lite(ssp) do { } while (0)
86+
8487
/* Defined here to avoid size increase for non-torture kernels. */
8588
static inline void srcu_torture_stats_print(struct srcu_struct *ssp,
8689
char *tt, char *tf)

include/linux/srcutree.h

Lines changed: 66 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ struct srcu_data {
2525
/* Read-side state. */
2626
atomic_long_t srcu_lock_count[2]; /* Locks per CPU. */
2727
atomic_long_t srcu_unlock_count[2]; /* Unlocks per CPU. */
28-
int srcu_nmi_safety; /* NMI-safe srcu_struct structure? */
28+
int srcu_reader_flavor; /* Reader flavor for srcu_struct structure? */
2929

3030
/* Update-side state. */
3131
spinlock_t __private lock ____cacheline_internodealigned_in_smp;
@@ -43,6 +43,11 @@ struct srcu_data {
4343
struct srcu_struct *ssp;
4444
};
4545

46+
/* Values for ->srcu_reader_flavor. */
47+
#define SRCU_READ_FLAVOR_NORMAL 0x1 // srcu_read_lock().
48+
#define SRCU_READ_FLAVOR_NMI 0x2 // srcu_read_lock_nmisafe().
49+
#define SRCU_READ_FLAVOR_LITE 0x4 // srcu_read_lock_lite().
50+
4651
/*
4752
* Node in SRCU combining tree, similar in function to rcu_data.
4853
*/
@@ -204,4 +209,64 @@ void synchronize_srcu_expedited(struct srcu_struct *ssp);
204209
void srcu_barrier(struct srcu_struct *ssp);
205210
void srcu_torture_stats_print(struct srcu_struct *ssp, char *tt, char *tf);
206211

212+
/*
213+
* Counts the new reader in the appropriate per-CPU element of the
214+
* srcu_struct. Returns an index that must be passed to the matching
215+
* srcu_read_unlock_lite().
216+
*
217+
* Note that this_cpu_inc() is an RCU read-side critical section either
218+
* because it disables interrupts, because it is a single instruction,
219+
* or because it is a read-modify-write atomic operation, depending on
220+
* the whims of the architecture.
221+
*/
222+
static inline int __srcu_read_lock_lite(struct srcu_struct *ssp)
223+
{
224+
int idx;
225+
226+
RCU_LOCKDEP_WARN(!rcu_is_watching(), "RCU must be watching srcu_read_lock_lite().");
227+
idx = READ_ONCE(ssp->srcu_idx) & 0x1;
228+
this_cpu_inc(ssp->sda->srcu_lock_count[idx].counter); /* Y */
229+
barrier(); /* Avoid leaking the critical section. */
230+
return idx;
231+
}
232+
233+
/*
234+
* Removes the count for the old reader from the appropriate
235+
* per-CPU element of the srcu_struct. Note that this may well be a
236+
* different CPU than that which was incremented by the corresponding
237+
* srcu_read_lock_lite(), but it must be within the same task.
238+
*
239+
* Note that this_cpu_inc() is an RCU read-side critical section either
240+
* because it disables interrupts, because it is a single instruction,
241+
* or because it is a read-modify-write atomic operation, depending on
242+
* the whims of the architecture.
243+
*/
244+
static inline void __srcu_read_unlock_lite(struct srcu_struct *ssp, int idx)
245+
{
246+
barrier(); /* Avoid leaking the critical section. */
247+
this_cpu_inc(ssp->sda->srcu_unlock_count[idx].counter); /* Z */
248+
RCU_LOCKDEP_WARN(!rcu_is_watching(), "RCU must be watching srcu_read_unlock_lite().");
249+
}
250+
251+
void __srcu_check_read_flavor(struct srcu_struct *ssp, int read_flavor);
252+
253+
// Record _lite() usage even for CONFIG_PROVE_RCU=n kernels.
254+
static inline void srcu_check_read_flavor_lite(struct srcu_struct *ssp)
255+
{
256+
struct srcu_data *sdp = raw_cpu_ptr(ssp->sda);
257+
258+
if (likely(READ_ONCE(sdp->srcu_reader_flavor) & SRCU_READ_FLAVOR_LITE))
259+
return;
260+
261+
// Note that the cmpxchg() in srcu_check_read_flavor() is fully ordered.
262+
__srcu_check_read_flavor(ssp, SRCU_READ_FLAVOR_LITE);
263+
}
264+
265+
// Record non-_lite() usage only for CONFIG_PROVE_RCU=y kernels.
266+
static inline void srcu_check_read_flavor(struct srcu_struct *ssp, int read_flavor)
267+
{
268+
if (IS_ENABLED(CONFIG_PROVE_RCU))
269+
__srcu_check_read_flavor(ssp, read_flavor);
270+
}
271+
207272
#endif

kernel/rcu/Kconfig

Lines changed: 18 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -249,16 +249,24 @@ config RCU_NOCB_CPU
249249
workloads will incur significant increases in context-switch
250250
rates.
251251

252-
This option offloads callback invocation from the set of CPUs
253-
specified at boot time by the rcu_nocbs parameter. For each
254-
such CPU, a kthread ("rcuox/N") will be created to invoke
255-
callbacks, where the "N" is the CPU being offloaded, and where
256-
the "x" is "p" for RCU-preempt (PREEMPTION kernels) and "s" for
257-
RCU-sched (!PREEMPTION kernels). Nothing prevents this kthread
258-
from running on the specified CPUs, but (1) the kthreads may be
259-
preempted between each callback, and (2) affinity or cgroups can
260-
be used to force the kthreads to run on whatever set of CPUs is
261-
desired.
252+
This option offloads callback invocation from the set of
253+
CPUs specified at boot time by the rcu_nocbs parameter.
254+
For each such CPU, a kthread ("rcuox/N") will be created to
255+
invoke callbacks, where the "N" is the CPU being offloaded,
256+
and where the "x" is "p" for RCU-preempt (PREEMPTION kernels)
257+
and "s" for RCU-sched (!PREEMPTION kernels). This option
258+
also creates another kthread for each sqrt(nr_cpu_ids) CPUs
259+
("rcuog/N", where N is the first CPU in that group to come
260+
online), which handles grace periods for its group. Nothing
261+
prevents these kthreads from running on the specified CPUs,
262+
but (1) the kthreads may be preempted between each callback,
263+
and (2) affinity or cgroups can be used to force the kthreads
264+
to run on whatever set of CPUs is desired.
265+
266+
The sqrt(nr_cpu_ids) grouping may be overridden using the
267+
rcutree.rcu_nocb_gp_stride kernel boot parameter. This can
268+
be especially helpful for smaller numbers of CPUs, where
269+
sqrt(nr_cpu_ids) can be a bit of a blunt instrument.
262270

263271
Say Y here if you need reduced OS jitter, despite added overhead.
264272
Say N here if you are unsure.

kernel/rcu/rcu_segcblist.h

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -120,7 +120,6 @@ void rcu_segcblist_inc_len(struct rcu_segcblist *rsclp);
120120
void rcu_segcblist_add_len(struct rcu_segcblist *rsclp, long v);
121121
void rcu_segcblist_init(struct rcu_segcblist *rsclp);
122122
void rcu_segcblist_disable(struct rcu_segcblist *rsclp);
123-
void rcu_segcblist_offload(struct rcu_segcblist *rsclp, bool offload);
124123
bool rcu_segcblist_ready_cbs(struct rcu_segcblist *rsclp);
125124
bool rcu_segcblist_pend_cbs(struct rcu_segcblist *rsclp);
126125
struct rcu_head *rcu_segcblist_first_cb(struct rcu_segcblist *rsclp);

0 commit comments

Comments
 (0)