Skip to content

Commit 85f0d8e

Browse files
guowangyhtejun
authored andcommitted
workqueue: Reduce expensive locks for unbound workqueue
For unbound workqueue, pwqs usually map to just a few pools. Most of the time, pwqs will be linked sequentially to wq->pwqs list by cpu index. Usually, consecutive CPUs have the same workqueue attribute (e.g. belong to the same NUMA node). This makes pwqs with the same pool cluster together in the pwq list. Only do lock/unlock if the pool has changed in flush_workqueue_prep_pwqs(). This reduces the number of expensive lock operations. The performance data shows this change boosts FIO by 65x in some cases when multiple concurrent threads write to xfs mount points with fsync. FIO Benchmark Details - FIO version: v3.35 - FIO Options: ioengine=libaio,iodepth=64,norandommap=1,rw=write, size=128M,bs=4k,fsync=1 - FIO Job Configs: 64 jobs in total writing to 4 mount points (ramdisks formatted as xfs file system). - Kernel Codebase: v6.12-rc5 - Test Platform: Xeon 8380 (2 sockets) Reviewed-by: Tim Chen <[email protected]> Signed-off-by: Wangyang Guo <[email protected]> Reviewed-by: Lai Jiangshan <[email protected]> Signed-off-by: Tejun Heo <[email protected]>
1 parent 5814346 commit 85f0d8e

File tree

1 file changed

+18
-4
lines changed

1 file changed

+18
-4
lines changed

kernel/workqueue.c

Lines changed: 18 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -3833,16 +3833,28 @@ static bool flush_workqueue_prep_pwqs(struct workqueue_struct *wq,
38333833
{
38343834
bool wait = false;
38353835
struct pool_workqueue *pwq;
3836+
struct worker_pool *current_pool = NULL;
38363837

38373838
if (flush_color >= 0) {
38383839
WARN_ON_ONCE(atomic_read(&wq->nr_pwqs_to_flush));
38393840
atomic_set(&wq->nr_pwqs_to_flush, 1);
38403841
}
38413842

3843+
/*
3844+
* For unbound workqueue, pwqs will map to only a few pools.
3845+
* Most of the time, pwqs within the same pool will be linked
3846+
* sequentially to wq->pwqs by cpu index. So in the majority
3847+
* of pwq iters, the pool is the same, only doing lock/unlock
3848+
* if the pool has changed. This can largely reduce expensive
3849+
* lock operations.
3850+
*/
38423851
for_each_pwq(pwq, wq) {
3843-
struct worker_pool *pool = pwq->pool;
3844-
3845-
raw_spin_lock_irq(&pool->lock);
3852+
if (current_pool != pwq->pool) {
3853+
if (likely(current_pool))
3854+
raw_spin_unlock_irq(&current_pool->lock);
3855+
current_pool = pwq->pool;
3856+
raw_spin_lock_irq(&current_pool->lock);
3857+
}
38463858

38473859
if (flush_color >= 0) {
38483860
WARN_ON_ONCE(pwq->flush_color != -1);
@@ -3859,9 +3871,11 @@ static bool flush_workqueue_prep_pwqs(struct workqueue_struct *wq,
38593871
pwq->work_color = work_color;
38603872
}
38613873

3862-
raw_spin_unlock_irq(&pool->lock);
38633874
}
38643875

3876+
if (current_pool)
3877+
raw_spin_unlock_irq(&current_pool->lock);
3878+
38653879
if (flush_color >= 0 && atomic_dec_and_test(&wq->nr_pwqs_to_flush))
38663880
complete(&wq->first_flusher->done);
38673881

0 commit comments

Comments
 (0)