Skip to content

Commit 6bda857

Browse files
Muchun Songaxboe
authored andcommitted
block: fix ordering between checking QUEUE_FLAG_QUIESCED request adding
Supposing the following scenario. CPU0 CPU1 blk_mq_insert_request() 1) store blk_mq_unquiesce_queue() blk_queue_flag_clear() 3) store blk_mq_run_hw_queues() blk_mq_run_hw_queue() if (!blk_mq_hctx_has_pending()) 4) load return blk_mq_run_hw_queue() if (blk_queue_quiesced()) 2) load return blk_mq_sched_dispatch_requests() The full memory barrier should be inserted between 1) and 2), as well as between 3) and 4) to make sure that either CPU0 sees QUEUE_FLAG_QUIESCED is cleared or CPU1 sees dispatch list or setting of bitmap of software queue. Otherwise, either CPU will not rerun the hardware queue causing starvation. So the first solution is to 1) add a pair of memory barrier to fix the problem, another solution is to 2) use hctx->queue->queue_lock to synchronize QUEUE_FLAG_QUIESCED. Here, we chose 2) to fix it since memory barrier is not easy to be maintained. Fixes: f4560ff ("blk-mq: use QUEUE_FLAG_QUIESCED to quiesce queue") Cc: [email protected] Cc: Muchun Song <[email protected]> Signed-off-by: Muchun Song <[email protected]> Reviewed-by: Ming Lei <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
1 parent 2003ee8 commit 6bda857

File tree

1 file changed

+34
-13
lines changed

1 file changed

+34
-13
lines changed

block/blk-mq.c

Lines changed: 34 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -2227,6 +2227,24 @@ void blk_mq_delay_run_hw_queue(struct blk_mq_hw_ctx *hctx, unsigned long msecs)
22272227
}
22282228
EXPORT_SYMBOL(blk_mq_delay_run_hw_queue);
22292229

2230+
static inline bool blk_mq_hw_queue_need_run(struct blk_mq_hw_ctx *hctx)
2231+
{
2232+
bool need_run;
2233+
2234+
/*
2235+
* When queue is quiesced, we may be switching io scheduler, or
2236+
* updating nr_hw_queues, or other things, and we can't run queue
2237+
* any more, even blk_mq_hctx_has_pending() can't be called safely.
2238+
*
2239+
* And queue will be rerun in blk_mq_unquiesce_queue() if it is
2240+
* quiesced.
2241+
*/
2242+
__blk_mq_run_dispatch_ops(hctx->queue, false,
2243+
need_run = !blk_queue_quiesced(hctx->queue) &&
2244+
blk_mq_hctx_has_pending(hctx));
2245+
return need_run;
2246+
}
2247+
22302248
/**
22312249
* blk_mq_run_hw_queue - Start to run a hardware queue.
22322250
* @hctx: Pointer to the hardware queue to run.
@@ -2247,20 +2265,23 @@ void blk_mq_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async)
22472265

22482266
might_sleep_if(!async && hctx->flags & BLK_MQ_F_BLOCKING);
22492267

2250-
/*
2251-
* When queue is quiesced, we may be switching io scheduler, or
2252-
* updating nr_hw_queues, or other things, and we can't run queue
2253-
* any more, even __blk_mq_hctx_has_pending() can't be called safely.
2254-
*
2255-
* And queue will be rerun in blk_mq_unquiesce_queue() if it is
2256-
* quiesced.
2257-
*/
2258-
__blk_mq_run_dispatch_ops(hctx->queue, false,
2259-
need_run = !blk_queue_quiesced(hctx->queue) &&
2260-
blk_mq_hctx_has_pending(hctx));
2268+
need_run = blk_mq_hw_queue_need_run(hctx);
2269+
if (!need_run) {
2270+
unsigned long flags;
22612271

2262-
if (!need_run)
2263-
return;
2272+
/*
2273+
* Synchronize with blk_mq_unquiesce_queue(), because we check
2274+
* if hw queue is quiesced locklessly above, we need the use
2275+
* ->queue_lock to make sure we see the up-to-date status to
2276+
* not miss rerunning the hw queue.
2277+
*/
2278+
spin_lock_irqsave(&hctx->queue->queue_lock, flags);
2279+
need_run = blk_mq_hw_queue_need_run(hctx);
2280+
spin_unlock_irqrestore(&hctx->queue->queue_lock, flags);
2281+
2282+
if (!need_run)
2283+
return;
2284+
}
22642285

22652286
if (async || !cpumask_test_cpu(raw_smp_processor_id(), hctx->cpumask)) {
22662287
blk_mq_delay_run_hw_queue(hctx, 0);

0 commit comments

Comments
 (0)