You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
nvme-pci: avoid race between nvme_reap_pending_cqes() and nvme_poll()
There may be a race between nvme_reap_pending_cqes() and nvme_poll(), e.g.,
when doing live reset while polling the nvme device.
CPU X CPU Y
nvme_poll()
nvme_dev_disable()
-> nvme_stop_queues()
-> nvme_suspend_io_queues()
-> nvme_suspend_queue()
-> spin_lock(&nvmeq->cq_poll_lock);
-> nvme_reap_pending_cqes()
-> nvme_process_cq() -> nvme_process_cq()
In the above scenario, the nvme_process_cq() for the same queue may be
running on both CPU X and CPU Y concurrently.
It is much more easier to reproduce the issue when CONFIG_PREEMPT is
enabled in kernel. When CONFIG_PREEMPT is disabled, it would take longer
time for nvme_stop_queues()-->blk_mq_quiesce_queue() to wait for grace
period.
This patch protects nvme_process_cq() with nvmeq->cq_poll_lock in
nvme_reap_pending_cqes().
Fixes: fa46c6f ("nvme/pci: move cqe check after device shutdown")
Signed-off-by: Dongli Zhang <[email protected]>
Reviewed-by: Ming Lei <[email protected]>
Reviewed-by: Keith Busch <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
0 commit comments