Skip to content

Commit b5bbc65

Browse files
kadesai16rleon
authored andcommitted
RDMA/bnxt_re: Prevent handling any completions after qp destroy
HW may generate completions that indicates QP is destroyed. Driver should not be scheduling any more completion handlers for this QP, after the QP is destroyed. Since CQs are active during the QP destroy, driver may still schedule completion handlers. This can cause a race where the destroy_cq and poll_cq running simultaneously. Snippet of kernel panic while doing bnxt_re driver load unload in loop. This indicates a poll after the CQ is freed.  [77786.481636] Call Trace: [77786.481640]  <TASK> [77786.481644]  bnxt_re_poll_cq+0x14a/0x620 [bnxt_re] [77786.481658]  ? kvm_clock_read+0x14/0x30 [77786.481693]  __ib_process_cq+0x57/0x190 [ib_core] [77786.481728]  ib_cq_poll_work+0x26/0x80 [ib_core] [77786.481761]  process_one_work+0x1e5/0x3f0 [77786.481768]  worker_thread+0x50/0x3a0 [77786.481785]  ? __pfx_worker_thread+0x10/0x10 [77786.481790]  kthread+0xe2/0x110 [77786.481794]  ? __pfx_kthread+0x10/0x10 [77786.481797]  ret_from_fork+0x2c/0x50 To avoid this, complete all completion handlers before returning the destroy QP. If free_cq is called soon after destroy_qp, IB stack will cancel the CQ work before invoking the destroy_cq verb and this will prevent any race mentioned. Fixes: 1ac5a40 ("RDMA/bnxt_re: Add bnxt_re RoCE driver") Signed-off-by: Kashyap Desai <[email protected]> Signed-off-by: Selvin Xavier <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Leon Romanovsky <[email protected]>
1 parent dc52aad commit b5bbc65

File tree

3 files changed

+31
-0
lines changed

3 files changed

+31
-0
lines changed

drivers/infiniband/hw/bnxt_re/ib_verbs.c

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -869,7 +869,10 @@ static int bnxt_re_destroy_gsi_sqp(struct bnxt_re_qp *qp)
869869
int bnxt_re_destroy_qp(struct ib_qp *ib_qp, struct ib_udata *udata)
870870
{
871871
struct bnxt_re_qp *qp = container_of(ib_qp, struct bnxt_re_qp, ib_qp);
872+
struct bnxt_qplib_qp *qplib_qp = &qp->qplib_qp;
872873
struct bnxt_re_dev *rdev = qp->rdev;
874+
struct bnxt_qplib_nq *scq_nq = NULL;
875+
struct bnxt_qplib_nq *rcq_nq = NULL;
873876
unsigned int flags;
874877
int rc;
875878

@@ -903,6 +906,15 @@ int bnxt_re_destroy_qp(struct ib_qp *ib_qp, struct ib_udata *udata)
903906
ib_umem_release(qp->rumem);
904907
ib_umem_release(qp->sumem);
905908

909+
/* Flush all the entries of notification queue associated with
910+
* given qp.
911+
*/
912+
scq_nq = qplib_qp->scq->nq;
913+
rcq_nq = qplib_qp->rcq->nq;
914+
bnxt_re_synchronize_nq(scq_nq);
915+
if (scq_nq != rcq_nq)
916+
bnxt_re_synchronize_nq(rcq_nq);
917+
906918
return 0;
907919
}
908920

drivers/infiniband/hw/bnxt_re/qplib_fp.c

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -381,6 +381,24 @@ static void bnxt_qplib_service_nq(struct tasklet_struct *t)
381381
spin_unlock_bh(&hwq->lock);
382382
}
383383

384+
/* bnxt_re_synchronize_nq - self polling notification queue.
385+
* @nq - notification queue pointer
386+
*
387+
* This function will start polling entries of a given notification queue
388+
* for all pending entries.
389+
* This function is useful to synchronize notification entries while resources
390+
* are going away.
391+
*/
392+
393+
void bnxt_re_synchronize_nq(struct bnxt_qplib_nq *nq)
394+
{
395+
int budget = nq->budget;
396+
397+
nq->budget = nq->hwq.max_elements;
398+
bnxt_qplib_service_nq(&nq->nq_tasklet);
399+
nq->budget = budget;
400+
}
401+
384402
static irqreturn_t bnxt_qplib_nq_irq(int irq, void *dev_instance)
385403
{
386404
struct bnxt_qplib_nq *nq = dev_instance;

drivers/infiniband/hw/bnxt_re/qplib_fp.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -553,6 +553,7 @@ int bnxt_qplib_process_flush_list(struct bnxt_qplib_cq *cq,
553553
struct bnxt_qplib_cqe *cqe,
554554
int num_cqes);
555555
void bnxt_qplib_flush_cqn_wq(struct bnxt_qplib_qp *qp);
556+
void bnxt_re_synchronize_nq(struct bnxt_qplib_nq *nq);
556557

557558
static inline void *bnxt_qplib_get_swqe(struct bnxt_qplib_q *que, u32 *swq_idx)
558559
{

0 commit comments

Comments
 (0)