Skip to content

Commit 8ccd692

Browse files
Dick Kennedymartinkpetersen
authored andcommitted
scsi: lpfc: Fix RSCN timeout due to incorrect gidft counter
In configs with a large number of initiators in the same zone (>250), RSCN timeouts are seen when creating or deleting vports: lpfc 0000:07:00.1: 5:(0):0231 RSCN timeout Data: x0 x3 During RSCN processing driver issues GID_FT command to nameserver. A counter for number of simultaneous GID_FT commands is maintained (an unsigned value). The counter is incremented when the GID_FT is issued. If the GID_FT command fails for some reason the driver retries the GID_FT from the completion call back. But the counter was decremented before the retry was issued. When the second GID_FT completes, the callback again tries to decrement the counter, possibly wrapping to a very large non-zero value, which causes the RSCN cleanup code to not execute. Thus the RSCN timeout failure. Do not decrement the counter on a retry. Also add defensive checks to ensure the counter is not decremented if already zero. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Dick Kennedy <[email protected]> Signed-off-by: James Smart <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
1 parent 9e3e365 commit 8ccd692

File tree

1 file changed

+16
-6
lines changed

1 file changed

+16
-6
lines changed

drivers/scsi/lpfc/lpfc_ct.c

Lines changed: 16 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -713,7 +713,8 @@ lpfc_cmpl_ct_cmd_gid_ft(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb,
713713
/* This is a GID_FT completing so the gidft_inp counter was
714714
* incremented before the GID_FT was issued to the wire.
715715
*/
716-
vport->gidft_inp--;
716+
if (vport->gidft_inp)
717+
vport->gidft_inp--;
717718

718719
/*
719720
* Skip processing the NS response
@@ -741,11 +742,14 @@ lpfc_cmpl_ct_cmd_gid_ft(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb,
741742
goto out;
742743

743744
/* CT command is being retried */
744-
vport->gidft_inp--;
745745
rc = lpfc_ns_cmd(vport, SLI_CTNS_GID_FT,
746746
vport->fc_ns_retry, type);
747747
if (rc == 0)
748748
goto out;
749+
else { /* Unable to send NS cmd */
750+
if (vport->gidft_inp)
751+
vport->gidft_inp--;
752+
}
749753
}
750754
if (vport->fc_flag & FC_RSCN_MODE)
751755
lpfc_els_flush_rscn(vport);
@@ -825,7 +829,8 @@ lpfc_cmpl_ct_cmd_gid_ft(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb,
825829
(uint32_t) CTrsp->ReasonCode,
826830
(uint32_t) CTrsp->Explanation);
827831
}
828-
vport->gidft_inp--;
832+
if (vport->gidft_inp)
833+
vport->gidft_inp--;
829834
}
830835

831836
lpfc_printf_vlog(vport, KERN_INFO, LOG_DISCOVERY,
@@ -918,7 +923,8 @@ lpfc_cmpl_ct_cmd_gid_pt(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb,
918923
/* This is a GID_PT completing so the gidft_inp counter was
919924
* incremented before the GID_PT was issued to the wire.
920925
*/
921-
vport->gidft_inp--;
926+
if (vport->gidft_inp)
927+
vport->gidft_inp--;
922928

923929
/*
924930
* Skip processing the NS response
@@ -942,11 +948,14 @@ lpfc_cmpl_ct_cmd_gid_pt(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb,
942948
vport->fc_ns_retry++;
943949

944950
/* CT command is being retried */
945-
vport->gidft_inp--;
946951
rc = lpfc_ns_cmd(vport, SLI_CTNS_GID_PT,
947952
vport->fc_ns_retry, GID_PT_N_PORT);
948953
if (rc == 0)
949954
goto out;
955+
else { /* Unable to send NS cmd */
956+
if (vport->gidft_inp)
957+
vport->gidft_inp--;
958+
}
950959
}
951960
if (vport->fc_flag & FC_RSCN_MODE)
952961
lpfc_els_flush_rscn(vport);
@@ -1027,7 +1036,8 @@ lpfc_cmpl_ct_cmd_gid_pt(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb,
10271036
(uint32_t)CTrsp->ReasonCode,
10281037
(uint32_t)CTrsp->Explanation);
10291038
}
1030-
vport->gidft_inp--;
1039+
if (vport->gidft_inp)
1040+
vport->gidft_inp--;
10311041
}
10321042

10331043
lpfc_printf_vlog(vport, KERN_INFO, LOG_DISCOVERY,

0 commit comments

Comments
 (0)