Skip to content

Commit 416ef71

Browse files
royno-nvidiakuba-moo
authored andcommitted
net/mlx5: Update fw fatal reporter state on PCI handlers successful recover
Update devlink health fw fatal reporter state to "healthy" is needed by strictly calling devlink_health_reporter_state_update() after recovery was done by PCI error handler. This is needed when fw_fatal reporter was triggered due to PCI error. Poll health is called and set reporter state to error. Health recovery failed (since EEH didn't re-enable the PCI). PCI handlers keep on recover flow and succeed later without devlink acknowledgment. Fix this by adding devlink state update at the end of the PCI handler recovery process. Fixes: 6181e5c ("devlink: add support for reporter recovery completion") Signed-off-by: Roy Novich <[email protected]> Reviewed-by: Moshe Shemesh <[email protected]> Reviewed-by: Aya Levin <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
1 parent 94d6517 commit 416ef71

File tree

1 file changed

+4
-0
lines changed
  • drivers/net/ethernet/mellanox/mlx5/core

1 file changed

+4
-0
lines changed

drivers/net/ethernet/mellanox/mlx5/core/main.c

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1872,6 +1872,10 @@ static void mlx5_pci_resume(struct pci_dev *pdev)
18721872

18731873
err = mlx5_load_one(dev, false);
18741874

1875+
if (!err)
1876+
devlink_health_reporter_state_update(dev->priv.health.fw_fatal_reporter,
1877+
DEVLINK_HEALTH_REPORTER_STATE_HEALTHY);
1878+
18751879
mlx5_pci_trace(dev, "Done, err = %d, device %s\n", err,
18761880
!err ? "recovered" : "Failed");
18771881
}

0 commit comments

Comments
 (0)