Skip to content

Commit db9b31a

Browse files
Akiva Goldbergerkuba-moo
authored andcommitted
net/mlx5: Discard command completions in internal error
Fix use after free when FW completion arrives while device is in internal error state. Avoid calling completion handler in this case, since the device will flush the command interface and trigger all completions manually. Kernel log: ------------[ cut here ]------------ refcount_t: underflow; use-after-free. ... RIP: 0010:refcount_warn_saturate+0xd8/0xe0 ... Call Trace: <IRQ> ? __warn+0x79/0x120 ? refcount_warn_saturate+0xd8/0xe0 ? report_bug+0x17c/0x190 ? handle_bug+0x3c/0x60 ? exc_invalid_op+0x14/0x70 ? asm_exc_invalid_op+0x16/0x20 ? refcount_warn_saturate+0xd8/0xe0 cmd_ent_put+0x13b/0x160 [mlx5_core] mlx5_cmd_comp_handler+0x5f9/0x670 [mlx5_core] cmd_comp_notifier+0x1f/0x30 [mlx5_core] notifier_call_chain+0x35/0xb0 atomic_notifier_call_chain+0x16/0x20 mlx5_eq_async_int+0xf6/0x290 [mlx5_core] notifier_call_chain+0x35/0xb0 atomic_notifier_call_chain+0x16/0x20 irq_int_handler+0x19/0x30 [mlx5_core] __handle_irq_event_percpu+0x4b/0x160 handle_irq_event+0x2e/0x80 handle_edge_irq+0x98/0x230 __common_interrupt+0x3b/0xa0 common_interrupt+0x7b/0xa0 </IRQ> <TASK> asm_common_interrupt+0x22/0x40 Fixes: 51d138c ("net/mlx5: Fix health error state handling") Signed-off-by: Akiva Goldberger <[email protected]> Reviewed-by: Moshe Shemesh <[email protected]> Signed-off-by: Tariq Toukan <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
1 parent 485d65e commit db9b31a

File tree

1 file changed

+3
-0
lines changed
  • drivers/net/ethernet/mellanox/mlx5/core

1 file changed

+3
-0
lines changed

drivers/net/ethernet/mellanox/mlx5/core/cmd.c

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1634,6 +1634,9 @@ static int cmd_comp_notifier(struct notifier_block *nb,
16341634
dev = container_of(cmd, struct mlx5_core_dev, cmd);
16351635
eqe = data;
16361636

1637+
if (dev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR)
1638+
return NOTIFY_DONE;
1639+
16371640
mlx5_cmd_comp_handler(dev, be32_to_cpu(eqe->data.cmd.vector), false);
16381641

16391642
return NOTIFY_OK;

0 commit comments

Comments
 (0)