Skip to content

Commit 428c491

Browse files
Guilherme G. Piccolidavem330
authored andcommitted
net: ena: Add PCI shutdown handler to allow safe kexec
Currently ENA only provides the PCI remove() handler, used during rmmod for example. This is not called on shutdown/kexec path; we are potentially creating a failure scenario on kexec: (a) Kexec is triggered, no shutdown() / remove() handler is called for ENA; instead pci_device_shutdown() clears the master bit of the PCI device, stopping all DMA transactions; (b) Kexec reboot happens and the device gets enabled again, likely having its FW with that DMA transaction buffered; then it may trigger the (now invalid) memory operation in the new kernel, corrupting kernel memory area. This patch aims to prevent this, by implementing a shutdown() handler quite similar to the remove() one - the difference being the handling of the netdev, which is unregistered on remove(), but following the convention observed in other drivers, it's only detached on shutdown(). This prevents an odd issue in AWS Nitro instances, in which after the 2nd kexec the next one will fail with an initrd corruption, caused by a wild DMA write to invalid kernel memory. The lspci output for the adapter present in my instance is: 00:05.0 Ethernet controller [0200]: Amazon.com, Inc. Elastic Network Adapter (ENA) [1d0f:ec20] Suggested-by: Gavin Shan <[email protected]> Signed-off-by: Guilherme G. Piccoli <[email protected]> Acked-by: Sameeh Jubran <[email protected]> Signed-off-by: David S. Miller <[email protected]>
1 parent c085dbf commit 428c491

File tree

1 file changed

+41
-10
lines changed

1 file changed

+41
-10
lines changed

drivers/net/ethernet/amazon/ena/ena_netdev.c

Lines changed: 41 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -4336,13 +4336,15 @@ static int ena_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
43364336

43374337
/*****************************************************************************/
43384338

4339-
/* ena_remove - Device Removal Routine
4339+
/* __ena_shutoff - Helper used in both PCI remove/shutdown routines
43404340
* @pdev: PCI device information struct
4341+
* @shutdown: Is it a shutdown operation? If false, means it is a removal
43414342
*
4342-
* ena_remove is called by the PCI subsystem to alert the driver
4343-
* that it should release a PCI device.
4343+
* __ena_shutoff is a helper routine that does the real work on shutdown and
4344+
* removal paths; the difference between those paths is with regards to whether
4345+
* dettach or unregister the netdevice.
43444346
*/
4345-
static void ena_remove(struct pci_dev *pdev)
4347+
static void __ena_shutoff(struct pci_dev *pdev, bool shutdown)
43464348
{
43474349
struct ena_adapter *adapter = pci_get_drvdata(pdev);
43484350
struct ena_com_dev *ena_dev;
@@ -4361,13 +4363,17 @@ static void ena_remove(struct pci_dev *pdev)
43614363

43624364
cancel_work_sync(&adapter->reset_task);
43634365

4364-
rtnl_lock();
4366+
rtnl_lock(); /* lock released inside the below if-else block */
43654367
ena_destroy_device(adapter, true);
4366-
rtnl_unlock();
4367-
4368-
unregister_netdev(netdev);
4369-
4370-
free_netdev(netdev);
4368+
if (shutdown) {
4369+
netif_device_detach(netdev);
4370+
dev_close(netdev);
4371+
rtnl_unlock();
4372+
} else {
4373+
rtnl_unlock();
4374+
unregister_netdev(netdev);
4375+
free_netdev(netdev);
4376+
}
43714377

43724378
ena_com_rss_destroy(ena_dev);
43734379

@@ -4382,6 +4388,30 @@ static void ena_remove(struct pci_dev *pdev)
43824388
vfree(ena_dev);
43834389
}
43844390

4391+
/* ena_remove - Device Removal Routine
4392+
* @pdev: PCI device information struct
4393+
*
4394+
* ena_remove is called by the PCI subsystem to alert the driver
4395+
* that it should release a PCI device.
4396+
*/
4397+
4398+
static void ena_remove(struct pci_dev *pdev)
4399+
{
4400+
__ena_shutoff(pdev, false);
4401+
}
4402+
4403+
/* ena_shutdown - Device Shutdown Routine
4404+
* @pdev: PCI device information struct
4405+
*
4406+
* ena_shutdown is called by the PCI subsystem to alert the driver that
4407+
* a shutdown/reboot (or kexec) is happening and device must be disabled.
4408+
*/
4409+
4410+
static void ena_shutdown(struct pci_dev *pdev)
4411+
{
4412+
__ena_shutoff(pdev, true);
4413+
}
4414+
43854415
#ifdef CONFIG_PM
43864416
/* ena_suspend - PM suspend callback
43874417
* @pdev: PCI device information struct
@@ -4431,6 +4461,7 @@ static struct pci_driver ena_pci_driver = {
44314461
.id_table = ena_pci_tbl,
44324462
.probe = ena_probe,
44334463
.remove = ena_remove,
4464+
.shutdown = ena_shutdown,
44344465
#ifdef CONFIG_PM
44354466
.suspend = ena_suspend,
44364467
.resume = ena_resume,

0 commit comments

Comments
 (0)