Skip to content

Commit a847234

Browse files
dcuiliuw
authored andcommitted
Revert "PCI: hv: Fix a timing issue which causes kdump to fail occasionally"
This reverts commit d6af2ed. The statement "the hv_pci_bus_exit() call releases structures of all its child devices" in commit d6af2ed is not true: in the path hv_pci_probe() -> hv_pci_enter_d0() -> hv_pci_bus_exit(hdev, true): the parameter "keep_devs" is true, so hv_pci_bus_exit() does *not* release the child "struct hv_pci_dev *hpdev" that is created earlier in pci_devices_present_work() -> new_pcichild_device(). The commit d6af2ed was originally made in July 2020 for RHEL 7.7, where the old version of hv_pci_bus_exit() was used; when the commit was rebased and merged into the upstream, people didn't notice that it's not really necessary. The commit itself doesn't cause any issue, but it makes hv_pci_probe() more complicated. Revert it to facilitate some upcoming changes to hv_pci_probe(). Signed-off-by: Dexuan Cui <[email protected]> Reviewed-by: Michael Kelley <[email protected]> Acked-by: Wei Hu <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Wei Liu <[email protected]>
1 parent add9195 commit a847234

File tree

1 file changed

+34
-37
lines changed

1 file changed

+34
-37
lines changed

drivers/pci/controller/pci-hyperv.c

Lines changed: 34 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -3318,8 +3318,10 @@ static int hv_pci_enter_d0(struct hv_device *hdev)
33183318
struct pci_bus_d0_entry *d0_entry;
33193319
struct hv_pci_compl comp_pkt;
33203320
struct pci_packet *pkt;
3321+
bool retry = true;
33213322
int ret;
33223323

3324+
enter_d0_retry:
33233325
/*
33243326
* Tell the host that the bus is ready to use, and moved into the
33253327
* powered-on state. This includes telling the host which region
@@ -3346,6 +3348,38 @@ static int hv_pci_enter_d0(struct hv_device *hdev)
33463348
if (ret)
33473349
goto exit;
33483350

3351+
/*
3352+
* In certain case (Kdump) the pci device of interest was
3353+
* not cleanly shut down and resource is still held on host
3354+
* side, the host could return invalid device status.
3355+
* We need to explicitly request host to release the resource
3356+
* and try to enter D0 again.
3357+
*/
3358+
if (comp_pkt.completion_status < 0 && retry) {
3359+
retry = false;
3360+
3361+
dev_err(&hdev->device, "Retrying D0 Entry\n");
3362+
3363+
/*
3364+
* Hv_pci_bus_exit() calls hv_send_resource_released()
3365+
* to free up resources of its child devices.
3366+
* In the kdump kernel we need to set the
3367+
* wslot_res_allocated to 255 so it scans all child
3368+
* devices to release resources allocated in the
3369+
* normal kernel before panic happened.
3370+
*/
3371+
hbus->wslot_res_allocated = 255;
3372+
3373+
ret = hv_pci_bus_exit(hdev, true);
3374+
3375+
if (ret == 0) {
3376+
kfree(pkt);
3377+
goto enter_d0_retry;
3378+
}
3379+
dev_err(&hdev->device,
3380+
"Retrying D0 failed with ret %d\n", ret);
3381+
}
3382+
33493383
if (comp_pkt.completion_status < 0) {
33503384
dev_err(&hdev->device,
33513385
"PCI Pass-through VSP failed D0 Entry with status %x\n",
@@ -3591,7 +3625,6 @@ static int hv_pci_probe(struct hv_device *hdev,
35913625
struct hv_pcibus_device *hbus;
35923626
u16 dom_req, dom;
35933627
char *name;
3594-
bool enter_d0_retry = true;
35953628
int ret;
35963629

35973630
bridge = devm_pci_alloc_host_bridge(&hdev->device, 0);
@@ -3708,47 +3741,11 @@ static int hv_pci_probe(struct hv_device *hdev,
37083741
if (ret)
37093742
goto free_fwnode;
37103743

3711-
retry:
37123744
ret = hv_pci_query_relations(hdev);
37133745
if (ret)
37143746
goto free_irq_domain;
37153747

37163748
ret = hv_pci_enter_d0(hdev);
3717-
/*
3718-
* In certain case (Kdump) the pci device of interest was
3719-
* not cleanly shut down and resource is still held on host
3720-
* side, the host could return invalid device status.
3721-
* We need to explicitly request host to release the resource
3722-
* and try to enter D0 again.
3723-
* Since the hv_pci_bus_exit() call releases structures
3724-
* of all its child devices, we need to start the retry from
3725-
* hv_pci_query_relations() call, requesting host to send
3726-
* the synchronous child device relations message before this
3727-
* information is needed in hv_send_resources_allocated()
3728-
* call later.
3729-
*/
3730-
if (ret == -EPROTO && enter_d0_retry) {
3731-
enter_d0_retry = false;
3732-
3733-
dev_err(&hdev->device, "Retrying D0 Entry\n");
3734-
3735-
/*
3736-
* Hv_pci_bus_exit() calls hv_send_resources_released()
3737-
* to free up resources of its child devices.
3738-
* In the kdump kernel we need to set the
3739-
* wslot_res_allocated to 255 so it scans all child
3740-
* devices to release resources allocated in the
3741-
* normal kernel before panic happened.
3742-
*/
3743-
hbus->wslot_res_allocated = 255;
3744-
ret = hv_pci_bus_exit(hdev, true);
3745-
3746-
if (ret == 0)
3747-
goto retry;
3748-
3749-
dev_err(&hdev->device,
3750-
"Retrying D0 failed with ret %d\n", ret);
3751-
}
37523749
if (ret)
37533750
goto free_irq_domain;
37543751

0 commit comments

Comments
 (0)