Skip to content

Commit 320805a

Browse files
kelleymhliuw
authored andcommitted
Drivers: hv: vmbus: Fix vmbus_wait_for_unload() to scan present CPUs
vmbus_wait_for_unload() may be called in the panic path after other CPUs are stopped. vmbus_wait_for_unload() currently loops through online CPUs looking for the UNLOAD response message. But the values of CONFIG_KEXEC_CORE and crash_kexec_post_notifiers affect the path used to stop the other CPUs, and in one of the paths the stopped CPUs are removed from cpu_online_mask. This removal happens in both x86/x64 and arm64 architectures. In such a case, vmbus_wait_for_unload() only checks the panic'ing CPU, and misses the UNLOAD response message except when the panic'ing CPU is CPU 0. vmbus_wait_for_unload() eventually times out, but only after waiting 100 seconds. Fix this by looping through *present* CPUs in vmbus_wait_for_unload(). The cpu_present_mask is not modified by stopping the other CPUs in the panic path, nor should it be. Also, in a CoCo VM the synic_message_page is not allocated in hv_synic_alloc(), but is set and cleared in hv_synic_enable_regs() and hv_synic_disable_regs() such that it is set only when the CPU is online. If not all present CPUs are online when vmbus_wait_for_unload() is called, the synic_message_page might be NULL. Add a check for this. Fixes: cd95aad ("Drivers: hv: vmbus: handle various crash scenarios") Cc: [email protected] Reported-by: John Starks <[email protected]> Signed-off-by: Michael Kelley <[email protected]> Reviewed-by: Vitaly Kuznetsov <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Wei Liu <[email protected]>
1 parent ec97e11 commit 320805a

File tree

1 file changed

+16
-2
lines changed

1 file changed

+16
-2
lines changed

drivers/hv/channel_mgmt.c

Lines changed: 16 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -829,11 +829,22 @@ static void vmbus_wait_for_unload(void)
829829
if (completion_done(&vmbus_connection.unload_event))
830830
goto completed;
831831

832-
for_each_online_cpu(cpu) {
832+
for_each_present_cpu(cpu) {
833833
struct hv_per_cpu_context *hv_cpu
834834
= per_cpu_ptr(hv_context.cpu_context, cpu);
835835

836+
/*
837+
* In a CoCo VM the synic_message_page is not allocated
838+
* in hv_synic_alloc(). Instead it is set/cleared in
839+
* hv_synic_enable_regs() and hv_synic_disable_regs()
840+
* such that it is set only when the CPU is online. If
841+
* not all present CPUs are online, the message page
842+
* might be NULL, so skip such CPUs.
843+
*/
836844
page_addr = hv_cpu->synic_message_page;
845+
if (!page_addr)
846+
continue;
847+
837848
msg = (struct hv_message *)page_addr
838849
+ VMBUS_MESSAGE_SINT;
839850

@@ -867,11 +878,14 @@ static void vmbus_wait_for_unload(void)
867878
* maybe-pending messages on all CPUs to be able to receive new
868879
* messages after we reconnect.
869880
*/
870-
for_each_online_cpu(cpu) {
881+
for_each_present_cpu(cpu) {
871882
struct hv_per_cpu_context *hv_cpu
872883
= per_cpu_ptr(hv_context.cpu_context, cpu);
873884

874885
page_addr = hv_cpu->synic_message_page;
886+
if (!page_addr)
887+
continue;
888+
875889
msg = (struct hv_message *)page_addr + VMBUS_MESSAGE_SINT;
876890
msg->header.message_type = HVMSG_NONE;
877891
}

0 commit comments

Comments
 (0)