Skip to content

Conversation

ricardon
Copy link
Contributor

Hi,

This pull request implements microsoft/openvmm#406. The issue describes the problem statement and proposed solution.

This code enables the changes made here: microsoft/openvmm#384.

Thank you for your consideration.

@ricardon
Copy link
Contributor Author

ricardon commented May 22, 2025 via email

@ricardon
Copy link
Contributor Author

@microsoft-github-policy-service agree [company="Intel Corporation"]

@ricardon
Copy link
Contributor Author

@microsoft-github-policy-service agree company="Intel Corporation"

@@ -359,6 +363,8 @@ struct mshv_kick_cpus {
/* For x86-64 TDX only */
#define MSHV_VTL_TDCALL _IOWR(MSHV_IOCTL, 0x32, struct mshv_tdcall)
#define MSHV_VTL_READ_VMX_CR4_FIXED1 _IOR(MSHV_IOCTL, 0x33, __u64)
#define MSHV_VTL_MAP_REDIRECTED_DEVICE_INTERRUPT _IOWR(MSHV_IOCTL, 0x38, \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that was taken recently by the KICK_CPUS ioctl. If you've rebased, please change to a free code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. I didn't realize. I will update it.

* vector is mapped.
*/
intr_data.vector = ret;
ret = copy_to_user(user_arg, &intr_data, sizeof(intr_data));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might fail returning the number of bytes that weren't copied. Is that good to return from this ioctl?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. This does not look correct. I'll fix the error handling.

return (long)-EFAULT;

if (!intr_data.create_mapping)
/* User space provides the hardware vector to unmap. */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe move the comment above the if statement?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Will do.

@@ -430,13 +430,30 @@ static int mshv_vtl_get_vsm_regs(void)
return ret;
}

static void do_assert_single_proxy_intr(const u32 vector, struct mshv_vtl_run *run)
{
const u32 bank = vector / 32;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the compiler should optimize this, but maybe >>5 would be good here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. This is a good improvement. I will update it.

static void do_assert_single_proxy_intr(const u32 vector, struct mshv_vtl_run *run)
{
const u32 bank = vector / 32;
const u32 masked_irr = BIT(vector % 32) & ~READ_ONCE(run->proxy_irr_blocked[bank]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

& 0x1f instead of the modulo? Thinkling if the compiler doesn't optimize, that's the hot path

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Good suggestion. I will add it.

@ricardon ricardon force-pushed the rneri/redirec-posted-intr branch from 3a546de to 76e36f5 Compare May 31, 2025 00:43
ricardon added 10 commits May 30, 2025 17:45
The proxy interrupts that the VTL2 kernel relays to the VTL0 guest are
handled via the single interrupt vector of the VMBus synthetic interrupt
controller.

On TDX guests in particular, Hyper-V does not use posted interrupts for
these interrupts. This results in a TDEXIT for every interrupt and
consequent performance degradation.

To recoup performance, updated versions of Hyper-V will feature support
the posting of VTL0 interrupts directly to the VTL2 guest. In such setup,
the VTL2 guest still relays interrupts to the VTL0 guest, but
uses a dedicated VTL2 interrupt vector for each VTL0 interrupt.

Since user space in the VTL2 guest is responsible of requesting such
redirection, implement the new MAP_REDIRECTED_DEVICE_INTERRUPT IOCTL. User
space provides the vector of the VTL0 guest that wants the VTL2 kernel to
redirect. On success, return the corresponding VTL2 interrupt vector on the
payload of the IOCTL.

Reuse the same IOCTL to also request the removal of the redirection when
needed. When removing a redirection user space must provide the number of
the hardware vector.

Provide weak stub functions. Subsequent changsets will implement actual
functionality for x86.

Signed-off-by: Ricardo Neri <[email protected]>
---
Changes since v1:
 * Assigned 0x39 to MAP_REDIRECTED_DEVICE_INTERRUPT as 0x38 was already
   taken. (Roman)
 * Fixed error handling for copy_to_user(). (Roman)
 * Moved comment about unmapping above the conditional statement. (Roman)
The computation of the proxy interrupt needs to be computed each time it
needs to be flagged. Optimize the computation using bit shifts and masks.

While here, fix a minor issue on a comment.

Suggested-by: Roman Kisel <[email protected]>
Signed-off-by: Ricardo Neri <[email protected]>
---
Changes since v1:
 * Introduced this patch.
Proxy interrupts can be asserted in group or individually via the vector
of the VMBus synthetic interrupt controller. They can also be asserted
individually if the proxy interrupt has been redirected to a hardware
interrupt vector in the VTL2 guest.

Add a helper function to handle single proxy interrupts that both cases
can use.

No functional change intended.

Signed-off-by: Ricardo Neri <[email protected]>
---
Changes since v1:
 * Rebased on the previous changeset that optimizes the calculation of
   the proxy interrupt bit.
The Hyper-V VTL driver owns all the data structures of proxy interrupts.
Add an architecture-independent handler to assert those that are
redirected (one proxy interrupt is redirected to one hardware interrupt).

The servicing of hardware interrupts is implemented in architecture-
specific code that is usually compiled as built-in. The Hyper-V driver can
be a module. The handler is out of scope of built-in code at build time. It
needs to be registered at runtime when the Hyper-V driver is loaded.

Since the registration of the handler is architecture-specific, add weak
stub functions that architectures can use to implement such registration.

Signed-off-by: Ricardo Neri <[email protected]>
---
Changes since v1:
 * None
Recent versions of Hyper-V hosts use IOMMU posted interrupts to deliver
the interrupts of VTL0 passthrough devices directly the underlying VTL2
guest. These interrupts are hardware vectors in the VTL2 guest and need
an interrupt gate in the Interrupt Descriptor Table.

Reserve a block of 32 interrupt vectors immediately below the FIRST_
SYSTEM_VECTOR. This can be regarded as an expansion of the system
vectors at the expense of external vectors. This should be acceptable
since not many external vectors are used in the normal operation of the
VTL2 kernel. Moreover, the proxy interrupts that the VTL2 guest handles
come from devices.

Signed-off-by: Ricardo Neri <[email protected]>
---
Changes since v1:
 * None
The symbols irq_entries_start and spurious_entries_start are tables of
IDT entry stubs that are subsequently used to populate the interrupt
descriptor table.

The code for both stubs tables is identical. Wrap it in a macro.

Signed-off-by: Ricardo Neri <[email protected]>
---
Changes since v1:
 * None
Create a block interrupt gates that start at FIRST_HV_VTL_REDIRECTED_
VECTOR. Service this interrupt with the function hv_vtl_redirected_
interrupt(). Only tell the APIC that the interrupt has been handled.
Subsequent changesets will use these interrupt vectors to deliver
their corresponding proxy interrupts to a VTL0 guest.

Signed-off-by: Ricardo Neri <[email protected]>
---
Changes since v1:
 * None
The MAP_REDIRECTED_DEVICE_INTERRUPT IOCTL maps an interrupt of a VTL0
guest to a hadware vector interrupt in the VTL2 kernel.

On x86, we reserve a block of 32 vectors starting at FIRST_HV_VTL_
REDIRECTED_VECTOR for this mapping.

Keep track of this redirection using an array hardware interrupt vectors
and their mapping to a proxy interrupt, if any. Note that the array starts
at zero but the first hardware vector is FIRST_HV_VTL_REDIRECTED_VECTOR.
Account for this offset as needed.

The array of mapped proxy interrupts is only updated when processing the
IOCTL. It is only read during interrupts. Serialize access using a
seqcount. This allows lockless read access for cases in which multiple
CPUs want to access the mapping data structures.

Since we are using a seqcount, writers cannot be preempted. Protect write
access using a spinlock.

Signed-off-by: Ricardo Neri <[email protected]>
---
Changes since v1:
 * Used a seqcount for serialization.
Implement the hv_{setup, remove}__redirected_proxy_intr_handler() function.
Store a pointer to the handler that the Hyper-V VTL driver provides. It
will be needed when dispatching hardware interrupt vectors that have been
mapped to proxy interrupts.

Signed-off-by: Ricardo Neri <[email protected]>
---
Changes since v1:
 * None
Once a redirected proxy interrupt has been configured, the IOMMU at the
host will post it directly to a hardware vector in the VTL2 guest.
Calculate the index of the proxy interrupt based on the hardware interrupt
vector and indicate that it needs to be asserted.

We only read the redirected data structures during interrupts. Protect
access using the provided seqcount. This allows lockless access for
readers.

Signed-off-by: Ricardo Neri <[email protected]>
---
Changes since v1:
 * Used a seqcount for serialization
@ricardon ricardon force-pushed the rneri/redirec-posted-intr branch from 76e36f5 to 9fe424d Compare May 31, 2025 00:46
@ricardon
Copy link
Contributor Author

I updated the patches to incorporate feedback from @romank-msft and use a seqcount to serialize access to the redirected proxy interrupts data structures.

@romank-msft
Copy link
Contributor

LGTM. Someone more familiar with the intent might want to take a look.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants