@@ -272,18 +272,6 @@ the VCPU file descriptor can be mmap-ed, including:
272272 KVM_CAP_DIRTY_LOG_RING, see section 8.3.
273273
274274
275- 4.6 KVM_SET_MEMORY_REGION
276- -------------------------
277-
278- :Capability: basic
279- :Architectures: all
280- :Type: vm ioctl
281- :Parameters: struct kvm_memory_region (in)
282- :Returns: 0 on success, -1 on error
283-
284- This ioctl is obsolete and has been removed.
285-
286-
2872754.7 KVM_CREATE_VCPU
288276-------------------
289277
@@ -368,17 +356,6 @@ see the description of the capability.
368356Note that the Xen shared info page, if configured, shall always be assumed
369357to be dirty. KVM will not explicitly mark it such.
370358
371- 4.9 KVM_SET_MEMORY_ALIAS
372- ------------------------
373-
374- :Capability: basic
375- :Architectures: x86
376- :Type: vm ioctl
377- :Parameters: struct kvm_memory_alias (in)
378- :Returns: 0 (success), -1 (error)
379-
380- This ioctl is obsolete and has been removed.
381-
382359
3833604.10 KVM_RUN
384361------------
@@ -1332,7 +1309,7 @@ yet and must be cleared on entry.
13321309 __u64 userspace_addr; /* start of the userspace allocated memory */
13331310 };
13341311
1335- /* for kvm_memory_region ::flags */
1312+ /* for kvm_userspace_memory_region ::flags */
13361313 #define KVM_MEM_LOG_DIRTY_PAGES (1UL << 0)
13371314 #define KVM_MEM_READONLY (1UL << 1)
13381315
@@ -1377,10 +1354,6 @@ the memory region are automatically reflected into the guest. For example, an
13771354mmap() that affects the region will be made visible immediately. Another
13781355example is madvise(MADV_DROP).
13791356
1380- It is recommended to use this API instead of the KVM_SET_MEMORY_REGION ioctl.
1381- The KVM_SET_MEMORY_REGION does not allow fine grained control over memory
1382- allocation and is deprecated.
1383-
13841357
138513584.36 KVM_SET_TSS_ADDR
13861359---------------------
@@ -3293,6 +3266,7 @@ valid entries found.
32933266----------------------
32943267
32953268:Capability: KVM_CAP_DEVICE_CTRL
3269+ :Architectures: all
32963270:Type: vm ioctl
32973271:Parameters: struct kvm_create_device (in/out)
32983272:Returns: 0 on success, -1 on error
@@ -3333,6 +3307,7 @@ number.
33333307:Capability: KVM_CAP_DEVICE_CTRL, KVM_CAP_VM_ATTRIBUTES for vm device,
33343308 KVM_CAP_VCPU_ATTRIBUTES for vcpu device
33353309 KVM_CAP_SYS_ATTRIBUTES for system (/dev/kvm) device (no set)
3310+ :Architectures: x86, arm64, s390
33363311:Type: device ioctl, vm ioctl, vcpu ioctl
33373312:Parameters: struct kvm_device_attr
33383313:Returns: 0 on success, -1 on error
@@ -4104,80 +4079,71 @@ flags values for ``struct kvm_msr_filter_range``:
41044079``KVM_MSR_FILTER_READ ``
41054080
41064081 Filter read accesses to MSRs using the given bitmap. A 0 in the bitmap
4107- indicates that a read should immediately fail , while a 1 indicates that
4108- a read for a particular MSR should be handled regardless of the default
4082+ indicates that read accesses should be denied , while a 1 indicates that
4083+ a read for a particular MSR should be allowed regardless of the default
41094084 filter action.
41104085
41114086``KVM_MSR_FILTER_WRITE ``
41124087
41134088 Filter write accesses to MSRs using the given bitmap. A 0 in the bitmap
4114- indicates that a write should immediately fail , while a 1 indicates that
4115- a write for a particular MSR should be handled regardless of the default
4089+ indicates that write accesses should be denied , while a 1 indicates that
4090+ a write for a particular MSR should be allowed regardless of the default
41164091 filter action.
41174092
4118- ``KVM_MSR_FILTER_READ | KVM_MSR_FILTER_WRITE ``
4119-
4120- Filter both read and write accesses to MSRs using the given bitmap. A 0
4121- in the bitmap indicates that both reads and writes should immediately fail,
4122- while a 1 indicates that reads and writes for a particular MSR are not
4123- filtered by this range.
4124-
41254093flags values for ``struct kvm_msr_filter ``:
41264094
41274095``KVM_MSR_FILTER_DEFAULT_ALLOW ``
41284096
41294097 If no filter range matches an MSR index that is getting accessed, KVM will
4130- fall back to allowing access to the MSR .
4098+ allow accesses to all MSRs by default .
41314099
41324100``KVM_MSR_FILTER_DEFAULT_DENY ``
41334101
41344102 If no filter range matches an MSR index that is getting accessed, KVM will
4135- fall back to rejecting access to the MSR. In this mode, all MSRs that should
4136- be processed by KVM need to explicitly be marked as allowed in the bitmaps.
4103+ deny accesses to all MSRs by default.
41374104
4138- This ioctl allows user space to define up to 16 bitmaps of MSR ranges to
4139- specify whether a certain MSR access should be explicitly filtered for or not.
4105+ This ioctl allows userspace to define up to 16 bitmaps of MSR ranges to deny
4106+ guest MSR accesses that would normally be allowed by KVM. If an MSR is not
4107+ covered by a specific range, the "default" filtering behavior applies. Each
4108+ bitmap range covers MSRs from [base .. base+nmsrs).
41404109
4141- If this ioctl has never been invoked, MSR accesses are not guarded and the
4142- default KVM in-kernel emulation behavior is fully preserved.
4110+ If an MSR access is denied by userspace, the resulting KVM behavior depends on
4111+ whether or not KVM_CAP_X86_USER_SPACE_MSR's KVM_MSR_EXIT_REASON_FILTER is
4112+ enabled. If KVM_MSR_EXIT_REASON_FILTER is enabled, KVM will exit to userspace
4113+ on denied accesses, i.e. userspace effectively intercepts the MSR access. If
4114+ KVM_MSR_EXIT_REASON_FILTER is not enabled, KVM will inject a #GP into the guest
4115+ on denied accesses.
4116+
4117+ If an MSR access is allowed by userspace, KVM will emulate and/or virtualize
4118+ the access in accordance with the vCPU model. Note, KVM may still ultimately
4119+ inject a #GP if an access is allowed by userspace, e.g. if KVM doesn't support
4120+ the MSR, or to follow architectural behavior for the MSR.
4121+
4122+ By default, KVM operates in KVM_MSR_FILTER_DEFAULT_ALLOW mode with no MSR range
4123+ filters.
41434124
41444125Calling this ioctl with an empty set of ranges (all nmsrs == 0) disables MSR
41454126filtering. In that mode, ``KVM_MSR_FILTER_DEFAULT_DENY `` is invalid and causes
41464127an error.
41474128
4148- As soon as the filtering is in place, every MSR access is processed through
4149- the filtering except for accesses to the x2APIC MSRs (from 0x800 to 0x8ff);
4150- x2APIC MSRs are always allowed, independent of the ``default_allow `` setting,
4151- and their behavior depends on the ``X2APIC_ENABLE `` bit of the APIC base
4152- register.
4153-
41544129.. warning ::
4155- MSR accesses coming from nested vmentry/vmexit are not filtered.
4130+ MSR accesses as part of nested VM-Enter/VM-Exit are not filtered.
41564131 This includes both writes to individual VMCS fields and reads/writes
41574132 through the MSR lists pointed to by the VMCS.
41584133
4159- If a bit is within one of the defined ranges, read and write accesses are
4160- guarded by the bitmap's value for the MSR index if the kind of access
4161- is included in the ``struct kvm_msr_filter_range `` flags. If no range
4162- cover this particular access, the behavior is determined by the flags
4163- field in the kvm_msr_filter struct: ``KVM_MSR_FILTER_DEFAULT_ALLOW ``
4164- and ``KVM_MSR_FILTER_DEFAULT_DENY ``.
4165-
4166- Each bitmap range specifies a range of MSRs to potentially allow access on.
4167- The range goes from MSR index [base .. base+nmsrs]. The flags field
4168- indicates whether reads, writes or both reads and writes are filtered
4169- by setting a 1 bit in the bitmap for the corresponding MSR index.
4170-
4171- If an MSR access is not permitted through the filtering, it generates a
4172- #GP inside the guest. When combined with KVM_CAP_X86_USER_SPACE_MSR, that
4173- allows user space to deflect and potentially handle various MSR accesses
4174- into user space.
4134+ x2APIC MSR accesses cannot be filtered (KVM silently ignores filters that
4135+ cover any x2APIC MSRs).
41754136
41764137Note, invoking this ioctl while a vCPU is running is inherently racy. However,
41774138KVM does guarantee that vCPUs will see either the previous filter or the new
41784139filter, e.g. MSRs with identical settings in both the old and new filter will
41794140have deterministic behavior.
41804141
4142+ Similarly, if userspace wishes to intercept on denied accesses,
4143+ KVM_MSR_EXIT_REASON_FILTER must be enabled before activating any filters, and
4144+ left enabled until after all filters are deactivated. Failure to do so may
4145+ result in KVM injecting a #GP instead of exiting to userspace.
4146+
418141474.98 KVM_CREATE_SPAPR_TCE_64
41824148----------------------------
41834149
@@ -5339,6 +5305,7 @@ KVM_PV_ASYNC_CLEANUP_PERFORM
53395305 union {
53405306 __u8 long_mode;
53415307 __u8 vector;
5308+ __u8 runstate_update_flag;
53425309 struct {
53435310 __u64 gfn;
53445311 } shared_info;
@@ -5416,6 +5383,14 @@ KVM_XEN_ATTR_TYPE_XEN_VERSION
54165383 event channel delivery, so responding within the kernel without
54175384 exiting to userspace is beneficial.
54185385
5386+ KVM_XEN_ATTR_TYPE_RUNSTATE_UPDATE_FLAG
5387+ This attribute is available when the KVM_CAP_XEN_HVM ioctl indicates
5388+ support for KVM_XEN_HVM_CONFIG_RUNSTATE_UPDATE_FLAG. It enables the
5389+ XEN_RUNSTATE_UPDATE flag which allows guest vCPUs to safely read
5390+ other vCPUs' vcpu_runstate_info. Xen guests enable this feature via
5391+ the VM_ASST_TYPE_runstate_update_flag of the HYPERVISOR_vm_assist
5392+ hypercall.
5393+
541953944.127 KVM_XEN_HVM_GET_ATTR
54205395--------------------------
54215396
@@ -6473,31 +6448,33 @@ if it decides to decode and emulate the instruction.
64736448
64746449Used on x86 systems. When the VM capability KVM_CAP_X86_USER_SPACE_MSR is
64756450enabled, MSR accesses to registers that would invoke a #GP by KVM kernel code
6476- will instead trigger a KVM_EXIT_X86_RDMSR exit for reads and KVM_EXIT_X86_WRMSR
6451+ may instead trigger a KVM_EXIT_X86_RDMSR exit for reads and KVM_EXIT_X86_WRMSR
64776452exit for writes.
64786453
6479- The "reason" field specifies why the MSR trap occurred. User space will only
6480- receive MSR exit traps when a particular reason was requested during through
6454+ The "reason" field specifies why the MSR interception occurred. Userspace will
6455+ only receive MSR exits when a particular reason was requested during through
64816456ENABLE_CAP. Currently valid exit reasons are:
64826457
64836458 KVM_MSR_EXIT_REASON_UNKNOWN - access to MSR that is unknown to KVM
64846459 KVM_MSR_EXIT_REASON_INVAL - access to invalid MSRs or reserved bits
64856460 KVM_MSR_EXIT_REASON_FILTER - access blocked by KVM_X86_SET_MSR_FILTER
64866461
6487- For KVM_EXIT_X86_RDMSR, the "index" field tells user space which MSR the guest
6488- wants to read. To respond to this request with a successful read, user space
6462+ For KVM_EXIT_X86_RDMSR, the "index" field tells userspace which MSR the guest
6463+ wants to read. To respond to this request with a successful read, userspace
64896464writes the respective data into the "data" field and must continue guest
64906465execution to ensure the read data is transferred into guest register state.
64916466
6492- If the RDMSR request was unsuccessful, user space indicates that with a "1" in
6467+ If the RDMSR request was unsuccessful, userspace indicates that with a "1" in
64936468the "error" field. This will inject a #GP into the guest when the VCPU is
64946469executed again.
64956470
6496- For KVM_EXIT_X86_WRMSR, the "index" field tells user space which MSR the guest
6497- wants to write. Once finished processing the event, user space must continue
6498- vCPU execution. If the MSR write was unsuccessful, user space also sets the
6471+ For KVM_EXIT_X86_WRMSR, the "index" field tells userspace which MSR the guest
6472+ wants to write. Once finished processing the event, userspace must continue
6473+ vCPU execution. If the MSR write was unsuccessful, userspace also sets the
64996474"error" field to "1".
65006475
6476+ See KVM_X86_SET_MSR_FILTER for details on the interaction with MSR filtering.
6477+
65016478::
65026479
65036480
@@ -7263,19 +7240,27 @@ the module parameter for the target VM.
72637240:Parameters: args[0] contains the mask of KVM_MSR_EXIT_REASON_* events to report
72647241:Returns: 0 on success; -1 on error
72657242
7266- This capability enables trapping of #GP invoking RDMSR and WRMSR instructions
7267- into user space .
7243+ This capability allows userspace to intercept RDMSR and WRMSR instructions if
7244+ access to an MSR is denied. By default, KVM injects #GP on denied accesses .
72687245
72697246When a guest requests to read or write an MSR, KVM may not implement all MSRs
72707247that are relevant to a respective system. It also does not differentiate by
72717248CPU type.
72727249
7273- To allow more fine grained control over MSR handling, user space may enable
7250+ To allow more fine grained control over MSR handling, userspace may enable
72747251this capability. With it enabled, MSR accesses that match the mask specified in
7275- args[0] and trigger a #GP event inside the guest by KVM will instead trigger
7276- KVM_EXIT_X86_RDMSR and KVM_EXIT_X86_WRMSR exit notifications which user space
7277- can then handle to implement model specific MSR handling and/or user notifications
7278- to inform a user that an MSR was not handled.
7252+ args[0] and would trigger a #GP inside the guest will instead trigger
7253+ KVM_EXIT_X86_RDMSR and KVM_EXIT_X86_WRMSR exit notifications. Userspace
7254+ can then implement model specific MSR handling and/or user notifications
7255+ to inform a user that an MSR was not emulated/virtualized by KVM.
7256+
7257+ The valid mask flags are:
7258+
7259+ KVM_MSR_EXIT_REASON_UNKNOWN - intercept accesses to unknown (to KVM) MSRs
7260+ KVM_MSR_EXIT_REASON_INVAL - intercept accesses that are architecturally
7261+ invalid according to the vCPU model and/or mode
7262+ KVM_MSR_EXIT_REASON_FILTER - intercept accesses that are denied by userspace
7263+ via KVM_X86_SET_MSR_FILTER
72797264
728072657.22 KVM_CAP_X86_BUS_LOCK_EXIT
72817266-------------------------------
@@ -7936,7 +7921,7 @@ KVM_EXIT_X86_WRMSR exit notifications.
79367921This capability indicates that KVM supports that accesses to user defined MSRs
79377922may be rejected. With this capability exposed, KVM exports new VM ioctl
79387923KVM_X86_SET_MSR_FILTER which user space can call to specify bitmaps of MSR
7939- ranges that KVM should reject access to.
7924+ ranges that KVM should deny access to.
79407925
79417926In combination with KVM_CAP_X86_USER_SPACE_MSR, this allows user space to
79427927trap and emulate MSRs that are outside of the scope of KVM as well as
@@ -8080,12 +8065,13 @@ KVM device "kvm-arm-vgic-its" when dirty ring is enabled.
80808065This capability indicates the features that Xen supports for hosting Xen
80818066PVHVM guests. Valid flags are::
80828067
8083- #define KVM_XEN_HVM_CONFIG_HYPERCALL_MSR (1 << 0)
8084- #define KVM_XEN_HVM_CONFIG_INTERCEPT_HCALL (1 << 1)
8085- #define KVM_XEN_HVM_CONFIG_SHARED_INFO (1 << 2)
8086- #define KVM_XEN_HVM_CONFIG_RUNSTATE (1 << 3)
8087- #define KVM_XEN_HVM_CONFIG_EVTCHN_2LEVEL (1 << 4)
8088- #define KVM_XEN_HVM_CONFIG_EVTCHN_SEND (1 << 5)
8068+ #define KVM_XEN_HVM_CONFIG_HYPERCALL_MSR (1 << 0)
8069+ #define KVM_XEN_HVM_CONFIG_INTERCEPT_HCALL (1 << 1)
8070+ #define KVM_XEN_HVM_CONFIG_SHARED_INFO (1 << 2)
8071+ #define KVM_XEN_HVM_CONFIG_RUNSTATE (1 << 3)
8072+ #define KVM_XEN_HVM_CONFIG_EVTCHN_2LEVEL (1 << 4)
8073+ #define KVM_XEN_HVM_CONFIG_EVTCHN_SEND (1 << 5)
8074+ #define KVM_XEN_HVM_CONFIG_RUNSTATE_UPDATE_FLAG (1 << 6)
80898075
80908076The KVM_XEN_HVM_CONFIG_HYPERCALL_MSR flag indicates that the KVM_XEN_HVM_CONFIG
80918077ioctl is available, for the guest to set its hypercall page.
@@ -8117,6 +8103,18 @@ KVM_XEN_VCPU_ATTR_TYPE_VCPU_ID/TIMER/UPCALL_VECTOR vCPU attributes.
81178103related to event channel delivery, timers, and the XENVER_version
81188104interception.
81198105
8106+ The KVM_XEN_HVM_CONFIG_RUNSTATE_UPDATE_FLAG flag indicates that KVM supports
8107+ the KVM_XEN_ATTR_TYPE_RUNSTATE_UPDATE_FLAG attribute in the KVM_XEN_SET_ATTR
8108+ and KVM_XEN_GET_ATTR ioctls. This controls whether KVM will set the
8109+ XEN_RUNSTATE_UPDATE flag in guest memory mapped vcpu_runstate_info during
8110+ updates of the runstate information. Note that versions of KVM which support
8111+ the RUNSTATE feature above, but not thie RUNSTATE_UPDATE_FLAG feature, will
8112+ always set the XEN_RUNSTATE_UPDATE flag when updating the guest structure,
8113+ which is perhaps counterintuitive. When this flag is advertised, KVM will
8114+ behave more correctly, not using the XEN_RUNSTATE_UPDATE flag until/unless
8115+ specifically enabled (by the guest making the hypercall, causing the VMM
8116+ to enable the KVM_XEN_ATTR_TYPE_RUNSTATE_UPDATE_FLAG attribute).
8117+
812081188.31 KVM_CAP_PPC_MULTITCE
81218119-------------------------
81228120
0 commit comments