@@ -4079,80 +4079,71 @@ flags values for ``struct kvm_msr_filter_range``:
4079
4079
``KVM_MSR_FILTER_READ ``
4080
4080
4081
4081
Filter read accesses to MSRs using the given bitmap. A 0 in the bitmap
4082
- indicates that a read should immediately fail , while a 1 indicates that
4083
- a read for a particular MSR should be handled regardless of the default
4082
+ indicates that read accesses should be denied , while a 1 indicates that
4083
+ a read for a particular MSR should be allowed regardless of the default
4084
4084
filter action.
4085
4085
4086
4086
``KVM_MSR_FILTER_WRITE ``
4087
4087
4088
4088
Filter write accesses to MSRs using the given bitmap. A 0 in the bitmap
4089
- indicates that a write should immediately fail , while a 1 indicates that
4090
- a write for a particular MSR should be handled regardless of the default
4089
+ indicates that write accesses should be denied , while a 1 indicates that
4090
+ a write for a particular MSR should be allowed regardless of the default
4091
4091
filter action.
4092
4092
4093
- ``KVM_MSR_FILTER_READ | KVM_MSR_FILTER_WRITE ``
4094
-
4095
- Filter both read and write accesses to MSRs using the given bitmap. A 0
4096
- in the bitmap indicates that both reads and writes should immediately fail,
4097
- while a 1 indicates that reads and writes for a particular MSR are not
4098
- filtered by this range.
4099
-
4100
4093
flags values for ``struct kvm_msr_filter ``:
4101
4094
4102
4095
``KVM_MSR_FILTER_DEFAULT_ALLOW ``
4103
4096
4104
4097
If no filter range matches an MSR index that is getting accessed, KVM will
4105
- fall back to allowing access to the MSR .
4098
+ allow accesses to all MSRs by default .
4106
4099
4107
4100
``KVM_MSR_FILTER_DEFAULT_DENY ``
4108
4101
4109
4102
If no filter range matches an MSR index that is getting accessed, KVM will
4110
- fall back to rejecting access to the MSR. In this mode, all MSRs that should
4111
- be processed by KVM need to explicitly be marked as allowed in the bitmaps.
4103
+ deny accesses to all MSRs by default.
4104
+
4105
+ This ioctl allows userspace to define up to 16 bitmaps of MSR ranges to deny
4106
+ guest MSR accesses that would normally be allowed by KVM. If an MSR is not
4107
+ covered by a specific range, the "default" filtering behavior applies. Each
4108
+ bitmap range covers MSRs from [base .. base+nmsrs).
4112
4109
4113
- This ioctl allows user space to define up to 16 bitmaps of MSR ranges to
4114
- specify whether a certain MSR access should be explicitly filtered for or not.
4110
+ If an MSR access is denied by userspace, the resulting KVM behavior depends on
4111
+ whether or not KVM_CAP_X86_USER_SPACE_MSR's KVM_MSR_EXIT_REASON_FILTER is
4112
+ enabled. If KVM_MSR_EXIT_REASON_FILTER is enabled, KVM will exit to userspace
4113
+ on denied accesses, i.e. userspace effectively intercepts the MSR access. If
4114
+ KVM_MSR_EXIT_REASON_FILTER is not enabled, KVM will inject a #GP into the guest
4115
+ on denied accesses.
4115
4116
4116
- If this ioctl has never been invoked, MSR accesses are not guarded and the
4117
- default KVM in-kernel emulation behavior is fully preserved.
4117
+ If an MSR access is allowed by userspace, KVM will emulate and/or virtualize
4118
+ the access in accordance with the vCPU model. Note, KVM may still ultimately
4119
+ inject a #GP if an access is allowed by userspace, e.g. if KVM doesn't support
4120
+ the MSR, or to follow architectural behavior for the MSR.
4121
+
4122
+ By default, KVM operates in KVM_MSR_FILTER_DEFAULT_ALLOW mode with no MSR range
4123
+ filters.
4118
4124
4119
4125
Calling this ioctl with an empty set of ranges (all nmsrs == 0) disables MSR
4120
4126
filtering. In that mode, ``KVM_MSR_FILTER_DEFAULT_DENY `` is invalid and causes
4121
4127
an error.
4122
4128
4123
- As soon as the filtering is in place, every MSR access is processed through
4124
- the filtering except for accesses to the x2APIC MSRs (from 0x800 to 0x8ff);
4125
- x2APIC MSRs are always allowed, independent of the ``default_allow `` setting,
4126
- and their behavior depends on the ``X2APIC_ENABLE `` bit of the APIC base
4127
- register.
4128
-
4129
4129
.. warning ::
4130
- MSR accesses coming from nested vmentry/vmexit are not filtered.
4130
+ MSR accesses as part of nested VM-Enter/VM-Exit are not filtered.
4131
4131
This includes both writes to individual VMCS fields and reads/writes
4132
4132
through the MSR lists pointed to by the VMCS.
4133
4133
4134
- If a bit is within one of the defined ranges, read and write accesses are
4135
- guarded by the bitmap's value for the MSR index if the kind of access
4136
- is included in the ``struct kvm_msr_filter_range `` flags. If no range
4137
- cover this particular access, the behavior is determined by the flags
4138
- field in the kvm_msr_filter struct: ``KVM_MSR_FILTER_DEFAULT_ALLOW ``
4139
- and ``KVM_MSR_FILTER_DEFAULT_DENY ``.
4140
-
4141
- Each bitmap range specifies a range of MSRs to potentially allow access on.
4142
- The range goes from MSR index [base .. base+nmsrs]. The flags field
4143
- indicates whether reads, writes or both reads and writes are filtered
4144
- by setting a 1 bit in the bitmap for the corresponding MSR index.
4145
-
4146
- If an MSR access is not permitted through the filtering, it generates a
4147
- #GP inside the guest. When combined with KVM_CAP_X86_USER_SPACE_MSR, that
4148
- allows user space to deflect and potentially handle various MSR accesses
4149
- into user space.
4134
+ x2APIC MSR accesses cannot be filtered (KVM silently ignores filters that
4135
+ cover any x2APIC MSRs).
4150
4136
4151
4137
Note, invoking this ioctl while a vCPU is running is inherently racy. However,
4152
4138
KVM does guarantee that vCPUs will see either the previous filter or the new
4153
4139
filter, e.g. MSRs with identical settings in both the old and new filter will
4154
4140
have deterministic behavior.
4155
4141
4142
+ Similarly, if userspace wishes to intercept on denied accesses,
4143
+ KVM_MSR_EXIT_REASON_FILTER must be enabled before activating any filters, and
4144
+ left enabled until after all filters are deactivated. Failure to do so may
4145
+ result in KVM injecting a #GP instead of exiting to userspace.
4146
+
4156
4147
4.98 KVM_CREATE_SPAPR_TCE_64
4157
4148
----------------------------
4158
4149
@@ -6457,31 +6448,33 @@ if it decides to decode and emulate the instruction.
6457
6448
6458
6449
Used on x86 systems. When the VM capability KVM_CAP_X86_USER_SPACE_MSR is
6459
6450
enabled, MSR accesses to registers that would invoke a #GP by KVM kernel code
6460
- will instead trigger a KVM_EXIT_X86_RDMSR exit for reads and KVM_EXIT_X86_WRMSR
6451
+ may instead trigger a KVM_EXIT_X86_RDMSR exit for reads and KVM_EXIT_X86_WRMSR
6461
6452
exit for writes.
6462
6453
6463
- The "reason" field specifies why the MSR trap occurred. User space will only
6464
- receive MSR exit traps when a particular reason was requested during through
6454
+ The "reason" field specifies why the MSR interception occurred. Userspace will
6455
+ only receive MSR exits when a particular reason was requested during through
6465
6456
ENABLE_CAP. Currently valid exit reasons are:
6466
6457
6467
6458
KVM_MSR_EXIT_REASON_UNKNOWN - access to MSR that is unknown to KVM
6468
6459
KVM_MSR_EXIT_REASON_INVAL - access to invalid MSRs or reserved bits
6469
6460
KVM_MSR_EXIT_REASON_FILTER - access blocked by KVM_X86_SET_MSR_FILTER
6470
6461
6471
- For KVM_EXIT_X86_RDMSR, the "index" field tells user space which MSR the guest
6472
- wants to read. To respond to this request with a successful read, user space
6462
+ For KVM_EXIT_X86_RDMSR, the "index" field tells userspace which MSR the guest
6463
+ wants to read. To respond to this request with a successful read, userspace
6473
6464
writes the respective data into the "data" field and must continue guest
6474
6465
execution to ensure the read data is transferred into guest register state.
6475
6466
6476
- If the RDMSR request was unsuccessful, user space indicates that with a "1" in
6467
+ If the RDMSR request was unsuccessful, userspace indicates that with a "1" in
6477
6468
the "error" field. This will inject a #GP into the guest when the VCPU is
6478
6469
executed again.
6479
6470
6480
- For KVM_EXIT_X86_WRMSR, the "index" field tells user space which MSR the guest
6481
- wants to write. Once finished processing the event, user space must continue
6482
- vCPU execution. If the MSR write was unsuccessful, user space also sets the
6471
+ For KVM_EXIT_X86_WRMSR, the "index" field tells userspace which MSR the guest
6472
+ wants to write. Once finished processing the event, userspace must continue
6473
+ vCPU execution. If the MSR write was unsuccessful, userspace also sets the
6483
6474
"error" field to "1".
6484
6475
6476
+ See KVM_X86_SET_MSR_FILTER for details on the interaction with MSR filtering.
6477
+
6485
6478
::
6486
6479
6487
6480
@@ -7247,19 +7240,27 @@ the module parameter for the target VM.
7247
7240
:Parameters: args[0] contains the mask of KVM_MSR_EXIT_REASON_* events to report
7248
7241
:Returns: 0 on success; -1 on error
7249
7242
7250
- This capability enables trapping of #GP invoking RDMSR and WRMSR instructions
7251
- into user space .
7243
+ This capability allows userspace to intercept RDMSR and WRMSR instructions if
7244
+ access to an MSR is denied. By default, KVM injects #GP on denied accesses .
7252
7245
7253
7246
When a guest requests to read or write an MSR, KVM may not implement all MSRs
7254
7247
that are relevant to a respective system. It also does not differentiate by
7255
7248
CPU type.
7256
7249
7257
- To allow more fine grained control over MSR handling, user space may enable
7250
+ To allow more fine grained control over MSR handling, userspace may enable
7258
7251
this capability. With it enabled, MSR accesses that match the mask specified in
7259
- args[0] and trigger a #GP event inside the guest by KVM will instead trigger
7260
- KVM_EXIT_X86_RDMSR and KVM_EXIT_X86_WRMSR exit notifications which user space
7261
- can then handle to implement model specific MSR handling and/or user notifications
7262
- to inform a user that an MSR was not handled.
7252
+ args[0] and would trigger a #GP inside the guest will instead trigger
7253
+ KVM_EXIT_X86_RDMSR and KVM_EXIT_X86_WRMSR exit notifications. Userspace
7254
+ can then implement model specific MSR handling and/or user notifications
7255
+ to inform a user that an MSR was not emulated/virtualized by KVM.
7256
+
7257
+ The valid mask flags are:
7258
+
7259
+ KVM_MSR_EXIT_REASON_UNKNOWN - intercept accesses to unknown (to KVM) MSRs
7260
+ KVM_MSR_EXIT_REASON_INVAL - intercept accesses that are architecturally
7261
+ invalid according to the vCPU model and/or mode
7262
+ KVM_MSR_EXIT_REASON_FILTER - intercept accesses that are denied by userspace
7263
+ via KVM_X86_SET_MSR_FILTER
7263
7264
7264
7265
7.22 KVM_CAP_X86_BUS_LOCK_EXIT
7265
7266
-------------------------------
@@ -7919,7 +7920,7 @@ KVM_EXIT_X86_WRMSR exit notifications.
7919
7920
This capability indicates that KVM supports that accesses to user defined MSRs
7920
7921
may be rejected. With this capability exposed, KVM exports new VM ioctl
7921
7922
KVM_X86_SET_MSR_FILTER which user space can call to specify bitmaps of MSR
7922
- ranges that KVM should reject access to.
7923
+ ranges that KVM should deny access to.
7923
7924
7924
7925
In combination with KVM_CAP_X86_USER_SPACE_MSR, this allows user space to
7925
7926
trap and emulate MSRs that are outside of the scope of KVM as well as
0 commit comments