1
1
.. SPDX-License-Identifier: GPL-2.0
2
2
3
- VMbus
3
+ VMBus
4
4
=====
5
- VMbus is a software construct provided by Hyper-V to guest VMs. It
5
+ VMBus is a software construct provided by Hyper-V to guest VMs. It
6
6
consists of a control path and common facilities used by synthetic
7
7
devices that Hyper-V presents to guest VMs. The control path is
8
8
used to offer synthetic devices to the guest VM and, in some cases,
@@ -12,9 +12,9 @@ and the synthetic device implementation that is part of Hyper-V, and
12
12
signaling primitives to allow Hyper-V and the guest to interrupt
13
13
each other.
14
14
15
- VMbus is modeled in Linux as a bus, with the expected /sys/bus/vmbus
16
- entry in a running Linux guest. The VMbus driver (drivers/hv/vmbus_drv.c)
17
- establishes the VMbus control path with the Hyper-V host, then
15
+ VMBus is modeled in Linux as a bus, with the expected /sys/bus/vmbus
16
+ entry in a running Linux guest. The VMBus driver (drivers/hv/vmbus_drv.c)
17
+ establishes the VMBus control path with the Hyper-V host, then
18
18
registers itself as a Linux bus driver. It implements the standard
19
19
bus functions for adding and removing devices to/from the bus.
20
20
@@ -49,9 +49,9 @@ synthetic NIC is referred to as "netvsc" and the Linux driver for
49
49
the synthetic SCSI controller is "storvsc". These drivers contain
50
50
functions with names like "storvsc_connect_to_vsp".
51
51
52
- VMbus channels
52
+ VMBus channels
53
53
--------------
54
- An instance of a synthetic device uses VMbus channels to communicate
54
+ An instance of a synthetic device uses VMBus channels to communicate
55
55
between the VSP and the VSC. Channels are bi-directional and used
56
56
for passing messages. Most synthetic devices use a single channel,
57
57
but the synthetic SCSI controller and synthetic NIC may use multiple
@@ -73,7 +73,7 @@ write indices and some control flags, followed by the memory for the
73
73
actual ring. The size of the ring is determined by the VSC in the
74
74
guest and is specific to each synthetic device. The list of GPAs
75
75
making up the ring is communicated to the Hyper-V host over the
76
- VMbus control path as a GPA Descriptor List (GPADL). See function
76
+ VMBus control path as a GPA Descriptor List (GPADL). See function
77
77
vmbus_establish_gpadl().
78
78
79
79
Each ring buffer is mapped into contiguous Linux kernel virtual
@@ -102,9 +102,9 @@ resources. For Windows Server 2019 and later, this limit is
102
102
approximately 1280 Mbytes. For versions prior to Windows Server
103
103
2019, the limit is approximately 384 Mbytes.
104
104
105
- VMbus messages
105
+ VMBus messages
106
106
--------------
107
- All VMbus messages have a standard header that includes the message
107
+ All VMBus messages have a standard header that includes the message
108
108
length, the offset of the message payload, some flags, and a
109
109
transactionID. The portion of the message after the header is
110
110
unique to each VSP/VSC pair.
@@ -137,7 +137,7 @@ control message contains a list of GPAs that describe the data
137
137
buffer. For example, the storvsc driver uses this approach to
138
138
specify the data buffers to/from which disk I/O is done.
139
139
140
- Three functions exist to send VMbus messages:
140
+ Three functions exist to send VMBus messages:
141
141
142
142
1. vmbus_sendpacket(): Control-only messages and messages with
143
143
embedded data -- no GPAs
@@ -154,20 +154,20 @@ Historically, Linux guests have trusted Hyper-V to send well-formed
154
154
and valid messages, and Linux drivers for synthetic devices did not
155
155
fully validate messages. With the introduction of processor
156
156
technologies that fully encrypt guest memory and that allow the
157
- guest to not trust the hypervisor (AMD SNP- SEV, Intel TDX), trusting
157
+ guest to not trust the hypervisor (AMD SEV-SNP , Intel TDX), trusting
158
158
the Hyper-V host is no longer a valid assumption. The drivers for
159
- VMbus synthetic devices are being updated to fully validate any
159
+ VMBus synthetic devices are being updated to fully validate any
160
160
values read from memory that is shared with Hyper-V, which includes
161
- messages from VMbus devices. To facilitate such validation,
161
+ messages from VMBus devices. To facilitate such validation,
162
162
messages read by the guest from the "in" ring buffer are copied to a
163
163
temporary buffer that is not shared with Hyper-V. Validation is
164
164
performed in this temporary buffer without the risk of Hyper-V
165
165
maliciously modifying the message after it is validated but before
166
166
it is used.
167
167
168
- VMbus interrupts
168
+ VMBus interrupts
169
169
----------------
170
- VMbus provides a mechanism for the guest to interrupt the host when
170
+ VMBus provides a mechanism for the guest to interrupt the host when
171
171
the guest has queued new messages in a ring buffer. The host
172
172
expects that the guest will send an interrupt only when an "out"
173
173
ring buffer transitions from empty to non-empty. If the guest sends
@@ -177,62 +177,62 @@ interrupts, the host may throttle that guest by suspending its
177
177
execution for a few seconds to prevent a denial-of-service attack.
178
178
179
179
Similarly, the host will interrupt the guest when it sends a new
180
- message on the VMbus control path, or when a VMbus channel "in" ring
180
+ message on the VMBus control path, or when a VMBus channel "in" ring
181
181
buffer transitions from empty to non-empty. Each CPU in the guest
182
- may receive VMbus interrupts, so they are best modeled as per-CPU
182
+ may receive VMBus interrupts, so they are best modeled as per-CPU
183
183
interrupts in Linux. This model works well on arm64 where a single
184
- per-CPU IRQ is allocated for VMbus . Since x86/x64 lacks support for
184
+ per-CPU IRQ is allocated for VMBus . Since x86/x64 lacks support for
185
185
per-CPU IRQs, an x86 interrupt vector is statically allocated (see
186
186
HYPERVISOR_CALLBACK_VECTOR) across all CPUs and explicitly coded to
187
- call the VMbus interrupt service routine. These interrupts are
187
+ call the VMBus interrupt service routine. These interrupts are
188
188
visible in /proc/interrupts on the "HYP" line.
189
189
190
- The guest CPU that a VMbus channel will interrupt is selected by the
190
+ The guest CPU that a VMBus channel will interrupt is selected by the
191
191
guest when the channel is created, and the host is informed of that
192
- selection. VMbus devices are broadly grouped into two categories:
192
+ selection. VMBus devices are broadly grouped into two categories:
193
193
194
- 1. "Slow" devices that need only one VMbus channel. The devices
194
+ 1. "Slow" devices that need only one VMBus channel. The devices
195
195
(such as keyboard, mouse, heartbeat, and timesync) generate
196
- relatively few interrupts. Their VMbus channels are all
196
+ relatively few interrupts. Their VMBus channels are all
197
197
assigned to interrupt the VMBUS_CONNECT_CPU, which is always
198
198
CPU 0.
199
199
200
- 2. "High speed" devices that may use multiple VMbus channels for
200
+ 2. "High speed" devices that may use multiple VMBus channels for
201
201
higher parallelism and performance. These devices include the
202
- synthetic SCSI controller and synthetic NIC. Their VMbus
202
+ synthetic SCSI controller and synthetic NIC. Their VMBus
203
203
channels interrupts are assigned to CPUs that are spread out
204
204
among the available CPUs in the VM so that interrupts on
205
205
multiple channels can be processed in parallel.
206
206
207
- The assignment of VMbus channel interrupts to CPUs is done in the
207
+ The assignment of VMBus channel interrupts to CPUs is done in the
208
208
function init_vp_index(). This assignment is done outside of the
209
209
normal Linux interrupt affinity mechanism, so the interrupts are
210
210
neither "unmanaged" nor "managed" interrupts.
211
211
212
- The CPU that a VMbus channel will interrupt can be seen in
212
+ The CPU that a VMBus channel will interrupt can be seen in
213
213
/sys/bus/vmbus/devices/<deviceGUID>/ channels/<channelRelID>/cpu.
214
214
When running on later versions of Hyper-V, the CPU can be changed
215
215
by writing a new value to this sysfs entry. Because the interrupt
216
216
assignment is done outside of the normal Linux affinity mechanism,
217
217
there are no entries in /proc/irq corresponding to individual
218
- VMbus channel interrupts.
218
+ VMBus channel interrupts.
219
219
220
220
An online CPU in a Linux guest may not be taken offline if it has
221
- VMbus channel interrupts assigned to it. Any such channel
221
+ VMBus channel interrupts assigned to it. Any such channel
222
222
interrupts must first be manually reassigned to another CPU as
223
223
described above. When no channel interrupts are assigned to the
224
224
CPU, it can be taken offline.
225
225
226
- When a guest CPU receives a VMbus interrupt from the host, the
226
+ When a guest CPU receives a VMBus interrupt from the host, the
227
227
function vmbus_isr() handles the interrupt. It first checks for
228
228
channel interrupts by calling vmbus_chan_sched(), which looks at a
229
229
bitmap setup by the host to determine which channels have pending
230
230
interrupts on this CPU. If multiple channels have pending
231
231
interrupts for this CPU, they are processed sequentially. When all
232
232
channel interrupts have been processed, vmbus_isr() checks for and
233
- processes any message received on the VMbus control path.
233
+ processes any message received on the VMBus control path.
234
234
235
- The VMbus channel interrupt handling code is designed to work
235
+ The VMBus channel interrupt handling code is designed to work
236
236
correctly even if an interrupt is received on a CPU other than the
237
237
CPU assigned to the channel. Specifically, the code does not use
238
238
CPU-based exclusion for correctness. In normal operation, Hyper-V
@@ -242,23 +242,23 @@ when Hyper-V will make the transition. The code must work correctly
242
242
even if there is a time lag before Hyper-V starts interrupting the
243
243
new CPU. See comments in target_cpu_store().
244
244
245
- VMbus device creation/deletion
245
+ VMBus device creation/deletion
246
246
------------------------------
247
247
Hyper-V and the Linux guest have a separate message-passing path
248
248
that is used for synthetic device creation and deletion. This
249
- path does not use a VMbus channel. See vmbus_post_msg() and
249
+ path does not use a VMBus channel. See vmbus_post_msg() and
250
250
vmbus_on_msg_dpc().
251
251
252
252
The first step is for the guest to connect to the generic
253
- Hyper-V VMbus mechanism. As part of establishing this connection,
254
- the guest and Hyper-V agree on a VMbus protocol version they will
253
+ Hyper-V VMBus mechanism. As part of establishing this connection,
254
+ the guest and Hyper-V agree on a VMBus protocol version they will
255
255
use. This negotiation allows newer Linux kernels to run on older
256
256
Hyper-V versions, and vice versa.
257
257
258
258
The guest then tells Hyper-V to "send offers". Hyper-V sends an
259
259
offer message to the guest for each synthetic device that the VM
260
- is configured to have. Each VMbus device type has a fixed GUID
261
- known as the "class ID", and each VMbus device instance is also
260
+ is configured to have. Each VMBus device type has a fixed GUID
261
+ known as the "class ID", and each VMBus device instance is also
262
262
identified by a GUID. The offer message from Hyper-V contains
263
263
both GUIDs to uniquely (within the VM) identify the device.
264
264
There is one offer message for each device instance, so a VM with
@@ -275,7 +275,7 @@ type based on the class ID, and invokes the correct driver to set up
275
275
the device. Driver/device matching is performed using the standard
276
276
Linux mechanism.
277
277
278
- The device driver probe function opens the primary VMbus channel to
278
+ The device driver probe function opens the primary VMBus channel to
279
279
the corresponding VSP. It allocates guest memory for the channel
280
280
ring buffers and shares the ring buffer with the Hyper-V host by
281
281
giving the host a list of GPAs for the ring buffer memory. See
@@ -285,7 +285,7 @@ Once the ring buffer is set up, the device driver and VSP exchange
285
285
setup messages via the primary channel. These messages may include
286
286
negotiating the device protocol version to be used between the Linux
287
287
VSC and the VSP on the Hyper-V host. The setup messages may also
288
- include creating additional VMbus channels, which are somewhat
288
+ include creating additional VMBus channels, which are somewhat
289
289
mis-named as "sub-channels" since they are functionally
290
290
equivalent to the primary channel once they are created.
291
291
0 commit comments