Skip to content

Commit a01efb4

Browse files
author
Adam Delo
authored
VFIO/PCIe Passthrough configuration support (#247)
* added basic support for PCIe passthrough for Intel and AMD CPUs Feature can be enabled via `pve_pcie_passthrough_enabled`. Mediated devices are supported, but disabled by default since not all boards support GVT-g. Interrupt remapping can also be disabled for boards that do not support it. * moved GRUB update task to a handler to deduplicate tasks * added handler for updating initramfs when updating modprobe configuration * added support for certain PCIe passthrough configurations Role variables have been added to allow stubbing PCI devices via Vendor:Product ID when GRUB boots, blocking the loading of modules (e.g. nvidia drivers) via `softdep`, enabling GPU OVMF passthrough, and disabling DMA translation by the hypervisor for passthrough devices. * added new section for PCIe passthrough in documentation * added ability to configure KVM module to ignore MSRS and disable logging ignored MSRs This fixes issues with certain applications in Windows guests.
1 parent c678974 commit a01efb4

File tree

8 files changed

+261
-16
lines changed

8 files changed

+261
-16
lines changed

README.md

Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -386,6 +386,15 @@ pve_check_for_kernel_update: true # Runs a script on the host to check kernel ve
386386
pve_reboot_on_kernel_update: false # If set to true, will automatically reboot the machine on kernel updates
387387
pve_reboot_on_kernel_update_delay: 60 # Number of seconds to wait before and after a reboot process to proceed with next task in cluster mode
388388
pve_remove_old_kernels: true # Currently removes kernel from main Debian repository
389+
pve_pcie_passthrough_enabled: false # Set this to true to enable PCIe passthrough.
390+
pve_iommu_passthrough_mode: false # Set this to true to allow VMs to bypass the DMA translation. This might increase performance for IOMMU passthrough.
391+
pve_iommu_unsafe_interrupts: false # Set this to true if your system doesn't support interrupt remapping.
392+
pve_mediated_devices_enabled: false # Set this to true if your device supports gtv-g and you wish to enable split functionality.
393+
pve_pcie_ovmf_enabled: false # Set this to true to enable GPU OVMF PCI passthrough.
394+
pve_pci_device_ids: [] # List of pci device ID's (see https://pve.proxmox.com/wiki/Pci_passthrough#GPU_Passthrough).
395+
pve_vfio_blacklist_drivers: [] # List of device drivers to blacklist from the Proxmox host (see https://pve.proxmox.com/wiki/PCI(e)_Passthrough).
396+
pve_pcie_ignore_msrs: false # Set this to true if passing through to Windows machine to prevent VM crashing.
397+
pve_pcie_report_msrs: true # Set this to false to prevent dmesg system from logging msrs crash reports.
389398
pve_watchdog: none # Set this to "ipmi" if you want to configure a hardware watchdog. Proxmox uses a software watchdog (nmi_watchdog) by default.
390399
pve_watchdog_ipmi_action: power_cycle # Can be one of "reset", "power_cycle", and "power_off".
391400
pve_watchdog_ipmi_timeout: 10 # Number of seconds the watchdog should wait
@@ -760,6 +769,56 @@ nodes).
760769
`pve_ceph_osds` by default creates unencrypted ceph volumes. To use encrypted
761770
volumes the parameter `encrypted` has to be set per drive to `true`.
762771

772+
## PCIe Passthrough
773+
774+
This role can be configured to allow PCI device passthrough from the Proxmox host to VMs. This feature is not enabled by default since not all motherboards and CPUs support this feature. To enable passthrough, the devices CPU must support hardware virtualization (VT-d for Intel based systems and AMD-V for AMD based systems). Refer to the manuals of all components to determine whether this feature is supported or not. Naming conventions of will vary, but is usually referred to as IOMMU, VT-d, or AMD-V.
775+
776+
By enabling this feature, dedicated devices (such as a GPU or USB devices) can be passed through to the VMs. Along with dedicated devices, various integrated devices such as Intel or AMD's integrated GPU's are also able to be passed through to VMs.
777+
778+
Some devices are able to take advantage of Mediated usage. Mediated devices are able to be passed through to multiple VMs to share resources, while still remaining usable by the host system. Splitting of devices is not always supported and should be validated before being enabled to prevent errors. Refer to the manual of the device you want to pass through to determine whether the device is capable of mediated usage (Currently this role only supports GVT-g; SR-IOV is not currently supported and must be enable manually after role completion).
779+
780+
The following is an example configuration which enables PCIe passthrough:
781+
782+
```yaml
783+
pve_pcie_passthrough_enabled: true
784+
pve_iommu_passthrough_mode: true
785+
pve_iommu_unsafe_interrupts: false
786+
pve_mediated_devices_enabled: false
787+
pve_pcie_ovmf_enabled: false
788+
pve_pci_device_ids:
789+
- id: "10de:1381"
790+
- id: "10de:0fbc"
791+
pve_vfio_blacklist_drivers:
792+
- name: "radeon"
793+
- name: "nouveau"
794+
- name: "nvidia"
795+
pve_pcie_ignore_msrs: false
796+
pve_pcie_report_msrs: true
797+
```
798+
799+
`pve_pcie_passthrough_enabled` is required to use any PCIe passthrough functionality. Without this enabled, all other PCIe related fields will be unused.
800+
801+
`pve_iommu_passthrough_mode` enabling IOMMU passthrough mode might increase device performance. By enabling this feature, it allows VMs to bypass the default DMA translation which would normally be performed by the hyper-visor. Instead, VMs pass DMA requests directly to the hardware IOMMU.
802+
803+
`pve_iommu_unsafe_interrupts` is required to be enabled to allow PCI passthrough if your system doesn't support interrupt remapping. You can find check whether the device supports interrupt remapping by using `dmesg | grep 'remapping'`. If you see one of the following lines:
804+
805+
- "AMD-Vi: Interrupt remapping enabled"
806+
- "DMAR-IR: Enabled IRQ remapping in x2apic mode" ('x2apic' can be different on old CPUs, but should still work)
807+
808+
Then system interrupt remapping is supported and you do not need to enable unsafe interrupts. Be aware that by enabling this value your system can become unstable.
809+
810+
`pve_mediated_devices_enabled` enables GVT-g support for integrated devices such as Intel iGPU's. Not all devices support GVT-g so it is recommended to check with your specific device beforehand to ensure it is allowed.
811+
812+
`pve_pcie_ovmf_enabled` enables GPU OVMF PCI passthrough. When using OVMF you should select 'OVMF' as the BIOS option for the VM instead of 'SeaBIOS' within Proxmox. This setting will try to opt-out devices from VGA arbitration if possible.
813+
814+
`pve_pci_device_ids` is a list of device and vendor ids that is wished to be passed through to VMs from the host. See the section 'GPU Passthrough' on the [Proxmox WIKI](https://pve.proxmox.com/wiki/Pci_passthrough) to find your specific device and vendor id's. When setting this value, it is required to specify an 'id' for each new element in the array.
815+
816+
`pve_vfio_blacklist_drivers` is a list of drivers to be excluded/blacklisted from the host. This is required when passing through a PCI device to prevent the host from using the device before it can be assigned to a VM. When setting this value, it is required to specify a 'name' for each new element in the array.
817+
818+
`pve_pcie_ignore_msrs` prevents some Windows applications like GeForce Experience, Passmark Performance Test and SiSoftware Sandra from crashing the VM. This value is only required when passing PCI devices to Windows based systems.
819+
820+
`pve_pcie_report_msrs` can be used to enable or disable logging messages of msrs warnings. If you see a lot of warning messages in your 'dmesg' system log, this value can be used to silence msrs warnings.
821+
763822
## Developer Notes
764823

765824
When developing new features or fixing something in this role, you can test out
@@ -802,6 +861,7 @@ PendaGTP ([@PendaGTP](https://github.com/PendaGTP)) - Ceph support
802861
John Marion ([@jmariondev](https://github.com/jmariondev))
803862
foerkede ([@foerkede](https://github.com/foerkede)) - ZFS storage support
804863
Guiffo Joel ([@futuriste](https://github.com/futuriste)) - Pool configuration support
864+
Adam Delo ([@ol3d](https://github.com/ol3d)) - PCIe Passthrough Support
805865

806866
[Full list of contributors](https://github.com/lae/ansible-role-proxmox/graphs/contributors)
807867

defaults/main.yml

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,15 @@ pve_reboot_on_kernel_update_delay: 60
1010
pve_remove_old_kernels: true
1111
pve_run_system_upgrades: false
1212
pve_run_proxmox_upgrades: true
13+
pve_pcie_passthrough_enabled: false
14+
pve_iommu_passthrough_mode: false
15+
pve_iommu_unsafe_interrupts: false
16+
pve_mediated_devices_enabled: false
17+
pve_pcie_ovmf_enabled: false
18+
pve_pci_device_ids: []
19+
pve_vfio_blacklist_drivers: []
20+
pve_pcie_ignore_msrs: false
21+
pve_pcie_report_msrs: true
1322
pve_watchdog: none
1423
pve_watchdog_ipmi_action: power_cycle
1524
pve_watchdog_ipmi_timeout: 10

handlers/main.yml

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,3 +32,12 @@
3232
name: ceph.service
3333
state: restarted
3434
daemon_reload: true
35+
36+
- name: update-initramfs
37+
command: update-initramfs -u -k all
38+
39+
- name: update-grub
40+
command: update-grub
41+
register: _pve_grub_update
42+
failed_when: ('error' in _pve_grub_update.stderr)
43+
tags: skiponlxc

tasks/disable_nmi_watchdog.yml

Lines changed: 1 addition & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -30,11 +30,4 @@
3030
dest: /etc/default/grub
3131
line: 'GRUB_CMDLINE_LINUX="$GRUB_CMDLINE_LINUX nmi_watchdog=0"'
3232
insertafter: '^GRUB_CMDLINE_LINUX="'
33-
register: _pve_grub
34-
35-
- name: Update GRUB configuration
36-
command: update-grub
37-
register: _pve_grub_update
38-
failed_when: ('error' in _pve_grub_update.stderr)
39-
when: "_pve_grub is changed"
40-
tags: skiponlxc
33+
notify: update-grub

tasks/kernel_module_cleanup.yml

Lines changed: 72 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -20,14 +20,7 @@
2020
dest: /etc/default/grub
2121
line: 'GRUB_CMDLINE_LINUX="$GRUB_CMDLINE_LINUX nmi_watchdog=0"'
2222
state: absent
23-
register: _pve_grub
24-
25-
- name: Update GRUB configuration
26-
command: update-grub
27-
register: _pve_grub_update
28-
failed_when: ('error' in _pve_grub_update.stderr)
29-
when: "_pve_grub is changed"
30-
tags: skiponlxc
23+
notify: update-grub
3124

3225
- name: Remove ipmi_watchdog modprobe configuration
3326
file:
@@ -46,3 +39,74 @@
4639
notify:
4740
- restart watchdog-mux
4841
when: "pve_watchdog != 'ipmi'"
42+
43+
- name: Modify vfio IOMMU references and configuration in default grub
44+
ansible.builtin.blockinfile:
45+
dest: /etc/default/grub
46+
state: absent
47+
marker: "# {mark}: IOMMU default grub configuration (managed by ansible)."
48+
notify: update-grub
49+
when: >
50+
(not pve_pcie_passthrough_enabled | bool) or
51+
((not 'GenuineIntel' in ansible_processor | unique) and
52+
(not pve_iommu_passthrough_mode | bool) and
53+
(not pve_mediated_devices_enabled | bool) and
54+
(not pve_pci_device_ids | length > 0))
55+
56+
- name: Remove modprobe.d configuration files
57+
notify: update-initramfs
58+
block:
59+
- name: Remove vfio config file
60+
ansible.builtin.file:
61+
dest: /etc/modprobe.d/vfio.conf
62+
state: absent
63+
when: >
64+
(not pve_pcie_passthrough_enabled | bool) or
65+
((not pve_pci_device_ids | length > 0) and
66+
(not pve_vfio_blacklist_drivers | length > 0) and
67+
(not pve_pcie_ovmf_enabled | bool))
68+
69+
- name: Remove driver blacklist config file
70+
ansible.builtin.file:
71+
dest: /etc/modprobe.d/blacklist.conf
72+
state: absent
73+
when: >
74+
(not pve_pcie_passthrough_enabled | bool) or
75+
(not pve_vfio_blacklist_drivers | length > 0)
76+
77+
- name: Remove kvm config file
78+
ansible.builtin.file:
79+
dest: /etc/modprobe.d/kvm.conf
80+
state: absent
81+
when: >
82+
(not pve_pcie_passthrough_enabled | bool) or
83+
((not pve_pcie_ignore_msrs | bool) and
84+
(pve_pcie_report_msrs | bool))
85+
86+
- name: Disable declaring IOMMU unsafe interrupts on init
87+
ansible.builtin.file:
88+
dest: /etc/modprobe.d/iommu_unsafe_interrupts.conf
89+
state: absent
90+
when: >
91+
(not pve_pcie_passthrough_enabled | bool) or
92+
(not pve_iommu_unsafe_interrupts | bool)
93+
94+
- name: Remove all GVT-g configuration
95+
notify: update-initramfs
96+
block:
97+
- name: Remove modules list for GVT-g
98+
ansible.builtin.blockinfile:
99+
dest: /etc/modules
100+
state: absent
101+
marker: "# {mark}: Modules required for GVT-g (managed by ansible)."
102+
when: >
103+
(not pve_pcie_passthrough_enabled | bool) or
104+
(not pve_mediated_devices_enabled | bool)
105+
106+
- name: Remove modules list required for PCI passthrough
107+
ansible.builtin.blockinfile:
108+
dest: /etc/modules
109+
state: absent
110+
marker: "# {mark}: Modules required for PCI passthrough (managed by ansible)."
111+
when: >
112+
(not pve_pcie_passthrough_enabled | bool)

tasks/main.yml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -201,6 +201,9 @@
201201
when:
202202
- "'pve-no-subscription' in pve_repository_line"
203203

204+
- import_tasks: pcie_passthrough.yml
205+
when: "pve_pcie_passthrough_enabled | bool"
206+
204207
- import_tasks: kernel_updates.yml
205208

206209
- import_tasks: ipmi_watchdog.yml

tasks/pcie_passthrough.yml

Lines changed: 93 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,93 @@
1+
---
2+
- name: Modify vfio IOMMU references and configuration in default grub
3+
ansible.builtin.blockinfile:
4+
dest: /etc/default/grub
5+
marker: "# {mark}: IOMMU default grub configuration (managed by ansible)."
6+
content: "\
7+
{% if '\"GenuineIntel\" in ansible_processor | unique' %}GRUB_CMDLINE_LINUX=\"$GRUB_CMDLINE_LINUX intel_iommu=on\"\n{% endif %}\
8+
{% if (pve_iommu_passthrough_mode | bool) %}GRUB_CMDLINE_LINUX=\"$GRUB_CMDLINE_LINUX iommu=pt\"\n{% endif %}\
9+
{% if (pve_mediated_devices_enabled | bool) %}GRUB_CMDLINE_LINUX=\"$GRUB_CMDLINE_LINUX i915.enable_gvt=1 i915.enable_guc=0\"\n{% endif %}\
10+
{% if (pve_pci_device_ids | length > 0) %}GRUB_CMDLINE_LINUX=\"$GRUB_CMDLINE_LINUX vfio-pci.ids={% for k in pve_pci_device_ids %}{{ k.id }}{% if k != (pve_pci_device_ids | last) %},{% endif %}{% endfor %}\"{% endif %}"
11+
insertafter: '^GRUB_CMDLINE_LINUX=""'
12+
mode: "0640"
13+
notify: update-grub
14+
when: >
15+
('GenuineIntel' in ansible_processor | unique) or
16+
(pve_iommu_passthrough_mode | bool) or
17+
(pve_mediated_devices_enabled | bool) or
18+
(pve_pci_device_ids | length > 0)
19+
20+
- name: Create/Modify modprobe.d configuration files
21+
notify: update-initramfs
22+
block:
23+
- name: Specify vfio configuration options
24+
ansible.builtin.blockinfile:
25+
dest: /etc/modprobe.d/vfio.conf
26+
marker: "# {mark}: VFIO driver configuration options (managed by ansible)."
27+
content: "\
28+
{% if (pve_vfio_blacklist_drivers | length > 0) %}{% for k in pve_vfio_blacklist_drivers %}softdep {{ k.name }} pre: vfio-pci\n{% endfor %}{% endif %}\
29+
{% if (pve_pcie_ovmf_enabled | bool) %}options vfio-pci disable_vga=1\n{% endif %}\
30+
{% if (pve_pci_device_ids | length > 0) %}options vfio-pci
31+
ids={% for k in pve_pci_device_ids %}{{ k.id }}{% if k != (pve_pci_device_ids | last) %},{% endif %}{% endfor %}{% endif %}"
32+
mode: "0640"
33+
create: true
34+
when: >
35+
(pve_vfio_blacklist_drivers | length > 0) or
36+
(pve_pci_device_ids | length > 0) or
37+
(pve_pcie_ovmf_enabled | bool)
38+
39+
- name: Blacklist drivers from host
40+
ansible.builtin.blockinfile:
41+
dest: /etc/modprobe.d/blacklist.conf
42+
marker: "# {mark}: Blacklist drivers from host (managed by ansible)."
43+
content: "{% for k in pve_vfio_blacklist_drivers %}blacklist {{ k.name }}\n{% endfor %}"
44+
mode: "0640"
45+
create: true
46+
when: >
47+
(pve_vfio_blacklist_drivers | length > 0)
48+
49+
- name: Specify kvm configuration options
50+
ansible.builtin.blockinfile:
51+
dest: /etc/modprobe.d/kvm.conf
52+
marker: "# {mark}: VFIO driver configuration options (managed by ansible)."
53+
content: "\
54+
{% if (pve_pcie_ignore_msrs | bool) %}options kvm ignore_msrs=1\n{% endif %}\
55+
{% if (not pve_pcie_report_msrs | bool) %}options kvm report_ignored_msrs=0{% endif %}"
56+
mode: "0640"
57+
create: true
58+
when: >
59+
(pve_pcie_ignore_msrs | bool) or
60+
(not pve_pcie_report_msrs | bool)
61+
62+
- name: Enable IOMMU Interrupt Remapping
63+
ansible.builtin.blockinfile:
64+
dest: /etc/modprobe.d/iommu_unsafe_interrupts.conf
65+
marker: "# {mark}: IOMMU Interrupt Remapping configuration (managed by ansible)."
66+
content: "options vfio_iommu_type1 allow_unsafe_interrupts=1"
67+
mode: "0640"
68+
create: true
69+
when: >
70+
(pve_iommu_unsafe_interrupts | bool)
71+
72+
- name: Modify required modules list
73+
notify: update-initramfs
74+
block:
75+
- name: Modify modules list for PCIe Passthrough
76+
ansible.builtin.blockinfile:
77+
dest: /etc/modules
78+
marker: "# {mark}: Modules required for PCI passthrough (managed by ansible)."
79+
content: |
80+
vfio
81+
vfio_iommu_type1
82+
vfio_pci
83+
vfio_virqfd
84+
85+
- name: Modify modules list for GVT-g
86+
ansible.builtin.blockinfile:
87+
dest: /etc/modules
88+
marker: "# {mark}: Modules required for GVT-g (managed by ansible)."
89+
content: |
90+
kvmgt
91+
mdev
92+
when: >
93+
(pve_mediated_devices_enabled | bool)

tests/vagrant/group_vars/all

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,20 @@ pve_extra_packages:
77
pve_check_for_kernel_update: true
88
pve_reboot_on_kernel_update: true
99
pve_run_system_upgrades: true
10+
pve_pcie_passthrough_enabled: true
11+
pve_iommu_passthrough_mode: true
12+
pve_iommu_unsafe_interrupts: true
13+
pve_mediated_devices_enabled: true
14+
pve_pcie_ovmf_enabled: true
15+
pve_pci_device_ids:
16+
- id: "10de:1381"
17+
- id: "10de:0fbc"
18+
pve_vfio_blacklist_drivers:
19+
- name: "radeon"
20+
- name: "nouveau"
21+
- name: "nvidia"
22+
pve_pcie_ignore_msrs: true
23+
pve_pcie_report_msrs: false
1024
pve_zfs_enabled: yes
1125
pve_zfs_zed_email: root@localhost
1226
pve_cluster_enabled: yes

0 commit comments

Comments
 (0)