2 GPU passthrough #
SUSE OpenStack Cloud Crowbar GPU passthrough functionality provides the nova instance direct access to the GPU device for better performance. This section demonstrates the steps to pass through a NVIDIA GPU card supported by SUSE OpenStack Cloud Crowbar.
When using PCI Passthrough, resizing the VM to the same host with the same PCI card is not supported.
2.1 Leveraging PCI Passthrough #
Two parts are necessary to leverage PCI passthrough on a SUSE OpenStack Cloud 9 Compute Node: preparing the Compute Node, preparing nova and glance.
To leverage PCI passthrough on a SUSE OpenStack Cloud 9 Compute Node, follow the below procedures in sequence:
There should be no kernel drivers or binaries with direct access to the PCI device. If there are kernel modules, they should be blacklisted.
For example, it is common to have a
nouveau
driver from when the node was installed. This driver is a graphics driver for NVIDIA-based GPUs. It must be blacklisted as shown in this example.root #
echo 'blacklist nouveau' >> /etc/modprobe.d/nouveau-default.confThe file location and its contents are important, the name of the file is your choice. Other drivers can be blacklisted in the same manner, possibly including NVIDIA drivers.
On the host,
iommu_groups
is necessary and may already be enabled. To check if IOMMU is enabled:root #
virt-host-validate ..... QEMU: Checking if IOMMU is enabled by kernel : WARN (IOMMU appears to be disabled in kernel. Add intel_iommu=on to kernel cmdline arguments) .....To modify the kernel cmdline as suggested in the warning, edit the file
/etc/default/grub
and appendintel_iommu=on
to theGRUB_CMDLINE_LINUX_DEFAULT
variable. Then runupdate-bootloader
.A reboot is required for
iommu_groups
to be enabled.After the reboot, check that IOMMU is enabled:
root #
virt-host-validate ..... QEMU: Checking if IOMMU is enabled by kernel : PASS .....Confirm IOMMU groups are available by finding the group associated with your PCI device (for example NVIDIA GPU):
root #
lspci -nn | grep -i nvidia 84:00.0 3D controller [0302]: NVIDIA Corporation GV100GL [Tesla V100 PCIe 16GB] [10de:1db4] (rev a1)In this example,
84:00.0
is the address of the PCI device. The vendorID is10de
. The product ID is1db4
.Confirm that the devices are available for passthrough:
root #
ls -ld /sys/kernel/iommu_groups/*/devices/*84:00.?/ drwxr-xr-x 3 root root 0 Nov 19 17:00 /sys/kernel/iommu_groups/56/devices/0000:84:00.0/NoteWith PCI passthrough, only an entire IOMMU group can be passed. Parts of the group cannot be passed. In this example, the IOMMU group is
56
.
2.1.1 Preparing nova and glance for passthrough #
Information about configuring nova is available in the documentation at
https://docs.openstack.org/nova/rocky/admin/pci-passthrough.html.
Services like nova-compute
, nova-scheduler
and nova-api
must be configured.
We suggest user to create new nova.conf
file specific to PCI passthrough usage
under/etc/nova/nova.conf.d
instead of
modifying the existing nova.conf
to
avoid any side effects caused from the
chef-client.service
. For example,
100-nova.conf
or
101-nova-placement.conf
. Note that this
new file, for example,
102-nova-pcipassthru.conf
will not be
recorded in any nova proposal and under manipulation by the
chef-client.service
. Ensure you have
backups in case of any future incidents.
Here's an example of the configuration of a single PCI device being passthrough to the nova instance:
Login to the compute host and change directory to
/etc/nova/nova.conf.d
and list all the files:root #
cd /etc/nova/nova.conf.d/ ls -alCreate a new nova configuration file under this folder and name the file appropriately. Note that configuration files are read and applied in lexicographic order. For example:
102-nova-pcipassthru.conf
.Add the following configuration entries to the file and the values for these entries are specific to your compute node as obtained in Preparing the Compute Node step Procedure 2.1, “Preparing the Compute Node”. The configuration entries specify the PCI device using whitelisting and the PCI alias for the device. Example:
root #
cat /etc/nova/nova.conf.d/102-nova-pcipassthru.conf [pci] passthrough_whitelist = [{ "address": "0000:84:00.0" }] alias = { "vendor_id":"10de", "product_id":"1db4", "device_type":"type-PCI", "name":"a1" }The example above shows how to configure a PCI alias a1 to request a PCI device with a
vendor_id
of10de
and aproduct_id
of1db4
.Once the file is updated, restart nova compute on the compute node:
root #
systemctl restart openstack-nova-compute
Follow the following steps for all the controller nodes. Login to the controller host and change directory to
/etc/nova/nova.conf.d
and list all the files under that folder:root #
cd /etc/nova/nova.conf.d/ ls -alCreate a new nova configuration file under this folder and name the file appropriately in lexicographic increasing order. For example:
102-nova-pcipassthru.conf
.Add the following configuration entries to the file and the configuration entries for controller nodes only specify PCI alias for the device from the compute host. For example:
root #
cat /etc/nova/nova.conf.d/102-nova-pcipassthru.conf [pci] alias = { "vendor_id":"10de", "product_id":"1db4", "device_type":"type-PCI", "name":"a1" }The example above shows how to configure a PCI alias a1 to request a PCI device with a
vendor_id
of10de
and aproduct_id
of1db4
.Once the file is updated, restart nova api on the controller node:
root #
systemctl restart openstack-nova-api
2.1.2 Flavor Creation #
Login into the controller node and create a new flavor or update an existing flavor with the property
"pci_passthrough:alias"
. For example:
root #
source .openrc
openstack flavor create --ram 8192 --disk 100 --vcpu 8 gpuflavor
openstack flavor set gpuflavor --property "pci_passthrough:alias"="a1:1"
In the property, "pci_passthrough:alias"="a1:1"
, a1
before the :
references the alias name as provided in the
configuration entries while the number 1
after the :
tells nova that a single GPU
should be assigned.
Boot an instance with the flavor created in previous step
Make sure the VM becomes ACTIVE
. Login into the virtual instance and verify that GPU is seen
from the guest
by running the lspci
command on the guest.
2.1.3 Additional examples #
Example 1: Multiple compute hosts
Compute host x:
[pci] passthrough_whitelist = [{"address": "0000:84:00.0"}] alias = {"vendor_id": "10de", "name": "a1", "device_type": "type-PCI", "product_id": "1db4"}
Compute host y:
[pci] passthrough_whitelist = [{"address": "0000:85:00.0"}] alias = {"vendor_id": "10de", "name": "a1", "device_type": "type-PCI", "product_id": "1db4"}
Controller nodes:
[pci] alias = {"vendor_id": "10de", "name": "a1", "device_type": "type-PCI", "product_id": "1db4"}
Example 2: Multiple PCI devices on the same host
Compute host z:
[pci] passthrough_whitelist = [{"vendor_id": "10de", "product_id": "1db4"}, {"vendor_id": "10de", "product_id": "1db1"}] alias = {"vendor_id": "10de", "name": "a2", "device_type": "type-PCI", "product_id": "1db1"} alias = {"vendor_id": "10de", "name": "a1", "device_type": "type-PCI", "product_id": "1db4"}
Controller nodes:
[pci] alias = {"vendor_id": "10de", "name": "a2", "device_type": "type-PCI", "product_id": "1db1"} alias = {"vendor_id": "10de", "name": "a1", "device_type": "type-PCI", "product_id": "1db4"}
In order to pass both the devices to the instance, you can set the
"pci_passthrough:alias"="a1:1,a2:1"
root #
source .openrc openstack flavor create --ram 8192 --disk 100 --vcpu 8 gpuflavor2 openstack flavor set gpuflavor2 --property "pci_passthrough:alias"="a1:1,a2:1"