Applies to SUSE OpenStack Cloud Crowbar 9

2 GPU passthrough #

SUSE OpenStack Cloud Crowbar GPU passthrough functionality provides the nova instance direct access to the GPU device for better performance. This section demonstrates the steps to pass through a NVIDIA GPU card supported by SUSE OpenStack Cloud Crowbar.

Note

When using PCI Passthrough, resizing the VM to the same host with the same PCI card is not supported.

2.1 Leveraging PCI Passthrough #

Two parts are necessary to leverage PCI passthrough on a SUSE OpenStack Cloud 9 Compute Node: preparing the Compute Node, preparing nova and glance.

To leverage PCI passthrough on a SUSE OpenStack Cloud 9 Compute Node, follow the below procedures in sequence:

Procedure 2.1: Preparing the Compute Node #

There should be no kernel drivers or binaries with direct access to the PCI device. If there are kernel modules, they should be blacklisted.
For example, it is common to have a nouveau driver from when the node was installed. This driver is a graphics driver for NVIDIA-based GPUs. It must be blacklisted as shown in this example.
```
root # echo 'blacklist nouveau' >> /etc/modprobe.d/nouveau-default.conf
```
The file location and its contents are important, the name of the file is your choice. Other drivers can be blacklisted in the same manner, possibly including NVIDIA drivers.
On the host, iommu_groups is necessary and may already be enabled. To check if IOMMU is enabled:
```
root # virt-host-validate
.....
QEMU: Checking if IOMMU is enabled by kernel
: WARN (IOMMU appears to be disabled in kernel. Add intel_iommu=on to kernel cmdline arguments)
.....
```
To modify the kernel cmdline as suggested in the warning, edit the file /etc/default/grub and append intel_iommu=on to the GRUB_CMDLINE_LINUX_DEFAULT variable. Then run update-bootloader.
A reboot is required for iommu_groups to be enabled.

After the reboot, check that IOMMU is enabled:

root # virt-host-validate
.....
QEMU: Checking if IOMMU is enabled by kernel
: PASS
.....

Confirm IOMMU groups are available by finding the group associated with your PCI device (for example NVIDIA GPU):
```
root # lspci -nn | grep -i nvidia
84:00.0 3D controller [0302]: NVIDIA Corporation GV100GL [Tesla V100 PCIe 16GB] [10de:1db4] (rev a1)
```
In this example, 84:00.0 is the address of the PCI device. The vendorID is 10de. The product ID is 1db4.
Confirm that the devices are available for passthrough:
```
root # ls -ld /sys/kernel/iommu_groups/*/devices/*84:00.?/
drwxr-xr-x 3 root root 0 Nov 19 17:00 /sys/kernel/iommu_groups/56/devices/0000:84:00.0/
```
Note
With PCI passthrough, only an entire IOMMU group can be passed. Parts of the group cannot be passed. In this example, the IOMMU group is 56.

2.1.1 Preparing nova and glance for passthrough #

Information about configuring nova is available in the documentation at https://docs.openstack.org/nova/rocky/admin/pci-passthrough.html. Services like nova-compute, nova-scheduler and nova-api must be configured.

Note

We suggest user to create new nova.conf file specific to PCI passthrough usage under/etc/nova/nova.conf.d instead of modifying the existing nova.conf to avoid any side effects caused from the chef-client.service. For example, 100-nova.conf or 101-nova-placement.conf. Note that this new file, for example, 102-nova-pcipassthru.conf will not be recorded in any nova proposal and under manipulation by the chef-client.service. Ensure you have backups in case of any future incidents.

Here's an example of the configuration of a single PCI device being passthrough to the nova instance:

Procedure 2.2: Configure the Compute Host #

Login to the compute host and change directory to /etc/nova/nova.conf.d and list all the files:
```
root # cd /etc/nova/nova.conf.d/
ls -al
```
Create a new nova configuration file under this folder and name the file appropriately. Note that configuration files are read and applied in lexicographic order. For example: 102-nova-pcipassthru.conf .
Add the following configuration entries to the file and the values for these entries are specific to your compute node as obtained in Preparing the Compute Node step Procedure 2.1, “Preparing the Compute Node”. The configuration entries specify the PCI device using whitelisting and the PCI alias for the device. Example:
```
root # cat /etc/nova/nova.conf.d/102-nova-pcipassthru.conf
[pci]
passthrough_whitelist = [{ "address": "0000:84:00.0" }]
alias = { "vendor_id":"10de", "product_id":"1db4", "device_type":"type-PCI", "name":"a1" }
```
The example above shows how to configure a PCI alias a1 to request a PCI device with a vendor_id of 10de and a product_id of 1db4.
Once the file is updated, restart nova compute on the compute node:
```
root # systemctl restart openstack-nova-compute
```

Procedure 2.3: Configure the Controller nodes #

Follow the following steps for all the controller nodes. Login to the controller host and change directory to /etc/nova/nova.conf.d and list all the files under that folder:
```
root # cd /etc/nova/nova.conf.d/
ls -al
```
Create a new nova configuration file under this folder and name the file appropriately in lexicographic increasing order. For example: 102-nova-pcipassthru.conf .
Add the following configuration entries to the file and the configuration entries for controller nodes only specify PCI alias for the device from the compute host. For example:
```
root # cat /etc/nova/nova.conf.d/102-nova-pcipassthru.conf
[pci]
alias = { "vendor_id":"10de", "product_id":"1db4", "device_type":"type-PCI", "name":"a1" }
```
The example above shows how to configure a PCI alias a1 to request a PCI device with a vendor_id of 10de and a product_id of 1db4.
Once the file is updated, restart nova api on the controller node:
```
root # systemctl restart openstack-nova-api
```

2.1.2 Flavor Creation #

Login into the controller node and create a new flavor or update an existing flavor with the property "pci_passthrough:alias". For example:

root # source .openrc
openstack flavor create --ram 8192 --disk 100 --vcpu 8 gpuflavor
openstack flavor set gpuflavor --property "pci_passthrough:alias"="a1:1"

In the property, "pci_passthrough:alias"="a1:1", a1 before the : references the alias name as provided in the configuration entries while the number 1 after the : tells nova that a single GPU should be assigned.

Boot an instance with the flavor created in previous step

Make sure the VM becomes ACTIVE. Login into the virtual instance and verify that GPU is seen from the guest by running the lspci command on the guest.

2.1.3 Additional examples #

Example 1: Multiple compute hosts

Compute host x:

[pci]
passthrough_whitelist = [{"address": "0000:84:00.0"}]
alias = {"vendor_id": "10de", "name": "a1", "device_type": "type-PCI", "product_id": "1db4"}

Compute host y:

[pci]
passthrough_whitelist = [{"address": "0000:85:00.0"}]
alias = {"vendor_id": "10de", "name": "a1", "device_type": "type-PCI", "product_id": "1db4"}

Controller nodes:

[pci]
alias = {"vendor_id": "10de", "name": "a1", "device_type": "type-PCI", "product_id": "1db4"}

Example 2: Multiple PCI devices on the same host

Compute host z:

[pci]
passthrough_whitelist = [{"vendor_id": "10de", "product_id": "1db4"}, {"vendor_id": "10de", "product_id":
"1db1"}]
alias = {"vendor_id": "10de", "name": "a2", "device_type": "type-PCI", "product_id": "1db1"}
alias = {"vendor_id": "10de", "name": "a1", "device_type": "type-PCI", "product_id": "1db4"}

Controller nodes:

[pci]
alias = {"vendor_id": "10de", "name": "a2", "device_type": "type-PCI", "product_id": "1db1"}
alias = {"vendor_id": "10de", "name": "a1", "device_type": "type-PCI", "product_id": "1db4"}

In order to pass both the devices to the instance, you can set the "pci_passthrough:alias"="a1:1,a2:1"

root # source .openrc
openstack flavor create --ram 8192 --disk 100 --vcpu 8 gpuflavor2
openstack flavor set gpuflavor2 --property "pci_passthrough:alias"="a1:1,a2:1"