B Configuring GPU Pass-Through for NVIDIA cards #
B.1 Introduction #
This article describes how to assign an NVIDIA GPU graphics card on the host machine to a virtualized guest.
B.2 Prerequisites #
GPU pass-through is supported on the AMD64/Intel 64 architecture only.
The host operating system needs to be SLES 12 SP3 or newer.
This article deals with a set of instructions based on V100/T1000 NVIDIA cards, and is meant for GPU computation purposes only.
Verify that you are using an NVIDIA Tesla product—Maxwell, Pascal, or Volta.
To be able to manage the host system, you need an additional display card on the host that you can use when configuring the GPU Pass-Through, or a functional SSH environment.
B.3 Configuring the host #
B.3.1 Verify the host environment #
Verify that the host operating system is SLES 12 SP3 or newer:
>
cat /etc/issue Welcome to SUSE Linux Enterprise Server 15 (x86_64) - Kernel \r (\l).Verify that the host supports VT-d technology and that it is already enabled in the firmware settings:
>
dmesg | grep -e "Directed I/O" [ 12.819760] DMAR: Intel(R) Virtualization Technology for Directed I/OIf VT-d is not enabled in the firmware, enable it and reboot the host.
Verify that the host has an extra GPU or VGA card:
>
lspci | grep -i "vga" 07:00.0 VGA compatible controller: Matrox Electronics Systems Ltd. \ MGA G200e [Pilot] ServerEngines (SEP1) (rev 05)With a Tesla V100 card:
>
lspci | grep -i nvidia 03:00.0 3D controller: NVIDIA Corporation GV100 [Tesla V100 PCIe] (rev a1)With a T1000 Mobile (available on Dell 5540):
>
lspci | grep -i nvidia 01:00.0 3D controller: NVIDIA Corporation TU117GLM [Quadro T1000 Mobile] (rev a1)
IOMMU is disabled by default. You need
to enable it at boot time in the /etc/default/grub
configuration file.
For Intel-based hosts:
GRUB_CMDLINE_LINUX="intel_iommu=on iommu=pt rd.driver.pre=vfio-pci"
For AMD-based hosts:
GRUB_CMDLINE_LINUX="iommu=pt amd_iommu=on rd.driver.pre=vfio-pci"
When you save the modified
/etc/default/grub
file, re-generate the main GRUB 2 configuration file/boot/grub2/grub.cfg
:>
sudo
grub2-mkconfig -o /boot/grub2/grub.cfgReboot the host and verify that IOMMU is enabled:
>
dmesg | grep -e DMAR -e IOMMU
B.3.3 Blacklist the Nouveau driver #
Because we want to assign the NVIDIA card to a VM guest, we need to avoid
use of the card by the host OS's built-in driver for NVIDIA GPUs. The open
source NVIDIA driver is called nouveau
. Edit the file
/etc/modprobe.d/50-blacklist.conf
and append the
following line to its end:
blacklist nouveau
Find the card vendor and model IDs. Utilize the bus number identified in Section B.3.1, “Verify the host environment”, for example
03:00.0
:>
lspci -nn | grep 03:00.0 03:00.0 3D controller [0302]: NVIDIA Corporation GV100 [Tesla V100 PCIe] [10de:1db4] (rev a1)Create the file
vfio.conf
in the/etc/modprobe.d/
directory with the following content:options vfio-pci ids=10de:1db4
NoteVerify that your card does not need an extra
ids=
parameter. For some cards, you must specify the audio device too, so that device's ID must also be added to the list, otherwise you will not be able to use the card.
There are three ways you can load the VFIO driver.
B.3.5.1 Including the driver in the initrd file #
Create the file
/etc/dracut.conf.d/gpu-passthrough.conf
and add the following content (mind the leading whitespace):add_drivers+=" vfio vfio_iommu_type1 vfio_pci vfio_virqfd"
Re-generate the initrd file:
>
sudo
dracut --force /boot/initrd $(uname -r)
B.3.5.2 Adding the driver to the list of auto-loaded modules #
Create the file /etc/modules-load.d/vfio-pci.conf
and
add the following content:
vfio vfio_iommu_type1 vfio_pci kvm kvm_intel
B.3.5.3 Loading the driver manually #
To load the driver manually at run-time, execute the following command:
>
sudo
modprobe vfio-pci
B.3.6 Disable MSR for Microsoft Windows guests #
For Microsoft Windows guests, we recommend disabling MSR (model-specific register) to
avoid the guest crashing. Create the file
/etc/modprobe.d/kvm.conf
and add the following
content:
options kvm ignore_msrs=1
B.3.7 Install and enable UEFI firmware #
For proper GPU Pass-Through functionality, the host needs to boot using UEFI firmware (that is, not using a legacy-style BIOS boot sequence).
Install the qemu-ovmf package which includes UEFI firmware images:
>
sudo
zypper install qemu-ovmfGet the list of OVMF
bin
andvars
files by filtering the results of the following command:>
rpm -ql qemu-ovmfEnable OVMF in the
libvirt
QEMU configuration in the file/etc/libvirt/qemu.conf
by using the list obtained from the previous step. It should look similar to the following:nvram = [ "/usr/share/qemu/ovmf-x86_64-4m.bin:/usr/share/qemu/ovmf-x86_64-4m-vars.bin", "/usr/share/qemu/ovmf-x86_64-4m-code.bin:/usr/share/qemu/ovmf-x86_64-4m-vars.bin", "/usr/share/qemu/ovmf-x86_64-smm-ms-code.bin:/usr/share/qemu/ovmf-x86_64-smm-ms-vars.bin", "/usr/share/qemu/ovmf-x86_64-smm-opensuse-code.bin:/usr/share/qemu/ovmf-x86_64-smm-opensuse-vars.bin", "/usr/share/qemu/ovmf-x86_64-ms-4m-code.bin:/usr/share/qemu/ovmf-x86_64-ms-4m-vars.bin", "/usr/share/qemu/ovmf-x86_64-smm-suse-code.bin:/usr/share/qemu/ovmf-x86_64-smm-suse-vars.bin", "/usr/share/qemu/ovmf-x86_64-ms-code.bin:/usr/share/qemu/ovmf-x86_64-ms-vars.bin", "/usr/share/qemu/ovmf-x86_64-smm-code.bin:/usr/share/qemu/ovmf-x86_64-smm-vars.bin", "/usr/share/qemu/ovmf-x86_64-opensuse-4m-code.bin:/usr/share/qemu/ovmf-x86_64-opensuse-4m-vars.bin", "/usr/share/qemu/ovmf-x86_64-suse-4m-code.bin:/usr/share/qemu/ovmf-x86_64-suse-4m-vars.bin", "/usr/share/qemu/ovmf-x86_64-suse-code.bin:/usr/share/qemu/ovmf-x86_64-suse-vars.bin", "/usr/share/qemu/ovmf-x86_64-opensuse-code.bin:/usr/share/qemu/ovmf-x86_64-opensuse-vars.bin", "/usr/share/qemu/ovmf-x86_64-code.bin:/usr/share/qemu/ovmf-x86_64-vars.bin", ]
B.3.8 Reboot the host machine #
For most of the changes in the above steps to take effect, you need to reboot the host machine:
>
sudo
shutdown -r now
B.4 Configuring the guest #
This section describes how to configure the guest virtual machine so that it
can use the host's NVIDIA GPU. Use Virtual Machine Manager or
virt-install
to install the guest VM. Find more details
in Chapter 10, Guest installation.
B.4.1 Requirements for the guest configuration #
During the guest VM installation, select
and configure the following devices:Use Q35 chipset if possible.
Install the guest VM using UEFI firmware.
Add the following emulated devices:
Graphic: Spice or VNC
Device: qxl, VGA, or Virtio
Find more information in Section 14.6, “Video”.
Add the host PCI device (
03:00.0
in our example) to the guest. Find more information in Section 14.12, “Assigning a host PCI device to a VM Guest”.For best performance, we recommend using virtio drivers for the network card and storage.
B.4.2 Install the graphic card driver #
B.4.2.1 Linux guest #
Download the driver RPM package from http://www.nvidia.com/download/driverResults.aspx/131159/en-us.
Install the downloaded RPM package:
>
sudo
rpm -i nvidia-diag-driver-local-repo-sles123-390.30-1.0-1.x86_64.rpmRefresh repositories and install cuda-drivers. This step will be different for non-SUSE distributions:
>
sudo
zypper refresh && zypper install cuda-driversReboot the guest VM:
>
sudo
shutdown -r now
Because the installer needs to compile the NVIDIA driver modules, install the gcc-c++ and kernel-devel packages.
Disable Secure Boot on the guest, because NVIDIA's driver modules are unsigned. On SUSE distributions, you can use the YaST GRUB 2 module to disable Secure Boot. Find more information in Section 13.1.1, “Implementation on SUSE Linux Enterprise Server”.
Download the driver installation script from https://www.nvidia.com/Download/index.aspx?lang=en-us, make it executable, and run it to complete the driver installation:
>
chmod +x NVIDIA-Linux-x86_64-460.73.01.run>
sudo
./NVIDIA-Linux-x86_64-460.73.01.runDownload CUDA drivers from https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&target_distro=SLES&target_version=15&target_type=rpmlocal and install following the on-screen instructions.
After you have installed the NVIDIA drivers, the Virtual Machine Manager display will
lose its connection to the guest OS. To access the guest VM, you
must either login via ssh
, change to the
console interface, or install a dedicated VNC server in the guest.
To avoid a flickering screen, stop and disable the display manager:
>
sudo
systemctl stop display-manager && systemctl disable display-manager
Change the directory to the CUDA sample templates:
>
cd /usr/local/cuda-9.1/samples/0_Simple/simpleTemplatesCompile and run the
simpleTemplates
file:>
make && ./simpleTemplates runTest<float,32> GPU Device 0: "Tesla V100-PCIE-16GB" with compute capability 7.0 CUDA device [Tesla V100-PCIE-16GB] has 80 Multi-Processors Processing time: 495.006000 (ms) Compare OK runTest<int,64> GPU Device 0: "Tesla V100-PCIE-16GB" with compute capability 7.0 CUDA device [Tesla V100-PCIE-16GB] has 80 Multi-Processors Processing time: 0.203000 (ms) Compare OK [simpleTemplates] -> Test Results: 0 Failures
B.4.2.2 Microsoft Windows guest #
Before you install the NVIDIA drivers, you need to hide the hypervisor
from the drivers by using the <hidden state='on'/>
directive in the guest's libvirt
definition.
Download and install the NVIDIA driver from https://www.nvidia.com/Download/index.aspx.
Download and install the CUDA toolkit from https://developer.nvidia.com/cuda-downloads?target_os=Windows&target_arch=x86_64.
Find some NVIDIA demo samples in the directory
Program Files\Nvidia GPU Computing Toolkit\CUDA\v10.2\extras\demo_suite
on the guest.