|
This is unreleased documentation for SUSE® Virtualization v1.7 (Dev). |
NVIDIA Driver Toolkit
The nvidia-driver-toolkit add-on allows you to deploy out-of-band NVIDIA GRID KVM drivers to your existing SUSE Virtualization clusters.
|
The toolkit only includes the correct SUSE Virtualization OS image, build utilities, and kernel headers that allow NVIDIA drivers to be compiled and loaded from the container. You must download the NVIDIA KVM drivers using a valid NVIDIA subscription. For guidance on identifying the correct driver for your NVIDIA GPU, see the NVIDIA documentation. Each new SUSE Virtualization version is released with the correct |
The SUSE Virtualization ISO does not include the nvidia-driver-toolkit container image. Because of its size, the image is pulled from Docker Hub by default. If you have an air-gapped environment, you can download and push the image to your private registry. The Image Repository and Image Tag fields on the nvidia-driver-toolkit screen provide information about the image that you must download.
To enable the add-on, you must specify the HTTP location where the NVIDIA vGPU KVM driver file is located. You can also update the image repository and image tag, if necessary. Once the add-on is enabled, an nvidia-driver-toolkit daemonset is deployed to the cluster.
On pod startup, the ENTRYPOINT script downloads the NVIDIA driver from the specified driver location. Install the driver and load the kernel drivers.
The pcidevices-controller add-on can now leverage this add-on to manage the lifecycle of the vGPU devices on nodes containing supported GPU devices.
Installing different NVIDIA driver versions
NVIDIA driver versions can vary across cluster nodes. If you want to install a specific driver version on a node, you must annotate the node before starting the nvidia-driver-toolkit add-on.
kubectl annotate nodes {node name} sriovgpu.harvesterhci.io/custom-driver=https://[driver location]
The add-on installs the specified driver upon starting.
If an NVIDIA driver was previously installed, you must restart the pod to trigger the installation process again.