Installing the NVIDIA GPU Operator
Prerequisites
You have the
kubectl
andhelm
CLIs available on a client machine.You can run the following commands to install the Helm CLI:
$ curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3 \ && chmod 700 get_helm.sh \ && ./get_helm.sh
All worker nodes or node groups to run GPU workloads in the Kubernetes cluster must run the same operating system version to use the NVIDIA GPU Driver container. Alternatively, if you pre-install the NVIDIA GPU Driver on the nodes, then you can run different operating systems.
For worker nodes or node groups that run CPU workloads only, the nodes can run any operating system because the GPU Operator does not perform any configuration or management of nodes for CPU-only workloads.
Nodes must be configured with a container engine such CRI-O or containerd.
If your cluster uses Pod Security Admission (PSA) to restrict the behavior of pods, label the namespace for the Operator to set the enforcement policy to privileged:
$ kubectl create ns gpu-operator $ kubectl label --overwrite ns gpu-operator pod-security.kubernetes.io/enforce=privileged
Node Feature Discovery (NFD) is a dependency for the Operator on each node. By default, NFD master and worker are automatically deployed by the Operator. If NFD is already running in the cluster, then you must disable deploying NFD when you install the Operator.
One way to determine if NFD is already running in the cluster is to check for a NFD label on your nodes:
$ kubectl get nodes -o json | jq '.items[].metadata.labels | keys | any(startswith("feature.node.kubernetes.io"))'
If the command output is
true
, then NFD is already running in the cluster.
Procedure
Add the NVIDIA Helm repository:
$ helm repo add nvidia https://helm.ngc.nvidia.com/nvidia \ && helm repo update
Install the GPU Operator.
Install the Operator with the default configuration:
$ helm install --wait --generate-name \ -n gpu-operator --create-namespace \ nvidia/gpu-operator
Install the Operator and specify configuration options:
$ helm install --wait --generate-name \ -n gpu-operator --create-namespace \ nvidia/gpu-operator \ --set <option-name>=<option-value>
Refer to the Chart Customization Options and Common Deployment Scenarios for more information.
Chart Customization Options
The following options are available when using the Helm chart.
These options can be used with --set
when installing with Helm.
Parameter |
Description |
Default |
---|---|---|
|
When set to |
|
|
When set to Pods can specify |
|
|
When set to |
|
|
Map of custom annotations to add to all GPU Operator managed pods. |
|
|
Map of custom labels to add to all GPU Operator managed pods. |
|
|
By default, the Operator deploys NVIDIA drivers as a container on the system.
Set this value to |
|
|
The images are downloaded from NGC. Specify another image repository when using custom driver images. |
|
|
Controls whether the driver daemonset should build and load the |
|
|
Indicate if MOFED is directly pre-installed on the host. This is used to build and load |
|
|
By default, the driver container has an initial delay of |
|
|
When set to |
|
|
When set to |
|
|
Version of the NVIDIA datacenter driver supported by the Operator. If you set |
Depends on the version of the Operator. See the Component Matrix for more information on supported drivers. |
|
The GPU Operator deploys NVIDIA Kata Manager when this field is |
|
|
Controls the strategy to be used with MIG on supported NVIDIA GPUs. Options
are either |
|
|
The MIG manager watches for changes to the MIG geometry and applies reconfiguration as needed. By default, the MIG manager only runs on nodes with GPUs that support MIG (for e.g. A100). |
|
|
Deploys Node Feature Discovery plugin as a daemonset.
Set this variable to |
|
|
Installs node feature rules that are related to confidential computing.
NFD uses the rules to detect security features in CPUs and NVIDIA GPUs.
Set this variable to |
|
|
Map of custom labels that will be added to all GPU Operator managed pods. |
|
|
The GPU operator deploys |
|
|
By default, the Operator deploys the NVIDIA Container Toolkit ( |
|
Common Deployment Scenarios
The following common deployment scenarios and sample commands apply best to bare metal hosts or virtual machines with GPU passthrough.
Specifying the Operator Namespace
Both the Operator and operands are installed in the same namespace.
The namespace is configurable and is specified during installation.
For example, to install the GPU Operator in the nvidia-gpu-operator
namespace:
$ helm install --wait --generate-name \
-n nvidia-gpu-operator --create-namespace \
nvidia/gpu-operator
If you do not specify a namespace during installation, all GPU Operator components are installed in the default
namespace.
Preventing Installation of Operands on Some Nodes
By default, the GPU Operator operands are deployed on all GPU worker nodes in the cluster.
GPU worker nodes are identified by the presence of the label feature.node.kubernetes.io/pci-10de.present=true
.
The value 0x10de
is the PCI vendor ID that is assigned to NVIDIA.
To disable operands from getting deployed on a GPU worker node, label the node with nvidia.com/gpu.deploy.operands=false
.
$ kubectl label nodes $NODE nvidia.com/gpu.deploy.operands=false
Installation on Red Hat Enterprise Linux
In this scenario, use the NVIDIA Container Toolkit image that is built on UBI 8:
$ helm install --wait --generate-name \
-n gpu-operator --create-namespace \
nvidia/gpu-operator \
--set toolkit-version=1.13.4-ubi8
Replace the 1.13.4
value in the preceding command with the version that is supported
with the NVIDIA GPU Operator.
Refer to the GPU Operator Component Matrix on the platform support page.
When using RHEL8 with Kubernetes, SELinux must be enabled either in permissive or enforcing mode for use with the GPU Operator. Additionally, network restricted environments are not supported.
Pre-Installed NVIDIA GPU Drivers
In this scenario, the NVIDIA GPU driver is already installed on the worker nodes that have GPUs:
$ helm install --wait --generate-name \
-n gpu-operator --create-namespace \
nvidia/gpu-operator \
--set driver.enabled=false
Pre-Installed NVIDIA GPU Drivers and NVIDIA Container Toolkit
In this scenario, the NVIDIA GPU driver and the NVIDIA Container Toolkit are already installed on the worker nodes that have GPUs.
Tip
This scenario applies to NVIDIA DGX Systems that run NVIDIA Base OS.
Before installing the Operator, ensure that the default runtime is set to nvidia
.
Refer to Configuration in the NVIDIA Container Toolkit documentation for more information.
Install the Operator with the following options:
$ helm install --wait --generate-name \
-n gpu-operator --create-namespace \
nvidia/gpu-operator \
--set driver.enabled=false \
--set toolkit.enabled=false
Pre-Installed NVIDIA Container Toolkit (but no drivers)
In this scenario, the NVIDIA Container Toolkit is already installed on the worker nodes that have GPUs.
Configure toolkit to use the
root
directory of the driver installation as/run/nvidia/driver
, because this is the path mounted by driver container.$ sudo sed -i 's/^#root/root/' /etc/nvidia-container-runtime/config.toml
Install the Operator with the following options (which will provision a driver):
$ helm install --wait --generate-name \ -n gpu-operator --create-namespace \ nvidia/gpu-operator \ --set toolkit.enabled=false
Running a Custom Driver Image
If you want to use custom driver container images, such as version 465.27, then you can build a custom driver container image. Follow these steps:
Rebuild the driver container by specifying the
$DRIVER_VERSION
argument when building the Docker image. For reference, the driver container Dockerfiles are available on the Git repository at https://gitlab.com/nvidia/container-images/driver.Build the container using the appropriate Dockerfile. For example:
$ docker build --pull -t \ --build-arg DRIVER_VERSION=455.28 \ nvidia/driver:455.28-ubuntu20.04 \ --file Dockerfile .
Ensure that the driver container is tagged as shown in the example by using the
driver:<version>-<os>
schema.Specify the new driver image and repository by overriding the defaults in the Helm install command. For example:
$ helm install --wait --generate-name \ -n gpu-operator --create-namespace \ nvidia/gpu-operator \ --set driver.repository=docker.io/nvidia \ --set driver.version="465.27"
These instructions are provided for reference and evaluation purposes. Not using the standard releases of the GPU Operator from NVIDIA would mean limited support for such custom configurations.
Specifying Configuration Options for containerd
When you use containerd as the container runtime, the following configuration options are used with the container-toolkit deployed with GPU Operator:
toolkit:
env:
- name: CONTAINERD_CONFIG
value: /etc/containerd/config.toml
- name: CONTAINERD_SOCKET
value: /run/containerd/containerd.sock
- name: CONTAINERD_RUNTIME_CLASS
value: nvidia
- name: CONTAINERD_SET_AS_DEFAULT
value: true
These options are defined as follows:
- CONTAINERD_CONFIG
The path on the host to the
containerd
config you would like to have updated with support for thenvidia-container-runtime
. By default this will point to/etc/containerd/config.toml
(the default location forcontainerd
). It should be customized if yourcontainerd
installation is not in the default location.- CONTAINERD_SOCKET
The path on the host to the socket file used to communicate with
containerd
. The operator will use this to send aSIGHUP
signal to thecontainerd
daemon to reload its config. By default this will point to/run/containerd/containerd.sock
(the default location forcontainerd
). It should be customized if yourcontainerd
installation is not in the default location.- CONTAINERD_RUNTIME_CLASS
The name of the Runtime Class you would like to associate with the
nvidia-container-runtime
. Pods launched with aruntimeClassName
equal to CONTAINERD_RUNTIME_CLASS will always run with thenvidia-container-runtime
. The default CONTAINERD_RUNTIME_CLASS isnvidia
.- CONTAINERD_SET_AS_DEFAULT
A flag indicating whether you want to set
nvidia-container-runtime
as the default runtime used to launch all containers. When set to false, only containers in pods with aruntimeClassName
equal to CONTAINERD_RUNTIME_CLASS will be run with thenvidia-container-runtime
. The default value istrue
.
Rancher Kubernetes Engine 2
For Rancher Kubernetes Engine 2 (RKE2), set the following in the ClusterPolicy
.
toolkit:
env:
- name: CONTAINERD_CONFIG
value: /var/lib/rancher/rke2/agent/etc/containerd/config.toml.tmpl
- name: CONTAINERD_SOCKET
value: /run/k3s/containerd/containerd.sock
- name: CONTAINERD_RUNTIME_CLASS
value: nvidia
- name: CONTAINERD_SET_AS_DEFAULT
value: "true"
These options can be passed to GPU Operator during install time as below.
helm install gpu-operator -n gpu-operator --create-namespace \
nvidia/gpu-operator $HELM_OPTIONS \
--set toolkit.env[0].name=CONTAINERD_CONFIG \
--set toolkit.env[0].value=/var/lib/rancher/rke2/agent/etc/containerd/config.toml.tmpl \
--set toolkit.env[1].name=CONTAINERD_SOCKET \
--set toolkit.env[1].value=/run/k3s/containerd/containerd.sock \
--set toolkit.env[2].name=CONTAINERD_RUNTIME_CLASS \
--set toolkit.env[2].value=nvidia \
--set toolkit.env[3].name=CONTAINERD_SET_AS_DEFAULT \
--set-string toolkit.env[3].value=true
MicroK8s
For MicroK8s, set the following in the ClusterPolicy
.
toolkit:
env:
- name: CONTAINERD_CONFIG
value: /var/snap/microk8s/current/args/containerd-template.toml
- name: CONTAINERD_SOCKET
value: /var/snap/microk8s/common/run/containerd.sock
- name: CONTAINERD_RUNTIME_CLASS
value: nvidia
- name: CONTAINERD_SET_AS_DEFAULT
value: "true"
These options can be passed to GPU Operator during install time as below.
helm install gpu-operator -n gpu-operator --create-namespace \
nvidia/gpu-operator $HELM_OPTIONS \
--set toolkit.env[0].name=CONTAINERD_CONFIG \
--set toolkit.env[0].value=/var/snap/microk8s/current/args/containerd-template.toml \
--set toolkit.env[1].name=CONTAINERD_SOCKET \
--set toolkit.env[1].value=/var/snap/microk8s/common/run/containerd.sock \
--set toolkit.env[2].name=CONTAINERD_RUNTIME_CLASS \
--set toolkit.env[2].value=nvidia \
--set toolkit.env[3].name=CONTAINERD_SET_AS_DEFAULT \
--set-string toolkit.env[3].value=true
Verification: Running Sample GPU Applications
CUDA VectorAdd
In the first example, let’s run a simple CUDA sample, which adds two vectors together:
Create a file, such as
cuda-vectoradd.yaml
, with contents like the following:apiVersion: v1 kind: Pod metadata: name: cuda-vectoradd spec: restartPolicy: OnFailure containers: - name: cuda-vectoradd image: "nvcr.io/nvidia/k8s/cuda-sample:vectoradd-cuda11.7.1-ubuntu20.04" resources: limits: nvidia.com/gpu: 1
Run the pod:
$ kubectl apply -f cuda-vectoradd.yaml
The pod starts, runs the
vectorAdd
command, and then exits.View the logs from the container:
$ kubectl logs pod/cuda-vectoradd
Example Output
[Vector addition of 50000 elements] Copy input data from the host memory to the CUDA device CUDA kernel launch with 196 blocks of 256 threads Copy output data from the CUDA device to the host memory Test PASSED Done
Removed the stopped pod:
$ kubectl delete -f cuda-vectoradd.yaml
Example Output
pod "cuda-vectoradd" deleted
Jupyter Notebook
You can perform the following steps to deploy Jupyter Notebook in your cluster:
Create a file, such as
tf-notebook.yaml
, with contents like the following example:--- apiVersion: v1 kind: Service metadata: name: tf-notebook labels: app: tf-notebook spec: type: NodePort ports: - port: 80 name: http targetPort: 8888 nodePort: 30001 selector: app: tf-notebook --- apiVersion: v1 kind: Pod metadata: name: tf-notebook labels: app: tf-notebook spec: securityContext: fsGroup: 0 containers: - name: tf-notebook image: tensorflow/tensorflow:latest-gpu-jupyter resources: limits: nvidia.com/gpu: 1 ports: - containerPort: 8888 name: notebook
Apply the manifest to deploy the pod and start the service:
$ kubectl apply -f tf-notebook.yaml
Check the pod status:
$ kubectl get pod tf-notebook
Example Output
NAMESPACE NAME READY STATUS RESTARTS AGE default tf-notebook 1/1 Running 0 3m45s
Because the manifest includes a service, get the external port for the notebook:
$ kubectl get svc tf-notebook
Example Output
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE tf-notebook NodePort 10.106.229.20 <none> 80:30001/TCP 4m41s
Get the token for the Jupyter notebook:
$ kubectl logs tf-notebook
Example Output
[I 21:50:23.188 NotebookApp] Writing notebook server cookie secret to /root/.local/share/jupyter/runtime/notebook_cookie_secret [I 21:50:23.390 NotebookApp] Serving notebooks from local directory: /tf [I 21:50:23.391 NotebookApp] The Jupyter Notebook is running at: [I 21:50:23.391 NotebookApp] http://tf-notebook:8888/?token=3660c9ee9b225458faaf853200bc512ff2206f635ab2b1d9 [I 21:50:23.391 NotebookApp] or http://127.0.0.1:8888/?token=3660c9ee9b225458faaf853200bc512ff2206f635ab2b1d9 [I 21:50:23.391 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation). [C 21:50:23.394 NotebookApp] To access the notebook, open this file in a browser: file:///root/.local/share/jupyter/runtime/nbserver-1-open.html Or copy and paste one of these URLs: http://tf-notebook:8888/?token=3660c9ee9b225458faaf853200bc512ff2206f635ab2b1d9 or http://127.0.0.1:8888/?token=3660c9ee9b225458faaf853200bc512ff2206f635ab2b1d9
The notebook should now be accessible from your browser at this URL: http://your-machine-ip:30001/?token=3660c9ee9b225458faaf853200bc512ff2206f635ab2b1d9.
Installation on Commercially Supported Kubernetes Platforms
Product |
Documentation |
---|---|
Red Hat OpenShift 4
using RHCOS worker nodes
|
|
VMware vSphere with Tanzu
and NVIDIA AI Enterprise
|
|
Google Cloud Anthos |