Platform Support for the NVIDIA GPU Operator
NVIDIA GPU Operator Versioning
To understand the NVIDIA GPU Operator life cycle policy, it is important to know how the Operator is versioned.
As of September 2022, the NVIDIA GPU Operator follows calendar versioning. NVIDIA GPU Operator v22.9.0 is the first release that follows calendar versioning and version 1.11 is the last release that follows the old versioning scheme.
Using version 22.9.0
as an example,
The first two fields in the version are in the format of
YY.MM
and represent the major version and also when the Operator was initially released. In this example, the Operator was released in September 2022. Zero padding is omitted for month to remain compatible with semantic versioning.The third segment,
0
in the example, represents the dot release. Dot releases typically include fixes for bugs or CVEs but can include minor features like support for a new NVIDIA GPU driver.
NVIDIA GPU Operator Life Cycle
The NVIDIA GPU Operator life cycle policy provides a predictable support policy and timeline of when new Operator versions are released.
Starting with version v22.9.0, a new major Operator version will be released every 6 months. Therefore, the next major release of the NVIDIA GPU Operator will be released in March 2023 and will be named v23.3.0.
Every major release of the Operator, starting with v22.9.0, is maintained for 12 months. Bug fixes and CVEs are released throughout the 12 months while minor feature updates are only released within the first six months.
This life cycle enables Operator users to use a given version for up to 12 months. This life cycle also provides users a 6 month period where they can plan the transition to the next major version.
The product lifecycle and versioning are subject to change in the future.
Note
Upgrades are only supported within a major release or to the next major release.
GPU Operator Component Matrix
The following table shows the operands and default operand versions that correspond to an Operator version.
When post-release testing confirms support for newer versions of operands, these updates are identified as recommended updates to a GPU Operator version. Refer to upgrade for information about upgrading the Operator.
Release
NVIDIAGPUDriver NVIDIA DriverManager for K8s NVIDIAContainerToolkit NVIDIA KubernetesDevice PluginDCGM Exporter
Node FeatureDiscovery NVIDIA GPU FeatureDiscovery for Kubernetes NVIDIA MIG Managerfor KubernetesDCGM
Validator forNVIDIA GPU Operator NVIDIA KubeVirtGPU Device Plugin NVIDIA vGPUDevice ManagerNVIDIA GDS Driver
v22.9.2
v0.10.1
v22.9.1
v0.2.0
v22.9.1
v0.10.1
v22.9.1
v0.2.0
v22.9.0
v0.10.1
v22.9.0
v0.2.0
N/A
1.11
v0.10.1
v1.11.0
v0.1.0
N/A
1.10
0.8.2
v1.10.0
N/A
N/A
N/A
1.9.1
0.8.2
v1.9.1
N/A
N/A
N/A
1.9.0
0.8.2
v1.9.0
N/A
N/A
N/A
1.8.2
0.8.2
v1.8.2
N/A
N/A
N/A
1.8.1
0.8.2
v1.8.1
N/A
N/A
N/A
1.8.0
0.8.2
v1.8.0
N/A
N/A
N/A
1.7.1
N/A
0.8.2
N/A
v1.7.1
N/A
N/A
N/A
1.7.0
N/A
0.6.0
N/A
v1.7.0
N/A
N/A
N/A
1.6.2
N/A
0.6.0
N/A
N/A
N/A
N/A
N/A
N/A
1.6.1
N/A
0.6.0
N/A
N/A
N/A
N/A
N/A
N/A
1.6.0
N/A
0.6.0
N/A
N/A
N/A
N/A
N/A
N/A
1.5.2
N/A
0.6.0
N/A
N/A
N/A
N/A
N/A
N/A
1.5.1
N/A
0.6.0
N/A
N/A
N/A
N/A
N/A
N/A
1.5.0
N/A
0.6.0
N/A
N/A
N/A
N/A
N/A
N/A
1.4.0
N/A
0.6.0
N/A
N/A
N/A
N/A
N/A
N/A
1.3.0
N/A
0.6.0
N/A
N/A
N/A
N/A
N/A
N/A
1.2.0
N/A
0.6.0
N/A
N/A
N/A
N/A
N/A
N/A
N/A
1.1.0
N/A
0.5.0
N/A
N/A
N/A
N/A
N/A
N/A
N/A
Note
Driver version could be different with NVIDIA vGPU, as it depends on the driver version downloaded from the NVIDIA vGPU Software Portal.
The GPU Operator is supported on all the R450, R470, R510, 515, 520 and 525 NVIDIA datacenter production drivers. For a list of supported datacenter drivers versions, visit this link.
Supported NVIDIA GPUs and Systems
The following NVIDIA data center GPUs are supported on x86 based platforms:
Product |
Architecture |
---|---|
NVIDIA HGX H100 |
NVIDIA Hopper and NVSwitch |
NVIDIA H100 |
NVIDIA Hopper |
NVIDIA L40 |
NVIDIA Ada |
NVIDIA L4 |
NVIDIA Ada |
NVIDIA DGX A100 |
A100 and NVSwitch |
NVIDIA HGX A100 |
A100 and NVSwitch |
NVIDIA A800 |
NVIDIA Ampere |
NVIDIA A100 |
NVIDIA Ampere |
NVIDIA A100X |
NVIDIA Ampere |
NVIDIA A40 |
NVIDIA Ampere |
NVIDIA A30 |
NVIDIA Ampere |
NVIDIA A30X |
NVIDIA Ampere |
NVIDIA A16 |
NVIDIA Ampere |
NVIDIA A10 |
NVIDIA Ampere |
NVIDIA A2 |
NVIDIA Ampere |
Product |
Architecture |
---|---|
NVIDIA T4 |
Turing |
NVIDIA V100 |
Volta |
NVIDIA P100 |
Pascal |
NVIDIA P40 |
Pascal |
NVIDIA P4 |
Pascal |
Product |
Architecture |
---|---|
NVIDIA RTX A6000 |
NVIDIA Ampere /Ada |
NVIDIA RTX A5000 |
NVIDIA Ampere |
NVIDIA RTX A4000 |
NVIDIA Ampere |
NVIDIA RTX A8000 |
Turing |
NVIDIA RTX A6000 |
Turing |
NVIDIA RTX A5000 |
Turing |
NVIDIA RTX A4000 |
Turing |
NVIDIA T1000 |
Turing |
NVIDIA T600 |
Turing |
NVIDIA T400 |
Turing |
Supported ARM Based Platforms
The following NVIDIA data center GPUs are supported:
Product |
Architecture |
---|---|
NVIDIA A100X |
Ampere |
NVIDIA A30X |
Ampere |
AWS EC2 G5g instances |
Turing |
Note
The GPU Operator only supports platforms using discrete GPUs. NVIDIA Jetson, or other embedded products with integrated GPUs, are not supported.
Note
The R520 Data Center Driver is not supported for ARM.
Supported Deployment Options, Hypervisors, and NVIDIA vGPU Based Products
The GPU Operator has been validated in the following scenarios:
Deployment Options |
---|
Bare Metal |
Virtual machines with GPU Passthrough |
Virtual machines with NVIDIA vGPU based products |
Hypervisors (On-premises)
Hypervisors |
---|
VMware vSphere 7 and 8 |
Red Hat Enterprise Linux KVM |
Red Hat Virtualization (RHV) |
NVIDIA vGPU based products
NVIDIA vGPU based products |
---|
NVIDIA vGPU (NVIDIA AI Enterprise) |
NVIDIA vCompute Server |
NVIDIA RTX Virtual Workstation |
Note
GPU Operator is supported with NVIDIA vGPU 12.0+.
Supported Operating Systems and Kubernetes Platforms
The GPU Operator has been validated in the following scenarios:
Note
The Kubernetes community supports only the last three minor releases as of v1.17. Older releases may be supported through enterprise distributions of Kubernetes such as Red Hat OpenShift.
Operating
System
|
Kubernetes |
Red Hat
OpenShift
|
VMWare vSphere
with Tanzu
|
Rancher Kubernetes
Engine 2
|
HPE Ezmeral
Runtime
Enterprise
|
---|---|---|---|---|---|
Ubuntu 18.04 LTS |
1.21, 1.22, 1,23
1.24, 1.25, 1.26
|
||||
Ubuntu 20.04 LTS |
1.21, 1.22, 1,23
1.24, 1.25, 1.26
|
7.0 U3c, 8.0 |
1.21, 1.22, 1.23,
1.24, 1.25
|
||
Ubuntu 22.04 LTS |
1.21, 1.22, 1,23
1.24, 1.25, 1.26
|
||||
CentOS 7 |
1.21, 1.22, 1,23
1.24, 1.25, 1.26
|
||||
Red Hat Core OS |
4.9, 4.10
4.11, 4.12
|
||||
Red Hat
Enterprise
Linux 8.4,
8.6, 8.7
|
1.21, 1.22, 1,23
1.24, 1.25, 1.26
|
1.21, 1.22, 1.23,
1.24, 1.25
|
|||
Red Hat
Enterprise
Linux 8.4, 8.5
|
5.5 |
Note
Red Hat OpenShift is supported on the AWS (G4, G5, P3, P4), Azure (NC-T4-v3, NC-v3, ND-A100-v4), and GCP (T4, V100, A100) based instances.
Operating
System
|
Kubernetes |
Red Hat
OpenShift
|
VMWare vSphere
with Tanzu
|
Rancher Kubernetes
Engine 2
|
---|---|---|---|---|
Ubuntu 20.04 LTS |
1.21, 1.22, 1,23
1.24, 1.25, 1.26
|
7.0 U3c, 8.0 |
1.21, 1.22, 1.23,
1.24, 1.25
|
|
Ubuntu 22.04 LTS |
1.21, 1.22, 1,23
1.24, 1.25, 1.26
|
|||
Red Hat Core OS |
4.9, 4.10
4.11, 4.12
|
|||
Red Hat
Enterprise
Linux 8.4,
8.6, 8.7
|
1.21, 1.22, 1,23
1.24, 1.25, 1.26
|
1.21, 1.22, 1.23,
1.24, 1.25
|
Supported Container Runtimes
The GPU Operator has been validated in the following scenarios:
Operating System |
Containerd 1.4 - 1.6 |
CRI-O |
---|---|---|
Ubuntu 18.04 LTS |
Yes |
Yes |
Ubuntu 20.04 LTS |
Yes |
Yes |
Ubuntu 22.04 LTS |
Yes |
Yes |
CentOS 7 |
Yes |
No |
Red Hat Core OS (RHCOS) |
No |
Yes |
Red Hat Enterprise Linux 8 |
Yes |
Yes |
Note
The GPU Operator has been validated with version 2 of the containerd config file.
NVIDIA AI Enterprise Support Matrix
The latest version of NVIDIA AI Enterprise supports the following scenarios:
Operating
System
|
Kubernetes |
Red Hat
OpenShift
|
VMWare vSphere
with Tanzu
|
---|---|---|---|
Ubuntu 20.04 LTS |
1.21, 1.22, 1,23
1.24, 1.25
|
7.0 U3c, 8.0 |
|
Ubuntu 22.04 LTS |
1.21, 1.22, 1,23
1.24, 1.25
|
||
Red Hat Core OS |
4.9.9+, 4.10
4.11
|
Operating
System
|
Kubernetes |
Red Hat
OpenShift
|
VMWare vSphere
with Tanzu
|
---|---|---|---|
Ubuntu 20.04 LTS |
1.21, 1.22, 1,23
1.24, 1.25
|
7.0 U3c, 8.0 |
|
Ubuntu 22.04 LTS |
1.21, 1.22, 1,23
1.24, 1.25
|
||
Red Hat Core OS |
4.9.9+, 4.10
4.11
|
Note
Red Hat OpenShift is supported on the AWS (G4, G5, P3, P4), Azure (NC-T4-v3, NC-v3, ND-A100-v4), and GCP (T4, V100, A100) based instances.
Support for KubeVirt
KubeVirt v0.36.0 is supported with the following operating systems and Kubernetes versions.
Operating
System
|
Kubernetes |
Red Hat
OpenShift
|
---|---|---|
Ubuntu 20.04 LTS |
1.21, 1.22, 1,23
1.24, 1.25, 1.26
|
|
Ubuntu 22.04 LTS |
1.21, 1.22, 1,23
1.24, 1.25, 1.26
|
|
Red Hat Core OS |
4.11 |
Support for GPUDirect RDMA
Supported operating systems and NVIDIA GPU Drivers with GPUDirect RDMA.
470 GPU Driver |
510 GPU Driver |
515 GPU Driver |
520 GPU Driver |
525 GPU Driver |
|
---|---|---|---|---|---|
Ubuntu 20.04 and 22.04 LTS with Network Operator 1.4 |
470.161.03 |
510.108.03 |
515.86.01 |
520.61.07 |
525.60.13 |
Red Hat OpenShift 4.10 and 4.11 with Network Operator 1.4 |
470.161.03 |
510.108.03 |
515.86.01 |
520.61.07 |
525.60.13 |
CentOS 7 with MOFED installed on the node |
470.161.03 |
510.108.03 |
515.86.01 |
520.61.07 |
525.60.13 |
For more information on GPUDirect RDMA refer to this document.
Support for GPUDirect Storage
Supported operating systems and NVIDIA GPU Drivers with GPUDirect Storage.
GPU Driver (GDS Driver) |
|
---|---|
Ubuntu 20.04 LTS with Network Operator 1.4 |
525.60.13 (2.14.13) |
Ubuntu 22.04 LTS with Network Operator 1.4 |
525.60.13 (2.14.13) |
Note
Not supported with secure boot and only with local NVME and remote NFS storage.
Additional Supported Container Management Tools
Helm v3
Red Hat Operator Lifecycle Manager (OLM)
Previous GPU Operator Releases
The following table outlines a historic view of GPU Operator support matrix.
GPU Operator Release |
Kubernetes |
OpenShift |
Anthos |
---|---|---|---|
v22.9.0 |
v1.21+ |
4.9, 4.10, 4.11 |
Supported |
1.11 |
v1.21+ |
4.9, 4.10, 4.11 |
Supported |
1.10 |
v1.21+ |
4.9, 4.10 |
Supported |
1.9 |
v1.19+ |
4.8, 4.9 |
Supported |
1.8 |
v1.18+ |
4.7, 4.8, 4.9 |
Supported |
1.7 |
v1.18+ |
4.5, 4.6, 4.7 |
Supported |
1.6 |
v1.16+ |
4.5, 4.6, 4.7 |
Supported |
1.5 |
v1.13+ |
4.4.29+, 4.5, 4.6 |
Supported |
1.4 |
v1.13+ |
4.4.29+, 4.5, 4.6 |
Supported |
1.3 |
v1.13+ |
4.4.29+, 4.5, 4.6 |
Supported |
1.2 |
v1.13+ |
Not supported |
Supported |
1.1.7 |
v1.13+ |
4.1, 4.2, 4.3, 4.4 |
Supported |
1.1 |
v1.13+ |
Not supported |
Not supported |
1.0 |
v1.13+ |
Not supported |
Not supported |
GPU Operator Release |
Kubernetes |
Red Hat OpenShift |
Anthos |
---|---|---|---|
v22.9.0 |
v1.21+ |
4.9, 4.10, 4.11 |
Not Supported |
1.11 |
v1.21+ |
4.9, 4.10, 4.11 |
Not Supported |
1.10 |
v1.21+ |
4.9, 4.10 |
Not Supported |
1.9 |
v1.19+ |
4.8, 4.9 |
Not Supported |
1.8 |
v1.18+ |
4.7, 4.8 |
Not Supported |
1.7 |
v1.18+ |
4.6, 4.7, 4.8 |
Not Supported |
1.6 |
v1.16+ |
4.6, 4.7 |
Not Supported |
1.5 |
v1.13+ |
4.6 |
Not Supported |
GPU Operator Release |
Kubernetes |
OpenShift |
vSphere with Tanzu |
Release |
---|---|---|---|---|
v22.9.1 |
v1.21+. |
4.9.9+, 4.10, 4.11 |
Supported |
3.0 and 1.4 |
v22.9.0 |
v1.21+ |
4.9.9+, 4.10, 4.11 |
Supported |
1.3 and 2.3 |
1.11.1 |
v1.21+ |
4.9.9+, 4.10, 4.11 |
Supported |
2.2 |
1.11 |
v1.21+ |
4.9.9+, 4.10, 4.11 |
Supported |
2.1 |
1.10.1 |
v1.21+ |
4.9.9+, 4.10 |
Supported |
2.0 |
1.9.1 |
v1.21+ |
Not Supported |
Supported |
1.1 |
1.8.1 |
v1.21+ |
Not Supported |
Not Supported |
1.0 |