Platform Support

This documents provides an overview of the GPUs and system Platform configurations supported.

NVIDIA GPU Operator Versioning

To understand the NVIDIA GPU Operator life cycle policy, it is important to know how the NVIDIA GPU Operator is versioned.

As of September 2022, the NVIDIA GPU Operator is versioned following the calendar schema. NVIDIA GPU Operator v22.9.0 will be the first release following calendar versioning, and NVIDIA GPU Operator 1.11 is therefore the last release following the old versioning schema.

Now, let’s have a look at how to interpret a NVIDIA GPU Operator release that follows calendar versioning. In this example, we will use v22.9.0 as the example.

The first two segments in the version are in the format of YY.MM which represent the major version and also when the NVIDIA GPU Operator was initially released. In this example, the NVIDIA GPU Operator was released in September 2022. Zero padding is omitted for month to be still compatible with semantic versioning.

The third segment as in ‘.0’ represents a dot release. Dot releases typically include fixes for bugs or CVEs but could also include minor features like support for a new NVIDIA GPU driver.

NVIDIA GPU Operator Life Cycle

The NVIDIA GPU Operator life cycle policy provides a predictable support policy and timeline of when new NVIDIA GPU Operator versions are released.

Starting with the NVIDIA GPU Operator v22.9.0, a new major GPU Operator version will be released every 6 months. Therefore, the next major release of the NVIDIA GPU Operator will be released in March 2023 and will be named v23.3.0.

Every major release of the NVIDIA GPU Operator, starting with v22.9.0, is maintained for 12 months. Bug fixes and CVEs are released throughout the 12 months while minor feature updates are only released within the first six months.

This life cycle allows NVIDIA GPU Operator users to use a given NVIDIA GPU Operator version for up to 12 months. It also provides users a 6 month period where they can plan the transition to the next major NVIDIA GPU Operator version.

The product lifecycle and versioning are subject to change in the future.

Note

  • Upgrades are only supported within a major release or to the next major release.

GPU Operator Component Matrix

The following table shows the operands and default operand versions that correspond to a GPU Operator version.

When post-release testing confirms support for newer versions of operands, these updates are identified as recommended updates to a GPU Operator version. Refer to Upgrading the GPU Operator for more information.

Release

NVIDIA
GPU
Driver
NVIDIA Driver
Manager for K8s
NVIDIA
Container
Toolkit
NVIDIA Kubernetes
Device Plugin

DCGM Exporter

Node Feature
Discovery
NVIDIA GPU Feature
Discovery for Kubernetes
NVIDIA MIG Manager
for Kubernetes

DCGM

Validator for
NVIDIA GPU Operator
NVIDIA KubeVirt
GPU Device Plugin
NVIDIA vGPU
Device Manager

NVIDIA GDS Driver

v23.3.2

v0.6.1

1.13.0

0.14.0

3.1.7-3.1.4

v0.12.1

0.8.0

0.5.2

3.1.7-1 (default),

v23.3.2

v1.2.1

v0.2.1

2.15.1

v23.3.1

v0.6.1

1.13.0

0.14.0

3.1.7-3.1.4

v0.12.1

0.8.0

0.5.2

3.1.7-1 (default),

v23.3.1

v1.2.1

v0.2.1

2.15.1

v23.3.0

v0.6.1

1.13.0

0.14.0

3.1.7-3.1.4

v0.12.1

0.8.0

0.5.2

3.1.7-1 (default),

v23.3.0

v1.2.1

v0.2.1

2.15.1

v22.9.2

v0.6.0

1.11.0

0.13.0

3.1.3-3.1.2

v0.10.1

0.7.0

0.5.0

3.1.6 (recommended),
3.1.3-1 (default)

v22.9.1

v1.2.1

v0.2.0

2.14.13

v22.9.1

v0.5.1

1.11.0

0.13.0

3.1.3-3.1.2

v0.10.1

0.7.0

0.5.0

3.1.3-1

v22.9.1

v1.2.1

v0.2.0

2.14.13

v22.9.0

520.61.05,
515.65.01 (default),

v0.4.2

1.11.0

0.12.3

3.0.4-3.0.0

v0.10.1

0.6.2

0.5.0

3.0.4-1

v22.9.0

v1.2.1

v0.2.0

N/A

1.11

v0.4.0

1.10.0

0.12.2

2.4.5-2.6.7

v0.10.1

0.6.1

0.4.2

2.4.5-1

v1.11.0

v1.1.2

v0.1.0

N/A

Note

  • Driver version could be different with NVIDIA vGPU, as it depends on the driver version downloaded from the NVIDIA vGPU Software Portal.

  • The GPU Operator is supported on all the R450, R470, R510, 515, 520 and 525 NVIDIA datacenter production drivers. For a list of supported datacenter drivers versions, visit this link.

Supported NVIDIA GPUs and Systems

The following NVIDIA data center GPUs are supported on x86 based platforms:

Product

Architecture

NVIDIA H800

NVIDIA Hopper

NVIDIA HGX H100

NVIDIA Hopper and NVSwitch

NVIDIA H100

NVIDIA Hopper

NVIDIA L40

NVIDIA Ada

NVIDIA L4

NVIDIA Ada

NVIDIA DGX A100

A100 and NVSwitch

NVIDIA HGX A100

A100 and NVSwitch

NVIDIA A800

NVIDIA Ampere

NVIDIA A100

NVIDIA Ampere

NVIDIA A100X

NVIDIA Ampere

NVIDIA A40

NVIDIA Ampere

NVIDIA A30

NVIDIA Ampere

NVIDIA A30X

NVIDIA Ampere

NVIDIA A16

NVIDIA Ampere

NVIDIA A10

NVIDIA Ampere

NVIDIA A2

NVIDIA Ampere

Note

  • Hopper (H100) GPU is only supported on x86 servers.

  • The GPU Operator supports DGX A100 with DGX OS 5.1+ and Red Hat OpenShift using Red Hat Core OS. For installation instructions, see here for DGX OS 5.1+ and here for Red Hat OpenShift.

Supported ARM Based Platforms

The following NVIDIA data center GPUs are supported:

Product

Architecture

NVIDIA A100X

Ampere

NVIDIA A30X

Ampere

AWS EC2 G5g instances

Turing

Note

The GPU Operator only supports platforms using discrete GPUs. NVIDIA Jetson, or other embedded products with integrated GPUs, are not supported.

Note

The R520 Data Center Driver is not supported for ARM.

Supported Deployment Options, Hypervisors, and NVIDIA vGPU Based Products

The GPU Operator has been validated in the following scenarios:

Deployment Options

Bare Metal

Virtual machines with GPU Passthrough

Virtual machines with NVIDIA vGPU based products

Hypervisors (On-premises)

Hypervisors

VMware vSphere 7 and 8

Red Hat Enterprise Linux KVM

Red Hat Virtualization (RHV)

NVIDIA vGPU based products

NVIDIA vGPU based products

NVIDIA vGPU (NVIDIA AI Enterprise)

NVIDIA vCompute Server

NVIDIA RTX Virtual Workstation

Note

GPU Operator is supported with NVIDIA vGPU 12.0+.

Supported Operating Systems and Kubernetes Platforms

The GPU Operator has been validated in the following scenarios:

Note

The Kubernetes community supports only the last three minor releases as of v1.17. Older releases may be supported through enterprise distributions of Kubernetes such as Red Hat OpenShift.

Operating
System

Kubernetes

Red Hat
OpenShift
VMWare vSphere
with Tanzu
Rancher Kubernetes
Engine 2
HPE Ezmeral
Runtime
Enterprise
Canonical
MicroK8s

Ubuntu 18.04 LTS

1.21, 1.22, 1,23
1.24, 1.25, 1.26

Ubuntu 20.04 LTS

1.21—1.27

7.0 U3c, 8.0 U1

1.21—1.27

Ubuntu 22.04 LTS

1.21—1.27

1.26

CentOS 7

1.21—1.27

Red Hat Core OS

4.9, 4.10, 4.11
4.12, 4.13
Red Hat
Enterprise
Linux 8.4,
8.6, 8.7, 8.8

1.21—1.27

1.21—1.27

Red Hat
Enterprise
Linux 8.4, 8.5

5.5

Note

Red Hat OpenShift is supported on the AWS (G4, G5, P3, P4), Azure (NC-T4-v3, NC-v3, ND-A100-v4), and GCP (T4, V100, A100) based instances.

Supported Container Runtimes

The GPU Operator has been validated in the following scenarios:

Operating System

Containerd 1.4 - 1.7

CRI-O

Ubuntu 18.04 LTS

Yes

Yes

Ubuntu 20.04 LTS

Yes

Yes

Ubuntu 22.04 LTS

Yes

Yes

CentOS 7

Yes

No

Red Hat Core OS (RHCOS)

No

Yes

Red Hat Enterprise Linux 8

Yes

Yes

Note

The GPU Operator has been validated with version 2 of the containerd config file.

NVIDIA AI Enterprise Support Matrix

The latest version of NVIDIA AI Enterprise supports the following scenarios:

Operating
System

Kubernetes

Red Hat
OpenShift
VMWare vSphere
with Tanzu

Ubuntu 20.04 LTS

1.21, 1.22, 1,23
1.24, 1.25

7.0 U3c, 8.0 U1

Ubuntu 22.04 LTS

1.21, 1.22, 1,23
1.24, 1.25

Red Hat Core OS

4.9.9+, 4.10
4.11

Note

Red Hat OpenShift is supported on the AWS (G4, G5, P3, P4), Azure (NC-T4-v3, NC-v3, ND-A100-v4), and GCP (T4, V100, A100) based instances.

Support for KubeVirt and OpenShift Virtualization

Red Hat OpenShift Virtualization is based on KubeVirt.

Operating System

Kubernetes

KubeVirt

OpenShift Virtualization

GPU
Passthrough

vGPU

GPU
Passthrough

vGPU

Ubuntu 20.04 LTS

1.21—1.27

0.36+

0.59.1+

Ubuntu 22.04 LTS

1.21—1.27

0.36+

0.59.1+

Red Hat Core OS

4.11, 4.12, 4.13

4.13

You can run GPU passthrough and NVIDIA vGPU in the same cluster as long as you use a software version that meets both requirements.

NVIDIA vGPU is incompatible with KubeVirt v0.58.0, v0.58.1, and v0.59.0, as well as OpenShift Virtualization 4.12.0—4.12.2. Starting with KubeVirt v0.58.2 and v0.59.1, and OpenShift Virtualization 4.12.3 and 4.13, you must set the DisableMDEVConfiguration feature gate. Refer to GPU Operator with KubeVirt or NVIDIA GPU Operator with OpenShift Virtualization.

Support for GPUDirect RDMA

Supported operating systems and NVIDIA GPU Drivers with GPUDirect RDMA.

470 GPU Driver

510 GPU Driver

515 GPU Driver

520 GPU Driver

525 GPU Driver

Ubuntu 20.04 and 22.04 LTS with Network Operator 1.4

470.161.03

510.108.03

515.86.01

520.61.07

525.105.17

Red Hat OpenShift 4.10 and 4.11 with Network Operator 1.4

470.161.03

510.108.03

515.86.01

520.61.07

525.105.17

CentOS 7 with MOFED installed on the node

470.161.03

510.108.03

515.86.01

520.61.07

525.105.17

For more information on GPUDirect RDMA refer to this document.

Support for GPUDirect Storage

Supported operating systems and NVIDIA GPU Drivers with GPUDirect Storage.

GPU Driver (GDS Driver)

Ubuntu 20.04 LTS with Network Operator 1.4

525.105.17 (2.15.1)

Ubuntu 22.04 LTS with Network Operator 1.4

525.105.17 (2.15.1)

Red Hat OpenShift Container Platform 4.11

525.105.17 (2.15.1)

Note

Not supported with secure boot. Supported storage types are local NVMe and remote NFS.

Additional Supported Container Management Tools

  • Helm v3

  • Red Hat Operator Lifecycle Manager (OLM)

Previous GPU Operator Releases

The following table outlines a historic view of GPU Operator support matrix.

GPU Operator Release

Kubernetes

OpenShift

Anthos

v23.3.0

v1.21+

4.9, 4.10, 4.11, 4.12

Supported

v22.9.2

v1.21+

4.9, 4.10, 4.11, 4.12

Supported

v22.9.1

v1.21+

4.9, 4.10, 4.11

Supported

v22.9.0

v1.21+

4.9, 4.10, 4.11

Supported

1.11

v1.21+

4.9, 4.10, 4.11

Supported

1.10

v1.21+

4.9, 4.10

Supported

1.9

v1.19+

4.8, 4.9

Supported

1.8

v1.18+

4.7, 4.8, 4.9

Supported

1.7

v1.18+

4.5, 4.6, 4.7

Supported

1.6

v1.16+

4.5, 4.6, 4.7

Supported

1.5

v1.13+

4.4.29+, 4.5, 4.6

Supported

1.4

v1.13+

4.4.29+, 4.5, 4.6

Supported

1.3

v1.13+

4.4.29+, 4.5, 4.6

Supported

1.2

v1.13+

Not supported

Supported

1.1.7

v1.13+

4.1, 4.2, 4.3, 4.4

Supported

1.1

v1.13+

Not supported

Not supported

1.0

v1.13+

Not supported

Not supported