Platform Support

This documents provides an overview of the GPU Operator lifecycle and the GPUs and system Platform configurations supported.

GPU Operator Lifecycle

The GPU Operator versions are expressed as x.y.z or <major, minor, patch> and follows the semver terminology.

Only the most recent release of the GPU Operator is maintained through z patch updates. All prior releases of the GPU Operator are deprecated (and unsupported) when a new x.y version of the GPU Operator is released.

The product lifecycle and versioning are subject to change in the future.

GPU Operator Component Matrix

Release

NVIDIA GPU Driver

NVIDIA Driver Manager for K8s

NVIDIA Container Toolkit

NVIDIA Kubernetes Device Plugin

DCGM Exporter

Node Feature Discovery

NVIDIA GPU Feature Discovery for Kubernetes

NVIDIA MIG Manager for Kubernetes

DCGM

Validator for NVIDIA GPU Operator

NVIDIA KubeVirt GPU Device Plugin

NVIDIA vGPU Device Manager

1.11

515.48.07 (default), 510.47.03, 470.129.06, 450.191.01

v0.4.0

1.10.0

0.12.2

2.4.5-2.6.7

v0.10.1

0.6.1

0.4.2

2.4.5-1

v1.11.0

v1.1.2

v0.1.0

1.10

510.47.03

v0.3.0

1.9.0

0.11.0

2.3.4-2.6.4

0.8.2

0.5.0

0.3.0

2.3.4.1

v1.10.0

N/A

N/A

1.9.1

470.82.01

v0.2.0

1.7.2

0.10.0

2.3.1-2.6.1

0.8.2

0.4.1

0.2.0

2.3.1

v1.9.1

N/A

N/A

1.9.0

470.82.01

v0.2.0

1.7.2

0.10.0

2.3.1-2.6.0

0.8.2

0.4.1

0.2.0

2.3.1

v1.9.0

N/A

N/A

1.8.2

470.57.02

v0.1.0

1.7.1

0.9.0

2.2.9-2.4.0

0.8.2

0.4.1

0.1.3

2.2.3

v1.8.2

N/A

N/A

1.8.1

470.57.02

v0.1.0

1.6.0

0.9.0

2.2.9-2.4.0

0.8.2

0.4.1

0.1.2

2.2.3

v1.8.1

N/A

N/A

1.8.0

470.57.02

v0.1.0

1.6.0

0.9.0

2.2.9-2.4.0

0.8.2

0.4.1

0.1.2

2.2.3

v1.8.0

N/A

N/A

1.7.1

460.73.01

N/A

1.5.0

0.9.0

2.1.8-2.4.0-rc.2

0.8.2

0.4.1

0.1.0

N/A

v1.7.1

N/A

N/A

1.7.0

460.73.01

N/A

1.5.0

0.9.0

2.1.8-2.4.0-rc.2

0.6.0

0.4.1

0.1.0

N/A

v1.7.0

N/A

N/A

1.6.2

460.32.03

N/A

1.4.7

0.8.2

2.2.0

0.6.0

0.4.1

N/A

N/A

N/A

N/A

N/A

1.6.1

460.32.03

N/A

1.4.6

0.8.2

2.2.0

0.6.0

0.4.1

N/A

N/A

N/A

N/A

N/A

1.6.0

460.32.03

N/A

1.4.5

0.8.2

2.2.0

0.6.0

0.4.1

N/A

N/A

N/A

N/A

N/A

1.5.2

450.80.02

N/A

1.4.4

0.8.1

2.1.2

0.6.0

0.4.0

N/A

N/A

N/A

N/A

N/A

1.5.1

450.80.02

N/A

1.4.3

0.7.3

2.1.2

0.6.0

0.3.0

N/A

N/A

N/A

N/A

N/A

1.5.0

450.80.02

N/A

1.4.2

0.7.3

2.1.2

0.6.0

0.3.0

N/A

N/A

N/A

N/A

N/A

1.4.0

450.80.02

N/A

1.4.0

0.7.1

2.1.2

0.6.0

0.2.2

N/A

N/A

N/A

N/A

N/A

1.3.0

450.80.02

N/A

1.3.0

0.7.0

2.1.0

0.6.0

0.2.1

N/A

N/A

N/A

N/A

N/A

1.2.0

450.80.02

N/A

1.3.0

0.7.0

2.1.0-rc.2

0.6.0

N/A

N/A

N/A

N/A

N/A

N/A

1.1.0

440.64.00

N/A

1.0.5

1.0.0-beta4

1.7.2

0.5.0

N/A

N/A

N/A

N/A

N/A

N/A

Note

  • Driver version could be different with NVIDIA vGPU, as it depends on the driver version downloaded from the NVIDIA vGPU Software Portal.

  • The GPU Operator is supported on all the R450, R470, R510 and 515 NVIDIA datacenter production drivers. For a list of supported datacenter drivers versions, visit this link.

Supported NVIDIA GPUs/Systems

The following NVIDIA datacenter/enterprise GPUs are supported on x86 based platforms:

Product

Architecture

NVIDIA DGX A100

A100 and NVSwitch

NVIDIA HGX A100

A100 and NVSwitch

NVIDIA A100

NVIDIA Ampere

NVIDIA A100X

NVIDIA Ampere

NVIDIA A40

NVIDIA Ampere

NVIDIA A30

NVIDIA Ampere

NVIDIA A30X

NVIDIA Ampere

NVIDIA A16

NVIDIA Ampere

NVIDIA A10

NVIDIA Ampere

NVIDIA A2

NVIDIA Ampere

Note

The GPU Operator supports DGX A100 with DGX OS 5.1+ and Red Hat OpenShift using Red Hat Core OS. For installation instructions, see here for DGX OS 5.1+ and here for Red Hat OpenShift.

Supported ARM based platforms

The following NVIDIA datacenter/enterprise GPUs are supported:

Product

Architecture

NVIDIA A100X

Ampere

NVIDIA A30X

Ampere

AWS EC2 G5g instaces

Turing

Note

The GPU Operator only supports platforms using discrete GPUs - Jetson or other embedded products with integrated GPUs are not supported.

Supported deployment options, hypervisors and NVIDIA vGPU based products

The GPU Operator has been validated in the following scenarios:

Deployment Options

Bare Metal

Virtual machines with GPU Passthrough

Virtual machines with NVIDIA vGPU based products

Hypervisors (On-premises)

Hypervisors

VMware vSphere 7

Red Hat Enterprise Linux KVM

Red Hat Virtualization (RHV)

NVIDIA vGPU based products

NVIDIA vGPU based products

NVIDIA vGPU (NVIDIA AI Enterprise)

NVIDIA vCompute Server

NVIDIA RTX Virtual Workstation

Note

GPU Operator is supported with NVIDIA vGPU 12.0+

Supported Operating Systems and Kubernetes platforms

The GPU Operator has been validated in the following scenarios:

Note

The Kubernetes community supports only the last three minor releases as of v1.17. Older releases may be supported through enterprise distributions of Kubernetes such as Red Hat OpenShift.

Kubernetes

Red Hat OpenShift

VMware vSphere with Tanzu

Ubuntu 18.04 LTS

1.21, 1.22, 1.23, 1.24

Ubuntu 20.04 LTS

1.21, 1.22, 1.23, 1.24

VMware vSphere 7.0 U3c

Ubuntu 22.04 LTS

1.21, 1.22, 1.23, 1.24

CentOS 7

1.21, 1.22, 1.23, 1.24

Red Hat Core OS

4.9, 4.10

Note

Red Hat OpenShift is supported on the AWS (G4, G5, P3 and P4), Azure (NC-T4-v3, NC-v3 and ND-A100-v4) and GCP (T4, V100, A100 based instances).

Supported Container Runtimes

The GPU Operator has been validated in the following scenarios:

Product

Containerd 1.4+

CRI-O

Ubuntu 18.04 LTS

Yes

No

Ubuntu 20.04 LTS

Yes

No

Ubuntu 22.04 LTS

Yes

No

CentOS 7

Yes

No

Red Hat Core OS (RHCOS)

No

Yes

Note

The GPU Operator has been validated with version 2 of the containerd config file.

NVIDIA AI Enterprise support matrix

The latest version of NVIDIA AI Enterprise supports the following scenarios:

Ubuntu 20.04 LTS

Ubuntu 22.04 LTS

Red Hat Core OS (RHCOS)

Kubernetes

1.21, 1.22, 1.23, 1.24

1.21, 1.22, 1.23, 1.24

Red Hat OpenShift

4.9.9+, 4.10

VMware vSphere with Tanzu

VMware vSphere 7.0 U3c

Note

Red Hat OpenShift is supported on the AWS (G4, G5, P3 and P4), Azure (NC-T4-v3, NC-v3 and ND-A100-v4) and GCP (T4, V100, A100 based instances).

Support for GPUDirect RDMA

Supported operating systems and NVIDIA GPU Drivers with GPUDirect RDMA.

470 GPU Driver

510 GPU Driver

515 GPU Driver

Ubuntu 20.04 LTS with Network Operator 1.2

470.129.06

510.47.03

515.48.07

Red Hat OpenShift 4.10 with Network Operator 1.2

470.129.06

510.47.03

515.48.07

CentOS 7 with MOFED installed on the node

470.129.06

510.47.03

515.48.07

For more information on GPUDirect RDMA refer to this document.

Additional supported container management tools:

  • Helm v3

  • Red Hat Operator Lifecycle Manager (OLM)

Technical Preview

Try out below features that are in technical preview and share feedback and contribute .

Previous GPU Operator Releases

The following table outlines a historic view of GPU Operator support matrix.

GPU Operator Release

Kubernetes

OpenShift

Anthos

1.11

v1.21+

4.9, 4.10

Supported

1.10

v1.21+

4.9, 4.10

Supported

1.9

v1.19+

4.8, 4.9

Supported

1.8

v1.18+

4.7, 4.8, 4.9

Supported

1.7

v1.18+

4.5, 4.6, 4.7

Supported

1.6

v1.16+

4.5, 4.6, 4.7

Supported

1.5

v1.13+

4.4.29+, 4.5, 4.6

Supported

1.4

v1.13+

4.4.29+, 4.5, 4.6

Supported

1.3

v1.13+

4.4.29+, 4.5, 4.6

Supported

1.2

v1.13+

Not supported

Supported

1.1.7

v1.13+

4.1, 4.2, 4.3, 4.4

Supported

1.1

v1.13+

Not supported

Not supported

1.0

v1.13+

Not supported

Not supported

  • Red Hat OpenShift 4.10 on ARM Server Base System Architecture (SBSA) systems.

  • KubeVirt with GPU Passthrough and NVIDIA vGPU.