Platform Support

This documents provides an overview of the GPUs and system platform configurations supported.

GPUs

The following NVIDIA datacenter/enterprise GPUs are supported:

Product

GPU Architecture

Datacenter A-series Products

NVIDIA A100

NVIDIA Ampere

NVIDIA A100X

NVIDIA Ampere

NVIDIA A40

NVIDIA Ampere

NVIDIA A30

NVIDIA Ampere

NVIDIA A30X

NVIDIA Ampere

NVIDIA A16

NVIDIA Ampere

NVIDIA A10

NVIDIA Ampere

NVIDIA A2

NVIDIA Ampere

Datacenter T-series Products

NVIDIA T4

Turing

Datacenter V-series Products

NVIDIA V100

Volta

Datacenter P-series Products

NVIDIA Tesla P100

Pascal

NVIDIA Tesla P40

Pascal

NVIDIA Tesla P4

Pascal

RTX-Series / T-Series Products

NVIDIA RTX A6000

NVIDIA Ampere

NVIDIA RTX A5000

NVIDIA Ampere

NVIDIA RTX A4000

NVIDIA Ampere

Quadro RTX 8000

Turing

Quadro RTX 6000

Turing

Quadro RTX 5000

Turing

Quadro RTX 4000

Turing

NVIDIA T1000

Turing

NVIDIA T600

Turing

NVIDIA T400

Turing

The following NVIDIA server platforms are supported:

Product

Architecture

Datacenter A-series Products

NVIDIA HGX A100

A100 and NVSwitch

NVIDIA DGX A100

A100 and NVSwitch

Note

The GPU Operator supports NVIDIA A100X/A30X running on the x86 host or on the DPU’s Arm processor.

Note

The GPU Operator supports DGX A100 with DGX OS 5.1+ and DGX A100 with OCP using RHCOS. For installation instructions, see here for DGX OS 5.1+ and here for OCP.

Note

The GPU Operator only supports platforms using discrete GPUs - Jetson or other embedded products with integrated GPUs are not supported.

Kubernetes Platforms

The following Kubernetes platforms are supported:

  • Kubernetes v1.21+

  • VMware vSphere with Tanzu

  • Red Hat OpenShift 4 using Red Hat Enterprise Linux CoreOS (RHCOS) and CRI-O container runtime. See the OpenShift guide for getting started.

  • Google Cloud Anthos. See the user guide for getting started.

Note

Technical Preview: Red Hat OpenShift 4.10 on ARM Server Base System Architecture (SBSA) systems. Raise issues on GitHub

Note

Note that the Kubernetes community supports only the last three minor releases as of v1.17. Older releases may be supported through enterprise distributions of Kubernetes such as Red Hat OpenShift. See the prerequisites for enabling monitoring in Kubernetes releases before v1.16.

The following table includes the support matrix of the GPU Operator releases and supported kubernetes platforms.

GPU Operator Release

Kubernetes

OpenShift

Anthos

1.10

v1.21+

4.9 and 4.10

Supported

1.9

v1.19+

4.8 and 4.9

Supported

1.8

v1.18+

4.7, 4.8 and 4.9

Supported

1.7

v1.18+

4.5, 4.6 and 4.7

Supported

1.6

v1.16+

4.5, 4.6 and 4.7

Supported

1.5

v1.13+

4.4.29+, 4.5 and 4.6

Supported

1.4

v1.13+

4.4.29+, 4.5 and 4.6

Supported

1.3

v1.13+

4.4.29+, 4.5 and 4.6

Supported

1.2

v1.13+

Not supported

Supported

1.1.7

v1.13+

4.1, 4.2, 4.3, and 4.4

Supported

1.1

v1.13+

Not supported

Not supported

1.0

v1.13+

Not supported

Not supported

Note

The GPU Operator versions are expressed as x.y.z or <major, minor, patch> and follows the semver terminology.

Only the most recent release of the GPU Operator is maintained through z patch updates. All prior releases of the GPU Operator are deprecated (and unsupported) when a new x.y version of the GPU Operator is released.

The product lifecycle and versioning are subject to change in the future.

Deployment Scenarios

The GPU Operator has been validated in the following scenarios:

Linux distributions

The following Linux distributions are supported:

  • Ubuntu 18.04.z, 20.04.z LTS

  • DGX OS 5.1+

  • Red Hat Enterprise Linux CoreOS (RHCOS) for use with OpenShift 4.9 and 4.10

  • CentOS 7

In addition, the following container management tools are supported:

  • Helm v3

  • Docker CE 19.03+

  • containerd 1.4+

  • CRI-O with OpenShift 4 using Red Hat Enterprise Linux CoreOS (RHCOS)

Supported Platforms with NVIDIA AI Enterprise

The following platforms are supported. Refer to the NVIDIA AI Enterprise Documentation for more detailed information.

  • Ubuntu 20.04.z LTS bare metal

  • Red Hat OpenShift 4.9.9+ and 4.10 with RHCOS on bare metal

  • Red Hat OpenShift 4.9.9+ and 4.10 with RHCOS on VMware vSphere 7.0 Update 2+

  • VMware vSphere 7.0 Update 2+ with Ubuntu 20.04 guest operating systems

  • VMware vSphere with Tanzu (7.0 U3c) with Ubuntu 20.04 guest operating systems

Supported NVIDIA vGPU Products

NVIDIA vGPU 12.0+ with the following software products

  • NVIDIA Virtual Compute Server (C-Series)

  • NVIDIA RTX Virtual Workstation (vWS)

Supported Hypervisors with NVIDIA vGPU

The following Virtualization Platforms are supported. Refer to the NVIDIA vGPU Documentation for more detailed information.

  • VMware vSphere 7

  • Red Hat Enterprise Linux KVM

  • Red Hat Virtualization (RHV)

Note

The GPU Operator deploys the NVIDIA driver as a container. In this environment, running on desktop environments (e.g. workstations with GPUs and display) is not supported.

GPU Operator Component Matrix

Release

NVIDIA Driver

NVIDIA Driver Manager for K8s

NVIDIA Container Toolkit

NVIDIA K8s Device Plugin

NVIDIA DCGM-Exporter

Node Feature Discovery

NVIDIA GPU Feature Discovery

NVIDIA MIG Manager for K8s

NVIDIA DCGM

1.10.1

510.47.03

v0.3.0

1.9.0

0.11.0

2.3.4-2.6.4

0.10.1

0.5.0

0.3.0

2.3.4.1

1.10

510.47.03

v0.3.0

1.9.0

0.11.0

2.3.4-2.6.4

0.10.1

0.5.0

0.3.0

2.3.4.1

1.9.1

470.82.01

v0.2.0

1.7.2

0.10.0

2.3.1-2.6.1

0.8.2

0.4.1

0.2.0

2.3.1

1.9.0

470.82.01

v0.2.0

1.7.2

0.10.0

2.3.1-2.6.0

0.8.2

0.4.1

0.2.0

2.3.1

1.8.2

470.57.02

v0.1.0

1.7.1

0.9.0

2.2.9-2.4.0

0.8.2

0.4.1

0.1.3

2.2.3

1.8.1

470.57.02

v0.1.0

1.6.0

0.9.0

2.2.9-2.4.0

0.8.2

0.4.1

0.1.2

2.2.3

1.8.0

470.57.02

v0.1.0

1.6.0

0.9.0

2.2.9-2.4.0

0.8.2

0.4.1

0.1.2

2.2.3

1.7.1

460.73.01

N/A

1.5.0

0.9.0

2.1.8-2.4.0-rc.2

0.8.2

0.4.1

0.1.0

N/A

1.7.0

460.73.01

N/A

1.5.0

0.9.0

2.1.8-2.4.0-rc.2

0.6.0

0.4.1

0.1.0

N/A

1.6.2

460.32.03

N/A

1.4.7

0.8.2

2.2.0

0.6.0

0.4.1

N/A

N/A

1.6.1

460.32.03

N/A

1.4.6

0.8.2

2.2.0

0.6.0

0.4.1

N/A

N/A

1.6.0

460.32.03

N/A

1.4.5

0.8.2

2.2.0

0.6.0

0.4.1

N/A

N/A

1.5.2

450.80.02

N/A

1.4.4

0.8.1

2.1.2

0.6.0

0.4.0

N/A

N/A

1.5.1

450.80.02

N/A

1.4.3

0.7.3

2.1.2

0.6.0

0.3.0

N/A

N/A

1.5.0

450.80.02

N/A

1.4.2

0.7.3

2.1.2

0.6.0

0.3.0

N/A

N/A

1.4.0

450.80.02

N/A

1.4.0

0.7.1

2.1.2

0.6.0

0.2.2

N/A

N/A

1.3.0

450.80.02

N/A

1.3.0

0.7.0

2.1.0

0.6.0

0.2.1

N/A

N/A

1.2.0

450.80.02

N/A

1.3.0

0.7.0

2.1.0-rc.2

0.6.0

N/A

N/A

N/A

1.1.0

440.64.00

N/A

1.0.5

1.0.0-beta4

1.7.2

0.5.0

N/A

N/A

N/A

Note

  • Driver version could be different with NVIDIA vGPU, as it depends on the driver version downloaded from the NVIDIA vGPU Software Portal.

  • The GPU Operator is supported on all the R450, R470 and R510 NVIDIA datacenter production drivers. For a list of supported datacenter drivers versions, visit this link.

GPUDirect RDMA

For more information on GPUDirect RDMA refer to this document.

The following Linux distributions are supported:

  • Ubuntu 20.04 LTS

  • RedHat OpenShift 4.10 using RHCOS

The following NVIDIA drivers are supported:

  • R470 datacenter drivers (470.57.02+)

  • R510 datacenter drivers (510.47.03+)

Note

For Red Hat OpenShift GPUDirect RDMA is only supported from 470.103.01+ when using R470 datacenter drivers.