Platform Support

This documents provides an overview of the GPUs and system platform configurations supported.

GPUs

The following NVIDIA datacenter/enterprise GPUs are supported:

Product

GPU Architecture

Datacenter A-series Products

NVIDIA A100

NVIDIA Ampere

NVIDIA A40

NVIDIA Ampere

NVIDIA A30

NVIDIA Ampere

NVIDIA A16

NVIDIA Ampere

NVIDIA A10

NVIDIA Ampere

Datacenter T-series Products

NVIDIA T4

Turing

Datacenter V-series Products

NVIDIA V100

Volta

Datacenter P-series Products

NVIDIA Tesla P100

Pascal

NVIDIA Tesla P40

Pascal

NVIDIA Tesla P4

Pascal

RTX-Series / T-Series Products

NVIDIA RTX A6000

NVIDIA Ampere

NVIDIA RTX A5000

NVIDIA Ampere

NVIDIA RTX A4000

NVIDIA Ampere

Quadro RTX 8000

Turing

Quadro RTX 6000

Turing

Quadro RTX 5000

Turing

Quadro RTX 4000

Turing

NVIDIA T1000

Turing

NVIDIA T600

Turing

NVIDIA T400

Turing

The following NVIDIA server platforms are supported:

Product

Architecture

Datacenter A-series Products

NVIDIA HGX A100

A100 and NVSwitch

NVIDIA DGX A100

A100 and NVSwitch

Note

The GPU Operator supports DGX A100 with DGX OS 5.1+ and DGX A100 with OCP using RHCOS. For installation instructions, see here for DGX OS 5.1+ and here for OCP.

Note

The GPU Operator only supports platforms using discrete GPUs - Jetson or other embedded products with integrated GPUs are not supported.

Container Platforms

The following Kubernetes platforms are supported:

  • Kubernetes v1.19+

  • VMware vSphere with Tanzu

  • Red Hat OpenShift 4 using Red Hat Enterprise Linux CoreOS (RHCOS) and CRI-O container runtime. See the OpenShift guide for getting started.

  • Google Cloud Anthos. See the user guide for getting started.

Note

Note that the Kubernetes community supports only the last three minor releases as of v1.17. Older releases may be supported through enterprise distributions of Kubernetes such as Red Hat OpenShift. See the prerequisites for enabling monitoring in Kubernetes releases before v1.16.

The following table includes the support matrix of the GPU Operator releases and supported container platforms.

GPU Operator Release

Kubernetes

OpenShift

Anthos

1.9

v1.19+

4.8 and 4.9

Supported

1.8

v1.18+

4.7, 4.8 and 4.9

Supported

1.7

v1.18+

4.5, 4.6 and 4.7

Supported

1.6

v1.16+

4.5, 4.6 and 4.7

Supported

1.5

v1.13+

4.4.29+, 4.5 and 4.6

Supported

1.4

v1.13+

4.4.29+, 4.5 and 4.6

Supported

1.3

v1.13+

4.4.29+, 4.5 and 4.6

Supported

1.2

v1.13+

Not supported

Supported

1.1.7

v1.13+

4.1, 4.2, 4.3, and 4.4

Supported

1.1

v1.13+

Not supported

Not supported

1.0

v1.13+

Not supported

Not supported

Note

The GPU Operator versions are expressed as x.y.z or <major, minor, patch> and follows the semver terminology.

Only the most recent release of the GPU Operator is maintained through z patch updates. All prior releases of the GPU Operator are deprecated (and unsupported) when a new x.y version of the GPU Operator is released.

The product lifecycle and versioning are subject to change in the future.

Linux distributions

The following Linux distributions are supported:

  • Ubuntu 18.04.z, 20.04.z LTS

  • DGX OS 5.1+

  • Red Hat Enterprise Linux CoreOS (RHCOS) for use with OpenShift 4.8 and 4.9

  • CentOS 7

In addition, the following container management tools are supported:

  • Helm v3

  • Docker CE 19.03+

  • containerd 1.4+

  • CRI-O with OpenShift 4 using Red Hat Enterprise Linux CoreOS (RHCOS)

GPU Operator Component Matrix

Release

NVIDIA Driver

NVIDIA Driver Manager for K8s

NVIDIA Container Toolkit

NVIDIA K8s Device Plugin

NVIDIA DCGM-Exporter

Node Feature Discovery

NVIDIA GPU Feature Discovery

NVIDIA MIG Manager for K8s

NVIDIA DCGM

1.9.1

470.82.01

v0.2.0

1.7.2

0.10.0

2.3.1-2.6.1

0.8.2

0.4.1

0.2.0

2.3.1

1.9.0

470.82.01

v0.2.0

1.7.2

0.10.0

2.3.1-2.6.0

0.8.2

0.4.1

0.2.0

2.3.1

1.8.2

470.57.02

v0.1.0

1.7.1

0.9.0

2.2.9-2.4.0

0.8.2

0.4.1

0.1.3

2.2.3

1.8.1

470.57.02

v0.1.0

1.6.0

0.9.0

2.2.9-2.4.0

0.8.2

0.4.1

0.1.2

2.2.3

1.8.0

470.57.02

v0.1.0

1.6.0

0.9.0

2.2.9-2.4.0

0.8.2

0.4.1

0.1.2

2.2.3

1.7.1

460.73.01

N/A

1.5.0

0.9.0

2.1.8-2.4.0-rc.2

0.8.2

0.4.1

0.1.0

N/A

1.7.0

460.73.01

N/A

1.5.0

0.9.0

2.1.8-2.4.0-rc.2

0.6.0

0.4.1

0.1.0

N/A

1.6.2

460.32.03

N/A

1.4.7

0.8.2

2.2.0

0.6.0

0.4.1

N/A

N/A

1.6.1

460.32.03

N/A

1.4.6

0.8.2

2.2.0

0.6.0

0.4.1

N/A

N/A

1.6.0

460.32.03

N/A

1.4.5

0.8.2

2.2.0

0.6.0

0.4.1

N/A

N/A

1.5.2

450.80.02

N/A

1.4.4

0.8.1

2.1.2

0.6.0

0.4.0

N/A

N/A

1.5.1

450.80.02

N/A

1.4.3

0.7.3

2.1.2

0.6.0

0.3.0

N/A

N/A

1.5.0

450.80.02

N/A

1.4.2

0.7.3

2.1.2

0.6.0

0.3.0

N/A

N/A

1.4.0

450.80.02

N/A

1.4.0

0.7.1

2.1.2

0.6.0

0.2.2

N/A

N/A

1.3.0

450.80.02

N/A

1.3.0

0.7.0

2.1.0

0.6.0

0.2.1

N/A

N/A

1.2.0

450.80.02

N/A

1.3.0

0.7.0

2.1.0-rc.2

0.6.0

N/A

N/A

N/A

1.1.0

440.64.00

N/A

1.0.5

1.0.0-beta4

1.7.2

0.5.0

N/A

N/A

N/A

Note

  • Driver version could be different with NVIDIA vGPU, as it depends on the driver version downloaded from the NVIDIA vGPU Software Portal.

  • The GPU Operator is supported on all the R450, R460 and R470 NVIDIA datacenter production drivers. For a list of supported datacenter drivers versions, visit this link.

Supported Platforms with NVIDIA AI Enterprise

The following platforms are supported. Refer to the NVIDIA AI Enterprise Documentation for more detailed information.

  • VMware vSphere 7.0 Update 2+ with Ubuntu 20.04 guest operating systems

  • Ubuntu 20.04.z LTS bare metal

  • VMware vSphere with Tanzu (7.0 U3) with Ubuntu 20.04 guest operating systems

Supported NVIDIA vGPU Products

NVIDIA vGPU 12.0+ with the following software products

  • NVIDIA Virtual Compute Server (C-Series)

  • NVIDIA RTX Virtual Workstation (vWS)

Supported Hypervisors with NVIDIA vGPU

The following Virtualization Platforms are supported. Refer to the NVIDIA vGPU Documentation for more detailed information.

  • VMware vSphere 7

  • Red Hat Enterprise Linux KVM

  • Red Hat Virtualization (RHV)

Deployment Scenarios

The GPU Operator has been validated in the following scenarios:

Note

The GPU Operator deploys the NVIDIA driver as a container. In this environment, running on desktop environments (e.g. workstations with GPUs and display) is not supported.