About the NVIDIA GPU Operator
Kubernetes provides access to special hardware resources such as NVIDIA GPUs, NICs, Infiniband adapters and other devices through the device plugin framework. However, configuring and managing nodes with these hardware resources requires configuration of multiple software components such as drivers, container runtimes or other libraries which are difficult and prone to errors. The NVIDIA GPU Operator uses the operator framework within Kubernetes to automate the management of all NVIDIA software components needed to provision GPU. These components include the NVIDIA drivers (to enable CUDA), Kubernetes device plugin for GPUs, the NVIDIA Container Toolkit, automatic node labelling using GFD, DCGM based monitoring and others.
Documentation
Browse through the following documents for getting started, platform support and release notes.
Getting Started
The Getting Started guide includes information on installing the GPU Operator in a Kubernetes cluster.
Release Notes
Refer to Release Notes for information about releases.
Platform Support
The Platform Support describes the supported platform configurations.
Licenses and Contributing
The NVIDIA GPU Operator sourcecode is licensed under Apache 2.0 and contributions are accepted with a DCO. See the contributing document for more information on how to contribute and the release artifacts.
The NVIDIA GPU Operator includes components governed by the following NVIDIA End User License Agreements. By installing and using the GPU Operator, you accept the terms and conditions of these licenses.
NVIDIA Deep Learning Container license.
NVIDIA Driver: The license for the NVIDIA datacenter drivers is available at https://www.nvidia.com/content/DriverDownloads/licence.php.
NVIDIA Data Center GPU Manager (DCGM): The license for the NVIDIA DCGM is available on the product page.
Since the underlying images may include components licensed under open-source licenses such as GPL, the sources for these components are archived on the CUDA opensource index.
Below table outlines the license for the components.
Artifact Type |
Artifact Licenses |
Source Code License |
|
---|---|---|---|
NVIDIA GPU Operator |
Helm Chart |
||
NVIDIA GPU Operator |
Image |
||
NVIDIA GPU Feature Discovery |
Image |
||
NVIDIA GPU Driver |
Image |
NVIDIA DEEP LEARNING CONTAINER LICENSE and NVIDIA GPU Driver |
|
NVIDIA Container Toolkit |
Image |
||
NVIDIA Kubernetes Device Plugin |
Image |
||
NVIDIA MIG Manager for Kubernetes |
Image |
||
Validator for NVIDIA GPU Operator |
Image |
||
NVIDIA DCGM |
Image |
||
NVIDIA DGCM Exporter |
Image |
||
NVIDIA Driver Manager for Kubernetes |
Image |
||
NVIDIA KubeVirt GPU Device Plugin |
Image |
||
NVIDIA vGPU Device Manager |
Image |
||
NVIDIA FS |
Image |
NVIDIA DEEP LEARNING CONTAINER LICENSE and NVIDIA GPU Driver |
|
NVIDIA Confidential Computing Manager for Kubernetes |
Image |
||
NVIDIA Kata Manager for Kubernetes |
Image |