Introduction¶
NVIDIA Fleet Command deploys container-based applications on GPU-accelerated Kubernetes clusters using Helm charts. This guide will provide the information necessary to build applications compatible with Fleet Command and includes information on setting up a development environment.
Overview of Fleet Command Technologies¶
Containerization involves bundling an application with all of its related configuration files, libraries, and dependencies required to run efficiently for portability. While containers are an excellent way to bundle and run applications, it is essential to manage the containers that run the applications and ensure no downtime in a deployment environment.
Kubernetes is an open-source platform for managing containerized applications. For example, if a container goes down, another container needs to start. Kubernetes helps in maintaining container life cycles without any human intervention.
Fleet Command leverages these technologies to provide a fully supported cloud-native platform that securely deploys, manages, and scales your applications across a distributed edge infrastructure.
Note
Fleet Command Software Stack 2.0 is available in Fleet Command version 1.3.0 and above.
Fleet Command Software Stack¶
Applications deployed on NVIDIA Fleet Command must run on the following software stack:
Ubuntu 20.04.1 LTS
CUDA 11.6
NVIDIA Driver 510.47.03 (Pre-compiled signed driver)
Containerd 1.4.9-1
Kubernetes 1.22.5
NVIDIA Container Runtime v3.5.0-1
Helm 3.8.2
The environment also includes the NVIDIA libraries listed below with NVIDIA Driver 470.103.01:
libnvidia-cfg1-470-server:amd64
libnvidia-common-470-server
libnvidia-compute-470-server:amd64
libnvidia-container-tools
libnvidia-container1:amd64
libnvidia-decode-470-server:amd64
libnvidia-encode-470-server:amd64
libnvidia-fbc1-470-server:amd64
libnvidia-gl-470-server:amd64
libnvidia-ifr1-470-server:amd64
linux-modules-nvidia-470-server-5.4.0-90-generic
linux-objects-nvidia-470-server-5.4.0-90-generic
linux-signatures-nvidia-5.4.0-90-generic
nvidia-compute-utils-470-server
nvidia-container-runtime
nvidia-container-toolkit
nvidia-kernel-common-470-server
nvidia-utils-470-server
Details on replicating this software stack are provided later in this guide, in the Development Environment section.
Instructions on setting up this software stack manually for development purposes are provided later in this guide, in the Development Environment section.