Introduction

NVIDIA Fleet Command deploys container-based applications on GPU-accelerated Kubernetes clusters using Helm charts. This guide will provide the information necessary to build applications compatible with Fleet Command and includes information on setting up a development environment.

Overview of Fleet Command Technologies

Containerization involves bundling an application with all of its related configuration files, libraries, and dependencies required to run efficiently for portability. While containers are an excellent way to bundle and run applications, it is essential to manage the containers that run the applications and ensure no downtime in a deployment environment.

Kubernetes is an open-source platform for managing containerized applications. For example, if a container goes down, another container needs to start. Kubernetes helps in maintaining container life cycles without any human intervention.

Fleet Command leverages these technologies to provide a fully supported cloud-native platform that securely deploys, manages, and scales your applications across a distributed edge infrastructure.

Note

Fleet Command Software Stack 2.0 is available in Fleet Command version 1.3.0 and above.

Fleet Command Software Stack

Applications deployed on NVIDIA Fleet Command must run on the following software stack:

  • Ubuntu 20.04.1 LTS

  • CUDA 11.6

    • NVIDIA Driver 510.47.03 (Pre-compiled signed driver)

  • Containerd 1.4.9-1

  • Kubernetes 1.22.5

  • NVIDIA Container Runtime v3.5.0-1

  • Helm 3.8.2

The environment also includes the NVIDIA libraries listed below with NVIDIA Driver 470.103.01:

  • libnvidia-cfg1-470-server:amd64

  • libnvidia-common-470-server

  • libnvidia-compute-470-server:amd64

  • libnvidia-container-tools

  • libnvidia-container1:amd64

  • libnvidia-decode-470-server:amd64

  • libnvidia-encode-470-server:amd64

  • libnvidia-fbc1-470-server:amd64

  • libnvidia-gl-470-server:amd64

  • libnvidia-ifr1-470-server:amd64

  • linux-modules-nvidia-470-server-5.4.0-90-generic

  • linux-objects-nvidia-470-server-5.4.0-90-generic

  • linux-signatures-nvidia-5.4.0-90-generic

  • nvidia-compute-utils-470-server

  • nvidia-container-runtime

  • nvidia-container-toolkit

  • nvidia-kernel-common-470-server

  • nvidia-utils-470-server

Details on replicating this software stack are provided later in this guide, in the Development Environment section.

Instructions on setting up this software stack manually for development purposes are provided later in this guide, in the Development Environment section.