Developing Your Application for Fleet Command
Before proceeding, set up a Development Environment that contains the components of the Fleet Command stack to use for developing your application and ensuring it is compatible with Fleet Command.
A container is a standard unit of software that packages up code and all dependencies, so the application runs quickly and reliably from one computing environment to another. A container image is a lightweight, standalone, executable package of software that includes everything needed to run an application: code, runtime, system tools, system libraries, and settings. Container images become containers at runtime, and in the case of containers - images become containers.
Why Containers?
One of the many benefits of using containers is installing your application, dependencies, and environment variables one time into the container image; rather than on each system you run on. In addition, the key benefits to using containers also include:
Install your application, dependencies, and environment variables one time into the container image; rather than on each system you run on.
There is no risk of conflict with libraries that are installed by others.
Containers allow the use of multiple different deep learning frameworks, which may have conflicting software dependencies on the same system.
After you build your application into a container, you can run it on many other places, especially systems, without installing any software.
Legacy accelerated compute applications can be containerized and deployed on newer systems, on-premise, or the cloud.
Specific GPU resources can be allocated to a container for isolation and better performance.
You can easily share, collaborate, and test applications across different environments.
Multiple instances of a given deep learning framework can be run concurrently, each having one or more specific GPUs assigned.
Containers can resolve network-port conflicts between applications by mapping container-ports to specific externally visible ports when launching the container.
Fleet Command Container Requirements
Applications deployed using NVIDIA Fleet Command must meet these requirements:
Be container-based.
Run and be supported in the software stack/environment listed above in the software stack requirements.
Deployable via Helm chart from either:
Your NGC Private Registry (recommended for maximum security)
The NGC Catalog.
A public registry.
For debugging, logging information should be printed to stdout or stderr.
Note that the root filesystem on a secured Fleet Command system is immutable; therefore, writing to the rootfs (e.g., deploying tools, etc.) is not allowed. As long as the applications are containerized and do not attempt to modify the root file system on the host, they should not encounter any problems.
The containers can run in privileged mode, when the “Allow PrivilegedContainers” option is enabled in creating deployment on Fleet Command. This will also ensure that applications will write to the root filesystem on the host.
Apply CPU and memory limits to your pods. This can help manage the resources on your worker nodes and prevent a malfunctioning microservice from impacting other microservices.
Set up liveness and readiness probes for your container. Unless your container completely crashes, Kubernetes will not know it’s unhealthy unless you create an endpoint or mechanism that can report container status. Alternatively, make sure your container halts and crashes if unhealthy.
Use trusted base container images from trusted sources like nvcr, RedHat/quay. If using community-supported images, use only the images provided by the communities that you trust.
Should contain the model or have its provision for obtaining models. Refer to the Helm chart requirements below for the authentication option from NGC with ModelPullSecret.
Building a Container
Here’s an example of a deep learning container built using an NVIDIA CUDA base image to deploy on Fleet Command.
Where /app/
contains all of your source code:
FROM nvcr.io/nvidia/cuda:11.3.0-base-ubuntu18.04
CMD nvidia-smi
#set up environment
RUN apt-get update && apt-get install --no-install-recommends --no-install-suggests -y curl
RUN apt-get install unzip
RUN apt-get -y install python3
RUN apt-get -y install python3-pip
#copies the application from local path to container path
COPY app/ /app/
WORKDIR /app
#Install the dependencies
RUN pip3 install -r /app/requirements.txt
ENV MODEL_TYPE='EfficientDet'
ENV DATASET_LINK='HIDDEN'
ENV TRAIN_TIME_SEC=100
CMD ["python3", "train.py"]
Build the Docker image with the below command where the above Dockerfile exists. In the below command, where <org-name>
is your NVIDIA organization name and <team-name>
is your NVIDIA organization team name.
$ docker build -t nvcr.io/<org-name>/<team-name>/deep-learning:cuda-11.3 .
If you do not have an NVIDIA organization team name, you do not need to include that in the above command.
Now publish the above container to your NGC Private Registry.
Helm is an application package manager running on top of Kubernetes. Helm is very similar to what Debian/RPM is for Linux or what JAR/WAR is for Java-based applications. Helm charts help you define, install, and upgrade even the most complex Kubernetes applications.
Why Helm?
Helm helps make managing Kubernetes resources easier and faster to use.
Improves productivity: Helm simplifies the deployment of complex applications that require configuration of other applications; for example, Jenkins Helm charts or WordPress Helm charts.
Sharing and reusability: Developers can use existing charts available on public repositories, which speeds up the development. Some examples are MySQL, CMS, etc.
Easier to start with Kubernetes: Developers can create a boilerplate template for their applications Kubernetes resources without spending too much time writing manifests manually.
Ease of CICD: By using environment-specific values and values overrides. Developers can create multiple CI environments using one set of Helm charts. This avoids duplicating deployment manifests per environment.
Kubernetes resources manifest validation: The developer can validate configuration changes without deploying to a Kubernetes cluster with the Helm client. Developers can render the templates on a control machine and validate the resource definitions.
Helm V3 Terminologies
Before we deep dive into the Helm workflow, here are some basic definitions:
Charts: Packaging in the Helm world are called charts.
A chart is a collection of files inside a particular directory tree that describe a related set of templates. The directory name is the name of the chart (without versioning information).
When charts are packaged as “archives”, these chart directories are packaged into a .tgz with their filename containing their version (using the SemVer2 versioning format). Here’s the typical format for an archive name: chart-name-{semVer2}.tgz.
Release: The Helm client creates a release referenced using
name
to track that installation when a chart is installed.A single chart may be installed many times into the same cluster and create many different releases. For example, one can install three PostgreSQL databases by running Helm install three times with another release name.
A single release can be updated multiple times. A sequential counter is used to track releases as they change. After a first Helm install, a release will have release number 1 (revision). Each time a release is upgraded or rolled back, the release number will be incremented.
Chart Version: Chart version is used for tracking the changes in charts. Any time the contents of the chart change, we have to update the chart version.
Application Version: Applications are containerized and tagged using SemVer versioning. In the CICD workflow, this is called the application version. Examples of an application with versioning are nginx-1.0.0 and WordPress-1.5.7. Each application container is published to a Docker registry using a container image and container tag. The container tag references the application version.
appVersion: To keep a one-to-one mapping between the chart and application version, Helm provides an appVersion field in Chart.yml (explained later). A good practice is to keep appVersion the same as the application version.
Values: Helm charts are templates, and Values provide the runtime config to be injected into the charts. This can be considered analogous to the parameters for a release (used to customize the chart).
Client: Command line interface used for interacting with the server component.
Fleet Command Helm Chart Requirements
Once your containers are created, you will need to create a Helm chart to deploy them on Fleet Command. The requirements for creating Helm charts are listed below:
Deployable from either:
Your NGC Private Registry. (Recommended for maximum security.)
The NGC Catalog.
A public repo.
Applications should be deployable via a single Helm chart.
For applications that require multiple Helm charts, the user is responsible for deploying and updating in the appropriate sequence.
Privileged pods should not be used. Refer to Containerizing Your Application for more information on privileged containers and Security Overrides.
Pods are restricted to deploy on the below namespaces:
kube-system
kube-public
kube-node-lease
egx-system
efa
helm
When using your NGC Private Registry by entering your NGC API Key into the Fleet Command UI:
Use ImagePullSecret for pulling container images.
Use ModelPullSecret for pulling models.
If an endpoint or endpoints are required when applications are brought online, applications should be exposed to the user via instructions provided in the NOTES.TXT/README file at the root of the Helm chart.
Fleet Command administrators can enter additional configuration options during deployment. Helm charts should be designed to expose these configuration options, and application documentation should clearly outline what configuration options are available and their specific format.
If your application requires ingress from outside the Kubernetes cluster (e.g., an app UI or console), the simplest solution would be to define the service to be exposed as a NodePort with a port range of 30000 to 32767.
If a port is not explicitly specified, provide users instructions on obtaining the port in the application’s documentation. Additionally, add the steps to remotely access the system and commands for getting a service address.
For more advanced solutions, an ingress controller can be used in place of using NodePorts.
Configure the application services for the appropriate service type via the
appProtocol
field. For more information on this field, refer to https://kubernetes.io/docs/concepts/services-networking/service/#application-protocol.If Fleet Command Remote Application Access is enabled in settings, then application services supporting web applications should preferably have an
appProtocol
value ofhttp
orhttps
. If your backend service is serving ‘https’, set theappProtocol
field tohttps
. If the field is empty, Fleet Command will assume the value ofhttp
, unless the targetPort is 443/https in which case Fleet Command will assume theappProtocol
field to behttps
. For more information, refer to the example service spec configuration.Remote Application Access currently only supports
TCP
in the service specprotocol
field.
To enable remote application access, Fleet Command recommends adding an automatic redirect to your web application to redirect from the root location to the correct path for your application. Refer to Configuring your Remote Application for more information.
If your application needs to configure the automatic redirect from the root index, include the nginx proxy into your Helm chart and use the following example nginx configuration to redirect from the root index to the application URL.
user nginx; worker_processes 1; events { worker_connections 10240; } http { sendfile on; tcp_nopush on; tcp_nodelay on; keepalive_timeout 65; types_hash_max_size 2048; include /etc/nginx/mime.types; fastcgi_buffers 8 16k; fastcgi_buffer_size 32k; client_max_body_size 24M; client_body_buffer_size 128k; client_header_buffer_size 5120k; large_client_header_buffers 16 5120k; server { listen 80 default_server; server_name localhost; root /var/www/html; location / { rewrite ^ $scheme://$http_host/demo/index.html break; proxy_buffering off; proxy_cache_bypass $http_upgrade; proxy_buffers 4 256k; proxy_buffer_size 128k; proxy_busy_buffers_size 256k; proxy_connect_timeout 4s; proxy_read_timeout 86400s; proxy_send_timeout 12s; proxy_http_version 1.1; proxy_set_header Host $http_host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection "upgrade"; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; } } }
If your host network doesn’t generate an FQDN, Fleet Command will generate an FQDN for each system based on the system name. For example, a system named
mysystem
would have an FQDN ofmysystem.egx.nvidia.com
. If your chart specifies system name values for use on other deployment platforms, ensure that the system hostname matches the location system name.Your application is expected to use basic Layer 3 pod-networking supported by standard K8S CNI implementations, such as Calico.
Your application is expected to use essential storage via local storage Volumes.
For local (on the edge system) storage, the paths which are allowed to be mounted by default to your application are /opt, /tmp, /mnt, and /etc/localtime. For additional information, visit: https://kubernetes.io/docs/concepts/storage/volumes/#hostpath
If access to any HostPath on the edge system is required, select the appropriate security override options when creating a Deployment for your application. For more information on Security Overrides, refer to the section in the User Guide.
For storing data to external storage, your application can use NFS volumes to mount an existing NFS server export. Your NFS server must have ‘no_root_squash’ enabled. For additional information and examples, visit: https://kubernetes.io/docs/concepts/storage/volumes/#nfs
Suppose your Helm chart uses dependencies (i.e., your Helm chart has a ‘charts/’ folder). In that case, you need to update the charts dependencies using “Helm dep up” or “Helm dependency update” before packaging the application. For example, if you have a Helm chart called “sample-iva-app,” you would run the following before pushing the chart to the repository:
$ helm dep up sample-iva-app
$ helm package sample-iva-app
Due to a Kubernetes naming limitation, the length of a Kubernetes object name in the Helm chart must be less than 63 characters. Kubernetes objects include deployments, pods, services, secrets, etc. In the following example,
Kubernetes service name: long-overflowing-sample-test-deploy-name-1234567890-video-analytics-demo-app-webui
the length of the service name in the application Helm chart is over 63 characters. As a result, the deployment of this application will fail.
Building a Helm Chart
Helm Chart Creation
Here is an example of a DeepStream Helm chart to help you build the Helm chart deployed with Fleet Command. Run the below command to create a sample Helm chart. This will generate a Helm chart with an nginx Docker container. We will update to DeepStream later in this section.
$ helm create deepstream
Once Helm chart is created, this is how the DeepStream directory structure appears:
deepstream
|-- Chart.yaml # A YAML file containing information about the chart
|-- charts # Contains any charts upon which this chart depends.
|-- templates # Contains templates that, when rendered with values, will
# generate valid Kubernetes manifest files.
| |-- NOTES.txt # OPTIONAL: A plain text file containing short usage notes
| |-- _helpers.tpl # go template helpers that can be used throughout the chart.
| |-- deployment.yaml
| |-- ingress.yaml
| `-- service.yaml
`-- values.yaml # The default configuration values for this chart
# It is common to have multiple values files, per environment.
Modify values.yaml
Next, modify the values.yaml
with the values indicated by “USE THIS VALUE” below.
image:
repository: nvcr.io/nvidia/deepstream # USE THIS VALUE
pullPolicy: IfNotPresent
# Overrides the image tag whose default is the chart appVersion.
tag: 5.1-21.02-samples # USE THIS VALUE
imagePullSecrets: []
nameOverride: ""
fullnameOverride: ""
serviceAccount:
# Specifies whether a service account should be created
create: false # USE THIS VALUE
# Annotations to add to the service account
annotations: {}
# The name of the service account to use.
# If not set and create is true, a name is generated using the fullname template
name: ""
podAnnotations: {}
podSecurityContext: {}
# fsGroup: 2000
securityContext: {}
# capabilities:
# drop:
# - ALL
# readOnlyRootFilesystem: true
# runAsNonRoot: true
# runAsUser: 1000
service:
type: NodePort # USE THIS VALUE
port: 8554 # USE THIS VALUE
nodeport: 31113 # USE THIS VALUE
Additional configuration in values.yaml may be needed if deploying the application on a MIG-enabled system. Refer to the MIG configuration for Applications section below for details.
Modify deployment.yaml
Update the deployment.yaml
in the template directory, with the values below after the image line, and remove liveness probe and readiness probes.
image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
command:
- sh
- -c
- cd /opt/nvidia/deepstream/deepstream-5.1/samples/configs/deepstream-app; sed -ie 's/rows=2/rows=1/g;s/columns=2/columns=1/g;s/num-sources=4/num-sources=1/g;s/batch-size=4/batch-size=1/g;s/batch-size=16/batch-size=1/g;s/file-loop=0/file-loop=1/g;/\[sink2\]/,/enable/s/enable=0/enable=1/;/\[sink2\]/,/sync/s/sync=0/sync=1/;/\[sink0\]/,/type/s/type=2/type=1/' source4_1080p_dec_infer-resnet_tracker_sgie_tiled_display_int8.txt; sed -ie '/\[sink0\]/a rtsp-port=8554\ncodec=1' source4_1080p_dec_infer-resnet_tracker_sgie_tiled_display_int8.txt; sed -ie '/\[sink2\]/a container=1\noutput-file=out.mp4' source4_1080p_dec_infer-resnet_tracker_sgie_tiled_display_int8.txt; deepstream-app -c /opt/nvidia/deepstream/deepstream-5.1/samples/configs/deepstream-app/source4_1080p_dec_infer-resnet_tracker_sgie_tiled_display_int8.txt
imagePullPolicy: {{ .Values.image.pullPolicy }}
ports:
- name: http
containerPort: 8554
protocol: TCP
Modify the container port:80 to container port:8554, as shown above.
Modify service.yaml
Next update the service.yaml
in templates directory with the “USE THIS VALUE” values below:
spec:
type: {{ .Values.service.type }}
ports:
- port: {{ .Values.service.port }}
targetPort: http
protocol: TCP
appProtocol: http
name: http # USE THIS VALUE
nodePort: {{ .Values.service.nodeport }}
ImagePullSecrets for Images
If you have an image that exists in NVIDIA private registry, you need add ImagePullSecrets to the values.yaml
as shown below.
imagePullSecrets:
- name: imagepullsecret
Helm Chart Validation
Run the below command to validate the Helm chart:
$ helm lint deepstream
==> Linting deepstream
[INFO] Chart.yaml: icon is recommended
1 chart(s) linted, no failures
Helm Chart Packaging
Run the below command to create a Helm chart package:
$ helm package deepstream
Successfully packaged chart and saved it to: /home/nvidia/deepstream-0.1.0.tgz
Next, follow the NGC Helm Charts guide to publish the Helm chart on NGC.
MIG configuration for Applications
If you plan to use the MIG feature on Fleet Command, consider the below scenarios and configure the Helm chart to support backward application compatibility.
MIG Configuration Strategies
There are two MIG strategies available on Fleet Command. For more information about MIG strategies, refer to the MIG Strategies documentation.
In Fleet Command, if Application Compatibility is On, the system is configured for the Single MIG strategy. If the Application Compatibility mode is Off, the Mixed MIG strategy applies.
To support the configuration of the application for MIG, configure the Helm chart with the GPU type and number of GPUs in values.yaml to help modify the deployment when MIG profiles have changed.
gpuType: nvidia.com/gpu
numGPUs: 1
Single MIG Strategy
When a system has all GPUs with the same MIG Config, Application Compatibility mode will be On. In this configuration, a Pod Spec that was previously working will not need to be changed because the nvidia.com/gpu
field is interpreted in a backward-compatible manner.
The Deployment Configuration field can be left blank or with the following values:
Mixed MIG Strategy
When a system has GPUs with different MIG configurations or one GPU in MIG mode and another not, Application Compatibility mode will be Off. In this configuration, the deployment configuration has to be changed to request a specific MIGdevice. This requires that the helm chart is configured with GPU type and number of GPUs in the values.yaml.
To change the deployment configuration for MIG, nvidia.com/gpu
must be changed to nvidia.com/mig-3g.20gb: 1
, Without that change the Deployment will show FAILED status on Fleet Command UI.
MIG Available Options |
GPU Labels |
Supported GPUs |
---|---|---|
2 MIGs of 3c.20gb | nvidia.com/mig-3g.20gb | A100 |
3 MIGs of 2c.10gb | nvidia.com/mig-2g.10gb | A100 |
7 MIGs of 1c.5gb | nvidia.com/mig-1g.5gb | A100 |
2 MIGs of 2c.12gb | nvidia.com/mig-2g.12gb | A30 |
4 MIGs of 1c.6gb | nvidia.com/mig-1g.6gb | A30 |
Multi-node Strategy
MIG configurations are configured on a per-system basis within Fleet Command. Each individual system will maintain its own MIG configuration, even if they are in the same location.
As a result, depending on the MIG configuration of each system and the deployment configuration of the application, the application may be scheduled on only one, or even none of the systems if the GPU label specified does not match any of the configured MIG options for the systems.
Multiple deployments may be required to schedule the application on different MIG configurations in order to specify the correct GPU labels for each system configuration.