Developing Your Application for Fleet Command

NVIDIA Docs Hub NVIDIA Fleet Command Fleet Command User Guide Developing Your Application for Fleet Command

Note

Before proceeding, set up a Development Environment that contains the components of the Fleet Command stack to use for developing your application and ensuring it is compatible with Fleet Command.

Containerizing Your Application

A container is a standard unit of software that packages up code and all dependencies, so the application runs quickly and reliably from one computing environment to another. A container image is a lightweight, standalone, executable package of software that includes everything needed to run an application: code, runtime, system tools, system libraries, and settings. Container images become containers at runtime, and in the case of containers - images become containers.

Why Containers?

One of the many benefits of using containers is installing your application, dependencies, and environment variables one time into the container image; rather than on each system you run on. In addition, the key benefits to using containers also include:

Install your application, dependencies, and environment variables one time into the container image; rather than on each system you run on.
There is no risk of conflict with libraries that are installed by others.
Containers allow the use of multiple different deep learning frameworks, which may have conflicting software dependencies on the same system.
After you build your application into a container, you can run it on many other places, especially systems, without installing any software.
Legacy accelerated compute applications can be containerized and deployed on newer systems, on-premise, or the cloud.
Specific GPU resources can be allocated to a container for isolation and better performance.
You can easily share, collaborate, and test applications across different environments.
Multiple instances of a given deep learning framework can be run concurrently, each having one or more specific GPUs assigned.
Containers can resolve network-port conflicts between applications by mapping container-ports to specific externally visible ports when launching the container.

Fleet Command Container Requirements

Applications deployed using NVIDIA Fleet Command must meet these requirements:

Be container-based.
Run and be supported in the software stack/environment listed above in the software stack requirements.
Deployable via Helm chart from either:
- Your NGC Private Registry (recommended for maximum security)
- The NGC Catalog.
- A public registry.
For debugging, logging information should be printed to stdout or stderr.
Note that the root filesystem on a secured Fleet Command system is immutable; therefore, writing to the rootfs (e.g., deploying tools, etc.) is not allowed. As long as the applications are containerized and do not attempt to modify the root file system on the host, they should not encounter any problems.
The containers can run in privileged mode, when the “Allow PrivilegedContainers” option is enabled in creating deployment on Fleet Command. This will also ensure that applications will write to the root filesystem on the host.
Apply CPU and memory limits to your pods. This can help manage the resources on your worker nodes and prevent a malfunctioning microservice from impacting other microservices.
Set up liveness and readiness probes for your container. Unless your container completely crashes, Kubernetes will not know it’s unhealthy unless you create an endpoint or mechanism that can report container status. Alternatively, make sure your container halts and crashes if unhealthy.
Use trusted base container images from trusted sources like nvcr, RedHat/quay. If using community-supported images, use only the images provided by the communities that you trust.
Should contain the model or have its provision for obtaining models. Refer to the Helm chart requirements below for the authentication option from NGC with ModelPullSecret.

Building a Container

Here’s an example of a deep learning container built using an NVIDIA CUDA base image to deploy on Fleet Command.

Where /app/ contains all of your source code:

Copy
Copied!

            
            FROM nvcr.io/nvidia/cuda:11.3.0-base-ubuntu18.04
CMD nvidia-smi

#set up environment
RUN apt-get update && apt-get install --no-install-recommends --no-install-suggests -y curl
RUN apt-get install unzip
RUN apt-get -y install python3
RUN apt-get -y install python3-pip

#copies the application from local path to container path
COPY app/ /app/
WORKDIR /app

#Install the dependencies
RUN pip3 install -r /app/requirements.txt

ENV MODEL_TYPE='EfficientDet'
ENV DATASET_LINK='HIDDEN'
ENV TRAIN_TIME_SEC=100

CMD ["python3", "train.py"]

Build the Docker image with the below command where the above Dockerfile exists. In the below command, where <org-name> is your NVIDIA organization name and <team-name> is your NVIDIA organization team name.

Copy
Copied!

            
            $ docker build -t nvcr.io/<org-name>/<team-name>/deep-learning:cuda-11.3 .

Note

If you do not have an NVIDIA organization team name, you do not need to include that in the above command.

Now publish the above container to your NGC Private Registry.

Creating a Helm Chart for your Application

Helm is an application package manager running on top of Kubernetes. Helm is very similar to what Debian/RPM is for Linux or what JAR/WAR is for Java-based applications. Helm charts help you define, install, and upgrade even the most complex Kubernetes applications.

Why Helm?

Helm helps make managing Kubernetes resources easier and faster to use.

Improves productivity: Helm simplifies the deployment of complex applications that require configuration of other applications; for example, Jenkins Helm charts or WordPress Helm charts.
Sharing and reusability: Developers can use existing charts available on public repositories, which speeds up the development. Some examples are MySQL, CMS, etc.
Easier to start with Kubernetes: Developers can create a boilerplate template for their applications Kubernetes resources without spending too much time writing manifests manually.
Ease of CICD: By using environment-specific values and values overrides. Developers can create multiple CI environments using one set of Helm charts. This avoids duplicating deployment manifests per environment.
Kubernetes resources manifest validation: The developer can validate configuration changes without deploying to a Kubernetes cluster with the Helm client. Developers can render the templates on a control machine and validate the resource definitions.

Helm V3 Terminologies

Before we deep dive into the Helm workflow, here are some basic definitions:

Charts: Packaging in the Helm world are called charts.
- A chart is a collection of files inside a particular directory tree that describe a related set of templates. The directory name is the name of the chart (without versioning information).
- When charts are packaged as “archives”, these chart directories are packaged into a .tgz with their filename containing their version (using the SemVer2 versioning format). Here’s the typical format for an archive name: chart-name-{semVer2}.tgz.
Release: The Helm client creates a release referenced using name to track that installation when a chart is installed.
- A single chart may be installed many times into the same cluster and create many different releases. For example, one can install three PostgreSQL databases by running Helm install three times with another release name.
- A single release can be updated multiple times. A sequential counter is used to track releases as they change. After a first Helm install, a release will have release number 1 (revision). Each time a release is upgraded or rolled back, the release number will be incremented.
Chart Version: Chart version is used for tracking the changes in charts. Any time the contents of the chart change, we have to update the chart version.
Application Version: Applications are containerized and tagged using SemVer versioning. In the CICD workflow, this is called the application version. Examples of an application with versioning are nginx-1.0.0 and WordPress-1.5.7. Each application container is published to a Docker registry using a container image and container tag. The container tag references the application version.
appVersion: To keep a one-to-one mapping between the chart and application version, Helm provides an appVersion field in Chart.yml (explained later). A good practice is to keep appVersion the same as the application version.
Values: Helm charts are templates, and Values provide the runtime config to be injected into the charts. This can be considered analogous to the parameters for a release (used to customize the chart).
Client: Command line interface used for interacting with the server component.

Fleet Command Helm Chart Requirements

Once your containers are created, you will need to create a Helm chart to deploy them on Fleet Command. The requirements for creating Helm charts are listed below:

Deployable from either:
- Your NGC Private Registry. (Recommended for maximum security.)
- The NGC Catalog.
- A public repo.
Applications should be deployable via a single Helm chart.
- For applications that require multiple Helm charts, the user is responsible for deploying and updating in the appropriate sequence.
Privileged pods should not be used. Refer to Containerizing Your Application for more information on privileged containers and Security Overrides.
Pods are restricted to deploy on the below namespaces:
- kube-system
- kube-public
- kube-node-lease
- egx-system
- efa
- helm
When using your NGC Private Registry by entering your NGC API Key into the Fleet Command UI:
- Use ImagePullSecret for pulling container images.
- Use ModelPullSecret for pulling models.
If an endpoint or endpoints are required when applications are brought online, applications should be exposed to the user via instructions provided in the NOTES.TXT/README file at the root of the Helm chart.
Fleet Command administrators can enter additional configuration options during deployment. Helm charts should be designed to expose these configuration options, and application documentation should clearly outline what configuration options are available and their specific format.
If your application requires ingress from outside the Kubernetes cluster (e.g., an app UI or console), the simplest solution would be to define the service to be exposed as a NodePort with a port range of 30000 to 32767.
- If a port is not explicitly specified, provide users instructions on obtaining the port in the application’s documentation. Additionally, add the steps to remotely access the system and commands for getting a service address.
- For more advanced solutions, an ingress controller can be used in place of using NodePorts.
Configure the application services for the appropriate service type via the appProtocol field. For more information on this field, refer to https://kubernetes.io/docs/concepts/services-networking/service/#application-protocol.
- If Fleet Command Remote Application Access is enabled in settings, then application services supporting web applications should preferably have an appProtocol value of http or https. If your backend service is serving ‘https’, set the appProtocol field to https. If the field is empty, Fleet Command will assume the value of http, unless the targetPort is 443/https in which case Fleet Command will assume the appProtocol field to be https. For more information, refer to the example service spec configuration.
- Remote Application Access currently only supports TCP in the service spec protocol field.

To enable remote application access, Fleet Command recommends adding an automatic redirect to your web application to redirect from the root location to the correct path for your application. Refer to Configuring your Remote Application for more information.

If your application needs to configure the automatic redirect from the root index, include the nginx proxy into your Helm chart and use the following example nginx configuration to redirect from the root index to the application URL.

Copy
Copied!

            
            user nginx;
worker_processes  1;
events {
  worker_connections  10240;
}
http {
  sendfile on;
  tcp_nopush on;
  tcp_nodelay on;
  keepalive_timeout 65;
  types_hash_max_size 2048;
  include /etc/nginx/mime.types;
  fastcgi_buffers 8 16k;
  fastcgi_buffer_size 32k;

  client_max_body_size 24M;
  client_body_buffer_size 128k;

  client_header_buffer_size 5120k;
  large_client_header_buffers 16 5120k;
  server {
    listen       80 default_server;
    server_name  localhost;
    root /var/www/html;
    location / {
        rewrite ^ $scheme://$http_host/demo/index.html break;
        proxy_buffering off;
        proxy_cache_bypass $http_upgrade;
        proxy_buffers 4 256k;
        proxy_buffer_size 128k;
        proxy_busy_buffers_size 256k;
        proxy_connect_timeout 4s;
        proxy_read_timeout 86400s;
        proxy_send_timeout 12s;
        proxy_http_version 1.1;
        proxy_set_header Host $http_host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    }
  }
}

If your host network doesn’t generate an FQDN, Fleet Command will generate an FQDN for each system based on the system name. For example, a system named mysystem would have an FQDN of mysystem.egx.nvidia.com. If your chart specifies system name values for use on other deployment platforms, ensure that the system hostname matches the location system name.
Your application is expected to use basic Layer 3 pod-networking supported by standard K8S CNI implementations, such as Calico.
Your application is expected to use essential storage via local storage Volumes.
- For local (on the edge system) storage, the paths which are allowed to be mounted by default to your application are /opt, /tmp, /mnt, and /etc/localtime. For additional information, visit: https://kubernetes.io/docs/concepts/storage/volumes/#hostpath
- If access to any HostPath on the edge system is required, select the appropriate security override options when creating a Deployment for your application. For more information on Security Overrides, refer to the section in the User Guide.
- For storing data to external storage, your application can use NFS volumes to mount an existing NFS server export. Your NFS server must have ‘no_root_squash’ enabled. For additional information and examples, visit: https://kubernetes.io/docs/concepts/storage/volumes/#nfs
Suppose your Helm chart uses dependencies (i.e., your Helm chart has a ‘charts/’ folder). In that case, you need to update the charts dependencies using “Helm dep up” or “Helm dependency update” before packaging the application. For example, if you have a Helm chart called “sample-iva-app,” you would run the following before pushing the chart to the repository:

Copy
Copied!

            
            $ helm dep up sample-iva-app
$ helm package sample-iva-app

Important

Due to a Kubernetes naming limitation, the length of a Kubernetes object name in the Helm chart must be less than 63 characters. Kubernetes objects include deployments, pods, services, secrets, etc. In the following example,

Kubernetes service name: long-overflowing-sample-test-deploy-name-1234567890-video-analytics-demo-app-webui

the length of the service name in the application Helm chart is over 63 characters. As a result, the deployment of this application will fail.

Building a Helm Chart

Helm Chart Creation

Here is an example of a DeepStream Helm chart to help you build the Helm chart deployed with Fleet Command. Run the below command to create a sample Helm chart. This will generate a Helm chart with an nginx Docker container. We will update to DeepStream later in this section.

Copy
Copied!

            
            $ helm create deepstream

Once Helm chart is created, this is how the DeepStream directory structure appears:

Copy
Copied!

            
            deepstream
|-- Chart.yaml           # A YAML file containing information about the chart
|-- charts               # Contains any charts upon which this chart depends.
|-- templates            # Contains templates that, when rendered with values, will
                         # generate valid Kubernetes manifest files.
|   |-- NOTES.txt        # OPTIONAL: A plain text file containing short usage notes
|   |-- _helpers.tpl     # go template helpers that can be used throughout the chart.
|   |-- deployment.yaml
|   |-- ingress.yaml
|   `-- service.yaml
`-- values.yaml          # The default configuration values for this chart
                         # It is common to have multiple values files, per environment.

Modify values.yaml

Next, modify the values.yaml with the values indicated by “USE THIS VALUE” below.

Copy
Copied!

            
            image:
    repository: nvcr.io/nvidia/deepstream  # USE THIS VALUE
    pullPolicy: IfNotPresent
    # Overrides the image tag whose default is the chart appVersion.
    tag: 5.1-21.02-samples # USE THIS VALUE

imagePullSecrets: []
nameOverride: ""
fullnameOverride: ""

serviceAccount:
    # Specifies whether a service account should be created
    create: false  # USE THIS VALUE
    # Annotations to add to the service account
    annotations: {}
    # The name of the service account to use.
    # If not set and create is true, a name is generated using the fullname template
    name: ""

podAnnotations: {}

podSecurityContext: {}
    # fsGroup: 2000

securityContext: {}
    # capabilities:
    # drop:
    # - ALL
    # readOnlyRootFilesystem: true
    # runAsNonRoot: true
    # runAsUser: 1000

service:
    type: NodePort   # USE THIS VALUE
    port: 8554       # USE THIS VALUE
    nodeport: 31113  # USE THIS VALUE

Note

Additional configuration in values.yaml may be needed if deploying the application on a MIG-enabled system. Refer to the MIG configuration for Applications section below for details.

Modify deployment.yaml

Update the deployment.yaml in the template directory, with the values below after the image line, and remove liveness probe and readiness probes.

Copy
Copied!

            
            image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
command:
- sh
- -c
- cd /opt/nvidia/deepstream/deepstream-5.1/samples/configs/deepstream-app; sed -ie 's/rows=2/rows=1/g;s/columns=2/columns=1/g;s/num-sources=4/num-sources=1/g;s/batch-size=4/batch-size=1/g;s/batch-size=16/batch-size=1/g;s/file-loop=0/file-loop=1/g;/\[sink2\]/,/enable/s/enable=0/enable=1/;/\[sink2\]/,/sync/s/sync=0/sync=1/;/\[sink0\]/,/type/s/type=2/type=1/' source4_1080p_dec_infer-resnet_tracker_sgie_tiled_display_int8.txt;  sed -ie '/\[sink0\]/a rtsp-port=8554\ncodec=1' source4_1080p_dec_infer-resnet_tracker_sgie_tiled_display_int8.txt; sed -ie '/\[sink2\]/a container=1\noutput-file=out.mp4' source4_1080p_dec_infer-resnet_tracker_sgie_tiled_display_int8.txt; deepstream-app -c /opt/nvidia/deepstream/deepstream-5.1/samples/configs/deepstream-app/source4_1080p_dec_infer-resnet_tracker_sgie_tiled_display_int8.txt
imagePullPolicy: {{ .Values.image.pullPolicy }}
ports:
  - name: http
    containerPort: 8554
    protocol: TCP

Note

Modify the container port:80 to container port:8554, as shown above.

Modify service.yaml

Next update the service.yaml in templates directory with the “USE THIS VALUE” values below:

Copy
Copied!

            
            spec:
    type: {{ .Values.service.type }}
    ports:
        - port: {{ .Values.service.port }}
        targetPort: http
        protocol: TCP
        appProtocol: http
        name: http  # USE THIS VALUE
        nodePort: {{ .Values.service.nodeport }}

ImagePullSecrets for Images

If you have an image that exists in NVIDIA private registry, you need add ImagePullSecrets to the values.yaml as shown below.

Copy
Copied!

            
            imagePullSecrets:
- name: imagepullsecret

Helm Chart Validation

Run the below command to validate the Helm chart:

Copy
Copied!

            
            $ helm lint deepstream
==> Linting deepstream
[INFO] Chart.yaml: icon is recommended

1 chart(s) linted, no failures

Helm Chart Packaging

Run the below command to create a Helm chart package:

Copy
Copied!

            
            $ helm package deepstream
Successfully packaged chart and saved it to: /home/nvidia/deepstream-0.1.0.tgz

Next, follow the NGC Helm Charts guide to publish the Helm chart on NGC.

MIG configuration for Applications

If you plan to use the MIG feature on Fleet Command, consider the below scenarios and configure the Helm chart to support backward application compatibility.

MIG Configuration Strategies

There are two MIG strategies available on Fleet Command. For more information about MIG strategies, refer to the MIG Strategies documentation.

Note

In Fleet Command, if Application Compatibility is On, the system is configured for the Single MIG strategy. If the Application Compatibility mode is Off, the Mixed MIG strategy applies.

To support the configuration of the application for MIG, configure the Helm chart with the GPU type and number of GPUs in values.yaml to help modify the deployment when MIG profiles have changed.

Copy
Copied!

            
            gpuType: nvidia.com/gpu
numGPUs: 1

Single MIG Strategy

When a system has all GPUs with the same MIG Config, Application Compatibility mode will be On. In this configuration, a Pod Spec that was previously working will not need to be changed because the nvidia.com/gpu field is interpreted in a backward-compatible manner.

The Deployment Configuration field can be left blank or with the following values:

Mixed MIG Strategy

When a system has GPUs with different MIG configurations or one GPU in MIG mode and another not, Application Compatibility mode will be Off. In this configuration, the deployment configuration has to be changed to request a specific MIGdevice. This requires that the helm chart is configured with GPU type and number of GPUs in the values.yaml.

To change the deployment configuration for MIG, nvidia.com/gpu must be changed to nvidia.com/mig-3g.20gb: 1, Without that change the Deployment will show FAILED status on Fleet Command UI.

MIG Available Options	GPU Labels	Supported GPUs
2 MIGs of 3c.20gb	nvidia.com/mig-3g.20gb	A100
3 MIGs of 2c.10gb	nvidia.com/mig-2g.10gb	A100
7 MIGs of 1c.5gb	nvidia.com/mig-1g.5gb	A100
2 MIGs of 2c.12gb	nvidia.com/mig-2g.12gb	A30
4 MIGs of 1c.6gb	nvidia.com/mig-1g.6gb	A30

Multi-node Strategy

MIG configurations are configured on a per-system basis within Fleet Command. Each individual system will maintain its own MIG configuration, even if they are in the same location.

As a result, depending on the MIG configuration of each system and the deployment configuration of the application, the application may be scheduled on only one, or even none of the systems if the GPU label specified does not match any of the configured MIG options for the systems.

Multiple deployments may be required to schedule the application on different MIG configurations in order to specify the correct GPU labels for each system configuration.

Previous Developer Concepts

Next Application Lifecycle Considerations