Step 2: Set Up Required Infrastructure

Intelligent Virtual Assistant

NVIDIA AI Workflows are designed to be deployed on a cloud-native Kubernetes-based platform, which can be deployed on-premise or using a cloud service provider (CSP).

The infrastructure stack that will be set up for the workflow should follow the diagram below:

Follow the instructions in the sections below to set up the required infrastructure (denoted by the blue and grey boxes) that will be used in Step 3: Install Workflow Components (denoted by the green box).

GPU-Enabled Hardware Infrastructure

NVIDIA AI Workflows at minimum require a single GPU-enabled node for running the provided example workload. Production deployments should be performed in an HA environment.

This workflow requires at minimum two nodes, one for the training pipeline of the workflow and one for the inference/deployment pipeline of the workflow. Production deployments should be performed in an HA environment.

The following hardware specification is recommended for both the training and inference pipelines:

1X A10/A30/A100 (any GPU with more than or equal to 24 GB memory)
12 vCPU Cores
64 GB RAM
500 GB HDD

Make a note of these hardware specifications, as you will use them in the following sections to provision the nodes used in the Kubernetes cluster.

Note

The Kubernetes cluster and Cloud Native Service Add-On Pack may have additional infrastructure requirements for networking, storage, services, etc. More detailed information can be found in the NVIDIA Cloud Native Service Add-On Pack Deployment Guide.

Training Pipeline

Follow the steps below to first set up the node that will be used for training.

Start by deploying an On-Demand NVIDIA AI Enterprise VMI in the cloud, meeting the above specifications. You can find this VMI under the Marketplaces for major CSPs; follow the instructions from the listing to provision an instance.
Once the instance has been provisioned, refer to the NVIDIA AI Enterprise Cloud Guide to authorize the instance and activate your subscription.
Once your subscription has been activated, ensure you can access the Enterprise Catalog, and create an NGC API Key if you have not done so already.
Once you have created an NGC API Key, install and configure the NGC CLI if you have not done so already using the instructions here.
Proceed below to set up the inference/deployment component.

Inference Pipeline

Kubernetes Cluster

The inference pipeline requires a Kubernetes cluster that is supported by NVIDIA AI Enterprise to be provisioned.

The Cloud Native Service Add-On Pack only supports a subset of NVIDIA AI Enterprise-supported Kubernetes distributions at this time. Specific supported distributions and the steps to provision a cluster can be found in the NVIDIA Cloud Native Service Add-On Pack Deployment Guide.

An example reference to provision a minimal cluster based on the NVIDIA AI Enterprise VMI, with NVIDIA Cloud Native Stack, can be found in the guide here.

NVIDIA Cloud Native Service Add-On Pack

Once the Kubernetes cluster has been provisioned, proceed to the next step in the NVIDIA Cloud Native Service Add-On Pack Deployment Guide to deploy the add-on pack on the cluster.

An example reference following the one from the previous section can be found here.

Workflow Components

All of the inference pipeline components are integrated and deployed on top of the the previously described infrastructure stack as a starting point. The workflow can then be customized and integrated with one’s own specific environment if required.

After the add-on pack has been installed, proceed to Step 3: Install Workflow Components to continue setting up the workflow.