Step 3: Install Cloud Native Software Stack

Digital Fingerprinting

NVIDIA AI Workflows are based on cloud-native platforms running Kubernetes. As a part of the workflow, we will deploy an example Kubernetes cluster meeting the hardware requirements, termed NVIDIA Cloud Native Stack, along with a set of components serving as implementation examples for integrating the AI application with Enterprise-like authentication, monitoring, reporting, and load balancing, termed NVIDIA Cloud Native Service Add-on Pack.

  1. Ensure that the VMI provisioned from the previous Hardware Requirements section is accessible.

  2. We will install NVIDIA Cloud Native Stack on the NVIDIA AI Enterprise VMI that has been provisioned.

    • Open an SSH web console to the VMI you provisioned via the cloud provider’s methods.

    • Once an SSH session has been established, a local user with sudo access will need to be created. This can be done via the following commands:

      Copy
      Copied!
                  

      sudo adduser nvidia

      Copy
      Copied!
                  

      sudo usermod -aG sudo nvidia


    • Next, clone the NVIDIA Cloud Native Stack repo via the following command:

      Copy
      Copied!
                  

      git clone https://github.com/NVIDIA/cloud-native-stack.git


    • Navigate to the playbooks directory within the repo:

      Copy
      Copied!
                  

      cd cloud-native-stack/playbooks


    • Edit the host file, uncomment the localhosts line and replace the ansible_ssh_user, password, and sudo user fields with the local user credentials you specified previously. Once this is done, save and exit the file.

      Copy
      Copied!
                  

      nano hosts


    • Next, edit the cnc_values_6.3.yaml file, updating the cnc_docker and cnc_nvidia_driver parameters from no to yes. Defining these values to yes ensures a compatible version of Docker and the NVIDIA GPU driver are installed and available for use in this workflow, helping developers evaluate the Docker experience with Cloud Native Stack. Save and exit the file.

      Copy
      Copied!
                  

      nano cnc_values_6.3.yaml


    • Run the installer for NVIDIA Cloud Native Stack via the following command:

      Copy
      Copied!
                  

      bash setup.sh install

      Note

      The installer may fail if dpkg does not run cleanly or entirely during the instance provisioning. If this occurs, run the following command to resolve the issue, then retry the installation.

      Copy
      Copied!
                  

      sudo dpkg --configure -a


    • After the installation completes, create the kubeconfig for your user:

      Copy
      Copied!
                  

      mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config


    • Next, check that the Kubernetes cluster is up and running by running the following command:

      Copy
      Copied!
                  

      kubectl get pods -A


    • Finally, we’ll install Local Path Provisioner to provide a storage class to use for the rest of the workflow:

      Copy
      Copied!
                  

      kubectl apply -f https://raw.githubusercontent.com/rancher/local-path-provisioner/v0.0.23/deploy/local-path-storage.yaml


  3. Once NVIDIA Cloud Native Stack has been set up, we will install NVIDIA Cloud Native Service Add-on Pack. Refer to the NVIDIA Cloud Native Service Add-on Pack Deployment Guide for instructions to set up the prerequisites and deploy the software stack.

  4. After the add-on pack has been installed, proceed to the Deployment Steps section to continue setting up the workflow.

More information about NVIDIA Cloud Native Stack can be found here.

And more information about NVIDIA Cloud Native Service Add-on Pack can be found here.

© Copyright 2022-2023, NVIDIA. Last updated on Mar 20, 2023.