Set Up Using Deployment Scripts#

Start a minikube cluster and install the NeMo microservices platform using the automated deployment scripts provided by NVIDIA.

Note

This minikube cluster setup tutorial is designed for the Beginner Tutorials that run small workloads of fine-tuning, evaluating, and running inference on smaller LLMs such as llama-3.1-8b-instruct and meta-llama/llama-3.2-1b-instruct. If you want to run AI workloads at a larger scale, set up the NeMo microservices platform on a larger Kubernetes cluster. For more information, refer to About Admin Setup.

Before You Begin#

Check the Requirements before you begin.

To Start Minikube Cluster and Install NeMo Microservices Platform#

Follow the steps below to start a minikube cluster and install the NeMo microservices platform using the automated deployment scripts provided by NVIDIA.

  1. Download the following files in a directory on your local machine.

    • create-nmp-deployment.sh: NeMo platform deployment script. Use this script to set up a minikube cluster and install the NeMo microservices platform on it.

    • destroy-nmp-deployment.sh: Clean-up script. Use this script to delete the minikube cluster and the local files used during the deployment.

    • demo-values.yaml: Optional. The create-nmp-deployment.sh script uses this values file or creates one if it doesn’t exist. This values file is for installing the NeMo microservices platform Helm chart, nemo-microservices-helm-chart-25.6.0.tgz, on the minikube cluster for demonstration purposes. Save this file in the same directory as the scripts.

      Note

      This demo-values.yaml file is for a minimal setup of the NeMo microservices platform on a minikube cluster. For production-grade deployments, you’ll need additional configurations for:

      • Database disaster recovery

      • Multi-node training support in Volcano

      • Persistent volume claims (PVC) with proper storage classes

      For production-grade deployments, refer to the Admin Setup section.

      Note

      The NeMo Data Store microservice, configured with the demo values, includes a 2GB persistent volume that can accommodate roughly twenty LoRA adapters. Attempting to run multiple LoRA fine-tuning sessions or even a single full supervised fine-tuning (SFT) job within this demo setup will result in failure.

  2. Make the scripts executable.

    chmod +x create-nmp-deployment.sh destroy-nmp-deployment.sh
    
  3. Create an NGC API key following the instructions at Generating NGC API Keys.

  4. Export the NGC API key into your shell environment using the following command:

    export NGC_API_KEY=<your-ngc-api-key>
    
  5. Go to build.nvidia.com and generate an NVIDIA API key.

  6. Export the NVIDIA API key into your shell environment using the following command:

    export NVIDIA_API_KEY=<your-nvidia-api-key>
    
  7. Run the create-nmp-deployment.sh script to set up the minikube cluster and the NeMo microservices platform.

    ./create-nmp-deployment.sh --helm-chart-url  https://helm.ngc.nvidia.com/nvidia/nemo-microservices/charts/nemo-microservices-helm-chart-25.6.0.tgz
    
  8. Move onto the Beginner Tutorials to learn how to use the capabilities of the NeMo microservices.

  9. After you’re done with the Beginner Tutorials, run the destroy-nmp-deployment.sh script to delete everything.

    ./destroy-nmp-deployment.sh
    

NeMo Platform Deployment Script Overview#

The create-nmp-deployment.sh script automates the deployment of the NeMo microservices platform on a minikube cluster. Here’s a detailed breakdown of its functionality:

Deployment Script Phases#

Phase

Description

Phase 0: Run pre-flight checks

  • Verifies system requirements.

  • Checks required software.

Phase 1: Set up minikube

  • Initializes a fresh minikube cluster with Docker driver and runtime.

  • Configures unlimited CPU/memory resources.

  • Enables GPU access.

  • Enables ingress addon.

Phase 2: Create secrets

  • Creates Kubernetes secrets:

    • Docker registry secret (nvcrimagepullsecret)

    • NGC API key secret (ngc-api)

    • NVIDIA API key secret (nvidia-api)

Phase 3: Install the Helm chart

  • Downloads the NeMo Microservices Helm Chart

  • Creates a default demo-values.yaml file if it doesn’t exist

  • Installs the Volcano scheduler

  • Deploys the NeMo platform components

Phase 4: Verify pod health

  • Monitors pod initialization (30-minute timeout)

  • Checks for critical errors (e.g., ImagePullBackOff).

  • Collects diagnostic information.

  • Validates pod health status.

Phase 5: Configure DNS

  • Updates /etc/hosts with DNS entries:

    <minikube-ip> nemo.test
    <minikube-ip> nim.test
    <minikube-ip> data-store.test
    

Phase 6: Deploy Meta LLaMA NIM

  • Deploys Meta LLaMA 3.1 8B Instruct NIM with the following configuration:

    • Model: llama-3.1-8b-instruct

    • Storage: 25GB PVC

    • Resources: 1 GPU

Phase 7: Wait for NIM readiness

  • Monitors deployment status (15-minute timeout).

  • Collects diagnostics if issues occur.

  • Ensures NIM reaches READY state.

Phase 8: Verify the NIM endpoint

  • Tests the NIM endpoint responsiveness.

  • Validates the models API endpoint.

  • Confirms the deployment functionality.

Script Usage#

./create-nmp-deployment.sh [OPTIONS]

Options:
  --helm-chart-url URL    Override default helm chart URL
  --values-file FILE      Specify additional values file(s)
  --help                  Show help message