Set Up Using Deployment Scripts#

Start a minikube cluster and install the NeMo microservices platform using the automated deployment scripts provided by NVIDIA.

Note

This minikube cluster setup tutorial is designed for the Beginner Tutorials that run small workloads of fine-tuning, evaluating, and running inference on smaller LLMs such as llama-3.1-8b-instruct and meta-llama/llama-3.2-1b-instruct. If you want to run AI workloads at a larger scale, set up the NeMo microservices platform on a larger Kubernetes cluster. For more information, refer to About Admin Setup.

Before You Begin#

Check the Requirements before you begin.

To Start Minikube Cluster and Install NeMo Microservices Platform#

Follow the steps below to start a minikube cluster and install the NeMo microservices platform using the automated deployment scripts provided by NVIDIA.

Download the following files in a directory on your local machine.
- create-nmp-deployment.sh: NeMo platform deployment script. Use this script to set up a minikube cluster and install the NeMo microservices platform on it.
- destroy-nmp-deployment.sh: Clean-up script. Use this script to delete the minikube cluster and the local files used during the deployment.
- demo-values.yaml: This values file is for installing the NeMo microservices platform Helm chart, nemo-microservices-helm-chart-25.7.0.tgz, on the minikube cluster for demonstration purposes. Save this file in the same directory as the scripts.
  Note
  
  This demo-values.yaml file is for a minimal setup of the NeMo microservices platform on a minikube cluster. For production-grade deployments, you’ll need additional configurations for:
  - Database disaster recovery
  - Multi-node training support in Volcano
  - Persistent volume claims (PVC) with proper storage classes
  For production-grade deployments, refer to the Admin Setup section.
  Note
  
  The NeMo Data Store microservice, configured with the demo values, includes a 2GB persistent volume that can accommodate roughly twenty LoRA adapters. Attempting to run multiple LoRA fine-tuning sessions or even a single full supervised fine-tuning (SFT) job within this demo setup will result in failure.

Make the scripts executable.

chmod +x create-nmp-deployment.sh destroy-nmp-deployment.sh

Create an NGC API key following the instructions at Generating NGC API Keys.
Export the NGC API key into your shell environment using the following command:
```
export NGC_API_KEY=<your-ngc-api-key>
```
Go to build.nvidia.com and generate an NVIDIA API key.
Export the NVIDIA API key into your shell environment using the following command:
```
export NVIDIA_API_KEY=<your-nvidia-api-key>
```

Run the create-nmp-deployment.sh script to set up the minikube cluster and the NeMo microservices platform.

./create-nmp-deployment.sh --values-file demo-values.yaml  --helm-chart-url  https://helm.ngc.nvidia.com/nvidia/nemo-microservices/charts/nemo-microservices-helm-chart-25.7.0.tgz

Move onto the Beginner Tutorials to learn how to use the capabilities of the NeMo microservices.
After you’re done with the Beginner Tutorials, run the destroy-nmp-deployment.sh script to delete everything.
```
./destroy-nmp-deployment.sh
```

NeMo Platform Deployment Script Overview#

The create-nmp-deployment.sh script automates the deployment of the NeMo microservices platform on a minikube cluster. Here’s a detailed breakdown of its functionality:

Deployment Script Phases#
Phase	Description
Phase 0: Run pre-flight checks	Verifies system requirements. Checks required software.
Phase 1: Set up minikube	Initializes a fresh minikube cluster with Docker driver and runtime. Configures unlimited CPU/memory resources. Enables GPU access. Enables ingress addon.
Phase 2: Create secrets	Creates Kubernetes secrets: Docker registry secret (`nvcrimagepullsecret`) NGC API key secret (`ngc-api`) NVIDIA API key secret (`nvidia-api`)
Phase 3: Install the Helm chart	Downloads the NeMo Microservices Helm Chart Creates a default `demo-values.yaml` file if it doesn’t exist Installs the Volcano scheduler Deploys the NeMo platform components
Phase 4: Verify pod health	Monitors pod initialization (30-minute timeout) Checks for critical errors (e.g., `ImagePullBackOff`). Collects diagnostic information. Validates pod health status.
Phase 5: Configure DNS	Updates `/etc/hosts` with DNS entries: <minikube-ip> nemo.test <minikube-ip> nim.test <minikube-ip> data-store.test
Phase 6: Deploy Meta LLaMA NIM	Deploys Meta LLaMA 3.1 8B Instruct NIM with the following configuration: Model: `llama-3.1-8b-instruct` Storage: 25GB PVC Resources: 1 GPU
Phase 7: Wait for NIM readiness	Monitors deployment status (15-minute timeout). Collects diagnostics if issues occur. Ensures NIM reaches READY state.
Phase 8: Verify the NIM endpoint	Tests the NIM endpoint responsiveness. Validates the models API endpoint. Confirms the deployment functionality.

Script Usage#

./create-nmp-deployment.sh [OPTIONS]

Options:
  --helm-chart-url URL    [Required] Helm chart URL
  --values-file FILE      [Required] Specify values file(s)
  --help                  Show help message