Set Up Manually#
If you want to manually set up the minikube cluster and the NeMo microservices platform, follow the steps in the following sections.
Note
This minikube cluster setup tutorial is designed for the Beginner Tutorials that run small workloads of fine-tuning, evaluating, and running inference on smaller LLMs such as llama-3.1-8b-instruct
and meta-llama/llama-3.2-1b-instruct
. If you want to run AI workloads at a larger scale, set up the NeMo microservices platform on a larger Kubernetes cluster. For more information, refer to About Admin Setup.
Before You Begin#
Check the Requirements before you begin.
Start Minikube#
Download minikube following the minikube start guide in the minikube documentation.
Refer to the instructions for Using NVIDIA GPUs with minikube and follow the steps until you reach the command to start minikube. When you get to that point, use the following command to ensure that there is enough RAM and CPU.
minikube start \ --driver docker \ --container-runtime docker \ --cpus no-limit \ --memory no-limit \ --gpus all
Enable the minikube ingress addon.
minikube addons enable ingress
Install the NeMo Microservices Platform#
Use the NVIDIA NeMo Microservices Helm Chart to install the NeMo microservices platform.
Create an NGC API key following the instructions at Generating NGC API Keys. The NGC API Key is used to fetch Docker images required by the platform.
Export the NGC API Key into your shell environment using the following command:
export NGC_API_KEY=<your-ngc-api-key>
Set up NGC secrets in the
default
namespace of the cluster using the following command:kubectl create secret \ docker-registry nvcrimagepullsecret \ --docker-server=nvcr.io \ --docker-username='$oauthtoken' \ --docker-password=$NGC_API_KEY kubectl create secret generic ngc-api \ --from-literal=NGC_API_KEY=$NGC_API_KEY
Go to build.nvidia.com and generate an NVIDIA API key.
Export the NVIDIA API key into your shell environment using the following command:
export NVIDIA_API_KEY=<your-nvidia-api-key>
Store the
NVIDIA_API_KEY
value in the Kubernetesnvidia-api
secret.kubectl create secret generic nvidia-api \ --from-literal=NVIDIA_API_KEY=$NVIDIA_API_KEY
Download the NeMo Microservices Helm Chart:
helm fetch --untar https://helm.ngc.nvidia.com/nvidia/nemo-microservices/charts/nemo-microservices-helm-chart-25.6.0.tgz \ --username='$oauthtoken' \ --password=$NGC_API_KEY
Download the
demo-values.yaml
file.Note
This
demo-values.yaml
file is for a minimal setup of the NeMo microservices platform on a minikube cluster. For production-grade deployments, you’ll need additional configurations for:Database disaster recovery
Multi-node training support in Volcano
Persistent volume claims (PVC) with proper storage classes
For production-grade deployments, refer to the Admin Setup section.
Note
The NeMo Data Store microservice, configured with the demo values, includes a 2GB persistent volume that can accommodate roughly twenty LoRA adapters. Attempting to run multiple LoRA fine-tuning sessions or even a single full supervised fine-tuning (SFT) job within this demo setup will result in failure.
Install Volcano scheduler before installing the chart:
VOLCANO_VERSION=$(yq '.dependencies[] | select(.name=="volcano") | .version' nemo-microservices-helm-chart/Chart.yaml) kubectl apply -f https://raw.githubusercontent.com/volcano-sh/volcano/v${VOLCANO_VERSION}/installer/volcano-development.yaml
Install the chart:
helm --namespace default install \
nemo nemo-microservices-helm-chart-25.6.0.tgz \
-f demo-values.yaml \
--set guardrails.nvcfAPIKeySecretName="nvidia-api"
The pods require approximately 30 minutes to download images, start the containers, and establish stable communication. During this time, it is normal for pods to be in a pending or restarting state.
Verify that the pods are in the ready state:
kubectl get pods
Confirm that pods with a running status display 1/1
or 2/2
.
If pods do not enter the Running or Completed state after 30 minutes, run kubectl events
to check for errors.
If the pods fail to stabilize, you can investigate image or Kubernetes issues by running kubectl events
. If you want to diagnose issues on a specific deployment, add --for deployment/<deployment name>
to target the deployment. You can also investigate software issues using kubectl logs <name of concerning pod>
.
Configure DNS Resolution#
Display the ingress resources:
kubectl get ingress
The following is an example output.
NAME CLASS HOSTS ADDRESS PORTS AGE nemo-microservices-helm-chart <none> nim.test,data-store.test 80 34m
Export an environment variable with the accessible IP address of your ingress controller:
export NEMO_HOST=$(minikube ip)
Add host name entries in the
/etc/hosts
file for the*.test
ingress hosts to use the accessible IP address. Make a backup of the/etc/hosts
file before you make the changes.sudo cp /etc/hosts /etc/hosts.bak echo -e "$NEMO_HOST nemo.test\n$NEMO_HOST nim.test\n$NEMO_HOST data-store.test\n" | sudo tee -a /etc/hosts
To learn more about how the hosts and their default path rules are configured, refer to Ingress Setup for Production Environment.
Tip
If you complete the steps in this section, the minikube cluster is ready with the NeMo microservices platform installed. Proceed to the Beginner Tutorials to learn how to use the capabilities of the NeMo microservices.
Clean Up#
After you’re done with the Beginner Tutorials, delete the minikube cluster to clean up.
Warning
This deletes the minikube cluster and all the NeMo platform setup and the resources associated within it. Do not run this command unless you are done with the tutorial and want to delete the minikube cluster.
minikube delete
Recover the /etc/hosts
file from the backup to remove the host name entries for the *.test
ingress hosts.
sudo cp /etc/hosts.bak /etc/hosts
Deploy the NeMo Microservices Platform to a Production-Grade Kubernetes Cluster#
For more information about deploying the NeMo microservices platform to a production-grade Kubernetes cluster, proceed to the Admin Setup section.