Demo Cluster Requirements#

Meet the following requirements to set up a demo cluster with the NeMo microservices platform and run the Beginner Platform Tutorials.

Note

The demo cluster requirements are specific to running the getting started tutorials with the Llama 3.1 8B Instruct model. If you want to try with a larger model, you need to re-evaluate and adjust the requirements accordingly.

System Requirements

The following are the common requirements for running the Beginner Platform Tutorials.

  • A single-node NVIDIA GPU cluster on a Linux host with cluster-admin permissions.

  • A least 300 GB of free disk space.

  • Two NVIDIA GPUs, B200 80B, A100 80 GB, or H100 80 GB, and no other workloads running on them:

    • One GPU for model fine-tuning.

    • One GPU for a meta/llama-3.1-8b-instruct NIM microservice for inference.

Software Requirements

NVIDIA developed and tested this tutorial using minikube and meeting the following prerequisites.

The minikube cluster setup tutorial uses the following minikube features:

  • minikube ingress.

  • Standard storage class using host path volumes provided by the default storage provisioner.

    The host file system for the host path volumes must support file locking. During customization with NeMo Customizer, NeMo Operator starts an entity handler pod that runs the Hugging Face CLI. The CLI requires a file system, such as EXT4, that supports file locking.