Hardware and Software Requirements for NeMo Microservices#

This document outlines the hardware and software requirements for running the NeMo microservices.

Requirements for NeMo Microservices Platform#

The following are the hardware and software requirements for installing the NeMo microservices as a platform.

Hardware Requirements#

Component

Minimum Requirement

Recommended

System

Single-node GPU cluster on Linux

Enterprise-grade multi-node GPU cluster on Linux

GPUs

Two A100 80GB or H100 80GB GPUs

Multiple A100 80GB or H100 80GB GPUs

Storage

200 GB or more of free disk space

High-performance storage system

Network

Standard datacenter connectivity

High-bandwidth, low-latency networking

Software Requirements#

Component

Requirement

Notes

Operating System

Linux with cluster-admin level permissions

Base for container deployments.

Container Runtime

NVIDIA Container Toolkit v1.16.2 or later

Required for building and running GPU-accelerated containers. See Installing the NVIDIA Container Toolkit for details.

NVIDIA Driver

Version 560.35.03 or later

Required for GPU access. See NVIDIA Driver Installation Guide for details.

Kubernetes

Version 1.26.0 or later

Required for cluster management. See Kubernetes Installation Guide for details.

Helm

Refer to Helm Charts Prerequisites in the NGC Catalog User Guide

Required for deployment management. See Helm Installation Guide for details.

NGC Access

Valid NGC API key

Required for accessing Helm charts and container images from NGC Catalog.

Service-specific Requirements#

The following sections describe the hardware and software requirements for installing the NeMo microservices individually.

NeMo Customizer#

The NeMo Customizer service enables model fine-tuning.

Hardware Requirements:

  • NVIDIA A100 80GB (Ampere) and H100 80GB (Hopper) GPUs (compute capabilities 8.0 and 9.0).

  • Support for distributed training with high-speed interconnects on supported clusters (Azure AKS, AWS EKS, Oracle OKE).

  • Sufficient memory to hold model weights and training data.

  • Recommended: 8+ CPU cores, 32+ GB memory per training node.

  • Model checkpoint storage: 25 GB - 100+ GB, depending on which models are offered and how many.

  • Ephemeral node storage: 200 GB for decompressing training images and other storage needs.

Software Requirements:

  • NVIDIA Container Toolkit for GPU access.

  • Container storage interface (CSI) that supports file locking.

  • Integration with NeMo Data Store service for storing files such as training data and model artifacts.

  • Integration with NeMo Entity Store service for metadata and project management.

  • Optional integration with NeMo Evaluator service for model quality assessment.

Datastore#

The NeMo Data Store service stores files such as training data and model artifacts.

Hardware Requirements:

  • No specific GPU requirements.

  • Sufficient storage for data artifacts.

  • Recommended: 2 CPU cores, 1 GB memory.

  • Storage: Minimum 20GB, scaled according to dataset sizes.

Software Requirements:

  • Ubuntu 22.04 base container.

  • Storage for file-based artifacts.

  • Git and Git LFS support.

  • Persistent volumes for artifact storage.

  • Hugging Face libraries for dataset management.

  • Integration with NeMo Entity Store service for metadata management.

  • File locking support in storage system.

Deployment Management#

The NeMo Deployment Management service handles the deployment of NIM for LLMs.

Hardware Requirements:

  • No GPU requirements.

  • Minimal compute resources.

  • Recommended: 1 CPU core, 500 MiB memory.

  • Storage: <2 GiB for operational data.

Software Requirements:

  • Lightweight, minimal container image.

  • Kubernetes cluster with proper RBAC setup.

  • Access to create and manage Kubernetes custom resources.

  • Storage class support for dynamic provisioning.

  • Integration with NeMo Entity Store service for model and deployment metadata.

  • Access to NGC container registry.

  • Optional integration with NeMo Evaluator service for deployment quality gates.

NIM Proxy#

The NIM Proxy service provides a unified API endpoint for model inference.

Hardware Requirements:

  • No GPU requirements.

  • Minimal compute resources.

  • Recommended: 1 CPU core, 500 MiB memory.

  • Storage: <2 GiB for operational data.

Software Requirements:

  • Lightweight, minimal container image.

  • Connection to deployed model endpoints.

  • Load balancing capabilities for multiple model endpoints.

  • Ingress controller for external access.

  • Integration with NeMo Deployment Management service for endpoint information.

  • Integration with NeMo Entity Store service for endpoint metadata.

  • Optional integration with NeMo Guardrails service for output filtering.

Evaluator#

The NeMo Evaluator service handles model quality assessment.

Hardware Requirements:

  • No GPU requirements for the service itself (inference is handled by NIM).

  • Evaluator Container: 2 CPU cores, 1 GB memory recommended.

  • Evaluation Jobs: 2 CPU cores, 1 GB memory recommended per job.

  • Storage: 10GB+ for evaluation results and metrics.

Software Requirements:

  • PostgreSQL database: 2 CPU cores, 1 GB memory recommended.

  • Argo Workflows: 1 CPU core, 2 GB memory recommended.

  • Milvus (optional for Retriever and RAG evaluations): 64 CPU cores, 32 GB memory recommended.

  • Integration with NIM Proxy service for model inference.

  • Integration with NeMo Data Store service for storing evaluation results and accessing datasets.

  • Integration with NeMo Entity Store service for model metadata.

Guardrails#

The NeMo Guardrails service provides safety and compliance controls for model outputs.

Hardware Requirements:

  • No GPU requirements.

  • Minimal compute resources.

  • Recommended: 1 CPU core, 500 MiB memory.

  • Storage: <2 GiB for embedding models and configurations.

Software Requirements:

  • Pre-loaded embedding model for semantic analysis.

  • Configuration store for guardrails definitions.

  • Connection to inference endpoints.

  • Policy storage and management.

  • Integration with NIM Proxy service for inference request interception.

  • Integration with NeMo Entity Store service for configuration storage.

  • Optional integration with NeMo Evaluator service for safety metrics.

Entity Store#

The NeMo Entity Store service manages metadata for models, datasets, and other entities.

Hardware Requirements:

  • No GPU requirements.

  • Minimal compute resources.

  • Recommended: 1 CPU core, 500 MiB memory.

  • Storage: <2 GiB for entity metadata.

Software Requirements:

  • Database for metadata storage.

  • API service for entity management.

  • Persistent storage for metadata.

  • No dependencies on other NeMo microservices (core service).

NeMo Operator#

The NeMo Operator handles custom resource definitions for NeMo Customizer jobs.

Hardware Requirements:

  • No GPU requirements.

  • Minimal compute resources.

  • Recommended: 1 CPU core, 500 MiB memory.

  • Storage: <2 GiB for operational data.

Software Requirements:

  • Lightweight, minimal container image.

  • Kubernetes cluster with proper RBAC setup.

  • Access to create and manage Kubernetes custom resources.

  • Storage class support for dynamic provisioning.

  • Integration with NeMo Customizer service for job management.

  • Integration with NeMo Entity Store service for job metadata.

Client Requirements#

For data scientists, MLEs, and other end users accessing the platform:

Component

Requirement

Purpose

Kubernetes Access

kubectl and relevant credentials

Monitoring and troubleshooting

NGC API Key

Valid credentials

Accessing models and containers

REST API Client

HTTP client or command-line tools

Interacting with services

Hugging Face CLI

Installed Python package

Dataset management