Deploy NeMo Curator#
Use the following Admin guides to set up NeMo Curator in a production environment.
Prerequisites#
Before deploying NeMo Curator in a production environment, review the comprehensive requirements:
System: Ubuntu 22.04/20.04, Python 3.10+
Hardware: Multi-core CPU, 16GB+ RAM (optional: NVIDIA GPU with 16GB+ VRAM)
Software: Dask, container runtime (Docker/Singularity), cluster management tools
Infrastructure: Shared storage, high-bandwidth networking
For detailed system, hardware, and software requirements, see Production Deployment Requirements.
Deployment Options#
Deploy NeMo Curator on Kubernetes clusters using Dask Operator, GPU Operator, and PVC storage. Includes setup, storage, cluster creation, module execution, and cleanup.
Run NeMo Curator on Slurm clusters with shared filesystems. Covers job scripts, Dask cluster setup, module execution, monitoring, and advanced Python-based job submission.
After Deployment#
Once your infrastructure is running, you’ll need to configure NeMo Curator for your specific environment. See the Configuration Guide for deployment-specific settings, environment variables, and storage credentials.