Production Deployment Requirements#
This page details the comprehensive system, hardware, and software requirements for deploying NeMo Curator in production environments.
System Requirements#
Operating System: Ubuntu 22.04/20.04 (recommended)
Python: Python 3.10, 3.11, or 3.12
packaging >= 22.0
Hardware Requirements#
CPU Requirements#
Multi-core CPU with sufficient cores for parallel processing
Memory: Minimum 16GB RAM recommended for text processing
For large datasets: 32GB+ RAM recommended
Memory requirements scale with dataset size and number of workers
GPU Requirements (Optional but Recommended)#
GPU: NVIDIA GPU with Volta™ architecture or higher
Compute capability 7.0+ required
Memory: Minimum 16GB VRAM for GPU-accelerated operations
For video processing: 21GB+ VRAM (reducible with optimization)
For large-scale deduplication: 32GB+ VRAM recommended
CUDA: CUDA 12.0 or above with compatible drivers
Software Dependencies#
Core Dependencies#
Python 3.10+ with required packages for distributed computing
RAPIDS libraries (cuDF) for GPU-accelerated deduplication operations
Container Support (Recommended)#
Docker or Podman for containerized deployment
Access to NVIDIA NGC registry for official containers
Network Requirements#
Reliable network connectivity between nodes
High-bandwidth network for large dataset transfers
InfiniBand recommended for multi-node GPU clusters
Storage Requirements#
Capacity: Storage capacity should be 3-5x the size of input datasets
Input data storage
Intermediate processing files
Output data storage
Performance: High-throughput storage system recommended
SSD storage preferred for frequently accessed data
Parallel filesystem for multi-node access
Deployment-Specific Requirements#
Resource quotas configured for GPU and memory allocation
Performance Considerations#
Memory Management#
Monitor memory usage across distributed workers
Configure appropriate memory limits per worker
Use memory-efficient data formats (e.g., Parquet)
GPU Optimization#
Ensure CUDA drivers are compatible with RAPIDS versions
Configure GPU memory pools (RMM) for optimal performance
Monitor GPU utilization and memory usage
Network Optimization#
Use high-bandwidth interconnects for multi-node deployments
Configure appropriate network protocols (TCP vs UCX)
Optimize data transfer patterns to minimize network overhead