Requirements | NeMo Curator

This page details the comprehensive system, hardware, and software requirements for deploying NeMo Curator in production environments.

System Requirements

Operating System: Ubuntu 22.04/20.04 (recommended)
Python: Python 3.10, 3.11, or 3.12
- packaging >= 22.0

Python 3.10 support will be removed in NeMo Curator 26.06. 26.04 is the last release to support Python 3.10. Standardize production environments on a newer supported Python version (3.11+) before upgrading to 26.06. See the 26.04 release notes for details.

Hardware Requirements

CPU Requirements

Multi-core CPU with sufficient cores for parallel processing
Memory: Minimum 16GB RAM recommended for text processing
- For large datasets: 32GB+ RAM recommended
- Memory requirements scale with dataset size and number of workers

GPU Requirements (Optional but Recommended)

GPU: NVIDIA GPU with Volta™ architecture or higher
- Compute capability 7.0+ required
- Memory: Minimum 16GB VRAM for GPU-accelerated operations
- For video processing: 21GB+ VRAM (reducible with optimization)
- For large-scale deduplication: 32GB+ VRAM recommended
CUDA: CUDA 12.0 or above with compatible drivers

Software Dependencies

Core Dependencies

Python 3.10+ with required packages for distributed computing
RAPIDS libraries (cuDF) for GPU-accelerated deduplication operations

Container Support (Recommended)

Docker or Podman for containerized deployment
Access to NVIDIA NGC registry for official containers

Network Requirements

Reliable network connectivity between nodes
High-bandwidth network for large dataset transfers
InfiniBand recommended for multi-node GPU clusters

Storage Requirements

Capacity: Storage capacity should be 3-5x the size of input datasets
- Input data storage
- Intermediate processing files
- Output data storage
Performance: High-throughput storage system recommended
- SSD storage preferred for frequently accessed data
- Parallel filesystem for multi-node access

Deployment-Specific Requirements

Resource quotas configured for GPU and memory allocation

Performance Considerations

Memory Management

Monitor memory usage across distributed workers
Configure appropriate memory limits per worker
Use memory-efficient data formats (e.g., Parquet)

GPU Optimization

Ensure CUDA drivers are compatible with RAPIDS versions
Configure GPU memory pools (RMM) for optimal performance
Monitor GPU utilization and memory usage

Network Optimization

Use high-bandwidth interconnects for multi-node deployments
Configure appropriate network protocols (TCP vs UCX)
Optimize data transfer patterns to minimize network overhead