*** description: >- Technical reference for NeMo Curator's infrastructure components including distributed computing, memory management, and GPU acceleration categories: * reference tags: * infrastructure * distributed * gpu-accelerated * memory-management * docker * performance personas: * admin-focused * mle-focused * devops-focused difficulty: reference content\_type: reference modality: universal *** # Infrastructure References This section provides technical reference documentation for NeMo Curator's infrastructure components that are used across all modalities (text, image, video). *** ## Infrastructure Components Optimize memory usage when processing large datasets partitioning batching monitoring Leverage NVIDIA GPUs for faster data processing cuda rmm performance Continue interrupted operations across large datasets checkpoints recovery batching Available environments and configurations in NeMo Curator containers. Includes build arguments and video-specific environments. docker conda environments