Infrastructure References#

This section provides technical reference documentation for NeMo Curator’s infrastructure components that are used across all modalities (text, image, video).


Infrastructure Components#

Memory Management

Optimize memory usage when processing large datasets

Memory Management Guide
GPU Acceleration

Leverage NVIDIA GPUs for faster data processing

GPU Processing Guide
Resumable Processing

Continue interrupted operations across large datasets

Resumable Processing
Container Environments

Available environments and configurations in NeMo Curator containers. Includes build arguments and video-specific environments.

Container Environments