***

description: >-
Reference documentation for container environments, configurations, and
deployment variables in NeMo Curator
categories:

* reference
  tags:
* docker
* configuration
* deployment
* gpu-accelerated
* environments
  personas:
* admin-focused
* devops-focused
* mle-focused
  difficulty: reference
  content\_type: reference
  modality: universal

***

# Container Environments

Deploy NeMo Curator in containerized environments for reproducible, scalable data curation pipelines with pre-configured dependencies and optimized runtime settings.

## Overview

NeMo Curator provides official Docker containers with all dependencies pre-installed and optimized for production workloads. Containers offer:

* **Reproducible Environments**: Consistent software stack across development, testing, and production
* **Simplified Deployment**: No manual dependency installation or environment configuration
* **GPU Acceleration**: Pre-configured CUDA, cuDNN, and NVIDIA libraries for optimal performance
* **Multi-Modal Support**: Built-in support for text, image, video, and audio curation
* **Cloud-Ready**: Compatible with Kubernetes, Docker Swarm, and cloud container orchestries

**When to use containers:**

* Production deployments requiring consistency and reliability
* Multi-node cluster processing with identical environments
* CI/CD pipelines for automated data curation workflows
* Quick prototyping without local environment setup
* GPU-accelerated processing in cloud environments

## Available Containers

### Main NeMo Curator Container

The primary container includes comprehensive support for all curation modalities:

**Container registry:** `nvcr.io/nvidia/nemo-curator:{{ container_version }}`

**Supported modalities:**

* ✅ Text curation (CPU/GPU)
* ✅ Image curation (GPU required)
* ✅ Video curation (GPU required, FFmpeg included)
* ✅ Audio curation (GPU required for ASR)

**Pre-installed components:**

* NeMo Curator with all optional dependencies (`[all]` extras)
* CUDA 12.8.1 with cuDNN
* Python 3.12 with uv package manager
* FFmpeg 8+ with NVENC support (for video processing)
* Ray, Dask, and distributed computing frameworks
* NVIDIA optimized Python packages

### Curator Environment

| Property         | Value                                                                                                     |
| ---------------- | --------------------------------------------------------------------------------------------------------- |
| Python Version   | 3.12                                                                                                      |
| CUDA Version     | 12.8.1 (configurable)                                                                                     |
| Operating System | Ubuntu 24.04 (configurable)                                                                               |
| Base Image       | `nvidia/cuda:${CUDA_VER}-cudnn-devel-${LINUX_VER}`                                                        |
| Package Manager  | uv (Ultrafast Python package installer)                                                                   |
| Installation     | NeMo Curator installed with all optional dependencies (`[all]` extras) using uv with NVIDIA index         |
| Environment Path | Virtual environment at `/opt/venv`. Activate with `source /opt/venv/env.sh` after entering the container. |

***

## Container Build Arguments

The main container accepts these build-time arguments for environment customization:

| Argument           | Default       | Description              |
| ------------------ | ------------- | ------------------------ |
| `CUDA_VER`         | `12.8.1`      | CUDA version             |
| `LINUX_VER`        | `ubuntu24.04` | Base OS version          |
| `CURATOR_ENV`      | `ci`          | Curator environment type |
| `NVIDIA_BUILD_ID`  | `<unknown>`   | NVIDIA build identifier  |
| `NVIDIA_BUILD_REF` | -             | NVIDIA build reference   |

***

## Environment Usage Examples

### Text Curation

Uses the default container environment with CPU or GPU workers depending on the module.

### Image Curation

Requires GPU-enabled workers in the container environment.
