This guide shows how to install Curator and run your first video curation pipeline.
The example pipeline processes a list of videos, splitting each into 10‑second clips using a fixed stride. It then generates clip‑level embeddings for downstream tasks such as duplicate removal and similarity search.
This quickstart guide demonstrates how to:
What you build: A video processing pipeline that:
To use NeMo Curator’s video curation capabilities, ensure your system meets these requirements:
h264_nvenc (recommended for performance; requires an NVENC-equipped GPU — note that A100 and H100 do not include NVENC)libvpx-vp9 (for non-NVENC GPUs; produces VP9 in .mp4)If uv is not installed, refer to the Installation Guide for setup instructions, or install it quickly with:
Create and activate a virtual environment, then choose an install option:
Curator’s video pipelines rely on FFmpeg for decoding and encoding. If you plan to encode clips (using --transcode-encoder h264_nvenc or --transcode-encoder libvpx-vp9), install FFmpeg with NVENC and libvpx-vp9 support. The maintained install script bundles both.
Use the maintained script in the repository to build and install FFmpeg with NVIDIA NVENC and libvpx-vp9 support. The script enables --enable-cuda-nvcc, --enable-libnpp, and --enable-libvpx.
Processing H.264/HEVC/AV1 inputs? You might still need a software decoder — even with NVENC/NVDEC.
Curator runs ffprobe inside CPU-only Ray actors (VideoReader, ClipWriter) for metadata extraction. Those actors can’t open NVDEC decoders, so without a software h264/hevc/av1 decoder your inputs are silently skipped (SoftwareCodecMissingError in the logs).
Run the bundled installer inside the container to add software decoder support — no image rebuild needed:
See Software H.264/HEVC/AV1 Codec Support for the full picture.
Refer to Clip Encoding to choose encoders and verify NVENC support on your system.
Embeddings convert each video clip into a numeric vector that captures visual and semantic content. Curator uses these vectors to:
NeMo Curator supports two embedding model families:
Cosmos-Embed1 (default): Available in three variants—cosmos-embed1-224p, cosmos-embed1-336p, and cosmos-embed1-448p—which differ in input resolution and accuracy/VRAM tradeoff. All variants are automatically downloaded to MODEL_DIR on first run.
Model links:
For this quickstart, the following steps set up support for Cosmos-Embed1-224p.
For most use cases, you only need to create a model directory. The required model files will be downloaded automatically on first run.
Create a model directory:
You can reuse the same <MODEL_DIR> across runs.
No additional setup is required. The model will be downloaded automatically when first used.
Organize input videos and output locations before running the pipeline.
Local: For local file processing. Define paths like:
S3: For cloud storage (AWS S3, MinIO, etc.). Configure credentials in ~/.aws/credentials and use s3:// paths for --video-dir and --output-clip-path.
S3 usage notes:
Use the example script from https://github.com/NVIDIA-NeMo/Curator/tree/main/tutorials/video/getting-started to read videos, split into clips, and write outputs. This runs a Ray pipeline with XennaExecutor under the hood.
What this command does:
$DATA_DIR$OUT_DIRUsing a config file: The example script accepts many command-line arguments. For complex configurations, you can store arguments in a file and pass them with the @ prefix:
echo ‘—video-dir /data/videos —output-clip-path /data/output —splitting-algorithm fixed_stride —fixed-stride-split-duration 10.0 —embedding-algorithm cosmos-embed1-224p —transcode-encoder h264_nvenc’ > my_config.txt
python tutorials/video/getting-started/video_split_clip_example.py @my_config.txt
After successful execution, the output directory will contain:
File descriptions:
Example manifest entry:
nvidia-smi during processing--verbose flag for debugging and monitoringnvidia-smi dmon to track GPU usage during processingExplore the Video Curation documentation. For encoding guidance, refer to Clip Encoding.