Get Started with Video Curation#
This guide shows how to install Curator and run your first video curation pipeline.
The example pipeline processes a list of videos, splitting each into 10‑second clips using a fixed stride. It then generates clip‑level embeddings for downstream tasks such as duplicate removal and similarity search.
Prerequisites#
To use NeMo Curator’s video curation modules, ensure you meet the following requirements:
OS: Ubuntu 24.04/22.04/20.04 (required for GPU-accelerated processing)
Python: 3.10, 3.11, or 3.12
uv (for package management and installation)
NVIDIA GPU (required)
Volta™ or higher (compute capability 7.0+)
CUDA 12 or above
With defaults, the full splitting plus captioning example can use up to 38 GB of VRAM. Reduce VRAM to about 21 GB by lowering batch sizes and using FP8 where available.
FFmpeg 7+ on your system path. For H.264, ensure an encoder is available:
h264_nvenc
(GPU) orlibopenh264
/libx264
(CPU).Git (required for some model dependencies)
Tip
If you don’t have uv
installed, refer to the Installation Guide for setup instructions, or install it quickly with:
curl -LsSf https://astral.sh/uv/0.8.22/install.sh | sh
source $HOME/.local/bin/env
Install#
Create and activate a virtual environment, then choose an install option:
Note
Cosmos-Embed1 (the default) is generally better than InternVideo2 for most video embedding tasks. Consider using Cosmos-Embed1 (cosmos-embed1-224p
) unless you have specific requirements for InternVideo2.
uv pip install torch wheel_stub psutil setuptools setuptools_scm
uv pip install --no-build-isolation "nemo-curator[video_cuda12]"
git clone https://github.com/NVIDIA-NeMo/Curator.git
cd Curator
uv sync --extra video_cuda12 --all-groups
source .venv/bin/activate
# Install base dependencies
uv pip install torch wheel_stub psutil setuptools setuptools_scm
uv pip install --no-build-isolation "nemo-curator[video_cuda12]"
# Clone and set up InternVideo2
git clone https://github.com/OpenGVLab/InternVideo.git
cd InternVideo
git checkout 09d872e5093296c6f36b8b3a91fc511b76433bf7
# Download and apply NeMo Curator patch
curl -fsSL https://raw.githubusercontent.com/NVIDIA/NeMo-Curator/main/external/intern_video2_multimodal.patch -o intern_video2_multimodal.patch
patch -p1 < intern_video2_multimodal.patch
cd ..
# Add InternVideo2 to the environment
uv pip install InternVideo/InternVideo2/multi_modality
git clone https://github.com/NVIDIA-NeMo/Curator.git
cd Curator
uv sync --extra video_cuda12 --all-groups
bash external/intern_video2_installation.sh
uv add InternVideo/InternVideo2/multi_modality
source .venv/bin/activate
NeMo Curator is available as a standalone container:
# Pull the container
docker pull nvcr.io/nvidia/nemo-curator:25.09
# Run the container
docker run --gpus all -it --rm nvcr.io/nvidia/nemo-curator:25.09
See also
For details on container environments and configurations, see Container Environments.
Install FFmpeg and Encoders#
Curator’s video pipelines rely on FFmpeg
for decoding and encoding. If you plan to encode clips (for example, using --transcode-encoder libopenh264
or h264_nvenc
), install FFmpeg
with the corresponding encoders.
Use the maintained script in the repository to build and install FFmpeg
with libopenh264
and NVIDIA NVENC support. The script enables --enable-libopenh264
, --enable-cuda-nvcc
, and --enable-libnpp
.
Script source: docker/common/install_ffmpeg.sh
curl -fsSL https://raw.githubusercontent.com/NVIDIA-NeMo/Curator/main/docker/common/install_ffmpeg.sh -o install_ffmpeg.sh
chmod +x install_ffmpeg.sh
sudo bash install_ffmpeg.sh
Confirm that FFmpeg
is on your PATH
and that at least one H.264 encoder is available:
ffmpeg -hide_banner -version | head -n 5
ffmpeg -encoders | grep -E "h264_nvenc|libopenh264|libx264" | cat
If encoders are missing, reinstall FFmpeg
with the required options or use the Debian/Ubuntu script above.
Refer to Clip Encoding to choose encoders and verify NVENC support on your system.
Choose Embedding Model#
Embeddings convert each video clip into a numeric vector that captures visual and semantic content. Curator uses these vectors to:
Remove near-duplicate clips during duplicate removal
Enable similarity search and clustering
Support downstream analysis such as caption verification
You can choose between two embedding models:
Cosmos-Embed1 (default): Available in three variants—cosmos-embed1-224p, cosmos-embed1-336p, and cosmos-embed1-448p—which differ in input resolution and accuracy/VRAM tradeoff. All variants are automatically downloaded to
MODEL_DIR
on first run.InternVideo2 (IV2): Open model that requires the IV2 checkpoint and BERT model files to be available locally; higher VRAM usage.
For this quickstart, we’re going to set up support for Cosmos-Embed1-224p.
Prepare Model Weights#
For most use cases, you only need to create a model directory. The required model files will be downloaded automatically on first run.
Create a model directory:
mkdir -p "$MODEL_DIR"
Tip
You can reuse the same
<MODEL_DIR>
across runs.No additional setup is required. The model will be downloaded automatically when first used.
Set Up Data Directories#
Store input videos locally or on S3-compatible storage.
Local: Define paths like:
DATA_DIR=/path/to/videos OUT_DIR=/path/to/output_clips MODEL_DIR=/path/to/models
S3: Configure credentials in
~/.aws/credentials
and uses3://
paths for--video-dir
and--output-clip-path
.
Run the Splitting Pipeline Example#
Use the following example script to read videos, split into clips, and write outputs. This runs a Ray pipeline with XennaExecutor
under the hood.
python -m nemo_curator.examples.video.video_split_clip_example \
--video-dir "$DATA_DIR" \
--model-dir "$MODEL_DIR" \
--output-clip-path "$OUT_DIR" \
--splitting-algorithm fixed_stride \
--fixed-stride-split-duration 10.0 \
--embedding-algorithm cosmos-embed1-224p \
--transcode-encoder libopenh264 \
--verbose
Options#
The example script supports the following options:
Option |
Values or Description |
---|---|
|
|
|
|
|
|
|
Enable captioning and preview generation |
|
Use NVENC when available (for example, |
Tip
To use InternVideo2 instead, set --embedding-algorithm internvideo2
.
Next Steps#
Explore the Video Curation documentation. For encoding guidance, refer to Clip Encoding.