Video Data Loading
Load video data for curation using NeMo Curator.
How it Works
NeMo Curator loads videos with a composite stage that discovers files and extracts metadata:
VideoReaderdecomposes into a partitioning stage plus a reader stage.- Local paths use
FilePartitioningStageto list files; remote URLs (for example,s3://,gcs://,http(s)://) useClientPartitioningStagebacked byfsspec. - For remote datasets, you can optionally supply an explicit file list using
ClientPartitioningStage.input_list_json_path. VideoReaderStagedownloads bytes (local or viaFSPath) and callsvideo.populate_metadata()to extract resolution, fps, duration, encoding format, and other fields.- Set
video_limitto cap discovery; useNonefor unlimited. Setverbose=Trueto log detailed per-video information.
Local and Cloud
Use VideoReader to load videos from local paths or remote URLs.
Local Paths
- Examples:
/data/videos/,/mnt/datasets/av/ - Uses
FilePartitioningStageto recursively discover files. - Filters by extensions:
.mp4,.mov,.avi,.mkv,.webm. - Set
video_limitto cap discovery during testing (Nonemeans unlimited).
Remote Paths
- Examples:
s3://bucket/path/,gcs://bucket/path/,https://host/path/, and other fsspec-supported protocols such ass3a://andabfs://. - Uses
ClientPartitioningStagebacked byfsspecto list files. - Optional
input_list_json_pathallows explicit file lists under a root prefix. - Wraps entries as
FSPathfor efficient byte access during reading.
Use an object storage prefix (for example, s3://my-bucket/videos/) to stream from cloud storage. Configure credentials in your environment or client configuration.
Example
Explicit File List (JSON)
For remote datasets, ClientPartitioningStage can use an explicit file list JSON. Each entry must be an absolute path under the specified root.
JSON Format
If any entry is outside the root, the stage raises an error.
Example
Supported File Types
The loader filters these video extensions by default:
.mp4.mov.avi.mkv.webm
Metadata on Load
After a successful read, the loader populates the following metadata fields for each video:
size(bytes)width,heightframeratenum_framesduration(seconds)video_codec,pixel_format,audio_codecbit_rate_k
With verbose=True, the loader logs size, resolution, fps, duration, weight, and bit rate for each processed video.