stages.video.clipping.clip_extraction_stages#

Module Contents#

Classes#

ClipTranscodingStage

Stage that transcodes video clips into a standardized format.

FixedStrideExtractorStage

Stage that extracts video clips using fixed-length intervals.

API#

class stages.video.clipping.clip_extraction_stages.ClipTranscodingStage#

Bases: nemo_curator.stages.base.ProcessingStage[nemo_curator.tasks.video.VideoTask, nemo_curator.tasks.video.VideoTask]

Stage that transcodes video clips into a standardized format.

This stage handles the conversion of video clips using FFmpeg, supporting both software (libx264, libopenh264) and hardware (NVENC) encoding with configurable parameters.

Args: num_cpus_per_worker: Number of CPUs per worker. encoder: Video encoder to use. encoder_threads: Number of threads per encoder. encode_batch_size: Number of clips to encode in parallel. nb_streams_per_gpu: Number of streams per GPU. use_hwaccel: Whether to use hardware acceleration. use_input_bit_rate: Whether to use input video bit rate. num_clips_per_chunk: Number of clips per chunk. If the number of clips is larger than this, the clips will be split into chunks, and created VideoTasks for each chunk. verbose: Whether to print verbose logs. ffmpeg_verbose: Whether to print FFmpeg verbose logs.

encode_batch_size: int#

16

encoder: str#

‘libx264’

encoder_threads: int#

1

ffmpeg_verbose: bool#

False

inputs() tuple[list[str], list[str]]#

Define stage input requirements.

Returns (tuple[list[str], list[str]]): Tuple of (required_attributes, required_columns) where: - required_top_level_attributes: List of task attributes that must be present - required_data_attributes: List of attributes within the data that must be present

name: str#

‘clip_transcoding’

nb_streams_per_gpu: int#

3

num_clips_per_chunk: int#

32

num_cpus_per_worker: float#

6.0

outputs() tuple[list[str], list[str]]#

Define stage output specification.

Returns (tuple[list[str], list[str]]): Tuple of (output_attributes, output_columns) where: - output_top_level_attributes: List of task attributes this stage adds/modifies - output_data_attributes: List of attributes within the data that this stage adds/modifies

process(
task: nemo_curator.tasks.video.VideoTask,
) nemo_curator.tasks.video.VideoTask#

Process a task and return the result. Args: task (X): Input task to process Returns (Y | list[Y]): - Single task: For 1-to-1 transformations - List of tasks: For 1-to-many transformations (e.g., readers) - None: If the task should be filtered out

ray_stage_spec() dict[str, Any]#

Ray stage specification for this stage.

setup(
worker_metadata: nemo_curator.backends.base.WorkerMetadata | None = None,
) None#

Setup method called once before processing begins. Override this method to perform any initialization that should happen once per worker. Args: worker_metadata (WorkerMetadata, optional): Information about the worker (provided by some backends)

use_hwaccel: bool#

False

use_input_bit_rate: bool#

False

verbose: bool#

False

class stages.video.clipping.clip_extraction_stages.FixedStrideExtractorStage#

Bases: nemo_curator.stages.base.ProcessingStage[nemo_curator.tasks.video.VideoTask, nemo_curator.tasks.video.VideoTask]

Stage that extracts video clips using fixed-length intervals.

This stage splits videos into clips of specified length and stride, ensuring each clip meets minimum length requirements and optionally limiting total clips.

clip_len_s: float#

None

clip_stride_s: float#

None

inputs() tuple[list[str], list[str]]#

Define stage input requirements.

Returns (tuple[list[str], list[str]]): Tuple of (required_attributes, required_columns) where: - required_top_level_attributes: List of task attributes that must be present - required_data_attributes: List of attributes within the data that must be present

limit_clips: int#

None

min_clip_length_s: float#

None

name: str#

‘fixed_stride_extractor’

outputs() tuple[list[str], list[str]]#

Define stage output specification.

Returns (tuple[list[str], list[str]]): Tuple of (output_attributes, output_columns) where: - output_top_level_attributes: List of task attributes this stage adds/modifies - output_data_attributes: List of attributes within the data that this stage adds/modifies

process(
task: nemo_curator.tasks.video.VideoTask,
) nemo_curator.tasks.video.VideoTask#

Process a task and return the result. Args: task (X): Input task to process Returns (Y | list[Y]): - Single task: For 1-to-1 transformations - List of tasks: For 1-to-many transformations (e.g., readers) - None: If the task should be filtered out

verbose: bool#

False