Curation Parameters#

The following is a description of all available JSON parameters for the curation pipeline:

pipeline (string): The curation pipeline method, which can be either “split” or “shard”. We recommend using “split”.
args (JSON object): An object containing the following parameters:
- input_video_path (string): (Only applicable for S3 storage) S3 uri which contains videos that will be processed. Video data in any prefix contained within this path will be processed.
- output_clip_path (string): (Only applicable for S3 storage) S3 uri which where results will be written to.
- input_s3_profile_name (string): (Only applicable for S3 storage) Specify the AWS profile name to use for reading data from S3. The default value is default.
- output_s3_profile_name (string): (Only applicable for S3 storage) Select the AWS profile name to use for writing data from S3. The default value is default.
- generate_embeddings (Boolean): If True, the curator service will generate embeddings for each video. The default value is True.
- generate_previews (Boolean): If True, the curator service will generate a preview image for each video. The default value is True.
- generate_captions (Boolean): If True, the curator service will generate text captions for each video. The default value is True.
- splitting_algorithm (string): Specifies the algorithm used to segment videos. The following options are available:
  - “transnetv2”: Segments videos using the TransNetV2 algorithm, which detects obvious cuts/transitions in videos.
  - “panda70m”: Segments videos using the PANDA70M algorithm, which detects more subtle transitions in videos, but is more computationally intensive than TransNetV2.
  - “fixed_stride”: Segments videos into clips of uniform length.
- captioning_prompt_variant (string): The type of text prompt used to generate captions for each video. The following options are available:
  - “default”: A general text prompt is used.
  - “av”: A text prompt specific to recordings from an autonomous vehicle camera is used.
  - “av-surveillance”: A text prompt specific to recordings from a fixed camera (e.g. a surveillance camera) is used.
- captioning_prompt_text (string): The text prompt used to generate captions for each video. If this string is empty, the captioning_prompt_variant parameter is used to determine the prompt. A non-empty string will override the captioning_prompt_variant parameter.
- nvdec_for_clipping (integer): If splitting_algorithm is set to “panda70m”, this value indicates the number of GPUs to use for hardware decoding with the panda70m algorithm. The default value is 0 since this detracts from other GPU-intensive parts of the pipeline .
- fixed_stride_split_duration (integer): If splitting_algorithm is set to fixed_stride, this value specifies the length of each video clip (in seconds). The default value is 10.
- encoder (string): Specifies the video encoder, which can be either “libopenh264” or “h264_nvenc”. The default value is “libopenh264”.
- use_hwaccel_for_transcoding (Boolean): If True, the curator service will use hardware acceleration for the decoding portion of transcoding. The default value is False.
- captioning_algorithm (string): Specifies the captioning algorithm to use, which can be either “qwen” or “vila-32b”. The default value is “qwen”.
- qwen_batch_size (integer): If captioning_algorithm is set to “qwen”, this value specifies the batch size to use for the Qwen algorithm. The default value is 8.
- fp8_weights_for_qwen (Boolean): If captioning_algorithm is set to “qwen”, this value indicates whether to enable FP8 weight quantization for the Qwen algorithm.
- limit (integer): Specifies the maximum number of videos to process.
- limit_clips (integer): Specifies the maximum number of clips to process. The default value is 0 (i.e. there is no limit on the number of clips).
- num_cpu_workers_download (integer): (Only applicable for S3 storage) Specifies the number of CPU workers for raw data downloaded into the pipeline. The default value is 4.
- num_cpu_workers_clipwriter (integer): (Only applicable for S3 storage) Specifies the number of CPU workers for writing data to S3 storage. The default value is 8.
- motion_filter (string): Controls motion filtering behavior. The following options are available:
  - “disable”: (default) Do not generate motion scores or perform filtering.
  - “score-only”: Enable motion scoring without filtering.
  - “enable”: Filters clips based on motion thresholds.
- motion_global_mean_threshold (float): Sets the threshold for global average motion filtering when motion_filter is set to “enable”. Clips with scores below the threshold will be filtered. The default value is 0.00098.
- motion_per_patch_min_256_threshold (float): Sets the threshold for average motion filtering in any 256x256 pixel patch when motion_filter is set to “enable”. Clips with scores below the threshold will be filtered. The default value is 0.000001.
- aesthetic_threshold (float): If specified, filter out clips with an aesthetic score below this threshold. Seting a threshold of 0.0 will enable aesthetic scoring without filtering clips.