Curation Parameters#
The following is a description of all available JSON parameters for the curation pipeline:
pipeline
(string): The curation pipeline method, which can be either “split” or “shard”. We recommend using “split”.args
(JSON object): An object containing the following parameters:generate_embeddings
(Boolean): If True, the curator service will generate embeddings for each video. The default value is True.generate_previews
(Boolean): If True, the curator service will generate a preview image for each video. The default value is True.generate_captions
(Boolean): If True, the curator service will generate text captions for each video. The default value is True.splitting_algorithm
(string): Specifies the algorithm used to segment videos. The following options are available:“transnetv2”: Segments videos using the TransNetV2 algorithm, which detects obvious cuts/transitions in videos.
“panda70m”: Segments videos using the PANDA70M algorithm, which detects more subtle transitions in videos, but is more computationally intensive than TransNetV2.
“fixed_stride”: Segments videos into clips of uniform length.
captioning_prompt_variant
(string): The type of text prompt used to generate captions for each video. The following options are available:“default”: A general text prompt is used.
“av”: A text prompt specific to recordings from an autonomous vehicle camera is used.
“av-surveillance”: A text prompt specific to recordings from a fixed camera (e.g. a surveillance camera) is used.
captioning_prompt_text
(string): The text prompt used to generate captions for each video. If this string is empty, thecaptioning_prompt_variant
parameter is used to determine the prompt. A non-empty string will override thecaptioning_prompt_variant
parameter.nvdec_for_clipping
(integer): Ifsplitting_algorithm
is set to “panda70m”, this value indicates the number of GPUs to use for hardware decoding with the panda70m algorithm. The default value is 0 since this detracts from other GPU-intensive parts of the pipeline .fixed_stride_split_duration
(integer): Ifsplitting_algorithm
is set tofixed_stride
, this value specifies the length of each video clip (in seconds). The default value is 10.encoder
(string): Specifies the video encoder, which can be either “libopenh264” or “h264_nvenc”. The default value is “libopenh264”.use_hwaccel_for_transcoding
(Boolean): If True, the curator service will use hardware acceleration for the decoding portion of transcoding. The default value is False.captioning_algorithm
(string): Specifies the captioning algorithm to use, which can be either “qwen” or “vila-32b”. The default value is “qwen”.qwen_batch_size
(integer): Ifcaptioning_algorithm
is set to “qwen”, this value specifies the batch size to use for the Qwen algorithm. The default value is 8.fp8_weights_for_qwen
(Boolean): Ifcaptioning_algorithm
is set to “qwen”, this value indicates whether to enable FP8 weight quantization for the Qwen algorithm.limit
(integer): Specifies the maximum number of videos to process.limit_clips
(integer): Specifies the maximum number of clips to process. The default value is 0 (i.e. there is no limit on the number of clips).num_cpu_workers_download
(integer): (Only applicable for S3 storage) Specifies the number of CPU workers for raw data downloaded into the pipeline. The default value is 4.num_cpu_workers_clipwriter
(integer): (Only applicable for S3 storage) Specifies the number of CPU workers for writing data to S3 storage. The default value is 8.