nemo_curator.models.prompt_formatter
nemo_curator.models.prompt_formatter
Module Contents
Classes
Data
API
Unified prompt formatter for VLM models using HuggingFace AutoProcessor.
Supports both Qwen and Nemotron model variants. Uses AutoProcessor.from_pretrained() to load the appropriate tokenizer and chat template from HuggingFace Hub or a local path.
Convert video inputs to numpy array in (T, H, W, C) format.
Create a message for Qwen models.
Generate inputs for Nemotron models.
Nemotron requires video metadata (fps, frames_indices) for vLLM processing.
Generate inputs for Qwen models.
Generate inputs for video and text data based on prompt_variant.
Parameters:
Text prompt to be included with the input.
Pre-processed video inputs (tensor or numpy array).
Whether to regenerate the text prompt even if cached.
Frames per second of the input video (used for Nemotron metadata).
Returns: dict[str, Any]
dict containing:
- “prompt”: The processed text prompt with chat template applied
- “multi_modal_data”: Dictionary containing processed “video” inputs