> For clean Markdown of any page, append .md to the page URL. > For a complete documentation index, see https://docs.nvidia.com/nemo/curator/llms.txt. > For full documentation content, see https://docs.nvidia.com/nemo/curator/llms-full.txt. # nemo_curator.models.prompt_formatter ## Module Contents ### Classes | Name | Description | | -------------------------------------------------------------------------- | ------------------------------------------------------------------------ | | [`PromptFormatter`](#nemo_curator-models-prompt_formatter-PromptFormatter) | Unified prompt formatter for VLM models using HuggingFace AutoProcessor. | ### Data [`VARIANT_MAPPING`](#nemo_curator-models-prompt_formatter-VARIANT_MAPPING) ### API ```python class nemo_curator.models.prompt_formatter.PromptFormatter( prompt_variant: str ) ``` Unified prompt formatter for VLM models using HuggingFace AutoProcessor. Supports both Qwen and Nemotron model variants. Uses AutoProcessor.from\_pretrained() to load the appropriate tokenizer and chat template from HuggingFace Hub or a local path. ```python nemo_curator.models.prompt_formatter.PromptFormatter._convert_to_numpy( video_inputs: torch.Tensor | numpy.ndarray ) -> numpy.ndarray ``` Convert video inputs to numpy array in (T, H, W, C) format. ```python nemo_curator.models.prompt_formatter.PromptFormatter._create_qwen_message( prompt: str ) -> list[dict[str, typing.Any]] ``` Create a message for Qwen models. ```python nemo_curator.models.prompt_formatter.PromptFormatter._generate_nemotron_inputs( prompt: str, video_inputs: torch.Tensor | numpy.ndarray | None, fps: float ) -> dict[str, typing.Any] ``` Generate inputs for Nemotron models. Nemotron requires video metadata (fps, frames\_indices) for vLLM processing. ```python nemo_curator.models.prompt_formatter.PromptFormatter._generate_qwen_inputs( prompt: str, video_inputs: torch.Tensor | None, override_text_prompt: bool, fps: float = 2.0 ) -> dict[str, typing.Any] ``` Generate inputs for Qwen models. ```python nemo_curator.models.prompt_formatter.PromptFormatter.generate_inputs( prompt: str, video_inputs: torch.Tensor | numpy.ndarray | None = None, override_text_prompt: bool = False, fps: float = 2.0 ) -> dict[str, typing.Any] ``` Generate inputs for video and text data based on prompt\_variant. **Parameters:** Text prompt to be included with the input. Pre-processed video inputs (tensor or numpy array). Whether to regenerate the text prompt even if cached. Frames per second of the input video (used for Nemotron metadata). **Returns:** `dict[str, Any]` dict containing: * "prompt": The processed text prompt with chat template applied * "multi\_modal\_data": Dictionary containing processed "video" inputs ```python nemo_curator.models.prompt_formatter.VARIANT_MAPPING: dict[str, str] = {'qwen2.5': 'Qwen/Qwen2.5-VL-7B-Instruct', 'qwen3': 'Qwen/Qwen3-VL-8B-Instruct',... ```