***

layout: overview
slug: nemo-curator/nemo\_curator/models/prompt\_formatter
title: nemo\_curator.models.prompt\_formatter
---------------------------------------------

## Module Contents

### Classes

| Name                                                                       | Description |
| -------------------------------------------------------------------------- | ----------- |
| [`PromptFormatter`](#nemo_curator-models-prompt_formatter-PromptFormatter) | -           |

### Data

[`VARIANT_MAPPING`](#nemo_curator-models-prompt_formatter-VARIANT_MAPPING)

### API

<Anchor id="nemo_curator-models-prompt_formatter-PromptFormatter">
  <CodeBlock showLineNumbers={false} wordWrap={true}>
    ```python
    class nemo_curator.models.prompt_formatter.PromptFormatter(
        prompt_variant: str
    )
    ```
  </CodeBlock>
</Anchor>

<Indent>
  <ParamField path="processor" />

  <Anchor id="nemo_curator-models-prompt_formatter-PromptFormatter-create_message">
    <CodeBlock showLineNumbers={false} wordWrap={true}>
      ```python
      nemo_curator.models.prompt_formatter.PromptFormatter.create_message(
          prompt: str
      ) -> list[dict[str, typing.Any]]
      ```
    </CodeBlock>
  </Anchor>

  <Indent>
    Create a message.

    **Parameters:**

    <ParamField path="text_input">
      The text input to create a message for.
    </ParamField>

    **Returns:** `list[dict[str, Any]]`

    List of messages for the VLM model including the text prompt and video.
  </Indent>

  <Anchor id="nemo_curator-models-prompt_formatter-PromptFormatter-generate_inputs">
    <CodeBlock showLineNumbers={false} wordWrap={true}>
      ```python
      nemo_curator.models.prompt_formatter.PromptFormatter.generate_inputs(
          prompt: str,
          video_inputs: torch.Tensor | None = None,
          override_text_prompt: bool = False
      ) -> dict[str, typing.Any]
      ```
    </CodeBlock>
  </Anchor>

  <Indent>
    Generate inputs for video and text data based on prompt\_variant.

    Processes video and text inputs to create the input for the model. It handles both video and
    image inputs, decoding video and applying preprocessing if needed, and creates a structured
    input dictionary containing the processed prompt and multimodal data.

    **Parameters:**

    <ParamField path="prompt" type="str">
      Text prompt to be included with the input.
    </ParamField>

    <ParamField path="fps">
      Frames per second of the input video.
    </ParamField>

    <ParamField path="preprocess_dtype">
      Data type to use for preprocessing the video/image inputs.
    </ParamField>

    <ParamField path="num_frames_to_use">
      Number of frames to extract from the video. If 0, uses all frames.
    </ParamField>

    <ParamField path="flip_input">
      Whether to flip the input video/image horizontally.
    </ParamField>

    <ParamField path="video_inputs" type="torch.Tensor | None" default="None">
      Pre-processed video inputs. If None, and video data is to be passed to
      the model, then video cannot be None.
    </ParamField>

    <ParamField path="override_text_prompt" type="bool" default="False">
      whether the text prompt should be overridden
    </ParamField>

    **Returns:** `dict[str, Any]`

    dict containing:

    * "prompt": The processed text prompt with chat template applied
    * "multi\_modal\_data": Dictionary containing processed "image" and/or "video" inputs
  </Indent>
</Indent>

<Anchor id="nemo_curator-models-prompt_formatter-VARIANT_MAPPING">
  <CodeBlock showLineNumbers={false} wordWrap={true}>
    ```python
    nemo_curator.models.prompt_formatter.VARIANT_MAPPING = {'qwen': 'Qwen/Qwen2.5-VL-7B-Instruct'}
    ```
  </CodeBlock>
</Anchor>
