nemo_curator.models.nemotron_h_vl

View as Markdown

Module Contents

Classes

NameDescription
LLM-
NemotronHVLNemotronH hybrid Mamba-Attention VLM for video captioning.
SamplingParams-

Data

EXPECTED_VIDEO_TAG_PARTS

NemotronVariant

VIDEO_TAG_SPLIT_MAX

VLLM_AVAILABLE

_NEMOTRON_REVISION_INFO

_NEMOTRON_VARIANTS_INFO

API

class nemo_curator.models.nemotron_h_vl.LLM()
class nemo_curator.models.nemotron_h_vl.NemotronHVL(
model_dir: str,
model_variant: nemo_curator.models.nemotron_h_vl.NemotronVariant = 'nemotron',
caption_batch_size: int = 8,
max_output_tokens: int = 512,
stage2_prompt_text: str | None = None,
verbose: bool = False
)

Bases: ModelInterface

NemotronH hybrid Mamba-Attention VLM for video captioning.

Supports multiple checkpoint variants from HuggingFace:

  • nemotron / nemotron-bf16: BF16 precision (default)
  • nemotron-fp8: FP8 quantized
  • nemotron-nvfp4: NVFP4 quantized

Models are automatically downloaded from HuggingFace on first use.

_hf_model_id
= _NEMOTRON_VARIANTS_INFO[self._normalized_variant]
_normalized_variant
NemotronVariant = 'nemotron-bf16'
model_id_names
list[str]

Return HuggingFace model ID for the selected variant.

stage2_prompt
weight_file
= str(Path(model_dir) / self._hf_model_id)
nemo_curator.models.nemotron_h_vl.NemotronHVL._refine_caption_prompt(
original_prompt: str,
refinement_text: str
) -> str

Create a refined prompt for stage 2 captioning.

nemo_curator.models.nemotron_h_vl.NemotronHVL.download_weights_on_node(
model_dir: str,
variant: nemo_curator.models.nemotron_h_vl.NemotronVariant = 'nemotron'
) -> None
classmethod

Download NemotronH VL weights from HuggingFace.

Models are automatically downloaded from HuggingFace Hub on first use. Supports multiple quantization variants for different performance/memory tradeoffs.

Parameters:

model_dir
str

Base directory for model weights. The model will be downloaded to a subdirectory named after the HuggingFace model ID.

variant
NemotronVariantDefaults to 'nemotron'

Model variant to download. Options:

  • “nemotron” or “nemotron-bf16”: BF16 precision (default)
  • “nemotron-fp8”: FP8 quantized
  • “nemotron-nvfp4”: NVFP4 quantized
nemo_curator.models.nemotron_h_vl.NemotronHVL.generate(
videos: list[dict[str, typing.Any]],
generate_stage2_caption: bool = False,
batch_size: int = 16
) -> list[str]
nemo_curator.models.nemotron_h_vl.NemotronHVL.setup() -> None
class nemo_curator.models.nemotron_h_vl.SamplingParams()
nemo_curator.models.nemotron_h_vl.EXPECTED_VIDEO_TAG_PARTS = 2
nemo_curator.models.nemotron_h_vl.NemotronVariant = Literal['nemotron', 'nemotron-bf16', 'nemotron-fp8', 'nemotron-nvfp4']
nemo_curator.models.nemotron_h_vl.VIDEO_TAG_SPLIT_MAX = 1
nemo_curator.models.nemotron_h_vl.VLLM_AVAILABLE = True
nemo_curator.models.nemotron_h_vl._NEMOTRON_REVISION_INFO: Final = {'nemotron': '5d250e2e111dc5e1434131bdf3d590c27a878ade', 'nemotron-bf16': '5d250...
nemo_curator.models.nemotron_h_vl._NEMOTRON_VARIANTS_INFO: Final = {'nemotron': 'nvidia/NVIDIA-Nemotron-Nano-12B-v2-VL-BF16', 'nemotron-bf16': 'nvi...