bridge.models.nemotron_vl.modeling_nemotron_vl#
Module Contents#
Classes#
A stub Megatron implementation of a Nemotron Vision-Language model. |
API#
- class bridge.models.nemotron_vl.modeling_nemotron_vl.NemotronVLModel(
- config: Optional[megatron.bridge.models.nemotron_vl.nemotron_vl_provider.NemotronNano12Bv2VLModelProvider] = None,
- *,
- llava_model: Optional[megatron.core.models.multimodal.llava_model.LLaVAModel] = None,
- pre_process: bool | None = True,
- post_process: bool | None = True,
- vp_stage: Optional[int] = None,
Bases:
megatron.core.transformer.module.MegatronModuleA stub Megatron implementation of a Nemotron Vision-Language model.
At the moment the class only supports language-only forward passes. Vision inputs will raise
NotImplementedErroruntil a reference vision encoder is open-sourced.Initialization
Create a wrapper that exposes an existing :class:
LLaVAModelvia the Bridge API.Parameters: llava_model: A fully-assembled instance of :class:
~megatron.core.models.multimodal.llava_model.LLaVAModel. config: (Optional) The provider used to generate the model. If omitted we fall back tollava_model.config.- set_input_tensor(input_tensor)#
- forward(*args, **kwargs)#
Delegate the forward pass to the wrapped :class:
LLaVAModel.
- freeze(
- *,
- freeze_language_model: bool = False,
- freeze_vision_model: bool = False,
- freeze_vision_projection: bool = False,
Freeze selected sub-modules by turning off
requires_grad.