nemo_automodel.components.models.qwen3_vl_moe.model#
Module Contents#
Classes#
Ensure inv_freq stays in float32 |
|
Ensure the vision rotary inv_freq buffer remains float32. |
|
Qwen3-VL text decoder rebuilt on top of the Qwen3-MoE block implementation. |
|
Qwen3-VL conditional generation model using the Qwen3-MoE backend components. |
Data#
API#
- class nemo_automodel.components.models.qwen3_vl_moe.model.Fp32SafeQwen3VLMoeTextRotaryEmbedding#
Bases:
transformers.models.qwen3_vl_moe.modeling_qwen3_vl_moe.Qwen3VLMoeTextRotaryEmbeddingEnsure inv_freq stays in float32
- _apply(fn: Any, recurse: bool = True)#
- class nemo_automodel.components.models.qwen3_vl_moe.model.Fp32SafeQwen3VLMoeVisionRotaryEmbedding#
Bases:
transformers.models.qwen3_vl_moe.modeling_qwen3_vl_moe.Qwen3VLMoeVisionRotaryEmbeddingEnsure the vision rotary inv_freq buffer remains float32.
- _apply(fn: Any, recurse: bool = True)#
- class nemo_automodel.components.models.qwen3_vl_moe.model.Qwen3VLMoeModel#
Bases:
transformers.models.qwen3_vl_moe.modeling_qwen3_vl_moe.Qwen3VLMoeModel- property layers#
- property embed_tokens#
- property norm#
- class nemo_automodel.components.models.qwen3_vl_moe.model.Qwen3VLMoeTextModelBackend(
- config: transformers.models.qwen3_vl_moe.configuration_qwen3_vl_moe.Qwen3VLMoeTextConfig,
- backend: nemo_automodel.components.moe.utils.BackendConfig,
- *,
- moe_config: nemo_automodel.components.moe.layers.MoEConfig | None = None,
Bases:
torch.nn.ModuleQwen3-VL text decoder rebuilt on top of the Qwen3-MoE block implementation.
Initialization
- forward(
- input_ids: torch.Tensor | None = None,
- *,
- inputs_embeds: torch.Tensor | None = None,
- attention_mask: torch.Tensor | None = None,
- position_ids: torch.Tensor | None = None,
- cache_position: torch.Tensor | None = None,
- visual_pos_masks: torch.Tensor | None = None,
- deepstack_visual_embeds: list[torch.Tensor] | None = None,
- padding_mask: torch.Tensor | None = None,
- past_key_values: Any | None = None,
- use_cache: bool | None = None,
- **attn_kwargs: Any,
- _deepstack_process(
- hidden_states: torch.Tensor,
- visual_pos_masks: torch.Tensor | None,
- visual_embeds: torch.Tensor,
- get_input_embeddings() torch.nn.Module#
- set_input_embeddings(value: torch.nn.Module) None#
- init_weights(buffer_device: torch.device | None = None) None#
- class nemo_automodel.components.models.qwen3_vl_moe.model.Qwen3VLMoeForConditionalGeneration(
- config: transformers.models.qwen3_vl_moe.configuration_qwen3_vl_moe.Qwen3VLMoeConfig,
- moe_config: nemo_automodel.components.moe.layers.MoEConfig | None = None,
- backend: nemo_automodel.components.moe.utils.BackendConfig | None = None,
- **kwargs,
Bases:
transformers.models.qwen3_vl_moe.modeling_qwen3_vl_moe.Qwen3VLMoeForConditionalGeneration,nemo_automodel.components.moe.fsdp_mixin.MoEFSDPSyncMixinQwen3-VL conditional generation model using the Qwen3-MoE backend components.
Initialization
- classmethod from_config(
- config: transformers.models.qwen3_vl_moe.configuration_qwen3_vl_moe.Qwen3VLMoeConfig,
- moe_config: nemo_automodel.components.moe.layers.MoEConfig | None = None,
- backend: nemo_automodel.components.moe.utils.BackendConfig | None = None,
- **kwargs,
- classmethod from_pretrained(
- pretrained_model_name_or_path: str,
- *model_args,
- **kwargs,
- forward(
- input_ids: torch.Tensor | None = None,
- *,
- position_ids: torch.Tensor | None = None,
- attention_mask: torch.Tensor | None = None,
- padding_mask: torch.Tensor | None = None,
- inputs_embeds: torch.Tensor | None = None,
- cache_position: torch.Tensor | None = None,
- **kwargs: Any,
- initialize_weights(
- buffer_device: torch.device | None = None,
- dtype: torch.dtype = torch.bfloat16,
- nemo_automodel.components.models.qwen3_vl_moe.model.ModelClass#
None