core.models.mimo.config.base_configs#

Module Contents#

Classes#

MimoModelConfig

Configuration for a multi-modal model.

API#

class core.models.mimo.config.base_configs.MimoModelConfig#

Configuration for a multi-modal model.

Parameters:
  • language_model_spec (ModuleSpec) – Specification for the language model

  • modality_submodules_spec (Dict[str, ModuleSpec]) – Dictionary mapping modality names to their submodule specifications

  • special_token_ids (Dict[str, int]) – Dictionary mapping modality names to their special token IDs. For example, {“vision”: -200, “audio”:32000}, these represent placeholders in the input_ids to insert the modality embeddings at the correct positions.

language_model_spec: megatron.core.transformer.spec_utils.ModuleSpec#

‘field(…)’

modality_submodules_spec: Dict[str, megatron.core.transformer.spec_utils.ModuleSpec]#

‘field(…)’

special_token_ids: Dict[str, int]#

‘field(…)’