bridge.models.mimo.mimo_config#
Module Contents#
Classes#
Parallelism config for a single module in a MIMO model. |
|
Configuration for multi-module (MIMO) heterogeneous parallelism. |
API#
- class bridge.models.mimo.mimo_config.ModuleParallelismConfig#
Parallelism config for a single module in a MIMO model.
- tensor_model_parallel_size: int#
1
- pipeline_model_parallel_size: int#
1
- context_parallel_size: int#
1
- expert_tensor_parallel_size: int#
1
- data_parallel_size: Optional[int]#
None
- rank_offset: int#
0
- property total_model_parallel_size: int#
- property total_ranks: int#
- finalize(world_size: Optional[int]) None#
Compute data_parallel_size if unset, and validate parallelism constraints.
- class bridge.models.mimo.mimo_config.MimoParallelismConfig#
Configuration for multi-module (MIMO) heterogeneous parallelism.
Note: Phase 1 only supports heterogeneous deployment where each module can have different parallelism configurations and rank offsets.
The language module must be named MIMO_LANGUAGE_MODULE_KEY (“language”) in module_parallelisms.
- module_parallelisms: dict[str, bridge.models.mimo.mimo_config.ModuleParallelismConfig]#
None
- special_token_ids: dict[str, int]#
‘field(…)’
- get_parallelism(
- module_name: str,
- property module_names: list[str]#
- property total_world_size: int#
Compute total world size from module rank ranges.
- _validate_heterogeneous() None#
Validate heterogeneous deployment: no overlapping rank ranges.
- _validate_parallelism_constraints() None#
Validate parallelism constraints for cross-module communication.
TP sizes must be powers of 2
DP sizes must be pairwise divisible (one divides the other)
- finalize(world_size: int) None#
Finalize parallelism config: compute data_parallel_size and validate.
- Parameters:
world_size – Total number of ranks in the distributed world. MIMO requires a distributed environment, so this must always be provided.