bridge.recipes.olmoe.olmoe_7b#

Module Contents#

Functions#

_get_olmoe_pipeline_layout

Get pipeline layout for OLMoE-7B based on PP and VP size.

olmoe_7b_pretrain_config

Return a pre-training config for OLMoE-7B (7B total, ~1B active).

olmoe_7b_sft_config

Return a full SFT config for OLMoE-7B (7B total, ~1B active).

olmoe_7b_peft_config

Return a PEFT config for OLMoE-7B (7B total, ~1B active).

API#

bridge.recipes.olmoe.olmoe_7b._get_olmoe_pipeline_layout(pp_size: int, vp_size: int)#

Get pipeline layout for OLMoE-7B based on PP and VP size.

bridge.recipes.olmoe.olmoe_7b.olmoe_7b_pretrain_config() megatron.bridge.training.config.ConfigContainer#

Return a pre-training config for OLMoE-7B (7B total, ~1B active).

Recommended parallelism: TP=1, PP=1, EP=8 Uses precision-aware optimizer with bf16 gradients/moments.

bridge.recipes.olmoe.olmoe_7b.olmoe_7b_sft_config() megatron.bridge.training.config.ConfigContainer#

Return a full SFT config for OLMoE-7B (7B total, ~1B active).

Default parallelism: TP=1, PP=1, EP=8, SP=False

Returns:

ConfigContainer with all settings pre-configured for OLMoE-7B SFT.

bridge.recipes.olmoe.olmoe_7b.olmoe_7b_peft_config(
peft_scheme: str | megatron.bridge.peft.base.PEFT = 'lora',
) megatron.bridge.training.config.ConfigContainer#

Return a PEFT config for OLMoE-7B (7B total, ~1B active).

Default parallelism: TP=1, PP=1, EP=1, SP=False

Parameters:

peft_scheme – PEFT scheme - “lora”, “dora”, or a custom PEFT instance.

Returns:

ConfigContainer with all settings pre-configured for OLMoE-7B PEFT.