bridge.recipes.gpt_oss.gpt_oss#
Module Contents#
Functions#
Enable Hopper FP8 current scaling for GPT-OSS recipes. |
|
Return a pre-training config for GPT-OSS 20B variant. |
|
Return a pre-training config for GPT-OSS 120B variant. |
|
Return a pre-training config for GPT-OSS 20B with Hopper FP8 current scaling. |
|
Return a full SFT config for GPT-OSS 20B. |
|
Return a full SFT config for GPT-OSS 120B. |
|
Return a full SFT config for GPT-OSS 20B with Hopper FP8 current scaling. |
|
Return a PEFT config for GPT-OSS 20B. |
|
Return a PEFT config for GPT-OSS 120B. |
|
Return a PEFT config for GPT-OSS 20B with Hopper FP8 current scaling. |
|
Enable Blackwell MXFP8 for GPT-OSS recipes. |
|
Return a pre-training config for GPT-OSS 20B with Blackwell MXFP8. |
|
Return a full SFT config for GPT-OSS 20B with Blackwell MXFP8. |
|
Return a PEFT config for GPT-OSS 20B with Blackwell MXFP8. |
API#
- bridge.recipes.gpt_oss.gpt_oss._enable_gpt_oss_hopper_fp8_current_scaling(
- cfg: megatron.bridge.training.config.ConfigContainer,
Enable Hopper FP8 current scaling for GPT-OSS recipes.
- bridge.recipes.gpt_oss.gpt_oss.gpt_oss_20b_pretrain_config() megatron.bridge.training.config.ConfigContainer#
Return a pre-training config for GPT-OSS 20B variant.
Recommended parallelism: TP=2, PP=4, EP=4
- bridge.recipes.gpt_oss.gpt_oss.gpt_oss_120b_pretrain_config() megatron.bridge.training.config.ConfigContainer#
Return a pre-training config for GPT-OSS 120B variant.
Recommended parallelism: TP=2, PP=4, EP=16
- bridge.recipes.gpt_oss.gpt_oss.gpt_oss_20b_pretrain_fp8_current_scaling_config() megatron.bridge.training.config.ConfigContainer#
Return a pre-training config for GPT-OSS 20B with Hopper FP8 current scaling.
- bridge.recipes.gpt_oss.gpt_oss.gpt_oss_20b_sft_config() megatron.bridge.training.config.ConfigContainer#
Return a full SFT config for GPT-OSS 20B.
Default parallelism: TP=1, PP=1, EP=8
- Returns:
ConfigContainer with all settings pre-configured for GPT-OSS 20B SFT.
- bridge.recipes.gpt_oss.gpt_oss.gpt_oss_120b_sft_config() megatron.bridge.training.config.ConfigContainer#
Return a full SFT config for GPT-OSS 120B.
Default parallelism: TP=1, PP=4, EP=8
- Returns:
ConfigContainer with all settings pre-configured for GPT-OSS 120B SFT.
- bridge.recipes.gpt_oss.gpt_oss.gpt_oss_20b_sft_fp8_current_scaling_config() megatron.bridge.training.config.ConfigContainer#
Return a full SFT config for GPT-OSS 20B with Hopper FP8 current scaling.
- bridge.recipes.gpt_oss.gpt_oss.gpt_oss_20b_peft_config(
- peft_scheme: str | megatron.bridge.peft.base.PEFT = 'lora',
Return a PEFT config for GPT-OSS 20B.
Default parallelism: TP=1, PP=1, EP=1
- Parameters:
peft_scheme – PEFT scheme - “lora”, “dora”, or a custom PEFT instance.
- Returns:
ConfigContainer with all settings pre-configured for GPT-OSS 20B PEFT.
- bridge.recipes.gpt_oss.gpt_oss.gpt_oss_120b_peft_config(
- peft_scheme: str | megatron.bridge.peft.base.PEFT = 'lora',
Return a PEFT config for GPT-OSS 120B.
Default parallelism: TP=1, PP=1, EP=8
- Parameters:
peft_scheme – PEFT scheme - “lora”, “dora”, or a custom PEFT instance.
- Returns:
ConfigContainer with all settings pre-configured for GPT-OSS 120B PEFT.
- bridge.recipes.gpt_oss.gpt_oss.gpt_oss_20b_peft_fp8_current_scaling_config(
- peft_scheme: str | megatron.bridge.peft.base.PEFT = 'lora',
Return a PEFT config for GPT-OSS 20B with Hopper FP8 current scaling.
- bridge.recipes.gpt_oss.gpt_oss._enable_gpt_oss_blackwell_mxfp8(
- cfg: megatron.bridge.training.config.ConfigContainer,
Enable Blackwell MXFP8 for GPT-OSS recipes.
- bridge.recipes.gpt_oss.gpt_oss.gpt_oss_20b_pretrain_mxfp8_config() megatron.bridge.training.config.ConfigContainer#
Return a pre-training config for GPT-OSS 20B with Blackwell MXFP8.
- bridge.recipes.gpt_oss.gpt_oss.gpt_oss_20b_sft_mxfp8_config() megatron.bridge.training.config.ConfigContainer#
Return a full SFT config for GPT-OSS 20B with Blackwell MXFP8.
- bridge.recipes.gpt_oss.gpt_oss.gpt_oss_20b_peft_mxfp8_config(
- peft_scheme: str | megatron.bridge.peft.base.PEFT = 'lora',
Return a PEFT config for GPT-OSS 20B with Blackwell MXFP8.