bridge.recipes.llama.llama3#

Module Contents#

Functions#

llama32_1b_pretrain_config

Return a pre-training config for Llama 3.2 1B.

llama32_3b_pretrain_config

Return a pre-training config for Llama 3.2 3B.

llama3_8b_pretrain_config

Return a pre-training config for Llama 3 8B.

llama3_8b_16k_pretrain_config

Return a pre-training config for Llama 3 8B 16K.

llama3_8b_64k_pretrain_config

Return a pre-training config for Llama 3 8B 64K.

llama3_8b_128k_pretrain_config

Return a pre-training config for Llama 3 8B 128K.

llama3_8b_low_precision_pretrain_config

Return a low precision (FP8 Current Scaling/MXFP8/NVFP4) pre-training config for Llama 3 8B.

llama3_70b_pretrain_config

Return a pre-training config for Llama 3 70B.

llama3_70b_16k_pretrain_config

Return a pre-training config for Llama 3 70B 16K.

llama3_70b_64k_pretrain_config

Return a pre-training config for Llama 3 70B 64K.

llama31_8b_pretrain_config

Return a pre-training config for Llama 3.1 8B.

llama31_70b_pretrain_config

Return a pre-training config for Llama 3.1 70B.

llama31_405b_pretrain_config

Return a pre-training config for Llama 3.1 405B.

llama32_1b_sft_config

Return a full SFT config for Llama 3.2 1B.

llama32_3b_sft_config

Return a full SFT config for Llama 3.2 3B.

llama3_8b_sft_config

Return a full SFT config for Llama 3 8B.

llama31_8b_sft_config

Return a full SFT config for Llama 3.1 8B.

llama3_70b_sft_config

Return a full SFT config for Llama 3 70B.

llama31_70b_sft_config

Return a full SFT config for Llama 3.1 70B.

llama31_405b_sft_config

Return a full SFT config for Llama 3.1 405B.

llama32_1b_peft_config

Return a PEFT config for Llama 3.2 1B.

llama32_3b_peft_config

Return a PEFT config for Llama 3.2 3B.

llama3_8b_peft_config

Return a PEFT config for Llama 3 8B.

llama31_8b_peft_config

Return a PEFT config for Llama 3.1 8B.

llama3_70b_peft_config

Return a PEFT config for Llama 3 70B.

llama31_70b_peft_config

Return a PEFT config for Llama 3.1 70B.

llama31_405b_peft_config

Return a PEFT config for Llama 3.1 405B.

Data#

API#

bridge.recipes.llama.llama3.SEQUENCE_LENGTH_16K: int#

16384

bridge.recipes.llama.llama3.SEQUENCE_LENGTH_64K: int#

65536

bridge.recipes.llama.llama3.SEQUENCE_LENGTH_128K: int#

131072

bridge.recipes.llama.llama3.llama32_1b_pretrain_config() megatron.bridge.training.config.ConfigContainer#

Return a pre-training config for Llama 3.2 1B.

Recommended parallelism: TP=1, PP=1, CP=1.

bridge.recipes.llama.llama3.llama32_3b_pretrain_config() megatron.bridge.training.config.ConfigContainer#

Return a pre-training config for Llama 3.2 3B.

Recommended parallelism: TP=1, PP=1, CP=1.

bridge.recipes.llama.llama3.llama3_8b_pretrain_config() megatron.bridge.training.config.ConfigContainer#

Return a pre-training config for Llama 3 8B.

Recommended parallelism: TP=1, PP=1, CP=2.

bridge.recipes.llama.llama3.llama3_8b_16k_pretrain_config() megatron.bridge.training.config.ConfigContainer#

Return a pre-training config for Llama 3 8B 16K.

Recommended parallelism: TP=4, PP=2, CP=2, SP=True.

bridge.recipes.llama.llama3.llama3_8b_64k_pretrain_config() megatron.bridge.training.config.ConfigContainer#

Return a pre-training config for Llama 3 8B 64K.

Recommended parallelism: TP=4, PP=2, CP=4, SP=True.

bridge.recipes.llama.llama3.llama3_8b_128k_pretrain_config() megatron.bridge.training.config.ConfigContainer#

Return a pre-training config for Llama 3 8B 128K.

Recommended parallelism: TP=4, PP=2, CP=8, SP=True.

bridge.recipes.llama.llama3.llama3_8b_low_precision_pretrain_config(
mixed_precision_recipe: str,
) megatron.bridge.training.config.ConfigContainer#

Return a low precision (FP8 Current Scaling/MXFP8/NVFP4) pre-training config for Llama 3 8B.

Parameters:

mixed_precision_recipe (str) –

The mixed precision recipe to use. Valid options are:

  • ”bf16_with_mxfp8_mixed”

  • ”bf16_with_fp8_current_scaling_mixed”

  • ”bf16_with_nvfp4_mixed”

Returns:

The pre-training configuration for Llama 3 8B.

Return type:

ConfigContainer

bridge.recipes.llama.llama3.llama3_70b_pretrain_config() megatron.bridge.training.config.ConfigContainer#

Return a pre-training config for Llama 3 70B.

Recommended parallelism: TP=4, PP=4, VPP=5, CP=2, SP=True with CommOverlap.

bridge.recipes.llama.llama3.llama3_70b_16k_pretrain_config() megatron.bridge.training.config.ConfigContainer#

Return a pre-training config for Llama 3 70B 16K.

Recommended parallelism: TP=8, PP=2, CP=2, SP=True with CommOverlap.

bridge.recipes.llama.llama3.llama3_70b_64k_pretrain_config() megatron.bridge.training.config.ConfigContainer#

Return a pre-training config for Llama 3 70B 64K.

Recommended parallelism: TP=8, PP=4, CP=8, SP=True with CommOverlap.

bridge.recipes.llama.llama3.llama31_8b_pretrain_config() megatron.bridge.training.config.ConfigContainer#

Return a pre-training config for Llama 3.1 8B.

Recommended parallelism: TP=1, PP=1, CP=2.

bridge.recipes.llama.llama3.llama31_70b_pretrain_config() megatron.bridge.training.config.ConfigContainer#

Return a pre-training config for Llama 3.1 70B.

Recommended parallelism: TP=4, PP=4, VPP=5, CP=2, SP=True with CommOverlap, seq=128K.

bridge.recipes.llama.llama3.llama31_405b_pretrain_config() megatron.bridge.training.config.ConfigContainer#

Return a pre-training config for Llama 3.1 405B.

Recommended parallelism: TP=8, PP=8, VPP=2, CP=4, SP=True with CommOverlap, seq=128K.

bridge.recipes.llama.llama3.llama32_1b_sft_config() megatron.bridge.training.config.ConfigContainer#

Return a full SFT config for Llama 3.2 1B.

Default parallelism: TP=1, PP=1

Returns:

ConfigContainer with all settings pre-configured for Llama 3.2 1B SFT.

bridge.recipes.llama.llama3.llama32_3b_sft_config() megatron.bridge.training.config.ConfigContainer#

Return a full SFT config for Llama 3.2 3B.

Default parallelism: TP=1, PP=1

Returns:

ConfigContainer with all settings pre-configured for Llama 3.2 3B SFT.

bridge.recipes.llama.llama3.llama3_8b_sft_config() megatron.bridge.training.config.ConfigContainer#

Return a full SFT config for Llama 3 8B.

Default parallelism: TP=2, PP=1

Returns:

ConfigContainer with all settings pre-configured for Llama 3 8B SFT.

bridge.recipes.llama.llama3.llama31_8b_sft_config() megatron.bridge.training.config.ConfigContainer#

Return a full SFT config for Llama 3.1 8B.

Default parallelism: TP=2, PP=1

Returns:

ConfigContainer with all settings pre-configured for Llama 3.1 8B SFT.

bridge.recipes.llama.llama3.llama3_70b_sft_config() megatron.bridge.training.config.ConfigContainer#

Return a full SFT config for Llama 3 70B.

Default parallelism: TP=8, PP=4

Returns:

ConfigContainer with all settings pre-configured for Llama 3 70B SFT.

bridge.recipes.llama.llama3.llama31_70b_sft_config() megatron.bridge.training.config.ConfigContainer#

Return a full SFT config for Llama 3.1 70B.

Default parallelism: TP=8, PP=4

Returns:

ConfigContainer with all settings pre-configured for Llama 3.1 70B SFT.

bridge.recipes.llama.llama3.llama31_405b_sft_config() megatron.bridge.training.config.ConfigContainer#

Return a full SFT config for Llama 3.1 405B.

Default parallelism: TP=8, PP=16, SP=True Total: 128 GPUs (16 nodes)

Returns:

ConfigContainer with all settings pre-configured for Llama 3.1 405B SFT.

bridge.recipes.llama.llama3.llama32_1b_peft_config(
peft_scheme: str | megatron.bridge.peft.base.PEFT = 'lora',
) megatron.bridge.training.config.ConfigContainer#

Return a PEFT config for Llama 3.2 1B.

Default parallelism: TP=1, PP=1

Parameters:

peft_scheme – PEFT scheme - “lora”, “dora”, or a custom PEFT instance.

Returns:

ConfigContainer with all settings pre-configured for Llama 3.2 1B PEFT.

bridge.recipes.llama.llama3.llama32_3b_peft_config(
peft_scheme: str | megatron.bridge.peft.base.PEFT = 'lora',
) megatron.bridge.training.config.ConfigContainer#

Return a PEFT config for Llama 3.2 3B.

Default parallelism: TP=1, PP=1

Parameters:

peft_scheme – PEFT scheme - “lora”, “dora”, or a custom PEFT instance.

Returns:

ConfigContainer with all settings pre-configured for Llama 3.2 3B PEFT.

bridge.recipes.llama.llama3.llama3_8b_peft_config(
peft_scheme: str | megatron.bridge.peft.base.PEFT = 'lora',
) megatron.bridge.training.config.ConfigContainer#

Return a PEFT config for Llama 3 8B.

Default parallelism: TP=1, PP=1

Parameters:

peft_scheme – PEFT scheme - “lora”, “dora”, or a custom PEFT instance.

Returns:

ConfigContainer with all settings pre-configured for Llama 3 8B PEFT.

bridge.recipes.llama.llama3.llama31_8b_peft_config(
peft_scheme: str | megatron.bridge.peft.base.PEFT = 'lora',
) megatron.bridge.training.config.ConfigContainer#

Return a PEFT config for Llama 3.1 8B.

Default parallelism: TP=1, PP=1

Parameters:

peft_scheme – PEFT scheme - “lora”, “dora”, or a custom PEFT instance.

Returns:

ConfigContainer with all settings pre-configured for Llama 3.1 8B PEFT.

bridge.recipes.llama.llama3.llama3_70b_peft_config(
peft_scheme: str | megatron.bridge.peft.base.PEFT = 'lora',
) megatron.bridge.training.config.ConfigContainer#

Return a PEFT config for Llama 3 70B.

Default parallelism: TP=8, PP=1

Parameters:

peft_scheme – PEFT scheme - “lora”, “dora”, or a custom PEFT instance.

Returns:

ConfigContainer with all settings pre-configured for Llama 3 70B PEFT.

bridge.recipes.llama.llama3.llama31_70b_peft_config(
peft_scheme: str | megatron.bridge.peft.base.PEFT = 'lora',
) megatron.bridge.training.config.ConfigContainer#

Return a PEFT config for Llama 3.1 70B.

Default parallelism: TP=8, PP=1

Parameters:

peft_scheme – PEFT scheme - “lora”, “dora”, or a custom PEFT instance.

Returns:

ConfigContainer with all settings pre-configured for Llama 3.1 70B PEFT.

bridge.recipes.llama.llama3.llama31_405b_peft_config(
peft_scheme: str | megatron.bridge.peft.base.PEFT = 'lora',
) megatron.bridge.training.config.ConfigContainer#

Return a PEFT config for Llama 3.1 405B.

Default parallelism: TP=4, PP=8, VPP=8, SP=True Total: 32 GPUs (4 nodes)

Parameters:

peft_scheme – PEFT scheme - “lora”, “dora”, or a custom PEFT instance.

Returns:

ConfigContainer with all settings pre-configured for Llama 3.1 405B PEFT.