bridge.diffusion.recipes.nemotron_labs_diffusion.continuous_pretraining#

Module Contents#

Functions#

_copy_embedding_to_output_layer

Initialize output_layer from embedding weights for untied diffusion_head.

_nemotron_labs_diffusion_cpt_config

nemotron_labs_diffusion_3b_finetune_config

Return a CPT config for NemotronLabsDiffusion 3B. Default: TP=1, MBS=1.

nemotron_labs_diffusion_8b_finetune_config

Return a CPT config for NemotronLabsDiffusion 8B. Default: TP=4, MBS=1.

nemotron_labs_diffusion_14b_finetune_config

Return a CPT config for NemotronLabsDiffusion 14B. Default: TP=8, MBS=1.

API#

bridge.diffusion.recipes.nemotron_labs_diffusion.continuous_pretraining._copy_embedding_to_output_layer(models)#

Initialize output_layer from embedding weights for untied diffusion_head.

bridge.diffusion.recipes.nemotron_labs_diffusion.continuous_pretraining._nemotron_labs_diffusion_cpt_config(
hf_path,
tensor_model_parallel_size,
micro_batch_size,
tokenizer_model,
data_paths=None,
data_args_path=None,
peft=None,
) megatron.bridge.training.config.ConfigContainer#
bridge.diffusion.recipes.nemotron_labs_diffusion.continuous_pretraining.nemotron_labs_diffusion_3b_finetune_config(
data_paths=None,
data_args_path=None,
hf_path=None,
peft=None,
) megatron.bridge.training.config.ConfigContainer#

Return a CPT config for NemotronLabsDiffusion 3B. Default: TP=1, MBS=1.

bridge.diffusion.recipes.nemotron_labs_diffusion.continuous_pretraining.nemotron_labs_diffusion_8b_finetune_config(
data_paths=None,
data_args_path=None,
hf_path=None,
peft=None,
) megatron.bridge.training.config.ConfigContainer#

Return a CPT config for NemotronLabsDiffusion 8B. Default: TP=4, MBS=1.

bridge.diffusion.recipes.nemotron_labs_diffusion.continuous_pretraining.nemotron_labs_diffusion_14b_finetune_config(
data_paths=None,
data_args_path=None,
hf_path=None,
peft=None,
) megatron.bridge.training.config.ConfigContainer#

Return a CPT config for NemotronLabsDiffusion 14B. Default: TP=8, MBS=1.