bridge.diffusion.recipes.nemotron_labs_diffusion.continuous_pretraining#
Module Contents#
Functions#
Initialize output_layer from embedding weights for untied diffusion_head. |
|
Return a CPT config for NemotronLabsDiffusion 3B. Default: TP=1, MBS=1. |
|
Return a CPT config for NemotronLabsDiffusion 8B. Default: TP=4, MBS=1. |
|
Return a CPT config for NemotronLabsDiffusion 14B. Default: TP=8, MBS=1. |
API#
- bridge.diffusion.recipes.nemotron_labs_diffusion.continuous_pretraining._copy_embedding_to_output_layer(models)#
Initialize output_layer from embedding weights for untied diffusion_head.
- bridge.diffusion.recipes.nemotron_labs_diffusion.continuous_pretraining._nemotron_labs_diffusion_cpt_config(
- hf_path,
- tensor_model_parallel_size,
- micro_batch_size,
- tokenizer_model,
- data_paths=None,
- data_args_path=None,
- peft=None,
- bridge.diffusion.recipes.nemotron_labs_diffusion.continuous_pretraining.nemotron_labs_diffusion_3b_finetune_config(
- data_paths=None,
- data_args_path=None,
- hf_path=None,
- peft=None,
Return a CPT config for NemotronLabsDiffusion 3B. Default: TP=1, MBS=1.
- bridge.diffusion.recipes.nemotron_labs_diffusion.continuous_pretraining.nemotron_labs_diffusion_8b_finetune_config(
- data_paths=None,
- data_args_path=None,
- hf_path=None,
- peft=None,
Return a CPT config for NemotronLabsDiffusion 8B. Default: TP=4, MBS=1.
- bridge.diffusion.recipes.nemotron_labs_diffusion.continuous_pretraining.nemotron_labs_diffusion_14b_finetune_config(
- data_paths=None,
- data_args_path=None,
- hf_path=None,
- peft=None,
Return a CPT config for NemotronLabsDiffusion 14B. Default: TP=8, MBS=1.