nemo_automodel.components.distributed.pipelining.config
nemo_automodel.components.distributed.pipelining.config
Pipeline parallel configuration class.
Design principle:
- Device mesh (world_mesh, moe_mesh) is passed separately to from_pretrained/from_config
- PipelineConfig contains scheduling, splitting, and runtime options
- loss_fn is included here since it’s only used for pipelining
- Axis names are inferred automatically from device_mesh in _instantiate_pipeline
Module Contents
Classes
API
Dataclass
Configuration for pipeline parallel training.
Note: Device mesh (world_mesh, moe_mesh) is passed separately on the from_pretrained/from_config method signature. Pipeline parallelism is enabled when pp_size > 1. Axis names are inferred automatically from the device mesh structure.
dtype
layers_per_stage
loss_fn
module_fqns_per_model_part
patch_causal_lm_model
patch_inner_model
patch_stage_backward_maybe_with_nosync
pp_batch_size
pp_microbatch_size
pp_schedule
pp_schedule_csv
pp_seq_len
round_virtual_stages_to_pp_multiple
scale_grads_in_schedule
Convert config to dictionary.