> For clean Markdown of any page, append .md to the page URL.
> For a complete documentation index, see https://docs.nvidia.com/nemo/automodel/llms.txt.
> For AI client integration (Claude Code, Cursor, etc.), connect to the MCP server at https://docs.nvidia.com/nemo/automodel/_mcp/server.

# nemo_automodel.components.distributed.pipelining.config

Pipeline parallel configuration class.

Design principle:

* Device mesh (world\_mesh, moe\_mesh) is passed separately to from\_pretrained/from\_config
* PipelineConfig contains scheduling, splitting, and runtime options
* loss\_fn is included here since it's only used for pipelining
* Axis names are inferred automatically from device\_mesh in \_instantiate\_pipeline

## Module Contents

### Classes

| Name                                                                                        | Description                                   |
| ------------------------------------------------------------------------------------------- | --------------------------------------------- |
| [`PipelineConfig`](#nemo_automodel-components-distributed-pipelining-config-PipelineConfig) | Configuration for pipeline parallel training. |

### API

```python
class nemo_automodel.components.distributed.pipelining.config.PipelineConfig(
    pp_schedule: typing.Optional[str] = '1f1b',
    pp_schedule_csv: typing.Optional[str] = None,
    pp_microbatch_size: int = 1,
    pp_batch_size: int = 1,
    layers_per_stage: typing.Optional[int] = None,
    round_virtual_stages_to_pp_multiple: typing.Optional[typing.Literal['up', 'down']] = None,
    module_fqns_per_model_part: typing.Optional[typing.List[typing.List[str]]] = None,
    patch_inner_model: bool = True,
    patch_causal_lm_model: bool = True,
    patch_stage_backward_maybe_with_nosync: bool = False,
    dtype: typing.Optional[torch.dtype] = None,
    scale_grads_in_schedule: bool = False,
    loss_fn: typing.Optional[typing.Callable] = None,
    pp_seq_len: typing.Optional[int] = None
)
```

Dataclass

Configuration for pipeline parallel training.

Note: Device mesh (world\_mesh, moe\_mesh) is passed separately on the
from\_pretrained/from\_config method signature. Pipeline parallelism is
enabled when pp\_size > 1. Axis names are inferred automatically from
the device mesh structure.

```python
nemo_automodel.components.distributed.pipelining.config.PipelineConfig.to_dict() -> typing.Dict[str, typing.Any]
```

Convert config to dictionary.