> For clean Markdown of any page, append .md to the page URL.
> For a complete documentation index, see https://docs.nvidia.com/nemo/automodel/llms.txt.
> For AI client integration (Claude Code, Cursor, etc.), connect to the MCP server at https://docs.nvidia.com/nemo/automodel/_mcp/server.

# nemo_automodel.components.distributed.mesh_utils

Device mesh construction and access utilities for distributed training.

## Module Contents

### Classes

| Name                                                                       | Description                                   |
| -------------------------------------------------------------------------- | --------------------------------------------- |
| [`_MeshSpec`](#nemo_automodel-components-distributed-mesh_utils-_MeshSpec) | Named mesh shape plus derived flattened axes. |

### Functions

| Name                                                                                                                       | Description                                                               |
| -------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------- |
| [`_create_device_meshes`](#nemo_automodel-components-distributed-mesh_utils-_create_device_meshes)                         | Create raw device meshes based on distributed config type.                |
| [`_create_fsdp2_device_mesh`](#nemo_automodel-components-distributed-mesh_utils-_create_fsdp2_device_mesh)                 | Create the FSDP2 root mesh and optional MoE mesh.                         |
| [`_create_megatron_fsdp_device_mesh`](#nemo_automodel-components-distributed-mesh_utils-_create_megatron_fsdp_device_mesh) | Create the Megatron FSDP mesh.                                            |
| [`_create_moe_mesh`](#nemo_automodel-components-distributed-mesh_utils-_create_moe_mesh)                                   | -                                                                         |
| [`_degree`](#nemo_automodel-components-distributed-mesh_utils-_degree)                                                     | -                                                                         |
| [`_infer_dp_size`](#nemo_automodel-components-distributed-mesh_utils-_infer_dp_size)                                       | -                                                                         |
| [`_init_named_mesh`](#nemo_automodel-components-distributed-mesh_utils-_init_named_mesh)                                   | -                                                                         |
| [`_mesh_device_type`](#nemo_automodel-components-distributed-mesh_utils-_mesh_device_type)                                 | -                                                                         |
| [`_register_flattened_axes`](#nemo_automodel-components-distributed-mesh_utils-_register_flattened_axes)                   | -                                                                         |
| [`_require_size_one`](#nemo_automodel-components-distributed-mesh_utils-_require_size_one)                                 | -                                                                         |
| [`_unflatten_compat`](#nemo_automodel-components-distributed-mesh_utils-_unflatten_compat)                                 | Compatibility shim for DeviceMesh.\_unflatten(), added in PyTorch 2.10.   |
| [`_validate_mesh_spec`](#nemo_automodel-components-distributed-mesh_utils-_validate_mesh_spec)                             | -                                                                         |
| [`get_flat_mesh`](#nemo_automodel-components-distributed-mesh_utils-get_flat_mesh)                                         | Access a 1D submesh by parallelism name (e.g. `"dp"`, `"tp"`, `"dp_cp"`). |
| [`get_fsdp_dp_mesh`](#nemo_automodel-components-distributed-mesh_utils-get_fsdp_dp_mesh)                                   | Return the DP mesh for FSDP2 without losing the original root mesh.       |
| [`get_submesh`](#nemo_automodel-components-distributed-mesh_utils-get_submesh)                                             | Access a submesh by parallelism dim names.                                |

### Data

[`__all__`](#nemo_automodel-components-distributed-mesh_utils-__all__)

### API

```python
class nemo_automodel.components.distributed.mesh_utils._MeshSpec(
    shape: tuple[int, ...],
    axes: tuple[nemo_automodel.components.distributed.mesh.MeshAxisName, ...],
    flattened_axes: dict[nemo_automodel.components.distributed.mesh.MeshAxisName, tuple[nemo_automodel.components.distributed.mesh.MeshAxisName, ...]] = dict()
)
```

Dataclass

Named mesh shape plus derived flattened axes.

```python
nemo_automodel.components.distributed.mesh_utils._create_device_meshes(
    strategy_config: nemo_automodel.components.distributed.config.DistributedStrategyConfig,
    parallelism: nemo_automodel.components.distributed.mesh.ParallelismSizes,
    world_size: int
) -> tuple[torch.distributed.device_mesh.DeviceMesh | None, torch.distributed.device_mesh.DeviceMesh | None]
```

Create raw device meshes based on distributed config type.

```python
nemo_automodel.components.distributed.mesh_utils._create_fsdp2_device_mesh(
    parallelism: nemo_automodel.components.distributed.mesh.ParallelismSizes,
    world_size: int
) -> tuple[torch.distributed.device_mesh.DeviceMesh, torch.distributed.device_mesh.DeviceMesh | None]
```

Create the FSDP2 root mesh and optional MoE mesh.

```python
nemo_automodel.components.distributed.mesh_utils._create_megatron_fsdp_device_mesh(
    parallelism: nemo_automodel.components.distributed.mesh.ParallelismSizes,
    world_size: int
) -> torch.distributed.device_mesh.DeviceMesh
```

Create the Megatron FSDP mesh.

```python
nemo_automodel.components.distributed.mesh_utils._create_moe_mesh(
    device_mesh: torch.distributed.device_mesh.DeviceMesh,
    ep_shard_size: int,
    ep_size: int
) -> torch.distributed.device_mesh.DeviceMesh
```

```python
nemo_automodel.components.distributed.mesh_utils._degree(
    value: int | None
) -> int
```

```python
nemo_automodel.components.distributed.mesh_utils._infer_dp_size(
    dp_size: int | None,
    world_size: int,
    non_dp_size: int,
    expression: str,
    factors: tuple[int, ...]
) -> int
```

```python
nemo_automodel.components.distributed.mesh_utils._init_named_mesh(
    spec: nemo_automodel.components.distributed.mesh_utils._MeshSpec
) -> torch.distributed.device_mesh.DeviceMesh
```

```python
nemo_automodel.components.distributed.mesh_utils._mesh_device_type() -> str
```

```python
nemo_automodel.components.distributed.mesh_utils._register_flattened_axes(
    device_mesh: torch.distributed.device_mesh.DeviceMesh,
    flattened_axes: dict[nemo_automodel.components.distributed.mesh.MeshAxisName, tuple[nemo_automodel.components.distributed.mesh.MeshAxisName, ...]]
) -> None
```

```python
nemo_automodel.components.distributed.mesh_utils._require_size_one(
    strategy_name: str,
    size: int | None,
    feature_name: str
) -> None
```

```python
nemo_automodel.components.distributed.mesh_utils._unflatten_compat(
    flat_mesh: torch.distributed.device_mesh.DeviceMesh,
    axis: int,
    sizes: tuple,
    names: tuple
) -> torch.distributed.device_mesh.DeviceMesh
```

Compatibility shim for DeviceMesh.\_unflatten(), added in PyTorch 2.10.

```python
nemo_automodel.components.distributed.mesh_utils._validate_mesh_spec(
    spec: nemo_automodel.components.distributed.mesh_utils._MeshSpec
) -> None
```

```python
nemo_automodel.components.distributed.mesh_utils.get_flat_mesh(
    device_mesh: torch.distributed.device_mesh.DeviceMesh,
    name: str
) -> torch.distributed.device_mesh.DeviceMesh
```

Access a 1D submesh by parallelism name (e.g. `"dp"`, `"tp"`, `"dp_cp"`).

PyTorch 2.11 deprecates `root_mesh["name"]` for dimensions created via
`_flatten()`.  This reads the `_flatten()` result directly.

**Parameters:**

Any DeviceMesh (root or submesh).

Parallelism dimension name.

```python
nemo_automodel.components.distributed.mesh_utils.get_fsdp_dp_mesh(
    device_mesh: torch.distributed.device_mesh.DeviceMesh,
    dp_replicate_name: str = MeshAxisName.DP_REPLICATE,
    dp_shard_cp_name: str = MeshAxisName.DP_SHARD_CP
) -> torch.distributed.device_mesh.DeviceMesh
```

Return the DP mesh for FSDP2 without losing the original root mesh.

`get_submesh()` may rebuild a fresh DeviceMesh when asked to compose native
and flattened dims like `("dp_replicate", "dp_shard_cp")`. That is fine
for many local operations, but FSDP2 expects its DP mesh to share the same
root mesh as TP/EP meshes. On multi-node TP runs this can break group
construction in non-obvious ways.

Prefer native dimensions whenever possible:

* cp=1, dp\_replicate=1  -> `device_mesh["dp_shard"]`
* cp=1, dp\_replicate>1  -> `device_mesh[("dp_replicate", "dp_shard")]`
* cp>1, dp\_replicate=1  -> `device_mesh[("dp_shard", "cp")]`

When both CP and replicated DP are active we fall back to `get_submesh()`
because the composed mesh is genuinely multi-level.

```python
nemo_automodel.components.distributed.mesh_utils.get_submesh(
    device_mesh: torch.distributed.device_mesh.DeviceMesh,
    names: tuple
) -> torch.distributed.device_mesh.DeviceMesh
```

Access a submesh by parallelism dim names.

Handles all cases: single dims, multi-dim slices, and combinations that
include `_flatten()`-created dims (e.g. `("dp_replicate", "dp_shard_cp")`).
For the latter, finds the parent `_flatten()` result and calls `_unflatten()`
to decompose it into the requested shape.

**Parameters:**

Any DeviceMesh (root or submesh).

Tuple of dimension names.

```python
nemo_automodel.components.distributed.mesh_utils.__all__ = ['_create_device_meshes', '_create_fsdp2_device_mesh', '_create_megatron_fsdp_de...
```