nemo_automodel.components.models.deepseek_v4.fsdp
nemo_automodel.components.models.deepseek_v4.fsdp
Module Contents
Functions
Data
API
Return the 1D PyTorch FSDP2 group used for HCA graph alignment.
HCA graph alignment is an FSDP/FSDP2 parameter-sync invariant: ranks that synchronize the same sharded HCA parameters must agree on whether the HCA compressor path participates in backward. This DeepSeek-V4 wrapper gets that domain from its 1D PyTorch FSDP2 mesh. The mesh may be named or unnamed; multi-dimensional meshes need an explicit owner dimension to avoid reducing across unrelated parallel groups. Until that is available, disable HCA graph alignment instead of using a broader or wrong group.
Apply FSDP2 to DeepSeek-V4 without mixing fp32 and bf16 params in one unit.
This is intentionally model-specific. DeepSeek-V4 keeps a small set of reference-sensitive tensors in fp32, while the existing DeepEP path expects the transformer block itself to remain the main FSDP unit.