nemo_automodel._transformers.mfu#

AutoMFU: Automatic Model FLOPs Utilization calculator.

Similar interface to HuggingFace AutoModel, this module provides automatic MFU calculation for various model architectures.

Module Contents#

Classes#

AutoMFU

Auto MFU calculator - provides MFU calculation for various model architectures.

Functions#

get_device_flops

Get theoretical device FLOPS in a requested unit.

Data#

API#

nemo_automodel._transformers.mfu.logger#

‘getLogger(…)’

nemo_automodel._transformers.mfu._DEVICE_FLOPS: Dict[str, float]#

None

nemo_automodel._transformers.mfu._UNIT_TO_SCALE#

None

nemo_automodel._transformers.mfu._UNWRAP_ATTRS#

(‘module’, ‘_orig_mod’, ‘_fsdp_wrapped_module’, ‘model’)

nemo_automodel._transformers.mfu._CONFIG_ALIAS_ATTRS#

((‘n_embd’, ‘hidden_size’), (‘n_layer’, ‘num_hidden_layers’), (‘n_head’, ‘num_attention_heads’), (‘n…

nemo_automodel._transformers.mfu.get_device_flops(
unit: str = 'T',
device_name: Optional[str] = None,
) float#

Get theoretical device FLOPS in a requested unit.

Parameters:
  • unit – One of B/K/M/G/T/P. Default T (TFLOPs/s).

  • device_name – Optional explicit device name for lookup. If None, the current torch device name is inferred.

Returns:

Theoretical FLOPS in requested unit. Returns float("inf") for unknown devices.

class nemo_automodel._transformers.mfu.AutoMFU(config: transformers.PretrainedConfig, device: str = 'h100')#

Auto MFU calculator - provides MFU calculation for various model architectures.

This class provides a HuggingFace AutoModel-like interface for calculating Model FLOPs Utilization (MFU) during training.

Initialization

Initialize AutoMFU with a model config.

Parameters:
  • config – HuggingFace PretrainedConfig object

  • device – Device name (e.g. "h100")

classmethod register_device(device: str, peak_tflops: float) None#

Register or override a device peak TFLOPs entry used for MFU calculation.

classmethod from_config(
config_or_path_or_model: Union[transformers.PretrainedConfig, str, os.PathLike[str], object],
device: str = 'h100',
**kwargs,
) nemo_automodel._transformers.mfu.AutoMFU#

Create AutoMFU from a config object, model object, or model path/ID.

Parameters:
  • config_or_path_or_model – Either a PretrainedConfig object, a model object (the .config attribute will be extracted), or a model ID/local path.

  • device – Device name (e.g. "h100")

  • **kwargs – Additional arguments passed to AutoConfig.from_pretrained when loading from model ID/path.

Returns:

AutoMFU instance

classmethod from_pretrained(
model_id_or_local_path_or_model: Union[str, os.PathLike[str], object],
device: str = 'h100',
**kwargs,
) nemo_automodel._transformers.mfu.AutoMFU#

Create AutoMFU from model ID, local path, or a model object.

Parameters:
  • model_id_or_local_path_or_model – Model ID (e.g., “meta-llama/llama-3-70b”), local path, or model object (the .config attribute will be extracted)

  • device – Device name (e.g. "h100")

  • **kwargs – Additional arguments passed to AutoConfig.from_pretrained

Returns:

AutoMFU instance

__call__(
input_ids_or_tensor: Union[torch.Tensor, Tuple[int, int]],
time_delta: float,
world_size: int,
) Optional[float]#

Calculate MFU percentage.

Parameters:
  • input_ids_or_tensor – Either a tensor (batch_size, seq_len) or a tuple of (batch_size, seq_len)

  • time_delta – Time taken for forward/backward pass in seconds

  • world_size – Number of GPUs used for training

Returns:

MFU as a percentage, or None if model not supported

get_flops(
input_ids_or_tensor: Union[torch.Tensor, Tuple[int, int]],
) Optional[float]#

Calculate FLOPs for given input shape.

Parameters:

input_ids_or_tensor – Either a tensor (batch_size, seq_len) or a tuple of (batch_size, seq_len)

Returns:

FLOPs as a float, or None if model not supported

static _unwrap_config(config_or_model: object)#
static _ensure_common_config_aliases(config: object) None#