nemo_automodel._transformers.mfu#
AutoMFU: Automatic Model FLOPs Utilization calculator.
Similar interface to HuggingFace AutoModel, this module provides automatic MFU calculation for various model architectures.
Module Contents#
Classes#
Auto MFU calculator - provides MFU calculation for various model architectures. |
Functions#
Get theoretical device FLOPS in a requested unit. |
Data#
API#
- nemo_automodel._transformers.mfu.logger#
‘getLogger(…)’
- nemo_automodel._transformers.mfu._DEVICE_FLOPS: Dict[str, float]#
None
- nemo_automodel._transformers.mfu._UNIT_TO_SCALE#
None
- nemo_automodel._transformers.mfu._UNWRAP_ATTRS#
(‘module’, ‘_orig_mod’, ‘_fsdp_wrapped_module’, ‘model’)
- nemo_automodel._transformers.mfu._CONFIG_ALIAS_ATTRS#
((‘n_embd’, ‘hidden_size’), (‘n_layer’, ‘num_hidden_layers’), (‘n_head’, ‘num_attention_heads’), (‘n…
- nemo_automodel._transformers.mfu.get_device_flops(
- unit: str = 'T',
- device_name: Optional[str] = None,
Get theoretical device FLOPS in a requested unit.
- Parameters:
unit – One of
B/K/M/G/T/P. DefaultT(TFLOPs/s).device_name – Optional explicit device name for lookup. If
None, the current torch device name is inferred.
- Returns:
Theoretical FLOPS in requested unit. Returns
float("inf")for unknown devices.
- class nemo_automodel._transformers.mfu.AutoMFU(config: transformers.PretrainedConfig, device: str = 'h100')#
Auto MFU calculator - provides MFU calculation for various model architectures.
This class provides a HuggingFace AutoModel-like interface for calculating Model FLOPs Utilization (MFU) during training.
Initialization
Initialize AutoMFU with a model config.
- Parameters:
config – HuggingFace PretrainedConfig object
device – Device name (e.g.
"h100")
- classmethod register_device(device: str, peak_tflops: float) None#
Register or override a device peak TFLOPs entry used for MFU calculation.
- classmethod from_config(
- config_or_path_or_model: Union[transformers.PretrainedConfig, str, os.PathLike[str], object],
- device: str = 'h100',
- **kwargs,
Create AutoMFU from a config object, model object, or model path/ID.
- Parameters:
config_or_path_or_model – Either a PretrainedConfig object, a model object (the .config attribute will be extracted), or a model ID/local path.
device – Device name (e.g.
"h100")**kwargs – Additional arguments passed to AutoConfig.from_pretrained when loading from model ID/path.
- Returns:
AutoMFU instance
- classmethod from_pretrained(
- model_id_or_local_path_or_model: Union[str, os.PathLike[str], object],
- device: str = 'h100',
- **kwargs,
Create AutoMFU from model ID, local path, or a model object.
- Parameters:
model_id_or_local_path_or_model – Model ID (e.g., “meta-llama/llama-3-70b”), local path, or model object (the .config attribute will be extracted)
device – Device name (e.g.
"h100")**kwargs – Additional arguments passed to AutoConfig.from_pretrained
- Returns:
AutoMFU instance
- __call__(
- input_ids_or_tensor: Union[torch.Tensor, Tuple[int, int]],
- time_delta: float,
- world_size: int,
Calculate MFU percentage.
- Parameters:
input_ids_or_tensor – Either a tensor (batch_size, seq_len) or a tuple of (batch_size, seq_len)
time_delta – Time taken for forward/backward pass in seconds
world_size – Number of GPUs used for training
- Returns:
MFU as a percentage, or None if model not supported
- get_flops(
- input_ids_or_tensor: Union[torch.Tensor, Tuple[int, int]],
Calculate FLOPs for given input shape.
- Parameters:
input_ids_or_tensor – Either a tensor (batch_size, seq_len) or a tuple of (batch_size, seq_len)
- Returns:
FLOPs as a float, or None if model not supported
- static _unwrap_config(config_or_model: object)#
- static _ensure_common_config_aliases(config: object) None#