`nemo_export.utils.utils`#

Module Contents#

Functions#

`is_nemo2_checkpoint`	Checks if the checkpoint is in NeMo 2.0 format.
`prepare_directory_for_export`	Prepares model_dir path for the TensorRTT-LLM / vLLM export.
`is_nemo_tarfile`	Checks if the path exists and points to packed NeMo 1 checkpoint.
`torch_dtype_from_precision`	Mapping from PyTorch Lighthing (PTL) precision types to corresponding PyTorch parameter data type.
`get_model_device_type`	Find the device type the model is assigned to and ensure consistency.
`get_example_inputs`	Gets example data to feed to the model during ONNX export.
`validate_fp8_network`	Checks the network to ensure it’s compatible with fp8 precison.

API#

nemo_export.utils.utils.is_nemo2_checkpoint(checkpoint_path: str) → bool#

Checks if the checkpoint is in NeMo 2.0 format.

Parameters:: checkpoint_path (str) – Path to a checkpoint.
Returns:: True if the path points to a NeMo 2.0 checkpoint; otherwise false.
Return type:: bool

nemo_export.utils.utils.prepare_directory_for_export( model_dir: Union[str, pathlib.Path], delete_existing_files: bool, subdir: Optional[str] = None, ) → None#

Prepares model_dir path for the TensorRTT-LLM / vLLM export.

Makes sure that the model_dir directory exists and is empty.

Parameters:

model_dir (str) – Path to the target directory for the export.
delete_existing_files (bool) – Attempt to delete existing files if they exist.
subdir (Optional[str]) – Subdirectory to create inside the model_dir.

Returns:

None

nemo_export.utils.utils.is_nemo_tarfile(path: str) → bool#

Checks if the path exists and points to packed NeMo 1 checkpoint.

Parameters:: path (str) – Path to possible checkpoint.
Returns:: NeMo 1 checkpoint exists and is in ‘.nemo’ format.
Return type:: bool

nemo_export.utils.utils.torch_dtype_from_precision( precision: Union[int, str], megatron_amp_O2: bool = True, ) → torch.dtype#

Mapping from PyTorch Lighthing (PTL) precision types to corresponding PyTorch parameter data type.

Parameters:

precision (Union[int, str]) – The PTL precision type used.
megatron_amp_O2 (bool) – A flag indicating if Megatron AMP O2 is enabled.

Returns:

The corresponding PyTorch data type based on the provided precision.

Return type:

torch.dtype

nemo_export.utils.utils.get_model_device_type(module: torch.nn.Module) → str#: Find the device type the model is assigned to and ensure consistency.

nemo_export.utils.utils.get_example_inputs( tokenizer: transformers.PreTrainedTokenizerBase, device: Optional[Union[str, torch.device]] = None, ) → Dict[str, torch.Tensor]#

Gets example data to feed to the model during ONNX export.

Parameters:

tokenizer (PreTrainedTokenizerBase) – Tokenizer to use for generating example inputs.
device (Optional[Union[str, torch.device]]) – Device to which the example inputs should be moved.

Returns:

Dictionary of tokenizer outputs.

nemo_export.utils.utils.validate_fp8_network(network) → None#

Checks the network to ensure it’s compatible with fp8 precison.

Raises:: ValueError if netowrk doesn't container Q/DQ FP8 layers –

nemo_export.utils.utils#

Module Contents#

Functions#

API#

`nemo_export.utils.utils`#