nemo_export.utils.utils#

Module Contents#

Functions#

is_nemo2_checkpoint

Checks if the checkpoint is in NeMo 2.0 format.

prepare_directory_for_export

Prepares model_dir path for the TensorRTT-LLM / vLLM export.

is_nemo_tarfile

Checks if the path exists and points to packed NeMo 1 checkpoint.

torch_dtype_from_precision

Mapping from PyTorch Lighthing (PTL) precision types to corresponding PyTorch parameter data type.

get_model_device_type

Find the device type the model is assigned to and ensure consistency.

get_example_inputs

Gets example data to feed to the model during ONNX export.

validate_fp8_network

Checks the network to ensure it’s compatible with fp8 precison.

API#

nemo_export.utils.utils.is_nemo2_checkpoint(checkpoint_path: str) bool[source]#

Checks if the checkpoint is in NeMo 2.0 format.

Parameters:

checkpoint_path (str) – Path to a checkpoint.

Returns:

True if the path points to a NeMo 2.0 checkpoint; otherwise false.

Return type:

bool

nemo_export.utils.utils.prepare_directory_for_export(
model_dir: Union[str, pathlib.Path],
delete_existing_files: bool,
subdir: Optional[str] = None,
) None[source]#

Prepares model_dir path for the TensorRTT-LLM / vLLM export.

Makes sure that the model_dir directory exists and is empty.

Parameters:
  • model_dir (str) – Path to the target directory for the export.

  • delete_existing_files (bool) – Attempt to delete existing files if they exist.

  • subdir (Optional[str]) – Subdirectory to create inside the model_dir.

Returns:

None

nemo_export.utils.utils.is_nemo_tarfile(path: str) bool[source]#

Checks if the path exists and points to packed NeMo 1 checkpoint.

Parameters:

path (str) – Path to possible checkpoint.

Returns:

NeMo 1 checkpoint exists and is in β€˜.nemo’ format.

Return type:

bool

nemo_export.utils.utils.torch_dtype_from_precision(
precision: Union[int, str],
megatron_amp_O2: bool = True,
) torch.dtype[source]#

Mapping from PyTorch Lighthing (PTL) precision types to corresponding PyTorch parameter data type.

Parameters:
  • precision (Union[int, str]) – The PTL precision type used.

  • megatron_amp_O2 (bool) – A flag indicating if Megatron AMP O2 is enabled.

Returns:

The corresponding PyTorch data type based on the provided precision.

Return type:

torch.dtype

nemo_export.utils.utils.get_model_device_type(module: torch.nn.Module) str[source]#

Find the device type the model is assigned to and ensure consistency.

nemo_export.utils.utils.get_example_inputs(
tokenizer: transformers.PreTrainedTokenizerBase,
device: Optional[Union[str, torch.device]] = None,
) Dict[str, torch.Tensor][source]#

Gets example data to feed to the model during ONNX export.

Parameters:
  • tokenizer (PreTrainedTokenizerBase) – Tokenizer to use for generating example inputs.

  • device (Optional[Union[str, torch.device]]) – Device to which the example inputs should be moved.

Returns:

Dictionary of tokenizer outputs.

nemo_export.utils.utils.validate_fp8_network(network) None[source]#

Checks the network to ensure it’s compatible with fp8 precison.

Raises:

ValueError if netowrk doesn't container Q/DQ FP8 layers –