nemo_export.trt_llm.nemo_ckpt_loader.nemo_file#

Module Contents#

Functions#

load_extra_state_from_bytes

Loads single extra_state from bytes storage.

rename_extra_states

This function preprocesses extra states for Megatron export.

update_tokenizer_paths

Updates tokenizer paths in the tokenizer config.

get_tokenizer_from_nemo2_context

Retrieve tokenizer configuration from NeMo 2.0 context and instantiate the tokenizer.

get_tokenizer

Loads the tokenizer from the decoded NeMo weights dir.

build_tokenizer

Builds tokenizer for trt-llm export.

load_nemo_config

Load the model configuration from a NeMo checkpoint.

get_model_type

Determine the model type from a NeMo checkpoint for TensorRT-LLM engine build or vLLM model converters.

get_weights_dtype

Determine the weights data type from a NeMo checkpoint for TensorRT-LLM engine build.

load_distributed_model_weights

Loads model weights in torch_dist format from the model path.

load_nemo_model

Unified model loading for trt-llm export.

Data#

API#

nemo_export.trt_llm.nemo_ckpt_loader.nemo_file.LOGGER = 'getLogger(...)'#
nemo_export.trt_llm.nemo_ckpt_loader.nemo_file.EXTRA_STATE = 'extra_state'#
nemo_export.trt_llm.nemo_ckpt_loader.nemo_file.load_extra_state_from_bytes(
val: Optional[Union[torch.Tensor, io.BytesIO]],
) Optional[dict]#

Loads single extra_state from bytes storage.

Parameters:

val (torch.Tensor | BytesIO) – Bytes storage of extra_state

Returns:

Deserialized extra_state, or None if the bytes storage is empty.

Return type:

Optional[dict]

nemo_export.trt_llm.nemo_ckpt_loader.nemo_file.rename_extra_states(
state_dict: Dict[str, Any],
) Dict[str, Any]#

This function preprocesses extra states for Megatron export.

Parameters:

state_dict (dict) – Model state dictionary

Returns:

Model state dictionary, with extra states consumable by mcore export

Return type:

dict

nemo_export.trt_llm.nemo_ckpt_loader.nemo_file.update_tokenizer_paths(
tokenizer_config: Dict,
unpacked_checkpoints_dir,
)#

Updates tokenizer paths in the tokenizer config.

nemo_export.trt_llm.nemo_ckpt_loader.nemo_file.get_tokenizer_from_nemo2_context(model_context_dir: pathlib.Path)#

Retrieve tokenizer configuration from NeMo 2.0 context and instantiate the tokenizer.

Parameters:

model_context_dir (Path) – Path to the model context directory.

Returns:

The instantiated tokenizer (various classes possible).

nemo_export.trt_llm.nemo_ckpt_loader.nemo_file.get_tokenizer(
tokenizer_dir_or_path: Union[str, pathlib.Path],
) transformers.PreTrainedTokenizer#

Loads the tokenizer from the decoded NeMo weights dir.

nemo_export.trt_llm.nemo_ckpt_loader.nemo_file.build_tokenizer(tokenizer)#

Builds tokenizer for trt-llm export.

nemo_export.trt_llm.nemo_ckpt_loader.nemo_file.load_nemo_config(
nemo_ckpt: Union[str, pathlib.Path],
) Dict[Any, Any]#

Load the model configuration from a NeMo checkpoint.

This function handles both NeMo 1.0 and NeMo 2.0 checkpoint structures. For NeMo 2.0, it reads the configuration from the ‘context/model.yaml’ file.

Parameters:

nemo_ckpt (Union[str, Path]) – Path to the NeMo checkpoint file or directory.

Returns:

The configuration dictionary.

Return type:

Dict[Any, Any]

nemo_export.trt_llm.nemo_ckpt_loader.nemo_file.get_model_type(
nemo_ckpt: Union[str, pathlib.Path],
use_vllm_type: bool = False,
) Optional[str]#

Determine the model type from a NeMo checkpoint for TensorRT-LLM engine build or vLLM model converters.

Parameters:
  • nemo_ckpt (Union[str, Path]) – Path to the NeMo checkpoint file.

  • use_vllm_type (bool) – If True, uses vLLM model type names for known model converters.

Returns:

The model type if it can be determined, otherwise None.

Return type:

Optional[str]

nemo_export.trt_llm.nemo_ckpt_loader.nemo_file.get_weights_dtype(
nemo_ckpt: Union[str, pathlib.Path],
) Optional[str]#

Determine the weights data type from a NeMo checkpoint for TensorRT-LLM engine build.

Parameters:

nemo_ckpt (Union[str, Path]) – Path to the NeMo checkpoint file.

Returns:

The dtype if it can be determined, otherwise None.

Return type:

Optional[str]

nemo_export.trt_llm.nemo_ckpt_loader.nemo_file.load_distributed_model_weights(
nemo_checkpoint: Union[str, pathlib.Path],
mcore_scales_format: Optional[bool] = None,
) Dict[str, Any]#

Loads model weights in torch_dist format from the model path.

Parameters:
  • nemo_checkpoint (str | Path) – Path to the nemo checkpoint.

  • mcore_scales_format (bool) – Depreacted flag for local vs megatron.core export.

Returns:

Model state dictionary.

Return type:

dict

nemo_export.trt_llm.nemo_ckpt_loader.nemo_file.load_nemo_model(
nemo_ckpt: Union[str, pathlib.Path],
nemo_export_dir: Union[str, pathlib.Path],
)#

Unified model loading for trt-llm export.