Core APIs#
Base class for all NeMo models#
- class nemo.core.ModelPT(*args: Any, **kwargs: Any)[source]#
Bases:
LightningModule
,Model
Interface for Pytorch-lightning based NeMo models
- register_artifact(config_path: str, src: str, verify_src_exists: bool = True)[source]#
Register model artifacts with this function. These artifacts (files) will be included inside .nemo file when model.save_to(“mymodel.nemo”) is called.
How it works:
- It always returns existing absolute path which can be used during Model constructor call
EXCEPTION: src is None or “” in which case nothing will be done and src will be returned
It will add (config_path, model_utils.ArtifactItem()) pair to self.artifacts
If "src" is local existing path: then it will be returned in absolute path form. elif "src" starts with "nemo_file:unique_artifact_name": .nemo will be untarred to a temporary folder location and an actual existing path will be returned else: an error will be raised.
WARNING: use .register_artifact calls in your models’ constructors. The returned path is not guaranteed to exist after you have exited your model’s constructor.
- Parameters
config_path (str) – Artifact key. Usually corresponds to the model config.
src (str) – Path to artifact.
verify_src_exists (bool) – If set to False, then the artifact is optional and register_artifact will return None even if src is not found. Defaults to True.
- Returns
If src is not None or empty it always returns absolute path which is guaranteed to exist during model instance life
- Return type
str
- has_native_or_submodules_artifacts() bool [source]#
Returns True if it has artifacts or any of the submodules have artifacts
- register_nemo_submodule(name: str, config_field: str, model: ModelPT) None [source]#
Adds a NeMo model as a submodule. Submodule can be accessed via the name attribute on the parent NeMo model this submodule was registered on (self). In the saving process, the whole parent model (self) is held as a solid model with artifacts from the child submodule, the submodule config will be saved to the config_field of the parent model. This method is necessary to create a nested model, e.g.
class ParentModel(ModelPT): def __init__(self, cfg, trainer=None): super().__init__(cfg=cfg, trainer=trainer) # annotate type for autocompletion and type checking (optional) self.child_model: Optional[ChildModel] = None if cfg.get("child_model") is not None: self.register_nemo_submodule( name="child_model", config_field="child_model", model=ChildModel(self.cfg.child_model, trainer=trainer), ) # ... other code
- Parameters
name – name of the attribute for the submodule
config_field – field in config, where submodule config should be saved
model – NeMo model, instance of ModelPT
- named_nemo_modules(prefix_name: str = '', prefix_config: str = '') Iterator[Tuple[str, str, ModelPT]] [source]#
Returns an iterator over all NeMo submodules recursively, yielding tuples of (attribute path, path in config, submodule), starting from the core module
- Parameters
prefix_name – prefix for the name path
prefix_config – prefix for the path in config
- Returns
Iterator over (attribute path, path in config, submodule), starting from (prefix, self)
- save_to(save_path: str)[source]#
- Saves model instance (weights and configuration) into .nemo file
You can use “restore_from” method to fully restore instance from .nemo file.
- .nemo file is an archive (tar.gz) with the following:
model_config.yaml - model configuration in .yaml format. You can deserialize this into cfg argument for model’s constructor model_wights.ckpt - model checkpoint
- Parameters
save_path – Path to .nemo file where model instance should be saved
- classmethod restore_from(restore_path: str, override_config_path: Optional[Union[omegaconf.OmegaConf, str]] = None, map_location: Optional[torch.device] = None, strict: bool = True, return_config: bool = False, save_restore_connector: Optional[SaveRestoreConnector] = None, trainer: Optional[pytorch_lightning.Trainer] = None)[source]#
Restores model instance (weights and configuration) from .nemo file.
- Parameters
restore_path – path to .nemo file from which model should be instantiated
override_config_path – path to a yaml config that will override the internal config file or an OmegaConf / DictConfig object representing the model config.
map_location – Optional torch.device() to map the instantiated model to a device. By default (None), it will select a GPU if available, falling back to CPU otherwise.
strict – Passed to load_state_dict. By default True.
return_config – If set to true, will return just the underlying config of the restored model as an OmegaConf DictConfig object without instantiating the model.
trainer – Optional, a pytorch lightning Trainer object that will be forwarded to the instantiated model’s constructor.
save_restore_connector (SaveRestoreConnector) – Can be overridden to add custom save and restore logic.
Example –
` model = nemo.collections.asr.models.EncDecCTCModel.restore_from('asr.nemo') assert isinstance(model, nemo.collections.asr.models.EncDecCTCModel) `
- Returns
An instance of type cls or its underlying config (if return_config is set).
- classmethod load_from_checkpoint(checkpoint_path: str, *args, map_location: Optional[Union[Dict[str, str], str, torch.device, int, Callable]] = None, hparams_file: Optional[str] = None, strict: bool = True, **kwargs)[source]#
Loads ModelPT from checkpoint, with some maintenance of restoration. For documentation, please refer to LightningModule.load_from_checkpoint() documentation.
- abstract setup_training_data(train_data_config: Union[omegaconf.DictConfig, Dict])[source]#
Setups data loader to be used in training
- Parameters
train_data_layer_config – training data layer parameters.
Returns:
- abstract setup_validation_data(val_data_config: Union[omegaconf.DictConfig, Dict])[source]#
Setups data loader to be used in validation :param val_data_layer_config: validation data layer parameters.
Returns:
- setup_test_data(test_data_config: Union[omegaconf.DictConfig, Dict])[source]#
(Optionally) Setups data loader to be used in test
- Parameters
test_data_layer_config – test data layer parameters.
Returns:
- setup_multiple_validation_data(val_data_config: Union[omegaconf.DictConfig, Dict])[source]#
(Optionally) Setups data loader to be used in validation, with support for multiple data loaders.
- Parameters
val_data_layer_config – validation data layer parameters.
- setup_multiple_test_data(test_data_config: Union[omegaconf.DictConfig, Dict])[source]#
(Optionally) Setups data loader to be used in test, with support for multiple data loaders.
- Parameters
test_data_layer_config – test data layer parameters.
- setup_optimization(optim_config: Optional[Union[omegaconf.DictConfig, Dict]] = None, optim_kwargs: Optional[Dict[str, Any]] = None)[source]#
Prepares an optimizer from a string name and its optional config parameters.
- Parameters
optim_config –
A dictionary containing the following keys:
”lr”: mandatory key for learning rate. Will raise ValueError if not provided.
”optimizer”: string name pointing to one of the available optimizers in the registry. If not provided, defaults to “adam”.
”opt_args”: Optional list of strings, in the format “arg_name=arg_value”. The list of “arg_value” will be parsed and a dictionary of optimizer kwargs will be built and supplied to instantiate the optimizer.
optim_kwargs – A dictionary with additional kwargs for the optimizer. Used for non-primitive types that are not compatible with OmegaConf.
- setup_optimizer_param_groups()[source]#
Used to create param groups for the optimizer. As an example, this can be used to specify per-layer learning rates:
- optim.SGD([
{‘params’: model.base.parameters()}, {‘params’: model.classifier.parameters(), ‘lr’: 1e-3} ], lr=1e-2, momentum=0.9)
See https://pytorch.org/docs/stable/optim.html for more information. By default, ModelPT will use self.parameters(). Override this method to add custom param groups. In the config file, add ‘optim_param_groups’ to support different LRs for different components (unspecified params will use the default LR):
- model:
- optim_param_groups:
- encoder:
lr: 1e-4 momentum: 0.8
- decoder:
lr: 1e-3
- optim:
lr: 3e-3 momentum: 0.9
- setup(stage: Optional[str] = None)[source]#
Called at the beginning of fit, validate, test, or predict. This is called on every process when using DDP.
- Parameters
stage – fit, validate, test or predict
- validation_epoch_end(outputs: Union[List[Dict[str, torch.Tensor]], List[List[Dict[str, torch.Tensor]]]]) Optional[Dict[str, Dict[str, torch.Tensor]]] [source]#
Default DataLoader for Validation set which automatically supports multiple data loaders via multi_validation_epoch_end.
If multi dataset support is not required, override this method entirely in base class. In such a case, there is no need to implement multi_validation_epoch_end either.
Note
If more than one data loader exists, and they all provide val_loss, only the val_loss of the first data loader will be used by default. This default can be changed by passing the special key val_dl_idx: int inside the validation_ds config.
- Parameters
outputs – Single or nested list of tensor outputs from one or more data loaders.
- Returns
A dictionary containing the union of all items from individual data_loaders, along with merged logs from all data loaders.
- test_epoch_end(outputs: Union[List[Dict[str, torch.Tensor]], List[List[Dict[str, torch.Tensor]]]]) Optional[Dict[str, Dict[str, torch.Tensor]]] [source]#
Default DataLoader for Test set which automatically supports multiple data loaders via multi_test_epoch_end.
If multi dataset support is not required, override this method entirely in base class. In such a case, there is no need to implement multi_test_epoch_end either.
Note
If more than one data loader exists, and they all provide test_loss, only the test_loss of the first data loader will be used by default. This default can be changed by passing the special key test_dl_idx: int inside the test_ds config.
- Parameters
outputs – Single or nested list of tensor outputs from one or more data loaders.
- Returns
A dictionary containing the union of all items from individual data_loaders, along with merged logs from all data loaders.
- multi_validation_epoch_end(outputs: List[Dict[str, torch.Tensor]], dataloader_idx: int = 0) Optional[Dict[str, Dict[str, torch.Tensor]]] [source]#
Adds support for multiple validation datasets. Should be overriden by subclass, so as to obtain appropriate logs for each of the dataloaders.
- Parameters
outputs – Same as that provided by LightningModule.validation_epoch_end() for a single dataloader.
dataloader_idx – int representing the index of the dataloader.
- Returns
A dictionary of values, optionally containing a sub-dict log, such that the values in the log will be pre-pended by the dataloader prefix.
- multi_test_epoch_end(outputs: List[Dict[str, torch.Tensor]], dataloader_idx: int = 0) Optional[Dict[str, Dict[str, torch.Tensor]]] [source]#
Adds support for multiple test datasets. Should be overriden by subclass, so as to obtain appropriate logs for each of the dataloaders.
- Parameters
outputs – Same as that provided by LightningModule.validation_epoch_end() for a single dataloader.
dataloader_idx – int representing the index of the dataloader.
- Returns
A dictionary of values, optionally containing a sub-dict log, such that the values in the log will be pre-pended by the dataloader prefix.
- get_validation_dataloader_prefix(dataloader_idx: int = 0) str [source]#
Get the name of one or more data loaders, which will be prepended to all logs.
- Parameters
dataloader_idx – Index of the data loader.
- Returns
str name of the data loader at index provided.
- get_test_dataloader_prefix(dataloader_idx: int = 0) str [source]#
Get the name of one or more data loaders, which will be prepended to all logs.
- Parameters
dataloader_idx – Index of the data loader.
- Returns
str name of the data loader at index provided.
- maybe_init_from_pretrained_checkpoint(cfg: omegaconf.OmegaConf, map_location: str = 'cpu')#
Initializes a given model with the parameters obtained via specific config arguments. The state dict of the provided model will be updated with strict=False setting so as to prevent requirement of exact model parameters matching.
- Initializations:
init_from_nemo_model: Str path to a .nemo model in order to load state_dict from single nemo file; if loading from multiple files, pass in a dict where the values have the following fields:
path: Str path to .nemo model
include: Optional list of strings, at least one of which needs to be contained in parameter name to be loaded from this .nemo file. Default: everything is included.
exclude: Optional list of strings, which can be used to exclude any parameter containing one of these strings from being loaded from this .nemo file. Default: nothing is excluded.
hydra usage example:
- init_from_nemo_model:
- model0:
path:<path/to/model1> include:[“encoder”]
- model1:
path:<path/to/model2> include:[“decoder”] exclude:[“embed”]
- init_from_pretrained_model: Str name of a pretrained model checkpoint (obtained via cloud).
The model will be downloaded (or a cached copy will be used), instantiated and then its state dict will be extracted. If loading from multiple models, you can pass in a dict with the same format as for init_from_nemo_model, except with “name” instead of “path”
- init_from_ptl_ckpt: Str name of a Pytorch Lightning checkpoint file. It will be loaded and
the state dict will extracted. If loading from multiple files, you can pass in a dict with the same format as for init_from_nemo_model.
- Parameters
cfg – The config used to instantiate the model. It need only contain one of the above keys.
map_location – str or torch.device() which represents where the intermediate state dict (from the pretrained model or checkpoint) will be loaded.
- classmethod extract_state_dict_from(restore_path: str, save_dir: str, split_by_module: bool = False, save_restore_connector: Optional[SaveRestoreConnector] = None)[source]#
Extract the state dict(s) from a provided .nemo tarfile and save it to a directory.
- Parameters
restore_path – path to .nemo file from which state dict(s) should be extracted
save_dir – directory in which the saved state dict(s) should be stored
split_by_module – bool flag, which determins whether the output checkpoint should be for the entire Model, or the individual module’s that comprise the Model
save_restore_connector (SaveRestoreConnector) – Can be overrided to add custom save and restore logic.
Example
To convert the .nemo tarfile into a single Model level PyTorch checkpoint :: state_dict = nemo.collections.asr.models.EncDecCTCModel.extract_state_dict_from(‘asr.nemo’, ‘./asr_ckpts’)
To restore a model from a Model level checkpoint :: model = nemo.collections.asr.models.EncDecCTCModel(cfg) # or any other method of restoration model.load_state_dict(torch.load(“./asr_ckpts/model_weights.ckpt”))
To convert the .nemo tarfile into multiple Module level PyTorch checkpoints :: state_dict = nemo.collections.asr.models.EncDecCTCModel.extract_state_dict_from(‘asr.nemo’, ‘./asr_ckpts’, split_by_module=True)
To restore a module from a Module level checkpoint :: model = nemo.collections.asr.models.EncDecCTCModel(cfg) # or any other method of restoration
# load the individual components model.preprocessor.load_state_dict(torch.load(“./asr_ckpts/preprocessor.ckpt”)) model.encoder.load_state_dict(torch.load(“./asr_ckpts/encoder.ckpt”)) model.decoder.load_state_dict(torch.load(“./asr_ckpts/decoder.ckpt”))
- Returns
The state dict that was loaded from the original .nemo checkpoint
- prepare_test(trainer: Trainer) bool [source]#
Helper method to check whether the model can safely be tested on a dataset after training (or loading a checkpoint).
trainer = Trainer() if model.prepare_test(trainer): trainer.test(model)
- Returns
bool which declares the model safe to test. Provides warnings if it has to return False to guide the user.
- set_trainer(trainer: pytorch_lightning.Trainer)[source]#
Set an instance of Trainer object.
- Parameters
trainer – PyTorch Lightning Trainer object.
- set_world_size(trainer: pytorch_lightning.Trainer)[source]#
Determines the world size from the PyTorch Lightning Trainer. And then updates AppState.
- Parameters
trainer (Trainer) – PyTorch Lightning Trainer object
- summarize(max_depth: int = 1) pytorch_lightning.utilities.model_summary.ModelSummary [source]#
Summarize this LightningModule.
- Parameters
max_depth – The maximum depth of layer nesting that the summary will include. A value of 0 turns the layer summary off. Default: 1.
- Returns
The model summary object
- property num_weights#
Utility property that returns the total number of parameters of the Model.
- trainer()#
- property cfg#
Property that holds the finalized internal config of the model.
Note
Changes to this config are not reflected in the state of the model. Please create a new model using an updated config to properly update the model.
- on_train_start()[source]#
PyTorch Lightning hook: https://pytorch-lightning.readthedocs.io/en/stable/common/lightning_module.html#on-train-start We use it here to copy the relevant config for dynamic freezing.
- on_train_batch_start(batch: Any, batch_idx: int, unused: int = 0) Optional[int] [source]#
PyTorch Lightning hook: https://pytorch-lightning.readthedocs.io/en/stable/common/lightning_module.html#on-train-batch-start We use it here to enable nsys profiling and dynamic freezing.
- on_train_batch_end(outputs, batch: Any, batch_idx: int, unused: int = 0) None [source]#
PyTorch Lightning hook: https://pytorch-lightning.readthedocs.io/en/stable/common/lightning_module.html#on-train-batch-end We use it here to enable nsys profiling.
- on_train_end()[source]#
PyTorch Lightning hook: https://pytorch-lightning.readthedocs.io/en/stable/common/lightning_module.html#on-train-end We use it here to cleanup the dynamic freezing config.
- cuda(device=None)[source]#
- PTL is overriding this method and changing the pytorch behavior of a module.
The PTL LightingModule override will move the module to device 0 if device is None. See the PTL method here: Lightning-AI/lightning
Here we are overriding this to maintain the default Pytorch nn.module behavior: pytorch/pytorch
Moves all model parameters and buffers to the GPU.
This also makes associated parameters and buffers different objects. So it should be called before constructing optimizer if the module will live on GPU while being optimized.
Note
This method modifies the module in-place.
- Parameters
device (int, optional) – if specified, all parameters will be copied to that device
- Returns
self
- Return type
Module
Base Neural Module class#
- class nemo.core.NeuralModule(*args: Any, **kwargs: Any)[source]#
Bases:
Module
,Typing
,Serialization
,FileIO
Abstract class offering interface shared between all PyTorch Neural Modules.
- property num_weights#
Utility property that returns the total number of parameters of NeuralModule.
Base Mixin classes#
- class nemo.core.Typing[source]
Bases:
ABC
An interface which endows module with neural types
- property input_types: Optional[Dict[str, NeuralType]]
Define these to enable input neural type checks
- property output_types: Optional[Dict[str, NeuralType]]
Define these to enable output neural type checks
- _validate_input_types(input_types=None, ignore_collections=False, **kwargs)[source]
This function does a few things.
It ensures that len(self.input_types <non-optional>) <= len(kwargs) <= len(self.input_types).
- For each (keyword name, keyword value) passed as input to the wrapped function:
Check if the keyword name exists in the list of valid self.input_types names.
- Check if keyword value has the neural_type property.
- If it does, then perform a comparative check and assert that neural types
are compatible (SAME or GREATER).
- Check if keyword value is a container type (list or tuple). If yes,
then perform the elementwise test of neural type above on each element of the nested structure, recursively.
- Parameters
input_types – Either the input_types defined at class level, or the local function overridden type definition.
ignore_collections – For backward compatibility, container support can be disabled explicitly using this flag. When set to True, all nesting is ignored and nest-depth checks are skipped.
kwargs – Dictionary of argument_name:argument_value pairs passed to the wrapped function upon call.
- _attach_and_validate_output_types(out_objects, ignore_collections=False, output_types=None)[source]
This function does a few things.
It ensures that len(out_object) == len(self.output_types).
- If the output is a tensor (or list/tuple of list/tuple … of tensors), it
attaches a neural_type to it. For objects without the neural_type attribute, such as python objects (dictionaries and lists, primitive data types, structs), no neural_type is attached.
Note: tensor.neural_type is only checked during _validate_input_types which is called prior to forward().
- Parameters
output_types – Either the output_types defined at class level, or the local function overridden type definition.
ignore_collections – For backward compatibility, container support can be disabled explicitly using this flag. When set to True, all nesting is ignored and nest-depth checks are skipped.
out_objects – The outputs of the wrapped function.
- __check_neural_type(obj, metadata: TypecheckMetadata, depth: int, name: Optional[str] = None)
Recursively tests whether the obj satisfies the semantic neural type assertion. Can include shape checks if shape information is provided.
- Parameters
obj – Any python object that can be assigned a value.
metadata – TypecheckMetadata object.
depth – Current depth of recursion.
name – Optional name used of the source obj, used when an error occurs.
- __attach_neural_type(obj, metadata: TypecheckMetadata, depth: int, name: Optional[str] = None)
Recursively attach neural types to a given object - as long as it can be assigned some value.
- Parameters
obj – Any python object that can be assigned a value.
metadata – TypecheckMetadata object.
depth – Current depth of recursion.
name – Optional name used of the source obj, used when an error occurs.
- class nemo.core.Serialization[source]
Bases:
ABC
- classmethod from_config_dict(config: DictConfig, trainer: Optional[Trainer] = None)[source]
Instantiates object using DictConfig-based configuration
- to_config_dict() omegaconf.DictConfig [source]
Returns object’s configuration to config dictionary
- class nemo.core.FileIO[source]
Bases:
ABC
- save_to(save_path: str)[source]
Standardized method to save a tarfile containing the checkpoint, config, and any additional artifacts. Implemented via
nemo.core.connectors.save_restore_connector.SaveRestoreConnector.save_to()
.- Parameters
save_path – str, path to where the file should be saved.
- classmethod restore_from(restore_path: str, override_config_path: Optional[str] = None, map_location: Optional[torch.device] = None, strict: bool = True, return_config: bool = False, trainer: Optional[Trainer] = None, save_restore_connector: SaveRestoreConnector = None)[source]
Restores model instance (weights and configuration) from a .nemo file
- Parameters
restore_path – path to .nemo file from which model should be instantiated
override_config_path – path to a yaml config that will override the internal config file or an OmegaConf / DictConfig object representing the model config.
map_location – Optional torch.device() to map the instantiated model to a device. By default (None), it will select a GPU if available, falling back to CPU otherwise.
strict – Passed to load_state_dict. By default True
return_config – If set to true, will return just the underlying config of the restored model as an OmegaConf DictConfig object without instantiating the model.
trainer – An optional Trainer object, passed to the model constructor.
save_restore_connector – An optional SaveRestoreConnector object that defines the implementation of the restore_from() method.
- classmethod from_config_file(path2yaml_file: str)[source]
Instantiates an instance of NeMo Model from YAML config file. Weights will be initialized randomly. :param path2yaml_file: path to yaml file with model configuration
Returns:
- to_config_file(path2yaml_file: str)[source]
Saves current instance’s configuration to YAML config file. Weights will not be saved. :param path2yaml_file: path2yaml_file: path to yaml file where model model configuration will be saved
Returns:
Base Connector classes#
- class nemo.core.connectors.save_restore_connector.SaveRestoreConnector[source]#
Bases:
object
- save_to(model: nemo_classes.ModelPT, save_path: str)[source]#
Saves model instance (weights and configuration) into .nemo file. You can use “restore_from” method to fully restore instance from .nemo file.
- .nemo file is an archive (tar.gz) with the following:
model_config.yaml - model configuration in .yaml format. You can deserialize this into cfg argument for model’s constructor model_wights.ckpt - model checkpoint
- Parameters
model – ModelPT object to be saved.
save_path – Path to .nemo file where model instance should be saved
- load_config_and_state_dict(calling_cls, restore_path: str, override_config_path: Optional[Union[omegaconf.OmegaConf, str]] = None, map_location: Optional[torch.device] = None, strict: bool = True, return_config: bool = False, trainer: Optional[pytorch_lightning.trainer.trainer.Trainer] = None)[source]#
Restores model instance (weights and configuration) into .nemo file
- Parameters
restore_path – path to .nemo file from which model should be instantiated
override_config_path – path to a yaml config that will override the internal config file or an OmegaConf / DictConfig object representing the model config.
map_location – Optional torch.device() to map the instantiated model to a device. By default (None), it will select a GPU if available, falling back to CPU otherwise.
strict – Passed to load_state_dict. By default True
return_config – If set to true, will return just the underlying config of the restored model as an OmegaConf DictConfig object without instantiating the model.
Example
` model = nemo.collections.asr.models.EncDecCTCModel.restore_from('asr.nemo') assert isinstance(model, nemo.collections.asr.models.EncDecCTCModel) `
- Returns
An instance of type cls or its underlying config (if return_config is set).
- modify_state_dict(conf, state_dict)[source]#
Utility method that allows to modify the state dict before loading parameters into a model. :param conf: A model level OmegaConf object. :param state_dict: The state dict restored from the checkpoint.
- Returns
A potentially modified state dict.
- load_instance_with_state_dict(instance, state_dict, strict)[source]#
Utility method that loads a model instance with the (potentially modified) state dict.
- Parameters
instance – ModelPT subclass instance.
state_dict – The state dict (which may have been modified)
strict – Bool, whether to perform strict checks when loading the state dict.
- restore_from(calling_cls, restore_path: str, override_config_path: Optional[Union[omegaconf.OmegaConf, str]] = None, map_location: Optional[torch.device] = None, strict: bool = True, return_config: bool = False, trainer: Optional[pytorch_lightning.trainer.trainer.Trainer] = None)[source]#
Restores model instance (weights and configuration) into .nemo file
- Parameters
restore_path – path to .nemo file from which model should be instantiated
override_config_path – path to a yaml config that will override the internal config file or an OmegaConf / DictConfig object representing the model config.
map_location – Optional torch.device() to map the instantiated model to a device. By default (None), it will select a GPU if available, falling back to CPU otherwise.
strict – Passed to load_state_dict. By default True
return_config – If set to true, will return just the underlying config of the restored model as an OmegaConf DictConfig object without instantiating the model.
trainer – An optional Trainer object, passed to the model constructor.
Example
` model = nemo.collections.asr.models.EncDecCTCModel.restore_from('asr.nemo') assert isinstance(model, nemo.collections.asr.models.EncDecCTCModel) `
- Returns
An instance of type cls or its underlying config (if return_config is set).
- extract_state_dict_from(restore_path: str, save_dir: str, split_by_module: bool = False)[source]#
Extract the state dict(s) from a provided .nemo tarfile and save it to a directory.
- Parameters
restore_path – path to .nemo file from which state dict(s) should be extracted
save_dir – directory in which the saved state dict(s) should be stored
split_by_module – bool flag, which determins whether the output checkpoint should be for the entire Model, or the individual module’s that comprise the Model
Example
To convert the .nemo tarfile into a single Model level PyTorch checkpoint :: state_dict = nemo.collections.asr.models.EncDecCTCModel.extract_state_dict_from(‘asr.nemo’, ‘./asr_ckpts’)
To restore a model from a Model level checkpoint :: model = nemo.collections.asr.models.EncDecCTCModel(cfg) # or any other method of restoration model.load_state_dict(torch.load(“./asr_ckpts/model_weights.ckpt”))
To convert the .nemo tarfile into multiple Module level PyTorch checkpoints :: state_dict = nemo.collections.asr.models.EncDecCTCModel.extract_state_dict_from(‘asr.nemo’, ‘./asr_ckpts’, split_by_module=True)
To restore a module from a Module level checkpoint :: model = nemo.collections.asr.models.EncDecCTCModel(cfg) # or any other method of restoration
# load the individual components model.preprocessor.load_state_dict(torch.load(“./asr_ckpts/preprocessor.ckpt”)) model.encoder.load_state_dict(torch.load(“./asr_ckpts/encoder.ckpt”)) model.decoder.load_state_dict(torch.load(“./asr_ckpts/decoder.ckpt”))
- Returns
The state dict that was loaded from the original .nemo checkpoint
- register_artifact(model, config_path: str, src: str, verify_src_exists: bool = True)[source]#
Register model artifacts with this function. These artifacts (files) will be included inside .nemo file when model.save_to(“mymodel.nemo”) is called.
How it works:
- It always returns existing absolute path which can be used during Model constructor call
EXCEPTION: src is None or “” in which case nothing will be done and src will be returned
It will add (config_path, model_utils.ArtifactItem()) pair to self.artifacts
If "src" is local existing path: then it will be returned in absolute path form elif "src" starts with "nemo_file:unique_artifact_name": .nemo will be untarred to a temporary folder location and an actual existing path will be returned else: an error will be raised.
WARNING: use .register_artifact calls in your models’ constructors. The returned path is not guaranteed to exist after you have exited your model’s constructor.
- Parameters
model – ModelPT object to register artifact for.
config_path (str) – Artifact key. Usually corresponds to the model config.
src (str) – Path to artifact.
verify_src_exists (bool) – If set to False, then the artifact is optional and register_artifact will return None even if src is not found. Defaults to True.
- Returns
- If src is not None or empty it always returns absolute path which is guaranteed to exists during model
instance life
- Return type
str
- property model_config_yaml: str#
- property model_weights_ckpt: str#
- property model_extracted_dir: Optional[str]#
Neural Type checking#
- class nemo.core.classes.common.typecheck(input_types: Union[TypeState, Dict[str, NeuralType]] = TypeState.UNINITIALIZED, output_types: Union[TypeState, Dict[str, NeuralType]] = TypeState.UNINITIALIZED, ignore_collections: bool = False)[source]#
Bases:
object
A decorator which performs input-output neural type checks, and attaches neural types to the output of the function that it wraps.
Requires that the class inherit from
Typing
in order to perform type checking, and will raise an error if that is not the case.# Usage (Class level type support)
@typecheck() def fn(self, arg1, arg2, ...): ...
# Usage (Function level type support)
@typecheck(input_types=..., output_types=...) def fn(self, arg1, arg2, ...): ...
Points to be noted:
The brackets () in @typecheck() are necessary.
You will encounter a TypeError: __init__() takes 1 positional argument but X were given without those brackets.
The function can take any number of positional arguments during definition.
When you call this function, all arguments must be passed using kwargs only.
- __call__(enabled=None, adapter=None, proxy=<class 'FunctionWrapper'>)[source]#
Wrapper method that can be used on any function of a class that implements
Typing
. By default, it will utilize the input_types and output_types properties of the class inheriting Typing.Local function level overrides can be provided by supplying dictionaries as arguments to the decorator.
- Parameters
input_types – Union[TypeState, Dict[str, NeuralType]]. By default, uses the global input_types.
output_types – Union[TypeState, Dict[str, NeuralType]]. By default, uses the global output_types.
ignore_collections – Bool. Determines if container types should be asserted for depth checks, or if depth checks are skipped entirely.
- class TypeState(value)[source]#
Bases:
Enum
Placeholder to denote the default value of type information provided. If the constructor of this decorator is used to override the class level type definition, this enum value indicate that types will be overridden.
- UNINITIALIZED = 0#
Neural Type classes#
- class nemo.core.neural_types.NeuralType(axes: Optional[Tuple] = None, elements_type: ElementType = VoidType, optional=False)[source]#
Bases:
object
This is the main class which would represent neural type concept. It is used to represent the types of inputs and outputs.
- Parameters
axes (Optional[Tuple]) – a tuple of AxisTypes objects representing the semantics of what varying each axis means You can use a short, string-based form here. For example: (‘B’, ‘C’, ‘H’, ‘W’) would correspond to an NCHW format frequently used in computer vision. (‘B’, ‘T’, ‘D’) is frequently used for signal processing and means [batch, time, dimension/channel].
elements_type (ElementType) – an instance of ElementType class representing the semantics of what is stored inside the tensor. For example: logits (LogitsType), log probabilities (LogprobType), etc.
optional (bool) – By default, this is false. If set to True, it would means that input to the port of this type can be optional.
- compare(second) NeuralTypeComparisonResult [source]#
Performs neural type comparison of self with second. When you chain two modules’ inputs/outputs via __call__ method, this comparison will be called to ensure neural type compatibility.
- class nemo.core.neural_types.axes.AxisType(kind: AxisKindAbstract, size: Optional[int] = None, is_list=False)[source]#
Bases:
object
This class represents axis semantics and (optionally) it’s dimensionality :param kind: what kind of axis it is? For example Batch, Height, etc. :type kind: AxisKindAbstract :param size: specify if the axis should have a fixed size. By default it is set to None and you :type size: int, optional :param typically do not want to set it for Batch and Time: :param is_list: whether this is a list or a tensor axis :type is_list: bool, default=False
- class nemo.core.neural_types.elements.ElementType[source]#
Bases:
ABC
Abstract class defining semantics of the tensor elements. We are relying on Python for inheritance checking
- property type_parameters: Dict#
Override this property to parametrize your type. For example, you can specify ‘storage’ type such as float, int, bool with ‘dtype’ keyword. Another example, is if you want to represent a signal with a particular property (say, sample frequency), then you can put sample_freq->value in there. When two types are compared their type_parameters must match.
- property fields: Optional[Tuple]#
This should be used to logically represent tuples/structures. For example, if you want to represent a bounding box (x, y, width, height) you can put a tuple with names (‘x’, y’, ‘w’, ‘h’) in here. Under the hood this should be converted to the last tesnor dimension of fixed size = len(fields). When two types are compared their fields must match.
- compare(second) NeuralTypeComparisonResult [source]#
- class nemo.core.neural_types.comparison.NeuralTypeComparisonResult(value)[source]#
Bases:
Enum
The result of comparing two neural type objects for compatibility. When comparing A.compare_to(B):
- SAME = 0#
- LESS = 1#
- GREATER = 2#
- DIM_INCOMPATIBLE = 3#
- TRANSPOSE_SAME = 4#
- CONTAINER_SIZE_MISMATCH = 5#
- INCOMPATIBLE = 6#
- SAME_TYPE_INCOMPATIBLE_PARAMS = 7#
- UNCHECKED = 8#
Experiment manager#
- class nemo.utils.exp_manager.exp_manager(trainer: pytorch_lightning.Trainer, cfg: Optional[Union[omegaconf.DictConfig, Dict]] = None)[source]#
Bases:
exp_manager is a helper function used to manage folders for experiments. It follows the pytorch lightning paradigm of exp_dir/model_or_experiment_name/version. If the lightning trainer has a logger, exp_manager will get exp_dir, name, and version from the logger. Otherwise it will use the exp_dir and name arguments to create the logging directory. exp_manager also allows for explicit folder creation via explicit_log_dir.
The version can be a datetime string or an integer. Datestime version can be disabled if use_datetime_version is set to False. It optionally creates TensorBoardLogger, WandBLogger, DLLogger, MLFlowLogger, ClearMLLogger, ModelCheckpoint objects from pytorch lightning. It copies sys.argv, and git information if available to the logging directory. It creates a log file for each process to log their output into.
exp_manager additionally has a resume feature (resume_if_exists) which can be used to continuing training from the constructed log_dir. When you need to continue the training repeatedly (like on a cluster which you need multiple consecutive jobs), you need to avoid creating the version folders. Therefore from v1.0.0, when resume_if_exists is set to True, creating the version folders is ignored.
- Parameters
trainer (pytorch_lightning.Trainer) – The lightning trainer.
cfg (DictConfig, dict) –
Can have the following keys:
- explicit_log_dir (str, Path): Can be used to override exp_dir/name/version folder creation. Defaults to
None, which will use exp_dir, name, and version to construct the logging directory.
- exp_dir (str, Path): The base directory to create the logging directory. Defaults to None, which logs to
./nemo_experiments.
- name (str): The name of the experiment. Defaults to None which turns into “default” via name = name or
”default”.
- version (str): The version of the experiment. Defaults to None which uses either a datetime string or
lightning’s TensorboardLogger system of using version_{int}.
use_datetime_version (bool): Whether to use a datetime string for version. Defaults to True.
- resume_if_exists (bool): Whether this experiment is resuming from a previous run. If True, it sets
trainer._checkpoint_connector.resume_from_checkpoint_fit_path so that the trainer should auto-resume. exp_manager will move files under log_dir to log_dir/run_{int}. Defaults to False. From v1.0.0, when resume_if_exists is True, we would not create version folders to make it easier to find the log folder for next runs.
- resume_past_end (bool): exp_manager errors out if resume_if_exists is True and a checkpoint matching
*end.ckpt
indicating a previous training run fully completed. This behaviour can be disabled, in which case the*end.ckpt
will be loaded by setting resume_past_end to True. Defaults to False.
- resume_ignore_no_checkpoint (bool): exp_manager errors out if resume_if_exists is True and no checkpoint
could be found. This behaviour can be disabled, in which case exp_manager will print a message and continue without restoring, by setting resume_ignore_no_checkpoint to True. Defaults to False.
- create_tensorboard_logger (bool): Whether to create a tensorboard logger and attach it to the pytorch
lightning trainer. Defaults to True.
- summary_writer_kwargs (dict): A dictionary of kwargs that can be passed to lightning’s TensorboardLogger
class. Note that log_dir is passed by exp_manager and cannot exist in this dict. Defaults to None.
- create_wandb_logger (bool): Whether to create a Weights and Baises logger and attach it to the pytorch
lightning trainer. Defaults to False.
- wandb_logger_kwargs (dict): A dictionary of kwargs that can be passed to lightning’s WandBLogger
class. Note that name and project are required parameters if create_wandb_logger is True. Defaults to None.
- create_mlflow_logger (bool): Whether to create an MLFlow logger and attach it to the pytorch lightning
training. Defaults to False
mlflow_logger_kwargs (dict): optional parameters for the MLFlow logger
- create_dllogger_logger (bool): Whether to create an DLLogger logger and attach it to the pytorch lightning
training. Defaults to False
dllogger_logger_kwargs (dict): optional parameters for the DLLogger logger
- create_clearml_logger (bool): Whether to create an ClearML logger and attach it to the pytorch lightning
training. Defaults to False
clearml_logger_kwargs (dict): optional parameters for the ClearML logger
- create_checkpoint_callback (bool): Whether to create a ModelCheckpoint callback and attach it to the
pytorch lightning trainer. The ModelCheckpoint saves the top 3 models with the best “val_loss”, the most recent checkpoint under
*last.ckpt
, and the final checkpoint after training completes under*end.ckpt
. Defaults to True.
create_early_stopping_callback (bool): Flag to decide if early stopping should be used to stop training. Default is False.
See EarlyStoppingParams dataclass above.
create_preemption_callback (bool): Flag to decide whether to enable preemption callback to save checkpoints and exit training
immediately upon preemption. Default is True.
- files_to_copy (list): A list of files to copy to the experiment logging directory. Defaults to None which
copies no files.
- log_local_rank_0_only (bool): Whether to only create log files for local rank 0. Defaults to False.
Set this to True if you are using DDP with many GPUs and do not want many log files in your exp dir.
- log_global_rank_0_only (bool): Whether to only create log files for global rank 0. Defaults to False.
Set this to True if you are using DDP with many GPUs and do not want many log files in your exp dir.
- max_time (str): The maximum wall clock time per run. This is intended to be used on clusters where you want
a checkpoint to be saved after this specified time and be able to resume from that checkpoint. Defaults to None.
- Returns
- The final logging directory where logging files are saved. Usually the concatenation of
exp_dir, name, and version.
- Return type
log_dir (Path)
- class nemo.utils.exp_manager.ExpManagerConfig(explicit_log_dir: Optional[str] = None, exp_dir: Optional[str] = None, name: Optional[str] = None, version: Optional[str] = None, use_datetime_version: Optional[bool] = True, resume_if_exists: Optional[bool] = False, resume_past_end: Optional[bool] = False, resume_ignore_no_checkpoint: Optional[bool] = False, create_tensorboard_logger: Optional[bool] = True, summary_writer_kwargs: Optional[Dict[Any, Any]] = None, create_wandb_logger: Optional[bool] = False, wandb_logger_kwargs: Optional[Dict[Any, Any]] = None, create_mlflow_logger: Optional[bool] = False, mlflow_logger_kwargs: Optional[MLFlowParams] = MLFlowParams(experiment_name=None, tracking_uri=None, tags=None, save_dir='./mlruns', prefix='', artifact_location=None, run_id=None), create_dllogger_logger: Optional[bool] = False, dllogger_logger_kwargs: Optional[DLLoggerParams] = DLLoggerParams(verbose=False, stdout=False, json_file='./dllogger.json'), create_clearml_logger: Optional[bool] = False, clearml_logger_kwargs: Optional[ClearMLParams] = ClearMLParams(project=None, task=None, connect_pytorch=False, model_name=None, tags=None, log_model=False, log_cfg=False, log_metrics=False), create_checkpoint_callback: Optional[bool] = True, checkpoint_callback_params: Optional[CallbackParams] = CallbackParams(filepath=None, dirpath=None, filename=None, monitor='val_loss', verbose=True, save_last=True, save_top_k=3, save_weights_only=False, mode='min', auto_insert_metric_name=True, every_n_epochs=1, every_n_train_steps=None, train_time_interval=None, prefix=None, postfix='.nemo', save_best_model=False, always_save_nemo=False, save_nemo_on_train_end=True, model_parallel_size=None, save_on_train_epoch_end=False), create_early_stopping_callback: Optional[bool] = False, early_stopping_callback_params: Optional[EarlyStoppingParams] = EarlyStoppingParams(monitor='val_loss', mode='min', min_delta=0.001, patience=10, verbose=True, strict=True, check_finite=True, stopping_threshold=None, divergence_threshold=None, check_on_train_epoch_end=None, log_rank_zero_only=False), create_preemption_callback: Optional[bool] = True, files_to_copy: Optional[List[str]] = None, log_step_timing: Optional[bool] = True, step_timing_kwargs: Optional[StepTimingParams] = StepTimingParams(reduction='mean', sync_cuda=False, buffer_size=1), log_local_rank_0_only: Optional[bool] = False, log_global_rank_0_only: Optional[bool] = False, disable_validation_on_resume: Optional[bool] = True, ema: Optional[EMAParams] = EMAParams(enable=False, decay=0.999, cpu_offload=False, validate_original_weights=False, every_n_steps=1), max_time_per_run: Optional[str] = None)[source]#
Bases:
object
Experiment Manager config for validation of passed arguments.
- explicit_log_dir: Optional[str] = None#
- exp_dir: Optional[str] = None#
- name: Optional[str] = None#
- version: Optional[str] = None#
- use_datetime_version: Optional[bool] = True#
- resume_if_exists: Optional[bool] = False#
- resume_past_end: Optional[bool] = False#
- resume_ignore_no_checkpoint: Optional[bool] = False#
- create_tensorboard_logger: Optional[bool] = True#
- summary_writer_kwargs: Optional[Dict[Any, Any]] = None#
- create_wandb_logger: Optional[bool] = False#
- wandb_logger_kwargs: Optional[Dict[Any, Any]] = None#
- create_mlflow_logger: Optional[bool] = False#
- mlflow_logger_kwargs: Optional[MLFlowParams] = MLFlowParams(experiment_name=None, tracking_uri=None, tags=None, save_dir='./mlruns', prefix='', artifact_location=None, run_id=None)#
- create_dllogger_logger: Optional[bool] = False#
- dllogger_logger_kwargs: Optional[DLLoggerParams] = DLLoggerParams(verbose=False, stdout=False, json_file='./dllogger.json')#
- create_clearml_logger: Optional[bool] = False#
- clearml_logger_kwargs: Optional[ClearMLParams] = ClearMLParams(project=None, task=None, connect_pytorch=False, model_name=None, tags=None, log_model=False, log_cfg=False, log_metrics=False)#
- create_checkpoint_callback: Optional[bool] = True#
- checkpoint_callback_params: Optional[CallbackParams] = CallbackParams(filepath=None, dirpath=None, filename=None, monitor='val_loss', verbose=True, save_last=True, save_top_k=3, save_weights_only=False, mode='min', auto_insert_metric_name=True, every_n_epochs=1, every_n_train_steps=None, train_time_interval=None, prefix=None, postfix='.nemo', save_best_model=False, always_save_nemo=False, save_nemo_on_train_end=True, model_parallel_size=None, save_on_train_epoch_end=False)#
- create_early_stopping_callback: Optional[bool] = False#
- early_stopping_callback_params: Optional[EarlyStoppingParams] = EarlyStoppingParams(monitor='val_loss', mode='min', min_delta=0.001, patience=10, verbose=True, strict=True, check_finite=True, stopping_threshold=None, divergence_threshold=None, check_on_train_epoch_end=None, log_rank_zero_only=False)#
- create_preemption_callback: Optional[bool] = True#
- files_to_copy: Optional[List[str]] = None#
- log_step_timing: Optional[bool] = True#
- step_timing_kwargs: Optional[StepTimingParams] = StepTimingParams(reduction='mean', sync_cuda=False, buffer_size=1)#
- log_local_rank_0_only: Optional[bool] = False#
- log_global_rank_0_only: Optional[bool] = False#
- disable_validation_on_resume: Optional[bool] = True#
- ema: Optional[EMAParams] = EMAParams(enable=False, decay=0.999, cpu_offload=False, validate_original_weights=False, every_n_steps=1)#
- max_time_per_run: Optional[str] = None#
Exportable#
- class nemo.core.classes.exportable.Exportable[source]#
Bases:
ABC
This Interface should be implemented by particular classes derived from nemo.core.NeuralModule or nemo.core.ModelPT. It gives these entities ability to be exported for deployment to formats such as ONNX.
- Usage:
# exporting pre-trained model to ONNX file for deployment. model.eval() model.to(‘cuda’) # or to(‘cpu’) if you don’t have GPU
model.export(‘mymodel.onnx’, [options]) # all arguments apart from output are optional.
- property input_module#
- property output_module#
- export(output: str, input_example=None, verbose=False, do_constant_folding=True, onnx_opset_version=None, check_trace: Union[bool, List[torch.Tensor]] = False, dynamic_axes=None, check_tolerance=0.01, export_modules_as_functions=False, keep_initializers_as_inputs=None)[source]#
Exports the model to the specified format. The format is inferred from the file extension of the output file.
Args: output (str): Output file name. File extension be .onnx, .pt, or .ts, and is used to select export
path of the model.
- input_example (list or dict): Example input to the model’s forward function. This is used to
trace the model and export it to ONNX/TorchScript. If the model takes multiple inputs, then input_example should be a list of input examples. If the model takes named inputs, then input_example should be a dictionary of input examples.
- verbose (bool): If True, will print out a detailed description of the model’s export steps, along with
the internal trace logs of the export process.
- do_constant_folding (bool): If True, will execute constant folding optimization on the model’s graph
before exporting. This is ONNX specific.
- onnx_opset_version (int): The ONNX opset version to export the model to. If None, will use a reasonable
default version.
- check_trace (bool): If True, will verify that the model’s output matches the output of the traced
model, upto some tolerance.
- dynamic_axes (dict): A dictionary mapping input and output names to their dynamic axes. This is
used to specify the dynamic axes of the model’s inputs and outputs. If the model takes multiple inputs, then dynamic_axes should be a list of dictionaries. If the model takes named inputs, then dynamic_axes should be a dictionary of dictionaries. If None, will use the dynamic axes of the input_example derived from the NeuralType of the input and output of the model.
- check_tolerance (float): The tolerance to use when checking the model’s output against the traced
model’s output. This is only used if check_trace is True. Note the high tolerance is used because the traced model is not guaranteed to be 100% accurate.
- export_modules_as_functions (bool): If True, will export the model’s submodules as functions. This is
ONNX specific.
- keep_initializers_as_inputs (bool): If True, will keep the model’s initializers as inputs in the onnx graph.
This is ONNX specific.
- Returns
A tuple of two outputs. Item 0 in the output is a list of outputs, the outputs of each subnet exported. Item 1 in the output is a list of string descriptions. The description of each subnet exported can be used for logging purposes.
- property disabled_deployment_input_names#
Implement this method to return a set of input names disabled for export
- property disabled_deployment_output_names#
Implement this method to return a set of output names disabled for export
- property supported_export_formats#
Implement this method to return a set of export formats supported. Default is all types.
- property input_names#
- property output_names#
- property input_types_for_export#
- property output_types_for_export#