NeMo NLP collection API#

Model Classes#

Modules#

class nemo.collections.nlp.modules.BertModule(*args: Any, **kwargs: Any)[source]#

Bases: nemo.core.classes.module.NeuralModule, nemo.core.classes.exportable.Exportable

input_example(max_batch=1, max_dim=256)[source]#

Generates input examples for tracing etc. :returns: A tuple of input examples.

property input_types: Optional[Dict[str, nemo.core.neural_types.neural_type.NeuralType]]#

Define these to enable input neural type checks

property output_types: Optional[Dict[str, nemo.core.neural_types.neural_type.NeuralType]]#

Define these to enable output neural type checks

restore_weights(restore_path: str)[source]#

Restores module/model’s weights

class nemo.collections.nlp.modules.AlbertEncoder(*args: Any, **kwargs: Any)[source]#

Bases: transformers.AlbertModel, nemo.collections.nlp.modules.common.bert_module.BertModule

Wraps around the Huggingface transformers implementation repository for easy use within NeMo.

forward(input_ids, attention_mask, token_type_ids)[source]#
class nemo.collections.nlp.modules.BertEncoder(*args: Any, **kwargs: Any)[source]#

Bases: transformers.BertModel, nemo.collections.nlp.modules.common.bert_module.BertModule

Wraps around the Huggingface transformers implementation repository for easy use within NeMo.

forward(input_ids, attention_mask=None, token_type_ids=None)[source]#
class nemo.collections.nlp.modules.DistilBertEncoder(*args: Any, **kwargs: Any)[source]#

Bases: transformers.DistilBertModel, nemo.collections.nlp.modules.common.bert_module.BertModule

Wraps around the Huggingface transformers implementation repository for easy use within NeMo.

forward(input_ids, attention_mask, token_type_ids=None)[source]#
class nemo.collections.nlp.modules.RobertaEncoder(*args: Any, **kwargs: Any)[source]#

Bases: transformers.RobertaModel, nemo.collections.nlp.modules.common.bert_module.BertModule

Wraps around the Huggingface transformers implementation repository for easy use within NeMo.

forward(input_ids, attention_mask, token_type_ids)[source]#
class nemo.collections.nlp.modules.SequenceClassifier(*args: Any, **kwargs: Any)[source]#

Bases: nemo.collections.nlp.modules.common.classifier.Classifier

forward(hidden_states)[source]#
property output_types: Optional[Dict[str, nemo.core.neural_types.neural_type.NeuralType]]#

Define these to enable output neural type checks

class nemo.collections.nlp.modules.SequenceRegression(*args: Any, **kwargs: Any)[source]#

Bases: nemo.collections.nlp.modules.common.classifier.Classifier

Parameters
  • hidden_size – the hidden size of the mlp head on the top of the encoder

  • num_layers – number of the linear layers of the mlp head on the top of the encoder

  • activation – type of activations between layers of the mlp head

  • dropout – the dropout used for the mlp head

  • use_transformer_init – initializes the weights with the same approach used in Transformer

  • idx_conditioned_on – index of the token to use as the sequence representation for the classification task, default is the first token

forward(hidden_states: torch.Tensor) torch.Tensor[source]#

Forward pass through the module.

Parameters

hidden_states – hidden states for each token in a sequence, for example, BERT module output

property output_types: Optional[Dict[str, nemo.core.neural_types.neural_type.NeuralType]]#

Define these to enable output neural type checks

class nemo.collections.nlp.modules.SequenceTokenClassifier(*args: Any, **kwargs: Any)[source]#

Bases: nemo.collections.nlp.modules.common.classifier.Classifier

forward(hidden_states)[source]#
property output_types: Optional[Dict[str, nemo.core.neural_types.neural_type.NeuralType]]#

Define these to enable output neural type checks

nemo.collections.nlp.modules.get_lm_model(config_dict: Optional[dict] = None, config_file: Optional[str] = None, vocab_file: Optional[str] = None, trainer: Optional[pytorch_lightning.Trainer] = None, cfg: Optional[omegaconf.DictConfig] = None) nemo.collections.nlp.modules.common.bert_module.BertModule[source]#

Helper function to instantiate a language model encoder, either from scratch or a pretrained model. If only pretrained_model_name are passed, a pretrained model is returned. If a configuration is passed, whether as a file or dictionary, the model is initialized with random weights.

Parameters
  • config_dict – path to the model configuration dictionary

  • config_file – path to the model configuration file

  • vocab_file – path to vocab_file to be used with Megatron-LM

  • trainer – an instance of a PyTorch Lightning trainer

  • cfg – a model configuration

Returns

Pretrained BertModule

nemo.collections.nlp.modules.get_pretrained_lm_models_list(include_external: bool = False) List[str][source]#

Returns the list of supported pretrained model names

Parameters
  • names (include_external if true includes all HuggingFace model) –

  • NeMo. (not only those supported language models in) –

nemo.collections.nlp.modules.common.megatron.get_megatron_lm_models_list() List[str][source]#

Returns the list of supported Megatron-LM models

Datasets#