Common Collection¶
The common collection contains things that could be used across all collections.
Tokenizers¶
Wrapper of HuggingFace AutoTokenizer https://huggingface.co/transformers/model_doc/auto.html#autotokenizer.
-
nemo.collections.common.tokenizers.AutoTokenizer.
bos_id
¶
-
nemo.collections.common.tokenizers.AutoTokenizer.
cls_id
¶
-
nemo.collections.common.tokenizers.AutoTokenizer.
eos_id
¶
-
nemo.collections.common.tokenizers.AutoTokenizer.
mask_id
¶
-
nemo.collections.common.tokenizers.AutoTokenizer.
name
¶
-
nemo.collections.common.tokenizers.AutoTokenizer.
pad_id
¶
-
nemo.collections.common.tokenizers.AutoTokenizer.
sep_id
¶
-
nemo.collections.common.tokenizers.AutoTokenizer.
unk_id
¶
-
nemo.collections.common.tokenizers.AutoTokenizer.
vocab_size
¶
Sentencepiecetokenizer https://github.com/google/sentencepiece.
-
nemo.collections.common.tokenizers.SentencePieceTokenizer.
bos_id
¶
-
nemo.collections.common.tokenizers.SentencePieceTokenizer.
cls_id
¶
-
nemo.collections.common.tokenizers.SentencePieceTokenizer.
eos_id
¶
-
nemo.collections.common.tokenizers.SentencePieceTokenizer.
name
¶
-
nemo.collections.common.tokenizers.SentencePieceTokenizer.
pad_id
¶
-
nemo.collections.common.tokenizers.SentencePieceTokenizer.
sep_id
¶
Inherit this class to implement a new tokenizer.
-
nemo.collections.common.tokenizers.TokenizerSpec.
__init__
¶ Initialize self. See help(type(self)) for accurate signature.
-
nemo.collections.common.tokenizers.TokenizerSpec.
name
¶
Losses¶
Sums several losses into one.
param num_inputs: | |
---|---|
number of input losses | |
param weights: | a list of coefficient for merging losses |
-
nemo.collections.common.losses.AggregatorLoss.
input_types
¶ Returns definitions of module input ports.
-
nemo.collections.common.losses.AggregatorLoss.
output_types
¶ Returns definitions of module output ports.
CrossEntropyLoss
-
nemo.collections.common.losses.CrossEntropyLoss.
input_types
¶ Returns definitions of module input ports.
-
nemo.collections.common.losses.CrossEntropyLoss.
output_types
¶ Returns definitions of module output ports.
MSELoss
-
nemo.collections.common.losses.MSELoss.
input_types
¶ Returns definitions of module input ports.
-
nemo.collections.common.losses.MSELoss.
output_types
¶ Returns definitions of module output ports.
Calculates Cross-entropy loss with label smoothing for a batch of sequences.
SmoothedCrossEntropyLoss: 1) excludes padding tokens from loss calculation 2) allows to use label smoothing regularization 3) allows to calculate loss for the desired number of last tokens
param label_smoothing: | |
---|---|
label smoothing regularization coefficient | |
type label_smoothing: | |
float | |
param predict_last_k: | |
parameter which sets the number of last tokens to calculate the loss for, for example 0: (default) calculate loss on the entire sequence (e.g., NMT) 1: calculate loss on the last token only (e.g., LM evaluation) Intermediate values allow to control the trade-off between eval time (proportional to the number of batches) and eval performance (proportional to the number of context tokens) | |
type predict_last_k: | |
int | |
param pad_id: | padding id |
type pad_id: | int |
param eps: | the small eps number to avoid division buy zero |
type eps: | float |
-
nemo.collections.common.losses.SmoothedCrossEntropyLoss.
input_types
¶ Returns definitions of module input ports.
-
nemo.collections.common.losses.SmoothedCrossEntropyLoss.
output_types
¶ Returns definitions of module output ports.
implements start and end loss of a span e.g. for Question Answering.
-
nemo.collections.common.losses.SpanningLoss.
input_types
¶ Returns definitions of module input ports.
-
nemo.collections.common.losses.SpanningLoss.
output_types
¶ Returns definitions of module output ports.