Common Collection¶
The common collection contains things that could be used across all collections.
Tokenizers¶
Wrapper of HuggingFace AutoTokenizer https://huggingface.co/transformers/model_doc/auto.html#autotokenizer.
-
nemo.collections.common.tokenizers.AutoTokenizer.
bos_id
¶
-
nemo.collections.common.tokenizers.AutoTokenizer.
cls_id
¶
-
nemo.collections.common.tokenizers.AutoTokenizer.
eos_id
¶
-
nemo.collections.common.tokenizers.AutoTokenizer.
mask_id
¶
-
nemo.collections.common.tokenizers.AutoTokenizer.
name
¶
-
nemo.collections.common.tokenizers.AutoTokenizer.
pad_id
¶
-
nemo.collections.common.tokenizers.AutoTokenizer.
sep_id
¶
-
nemo.collections.common.tokenizers.AutoTokenizer.
unk_id
¶
-
nemo.collections.common.tokenizers.AutoTokenizer.
vocab_size
¶
Sentencepiecetokenizer https://github.com/google/sentencepiece.
-
nemo.collections.common.tokenizers.SentencePieceTokenizer.
bos_id
¶
-
nemo.collections.common.tokenizers.SentencePieceTokenizer.
cls_id
¶
-
nemo.collections.common.tokenizers.SentencePieceTokenizer.
eos_id
¶
-
nemo.collections.common.tokenizers.SentencePieceTokenizer.
name
¶
-
nemo.collections.common.tokenizers.SentencePieceTokenizer.
pad_id
¶
-
nemo.collections.common.tokenizers.SentencePieceTokenizer.
sep_id
¶
Inherit this class to implement a new tokenizer.
-
nemo.collections.common.tokenizers.TokenizerSpec.
__init__
(self, /, *args, **kwargs)¶ Initialize self. See help(type(self)) for accurate signature.
-
nemo.collections.common.tokenizers.TokenizerSpec.
name
¶
Losses¶
Sums several losses into one.
- param num_inputs
number of input losses
- param weights
a list of coefficient for merging losses
-
nemo.collections.common.losses.AggregatorLoss.
input_types
¶ Returns definitions of module input ports.
-
nemo.collections.common.losses.AggregatorLoss.
output_types
¶ Returns definitions of module output ports.
CrossEntropyLoss
-
nemo.collections.common.losses.CrossEntropyLoss.
input_types
¶ Returns definitions of module input ports.
-
nemo.collections.common.losses.CrossEntropyLoss.
output_types
¶ Returns definitions of module output ports.
MSELoss
-
nemo.collections.common.losses.MSELoss.
input_types
¶ Returns definitions of module input ports.
-
nemo.collections.common.losses.MSELoss.
output_types
¶ Returns definitions of module output ports.
Calculates Cross-entropy loss with label smoothing for a batch of sequences.
SmoothedCrossEntropyLoss: 1) excludes padding tokens from loss calculation 2) allows to use label smoothing regularization 3) allows to calculate loss for the desired number of last tokens
- param label_smoothing
label smoothing regularization coefficient
- type label_smoothing
float
- param predict_last_k
parameter which sets the number of last tokens to calculate the loss for, for example 0: (default) calculate loss on the entire sequence (e.g., NMT) 1: calculate loss on the last token only (e.g., LM evaluation) Intermediate values allow to control the trade-off between eval time (proportional to the number of batches) and eval performance (proportional to the number of context tokens)
- type predict_last_k
int
- param pad_id
padding id
- type pad_id
int
- param eps
the small eps number to avoid division buy zero
- type eps
float
-
nemo.collections.common.losses.SmoothedCrossEntropyLoss.
input_types
¶ Returns definitions of module input ports.
-
nemo.collections.common.losses.SmoothedCrossEntropyLoss.
output_types
¶ Returns definitions of module output ports.
implements start and end loss of a span e.g. for Question Answering.
-
nemo.collections.common.losses.SpanningLoss.
input_types
¶ Returns definitions of module input ports.
-
nemo.collections.common.losses.SpanningLoss.
output_types
¶ Returns definitions of module output ports.
Metrics¶
-
class
nemo.collections.common.metrics.
Perplexity
(*args: Any, **kwargs: Any)[source]¶ Bases:
pytorch_lightning.metrics.
This class computes mean perplexity of distributions in the last dimension of inputs. It is a wrapper around torch.distributions.Categorical.perplexity method. You have to provide either
probs
orlogits
to theupdate()
method. The class computes perplexities for distributions passed toupdate()
method inprobs
orlogits
arguments and averages the perplexities. Reducing results between all workers is done via SUM operations.See PyTorch Lightning Metrics for the metric usage instructions.
- Parameters
compute_on_step – Forward only calls
update()
and returnsNone
if this is set toFalse
. default:True
dist_sync_on_step – Synchronize metric state across processes at each
forward()
before returning the value at the step.process_group –
- Specify the process group on which synchronization is called. default:
None
(which selects the entire world)
- Specify the process group on which synchronization is called. default:
validate_args – If
True
values ofupdate()
method parameters are checked.logits
has to not contain NaNs andprobs
last dim has to be valid probability distribution.