Losses#

class nemo.collections.common.losses.AggregatorLoss(*args: Any, **kwargs: Any)#

Sums several losses into one.

Parameters:

num_inputs – number of input losses
weights – a list of coefficient for merging losses

__init__( num_inputs: int = 2, weights: List[float] | None = None, )#

class nemo.collections.common.losses.CrossEntropyLoss(*args: Any, **kwargs: Any)#

__init__( logits_ndim=2, weight=None, reduction='mean', ignore_index=-100, )#

Parameters:

logits_ndim (int) – number of dimensions (or rank) of the logits tensor
weight (list) – list of rescaling weight given to each class
reduction (str) – type of the reduction over the batch

class nemo.collections.common.losses.MSELoss(*args: Any, **kwargs: Any)#

__init__(reduction: str = 'mean')#

Parameters:: reduction – type of the reduction over the batch

class nemo.collections.common.losses.SmoothedCrossEntropyLoss(*args: Any, **kwargs: Any)#

Calculates Cross-entropy loss with label smoothing for a batch of sequences.

SmoothedCrossEntropyLoss: 1) excludes padding tokens from loss calculation 2) allows to use label smoothing regularization 3) allows to calculate loss for the desired number of last tokens 4) per_token_reduction - if False disables reduction per token

Parameters:

label_smoothing (float) – label smoothing regularization coefficient
predict_last_k (int) – parameter which sets the number of last tokens to calculate the loss for, for example 0: (default) calculate loss on the entire sequence (e.g., NMT) 1: calculate loss on the last token only (e.g., LM evaluation) Intermediate values allow to control the trade-off between eval time (proportional to the number of batches) and eval performance (proportional to the number of context tokens)
pad_id (int) – padding id
eps (float) – the small eps number to avoid division buy zero

__init__( pad_id: int | None = None, label_smoothing: float | None = 0.0, predict_last_k: int | None = 0, eps: float = 1e-06, per_token_reduction: bool = True, )#

class nemo.collections.common.losses.SpanningLoss(*args: Any, **kwargs: Any)#

implements start and end loss of a span e.g. for Question Answering.

__init__()#