Important

You are viewing the NeMo 2.0 documentation. This release introduces significant changes to the API and a new library, NeMo Run. We are currently porting all features from NeMo 1.0 to 2.0. For documentation on previous versions or features not yet available in 2.0, please refer to the NeMo 24.07 documentation.

Losses#

class nemo.collections.common.losses.AggregatorLoss(*args: Any, **kwargs: Any)#

Sums several losses into one.

Parameters:
  • num_inputs – number of input losses

  • weights – a list of coefficient for merging losses

__init__(
num_inputs: int = 2,
weights: List[float] | None = None,
)#
class nemo.collections.common.losses.CrossEntropyLoss(*args: Any, **kwargs: Any)#
__init__(
logits_ndim=2,
weight=None,
reduction='mean',
ignore_index=-100,
)#
Parameters:
  • logits_ndim (int) – number of dimensions (or rank) of the logits tensor

  • weight (list) – list of rescaling weight given to each class

  • reduction (str) – type of the reduction over the batch

class nemo.collections.common.losses.MSELoss(*args: Any, **kwargs: Any)#
__init__(reduction: str = 'mean')#
Parameters:

reduction – type of the reduction over the batch

class nemo.collections.common.losses.SmoothedCrossEntropyLoss(*args: Any, **kwargs: Any)#

Calculates Cross-entropy loss with label smoothing for a batch of sequences.

SmoothedCrossEntropyLoss: 1) excludes padding tokens from loss calculation 2) allows to use label smoothing regularization 3) allows to calculate loss for the desired number of last tokens 4) per_token_reduction - if False disables reduction per token

Parameters:
  • label_smoothing (float) – label smoothing regularization coefficient

  • predict_last_k (int) – parameter which sets the number of last tokens to calculate the loss for, for example 0: (default) calculate loss on the entire sequence (e.g., NMT) 1: calculate loss on the last token only (e.g., LM evaluation) Intermediate values allow to control the trade-off between eval time (proportional to the number of batches) and eval performance (proportional to the number of context tokens)

  • pad_id (int) – padding id

  • eps (float) – the small eps number to avoid division buy zero

__init__(
pad_id: int | None = None,
label_smoothing: float | None = 0.0,
predict_last_k: int | None = 0,
eps: float = 1e-06,
per_token_reduction: bool = True,
)#
class nemo.collections.common.losses.SpanningLoss(*args: Any, **kwargs: Any)#

implements start and end loss of a span e.g. for Question Answering.

__init__()#