Important
You are viewing the NeMo 2.0 documentation. This release introduces significant changes to the API and a new library, NeMo Run. We are currently porting all features from NeMo 1.0 to 2.0. For documentation on previous versions or features not yet available in 2.0, please refer to the NeMo 24.07 documentation.
Losses#
- class nemo.collections.common.losses.AggregatorLoss(*args: Any, **kwargs: Any)#
Sums several losses into one.
- Parameters:
num_inputs – number of input losses
weights – a list of coefficient for merging losses
- __init__(
- num_inputs: int = 2,
- weights: List[float] | None = None,
- class nemo.collections.common.losses.CrossEntropyLoss(*args: Any, **kwargs: Any)#
- __init__(
- logits_ndim=2,
- weight=None,
- reduction='mean',
- ignore_index=-100,
- Parameters:
logits_ndim (int) – number of dimensions (or rank) of the logits tensor
weight (list) – list of rescaling weight given to each class
reduction (str) – type of the reduction over the batch
- class nemo.collections.common.losses.MSELoss(*args: Any, **kwargs: Any)#
- __init__(reduction: str = 'mean')#
- Parameters:
reduction – type of the reduction over the batch
- class nemo.collections.common.losses.SmoothedCrossEntropyLoss(*args: Any, **kwargs: Any)#
Calculates Cross-entropy loss with label smoothing for a batch of sequences.
SmoothedCrossEntropyLoss: 1) excludes padding tokens from loss calculation 2) allows to use label smoothing regularization 3) allows to calculate loss for the desired number of last tokens 4) per_token_reduction - if False disables reduction per token
- Parameters:
label_smoothing (float) – label smoothing regularization coefficient
predict_last_k (int) – parameter which sets the number of last tokens to calculate the loss for, for example 0: (default) calculate loss on the entire sequence (e.g., NMT) 1: calculate loss on the last token only (e.g., LM evaluation) Intermediate values allow to control the trade-off between eval time (proportional to the number of batches) and eval performance (proportional to the number of context tokens)
pad_id (int) – padding id
eps (float) – the small eps number to avoid division buy zero
- __init__(
- pad_id: int | None = None,
- label_smoothing: float | None = 0.0,
- predict_last_k: int | None = 0,
- eps: float = 1e-06,
- per_token_reduction: bool = True,