`nemo_automodel.components.loss.chunked_ce`#

Module Contents#

Classes#

ChunkedCrossEntropy

Functions#

compute_cross_entropy

Computes the cross-entropy loss between logits and targets.

Data#

_compiled_compute_cross_entropy

API#

nemo_automodel.components.loss.chunked_ce._compiled_compute_cross_entropy#: None

nemo_automodel.components.loss.chunked_ce.compute_cross_entropy( logits: torch.Tensor, targets: torch.Tensor, ignore_index=-100, reduction='sum', )#

Computes the cross-entropy loss between logits and targets.

Parameters:

logits (torch.Tensor) – Model predictions of shape (sequence_length, num_classes).
targets (torch.Tensor) – Ground-truth labels of shape (sequence_length,).
ignore_index (int, optional) – Target value that is ignored when computing the loss. Defaults to -100.

Returns:

The sum of cross-entropy losses over the sequence.

Return type:

torch.Tensor

class nemo_automodel.components.loss.chunked_ce.ChunkedCrossEntropy( chunk_len: int = 32, compile: bool = True, ignore_index: int = -100, reduction: str = 'sum', )#

Bases: torch.nn.Module

Initialization

Chunked cross-entropy loss.

Parameters:

chunk_len (int, optional) – The size of each chunk. The sequence will be split along the first dimension in chunks of this length. Defaults to 32.
compile (bool, optional) – If True, uses the compiled compute_cross_entropy function. Defaults to True.
ignore_index (int, optional) – Target value that is ignored when computing the loss. Defaults to -100.
reduction (str, optional) – Type of reduction. Defaults to “sum”.

forward( logits: torch.Tensor, labels: torch.Tensor, mask: Optional[torch.Tensor] = None, num_label_tokens: Optional[int] = None, ) → torch.Tensor#

Computes cross-entropy loss in chunks to handle long sequences more efficiently.

Parameters:

logits (torch.Tensor) – Model output logits of shape [batch_size, seq_len, vocab_size].
labels (torch.Tensor) – Ground-truth labels of shape [batch_size, seq_len].
mask (torch.Tensor, optional) – Boolean mask indicating valid positions (1) and positions to ignore (0). Defaults to None.

Returns:

The sum of cross-entropy losses over the sequence.

Return type:

torch.Tensor

nemo_automodel.components.loss.chunked_ce#

Module Contents#

Classes#

Functions#

Data#

API#

`nemo_automodel.components.loss.chunked_ce`#