nemo_curator.stages.text.models.utils
nemo_curator.stages.text.models.utils
Module Contents
Functions
Data
API
Clip the tokens to the smallest size possible.
Parameters:
token_o
The dictionary containing the input tokens (input_ids, attention_mask).
padding_side
The side to pad the input tokens. Defaults to “right”.
Returns: dict[str, torch.Tensor]
The clipped tokens (input_ids, attention_mask).