stages.text.models.utils#

Module Contents#

Functions#

clip_tokens

Clip the tokens to the smallest size possible.

format_name_with_suffix

Data#

API#

stages.text.models.utils.ATTENTION_MASK_COLUMN#

‘attention_mask’

stages.text.models.utils.INPUT_ID_COLUMN#

‘input_ids’

stages.text.models.utils.SEQ_ORDER_COLUMN#

‘_curator_seq_order’

stages.text.models.utils.TOKEN_LENGTH_COLUMN#

‘_curator_token_length’

stages.text.models.utils.clip_tokens(
token_o: dict,
padding_side: Literal[left, right] = 'right',
) dict[str, torch.Tensor]#

Clip the tokens to the smallest size possible.

Args: token_o: The dictionary containing the input tokens (input_ids, attention_mask). padding_side: The side to pad the input tokens. Defaults to “right”.

Returns: The clipped tokens (input_ids, attention_mask).

stages.text.models.utils.format_name_with_suffix(
model_identifier: str,
suffix: str = '_classifier',
) str#