`stages.text.models.utils`#

Module Contents#

Functions#

`clip_tokens`	Clip the tokens to the smallest size possible.
`format_name_with_suffix`

Data#

`ATTENTION_MASK_COLUMN`
`INPUT_ID_COLUMN`
`SEQ_ORDER_COLUMN`
`TOKEN_LENGTH_COLUMN`

API#

stages.text.models.utils.ATTENTION_MASK_COLUMN#: ‘attention_mask’

stages.text.models.utils.INPUT_ID_COLUMN#: ‘input_ids’

stages.text.models.utils.SEQ_ORDER_COLUMN#: ‘_curator_seq_order’

stages.text.models.utils.TOKEN_LENGTH_COLUMN#: ‘_curator_token_length’

stages.text.models.utils.clip_tokens( token_o: dict, padding_side: Literal[left, right] = 'right', ) → dict[str, torch.Tensor]#

Clip the tokens to the smallest size possible.

Args: token_o: The dictionary containing the input tokens (input_ids, attention_mask). padding_side: The side to pad the input tokens. Defaults to “right”.

Returns: The clipped tokens (input_ids, attention_mask).

stages.text.models.utils.format_name_with_suffix( model_identifier: str, suffix: str = '_classifier', ) → str#