nemo_automodel.components.datasets.llm.seq_cls#
Module Contents#
Classes#
GLUE MRPC dataset (sentence pair classification). |
API#
- class nemo_automodel.components.datasets.llm.seq_cls.GLUE_MRPC(
- tokenizer,
- *,
- split: str = 'train',
- num_samples_limit: Optional[int] = None,
- trust_remote_code: bool = True,
- max_length: Optional[int] = 256,
GLUE MRPC dataset (sentence pair classification).
Produces tokenized inputs with both sentence1 and sentence2 using the provided tokenizer.
Initialization
- __len__()#
- __getitem__(idx)#