nemo_curator.stages.math.classifiers.finemath
nemo_curator.stages.math.classifiers.finemath
Module Contents
Classes
Data
API
Bases: ProcessingStage[DocumentBatch, DocumentBatch]
Pre-tokenization stage that center-crops the text field to a fixed number of characters to keep central context.
center_crop_chars
name
staticmethod
Dataclass
Bases: CompositeStage[DocumentBatch, DocumentBatch]
FineMath composite: TokenizerStage -> FineMathModelStage.
autocast
cache_dir
center_crop_chars
float_score_column
int_score_column
max_chars
max_seq_length
model_inference_batch_size
sort_by_length
text_field
Bases: ModelStage
Hugging Face sequence classification model stage for FineMath.
Outputs columns:
- finemath_scores (float list)
- finemath_int_scores (int list)
staticmethod