- Model Guide
- Training NeMo Framework Models
- Training with Predefined Configurations
- Using AutoConfigurator to Find the Optimal Configuration
- Training with Custom Configurations
- Bring Your Own Dataset
- Model Training
- Resuming Training with a Different Number of Nodes
- Checkpoint Conversion
- Generalized PEFT Framework
- Model Fine-Tuning
- Model Prompt Learning
- Model Adapter Learning and IA3 Learning
- Model Evaluation
- Exporting the NeMo Models to TensorRT-LLM
- NeMo Data Curator
- Coverage
- General Usage
- Module-specific documentation
- NeMo Data Curator
- Downloading and extracting text
- Document filtering
- Text cleaning and language separation
- Exact and fuzzy deduplication
- Classifier and heuristic-based quality filtering
- 1 Downstream task decontamination (task deduplication)
prepare_task_data
- 3 Find the matching task N-grams within the training documents
- 4 Remove matching N-grams above a user-defined threshold.