NeMo Curator Documentation#
Welcome to the NeMo Curator documentation.
Introduction to Curator#
Learn about the Curator, how it works at a high-level, and the key features.
Overview of NeMo Curator and its capabilities.
Discover the main features of NeMo Curator for data curation.
Explore the core concepts for each modality in NeMo Curator.
Data Curation Workflows#
Workflow Modalities#
Explore how you can use NeMo Curator across different content modalities.
Curate and prepare high-quality text datasets for LLM training.
Curate image-text datasets with embedding, classification, and deduplication.
Quickstart Guides#
Install and run NeMo Curator for specific modalities.
Quickly set up and run text curation workflows.
Quickly set up and run image curation workflows.
Tutorial Highlights#
Check out tutorials to get a quick start on using the NeMo Curator library.
Learn the basics of text data processing with NeMo Curator.
Learn the basics of image data processing with NeMo Curator.