Concepts#

Learn about the core components and concepts introduced by NeMo Curator. The following concepts are organized by each major modality.

Modality Concepts#

Learn about working with specific modalities using NeMo Curator.

Image Curation Concepts

Explore key concepts for image data curation, including scalable loading, processing (embedding, classification, filtering, deduplication), and dataset export.

Image Curation Concepts
Text Curation Concepts

Learn about text data curation, covering data loading, processing (filtering, deduplication, classification), and synthetic data generation.

Text Curation Concepts