Get Started

About Getting Started

View as Markdown

Before You Start

Welcome to NeMo Curator! This framework streamlines the curation and pre-processing of large-scale datasets for training generative AI models across text, image, audio and video modalities.

Who are these quickstarts for?

  • AI/ML engineers and researchers who want to quickly test NeMo Curator’s capabilities
  • Users looking to run an initial curation pipeline with minimal setup
  • Individuals exploring NeMo Curator prior to a full production deployment

What you’ll find here: Each quickstart enables you to get started with a specific domain in less than 30 minutes. Quickstarts provide basic installation steps, sample data, and a working example.

For production deployments, cluster configurations, or detailed system requirements, refer to the Setup & Deployment documentation.


Modality Quickstarts

The following quickstarts allow you to test NeMo Curator using a selected data modality.