References
NeMo Curator’s reference documentation provides comprehensive technical details, API references, and integration information to help you maximize your NeMo Curator implementation. Use these resources to understand the technical foundation of NeMo Curator and integrate it with other tools and systems.
API Quicklinks
Quickly access core NeMo Curator API references. Use these links to jump directly to the technical API documentation for each major module.
Main orchestrator for executing sequences of processing stages
Base class for all data processing stages
High-level stages that decompose into multiple execution stages
CPU and GPU resource configuration for stages
Task types for text, image, video, and audio processing
Execution backends for running pipelines
Infrastructure Components
Explore the foundational infrastructure that powers NeMo Curator. Learn how to scale, optimize, and manage large data workflows efficiently.
Optimize memory usage when processing large datasets
Leverage NVIDIA GPUs for faster data processing
Continue interrupted operations across large datasets
Integration & Tools
Discover related tools and integrations in the NVIDIA AI ecosystem that complement NeMo Curator, enabling seamless workflows from data curation to model training and deployment.