References
NeMo Curator’s reference documentation provides comprehensive technical details, API references, and integration information to help you maximize your NeMo Curator implementation. Use these resources to understand the technical foundation of NeMo Curator and integrate it with other tools and systems.
API Quicklinks
Browse the generated Python API reference for NeMo Curator modules, or use the API Reference tab for curated overviews of pipelines, stages, tasks, and executors.
Infrastructure Components
Explore the foundational infrastructure that powers NeMo Curator. Learn how to scale, optimize, and manage large data workflows efficiently.
Optimize memory usage when processing large datasets partitioning batching monitoring
Leverage NVIDIA GPUs for faster data processing cuda rmm performance
Continue interrupted operations across large datasets checkpoints recovery batching
Integration & Tools
Discover related tools and integrations in the NVIDIA AI ecosystem that complement NeMo Curator, enabling seamless workflows from data curation to model training and deployment.