API Reference#
NeMo Curator’s API reference provides comprehensive technical documentation for all modules, classes, and functions. Use these references to understand the technical foundation of NeMo Curator and integrate it with your data curation workflows.
Ray-based execution backends
Adapters and executors for running pipelines at scale.
ray-data xenna
Orchestrate end-to-end workflows
Build and run pipelines composed of processing stages.
Download, transform, and write data
Modular stages for download/extract, text models/classifiers, I/O, and utilities.
download text io modules
Core data structures
Document batches, file groups, and related interfaces passed between stages.
Helper functions
File, performance, and operation utilities used across the pipeline.