Deploy NeMo Curator on Slurm#
Deploy NeMo Curator on Slurm-managed clusters to scale data curation workflows across multiple nodes with shared storage.
Slurm provides workload management and job scheduling for high-performance computing environments, enabling efficient resource allocation and queue management for large-scale data processing tasks.
All Modalities
How to set up and run NeMo Curator on Slurm for any modality.
Multi-Node Setup Guide
Advanced multi-node configurations for large-scale deployments with performance optimization and troubleshooting.
Image Modality Deployment & Workflow
A step-by-step Slurm pipeline for image curation, including embedding generation and semantic deduplication.
Text Modality Deployment & Workflow
A step-by-step Slurm pipeline for text curation, including deduplication, classification, and PII redaction.