Deploy NeMo Curator on Slurm#

Deploy NeMo Curator on Slurm-managed clusters to scale data curation workflows across multiple nodes with shared storage.

Slurm provides workload management and job scheduling for high-performance computing environments, enabling efficient resource allocation and queue management for large-scale data processing tasks.

All Modalities

How to set up and run NeMo Curator on Slurm for any modality.

Deploy All Modalities on Slurm
Multi-Node Setup Guide

Advanced multi-node configurations for large-scale deployments with performance optimization and troubleshooting.

Multi-Node Slurm Setup Guide
Image Modality Deployment & Workflow

A step-by-step Slurm pipeline for image curation, including embedding generation and semantic deduplication.

Deploy Image Curation on Slurm
Text Modality Deployment & Workflow

A step-by-step Slurm pipeline for text curation, including deduplication, classification, and PII redaction.

Deploy Text Curation on Slurm