NeMo Curator provides filters for image curation, including aesthetic and NSFW filters. These models help you filter, score, and curate large image datasets for downstream tasks such as generative model training and dataset quality control.
Image filtering in NeMo Curator typically follows these steps:
FilePartitioningStage and ImageReaderStageImageEmbeddingStageImageAestheticFilterStage or ImageNSFWFilterStage)Filtering stages integrate seamlessly into NeMo Curator’s pipeline architecture.
Before using filtering stages, ensure that:
ImageReaderStageImageEmbeddingStageImageObject.embedding field for each imageAssess the subjective quality of images using a model trained on human aesthetic preferences. Filters images below a configurable aesthetic score threshold (0.0 to 1.0). ImageAestheticFilterStage aesthetic_score
Detect not-safe-for-work (NSFW) content in images using a CLIP-based filter. Removes images above a configurable NSFW probability threshold (0.0 to 1.0). ImageNSFWFilterStage nsfw_score