NSFW Filter | NeMo Curator

The NSFW (Not Safe For Work) Filter detects the likelihood that an image contains explicit or unsafe content. It outputs a probability score from 0 (safe) to 1 (NSFW), helping you filter or flag images in your datasets.

Model Details

Architecture: MLP trained on CLIP ViT-L/14 image embeddings
Source: CLIP-based NSFW Detector
Output Field: nsfw_score
Score Range: 0–1 (higher scores show NSFW content)
Embeddings: Requires CLIP ViT-L/14 (see Image embeddings)

How It Works

The filter takes pre-computed normalized image embeddings from a previous pipeline stage and predicts the probability of NSFW content. The lightweight model processes batches of embeddings efficiently on the GPU.

Prerequisites

Before using the ImageNSFWFilterStage, ensure you have:

Model Setup

The NSFW detector model weights are automatically downloaded from the LAION repository on first use. The stage will:

Download the CLIP-based NSFW detector model (~20MB) to the specified model_dir
Cache the model for subsequent runs
Load the model onto GPU (or CPU if GPU unavailable)

First-time setup: The initial model download is quick (under 1 minute on most connections). Subsequent runs will use the cached model.

Required Input

CLIP Embeddings: Images must have embeddings already generated by ImageEmbeddingStage
Embedding Format: CLIP ViT-L/14 768-dimensional vectors stored in ImageObject.embedding

Usage

Python

1 from nemo_curator.pipeline import Pipeline
2 from nemo_curator.stages.file_partitioning import FilePartitioningStage
3 from nemo_curator.stages.image.io.image_reader import ImageReaderStage
4 from nemo_curator.stages.image.embedders.clip_embedder import ImageEmbeddingStage
5 from nemo_curator.stages.image.filters.nsfw_filter import ImageNSFWFilterStage
6 
7 # Create pipeline
8 pipeline = Pipeline(name="nsfw_filtering", description="Filter NSFW content from images")
9 
10 # Stage 1: Partition tar files
11 pipeline.add_stage(FilePartitioningStage(
12     file_paths="/path/to/tar_dataset",
13     files_per_partition=1,
14     file_extensions=[".tar"],
15 ))
16 
17 # Stage 2: Read images
18 pipeline.add_stage(ImageReaderStage(
19     batch_size=100,
20     num_gpus_per_worker=0.25,
21 ))
22 
23 # Stage 3: Generate CLIP embeddings
24 pipeline.add_stage(ImageEmbeddingStage(
25     model_dir="/path/to/models",
26     model_inference_batch_size=32,
27     num_gpus_per_worker=0.25,
28 ))
29 
30 # Stage 4: Apply NSFW filtering
31 pipeline.add_stage(ImageNSFWFilterStage(
32     model_dir="/path/to/models",
33     score_threshold=0.5,
34     model_inference_batch_size=32,
35     num_gpus_per_worker=0.25,
36 ))
37 
38 # Run the pipeline (uses XennaExecutor by default)
39 results = pipeline.run()

Parameters

Parameter	Type	Default	Description
`model_dir`	str	None	Path to directory containing model weights
`score_threshold`	float	0.5	NSFW score threshold for filtering (filters out images above this value)
`model_inference_batch_size`	int	32	Batch size for model inference
`num_gpus_per_worker`	float	0.25	GPU allocation per worker (0.25 = 1/4 GPU)
`verbose`	bool	False	Enable verbose logging for debugging

Performance Notes

The small model processes pre-computed embeddings efficiently on the GPU.
Increase batch size for faster throughput if memory allows.

Best Practices

Use CLIP ViT-L/14 embeddings generated by ImageEmbeddingStage for best results.
Run the NSFW filter after embedding generation in the same pipeline to avoid extra I/O.
The filter requires pre-computed embeddings and cannot extract embeddings from raw images.
Review a sample of scores to calibrate thresholds for your use case.
Adjust model_inference_batch_size based on available GPU memory.