NSFW Filter#

The NSFW (Not Safe For Work) Filter detects the likelihood that an image contains explicit or unsafe content. It outputs a probability score from 0 (safe) to 1 (NSFW), helping you filter or flag images in your datasets.

Model Details#

  • Architecture: MLP trained on CLIP ViT-L/14 image embeddings

  • Source: CLIP-based NSFW Detector

  • Output Field: nsfw_score

  • Score Range: 0–1 (higher scores show NSFW content)

  • Embeddings: Requires CLIP ViT-L/14 (see Image Embedding)

How It Works#

The filter takes pre-computed normalized image embeddings from a previous pipeline stage and predicts the probability of NSFW content. The lightweight model processes batches of embeddings efficiently on the GPU.

Prerequisites#

Before using the ImageNSFWFilterStage, ensure you have:

Model Setup#

The NSFW detector model weights are automatically downloaded from the LAION repository on first use. The stage will:

  1. Download the CLIP-based NSFW detector model (~20MB) to the specified model_dir

  2. Cache the model for subsequent runs

  3. Load the model onto GPU (or CPU if GPU unavailable)

First-time setup: The initial model download is quick (under 1 minute on most connections). Subsequent runs will use the cached model.

Required Input#

  • CLIP Embeddings: Images must have embeddings already generated by ImageEmbeddingStage

  • Embedding Format: CLIP ViT-L/14 768-dimensional vectors stored in ImageObject.embedding

Usage#

from nemo_curator.pipeline import Pipeline
from nemo_curator.stages.file_partitioning import FilePartitioningStage
from nemo_curator.stages.image.io.image_reader import ImageReaderStage
from nemo_curator.stages.image.embedders.clip_embedder import ImageEmbeddingStage
from nemo_curator.stages.image.filters.nsfw_filter import ImageNSFWFilterStage

# Create pipeline
pipeline = Pipeline(name="nsfw_filtering", description="Filter NSFW content from images")

# Stage 1: Partition tar files
pipeline.add_stage(FilePartitioningStage(
    file_paths="/path/to/tar_dataset",
    files_per_partition=1,
    file_extensions=[".tar"],
))

# Stage 2: Read images
pipeline.add_stage(ImageReaderStage(
    batch_size=100,
    num_gpus_per_worker=0.25,
))

# Stage 3: Generate CLIP embeddings
pipeline.add_stage(ImageEmbeddingStage(
    model_dir="/path/to/models",
    model_inference_batch_size=32,
    num_gpus_per_worker=0.25,
))

# Stage 4: Apply NSFW filtering
pipeline.add_stage(ImageNSFWFilterStage(
    model_dir="/path/to/models",
    score_threshold=0.5,
    model_inference_batch_size=32,
    num_gpus_per_worker=0.25,
))

# Run the pipeline (uses XennaExecutor by default)
results = pipeline.run()

Parameters#

Parameter

Type

Default

Description

model_dir

str

None

Path to directory containing model weights

score_threshold

float

0.5

NSFW score threshold for filtering (filters out images above this value)

model_inference_batch_size

int

32

Batch size for model inference

num_gpus_per_worker

float

0.25

GPU allocation per worker (0.25 = 1/4 GPU)

verbose

bool

False

Enable verbose logging for debugging

Performance Notes#

  • The small model processes pre-computed embeddings efficiently on the GPU.

  • Increase batch size for faster throughput if memory allows.

Best Practices#

  • Use CLIP ViT-L/14 embeddings generated by ImageEmbeddingStage for best results.

  • Run the NSFW filter after embedding generation in the same pipeline to avoid extra I/O.

  • The filter requires pre-computed embeddings and cannot extract embeddings from raw images.

  • Review a sample of scores to calibrate thresholds for your use case.

  • Adjust model_inference_batch_size based on available GPU memory.

Resources#