***

description: >-
NSFW filter for detecting inappropriate content in images using CLIP
embeddings and MLP architecture
categories:

* how-to-guides
  tags:
* nsfw
* filtering
* clip
* safety
* content-filtering
  personas:
* data-scientist-focused
* mle-focused
  difficulty: intermediate
  content\_type: how-to
  modality: image-only

***

# NSFW Filter

The NSFW (Not Safe For Work) Filter detects the likelihood that an image contains explicit or unsafe content. It outputs a probability score from 0 (safe) to 1 (NSFW), helping you filter or flag images in your datasets.

## Model Details

* **Architecture:** MLP trained on CLIP ViT-L/14 image embeddings
* **Source**: [CLIP-based NSFW Detector](https://github.com/LAION-AI/CLIP-based-NSFW-Detector)
* **Output Field:** `nsfw_score`
* **Score Range:** 0–1 (higher scores show NSFW content)
* **Embeddings:** Requires CLIP ViT-L/14 (see [Image embeddings](/curate-images/process-data/embeddings))

## How It Works

The filter takes pre-computed normalized image embeddings from a previous pipeline stage and predicts the probability of NSFW content. The lightweight model processes batches of embeddings efficiently on the GPU.

## Prerequisites

Before using the `ImageNSFWFilterStage`, ensure you have:

### Model Setup

The NSFW detector model weights are automatically downloaded from the LAION repository on first use. The stage will:

1. Download the CLIP-based NSFW detector model (\~20MB) to the specified `model_dir`
2. Cache the model for subsequent runs
3. Load the model onto GPU (or CPU if GPU unavailable)

**First-time setup:** The initial model download is quick (under 1 minute on most connections). Subsequent runs will use the cached model.

### Required Input

* **CLIP Embeddings:** Images must have embeddings already generated by `ImageEmbeddingStage`
* **Embedding Format:** CLIP ViT-L/14 768-dimensional vectors stored in `ImageObject.embedding`

## Usage

<Tabs>
  <Tab title="Python">
    ```python
    from nemo_curator.pipeline import Pipeline
    from nemo_curator.stages.file_partitioning import FilePartitioningStage
    from nemo_curator.stages.image.io.image_reader import ImageReaderStage
    from nemo_curator.stages.image.embedders.clip_embedder import ImageEmbeddingStage
    from nemo_curator.stages.image.filters.nsfw_filter import ImageNSFWFilterStage

    # Create pipeline
    pipeline = Pipeline(name="nsfw_filtering", description="Filter NSFW content from images")

    # Stage 1: Partition tar files
    pipeline.add_stage(FilePartitioningStage(
        file_paths="/path/to/tar_dataset",
        files_per_partition=1,
        file_extensions=[".tar"],
    ))

    # Stage 2: Read images
    pipeline.add_stage(ImageReaderStage(
        batch_size=100,
        num_gpus_per_worker=0.25,
    ))

    # Stage 3: Generate CLIP embeddings
    pipeline.add_stage(ImageEmbeddingStage(
        model_dir="/path/to/models",
        model_inference_batch_size=32,
        num_gpus_per_worker=0.25,
    ))

    # Stage 4: Apply NSFW filtering
    pipeline.add_stage(ImageNSFWFilterStage(
        model_dir="/path/to/models",
        score_threshold=0.5,
        model_inference_batch_size=32,
        num_gpus_per_worker=0.25,
    ))

    # Run the pipeline (uses XennaExecutor by default)
    results = pipeline.run()
    ```
  </Tab>
</Tabs>

## Parameters

| Parameter                    | Type  | Default | Description                                                              |
| ---------------------------- | ----- | ------- | ------------------------------------------------------------------------ |
| `model_dir`                  | str   | None    | Path to directory containing model weights                               |
| `score_threshold`            | float | 0.5     | NSFW score threshold for filtering (filters out images above this value) |
| `model_inference_batch_size` | int   | 32      | Batch size for model inference                                           |
| `num_gpus_per_worker`        | float | 0.25    | GPU allocation per worker (0.25 = 1/4 GPU)                               |
| `verbose`                    | bool  | False   | Enable verbose logging for debugging                                     |

## Performance Notes

* The small model processes pre-computed embeddings efficiently on the GPU.
* Increase batch size for faster throughput if memory allows.

## Best Practices

* Use CLIP ViT-L/14 embeddings generated by `ImageEmbeddingStage` for best results.
* Run the NSFW filter after embedding generation in the same pipeline to avoid extra I/O.
* The filter requires pre-computed embeddings and cannot extract embeddings from raw images.
* Review a sample of scores to calibrate thresholds for your use case.
* Adjust `model_inference_batch_size` based on available GPU memory.

## Resources

* [Image Curation Tutorial](https://github.com/NVIDIA-NeMo/Curator/blob/main/tutorials/image/getting-started/image_curation_example.py)
* [Image Deduplication Example](https://github.com/NVIDIA-NeMo/Curator/blob/main/tutorials/image/getting-started/image_dedup_example.py)