*** description: >- NSFW filter for detecting inappropriate content in images using CLIP embeddings and MLP architecture categories: * how-to-guides tags: * nsfw * filtering * clip * safety * content-filtering personas: * data-scientist-focused * mle-focused difficulty: intermediate content\_type: how-to modality: image-only *** # NSFW Filter The NSFW (Not Safe For Work) Filter detects the likelihood that an image contains explicit or unsafe content. It outputs a probability score from 0 (safe) to 1 (NSFW), helping you filter or flag images in your datasets. ## Model Details * **Architecture:** MLP trained on CLIP ViT-L/14 image embeddings * **Source**: [CLIP-based NSFW Detector](https://github.com/LAION-AI/CLIP-based-NSFW-Detector) * **Output Field:** `nsfw_score` * **Score Range:** 0–1 (higher scores show NSFW content) * **Embeddings:** Requires CLIP ViT-L/14 (see [Image embeddings](/curate-images/process-data/embeddings)) ## How It Works The filter takes pre-computed normalized image embeddings from a previous pipeline stage and predicts the probability of NSFW content. The lightweight model processes batches of embeddings efficiently on the GPU. ## Prerequisites Before using the `ImageNSFWFilterStage`, ensure you have: ### Model Setup The NSFW detector model weights are automatically downloaded from the LAION repository on first use. The stage will: 1. Download the CLIP-based NSFW detector model (\~20MB) to the specified `model_dir` 2. Cache the model for subsequent runs 3. Load the model onto GPU (or CPU if GPU unavailable) **First-time setup:** The initial model download is quick (under 1 minute on most connections). Subsequent runs will use the cached model. ### Required Input * **CLIP Embeddings:** Images must have embeddings already generated by `ImageEmbeddingStage` * **Embedding Format:** CLIP ViT-L/14 768-dimensional vectors stored in `ImageObject.embedding` ## Usage ```python from nemo_curator.pipeline import Pipeline from nemo_curator.stages.file_partitioning import FilePartitioningStage from nemo_curator.stages.image.io.image_reader import ImageReaderStage from nemo_curator.stages.image.embedders.clip_embedder import ImageEmbeddingStage from nemo_curator.stages.image.filters.nsfw_filter import ImageNSFWFilterStage # Create pipeline pipeline = Pipeline(name="nsfw_filtering", description="Filter NSFW content from images") # Stage 1: Partition tar files pipeline.add_stage(FilePartitioningStage( file_paths="/path/to/tar_dataset", files_per_partition=1, file_extensions=[".tar"], )) # Stage 2: Read images pipeline.add_stage(ImageReaderStage( batch_size=100, num_gpus_per_worker=0.25, )) # Stage 3: Generate CLIP embeddings pipeline.add_stage(ImageEmbeddingStage( model_dir="/path/to/models", model_inference_batch_size=32, num_gpus_per_worker=0.25, )) # Stage 4: Apply NSFW filtering pipeline.add_stage(ImageNSFWFilterStage( model_dir="/path/to/models", score_threshold=0.5, model_inference_batch_size=32, num_gpus_per_worker=0.25, )) # Run the pipeline (uses XennaExecutor by default) results = pipeline.run() ``` ## Parameters | Parameter | Type | Default | Description | | ---------------------------- | ----- | ------- | ------------------------------------------------------------------------ | | `model_dir` | str | None | Path to directory containing model weights | | `score_threshold` | float | 0.5 | NSFW score threshold for filtering (filters out images above this value) | | `model_inference_batch_size` | int | 32 | Batch size for model inference | | `num_gpus_per_worker` | float | 0.25 | GPU allocation per worker (0.25 = 1/4 GPU) | | `verbose` | bool | False | Enable verbose logging for debugging | ## Performance Notes * The small model processes pre-computed embeddings efficiently on the GPU. * Increase batch size for faster throughput if memory allows. ## Best Practices * Use CLIP ViT-L/14 embeddings generated by `ImageEmbeddingStage` for best results. * Run the NSFW filter after embedding generation in the same pipeline to avoid extra I/O. * The filter requires pre-computed embeddings and cannot extract embeddings from raw images. * Review a sample of scores to calibrate thresholds for your use case. * Adjust `model_inference_batch_size` based on available GPU memory. ## Resources * [Image Curation Tutorial](https://github.com/NVIDIA-NeMo/Curator/blob/main/tutorials/image/getting-started/image_curation_example.py) * [Image Deduplication Example](https://github.com/NVIDIA-NeMo/Curator/blob/main/tutorials/image/getting-started/image_dedup_example.py)