The Aesthetic Filter predicts the subjective visual quality of images using a model trained on human aesthetic preferences. It outputs an aesthetic score (higher values show more aesthetic images), making it useful for filtering or ranking images in generative pipelines and dataset curation.
aesthetic_scoreThe filter takes pre-computed CLIP ViT-L/14 image embeddings from a previous pipeline stage and predicts an aesthetic score. The lightweight model processes batches of embeddings efficiently on the GPU.
Before using the ImageAestheticFilterStage, ensure you have:
The aesthetic predictor model weights are automatically downloaded from HuggingFace on first use. The stage will:
model_dirFirst-time setup: The initial model download is quick (under 1 minute on most connections). Subsequent runs will use the cached model.
ImageEmbeddingStageImageObject.embeddingImageEmbeddingStage for best results.model_inference_batch_size based on available GPU memory.