Image Embedding
Generate image embeddings for large-scale datasets using NeMo Curator’s built-in embedders. Image embeddings enable downstream tasks such as classification, filtering, duplicate removal, and similarity search.
How It Works
Image embedding in NeMo Curator typically follows these steps:
- Load your dataset using
FilePartitioningStageandImageReaderStage - Configure the
ImageEmbeddingStagewith CLIP model settings - Apply the embedding stage to generate CLIP embeddings for each image
- Continue with downstream processing stages (filtering, classification, etc.)
The embedding stage integrates seamlessly into NeMo Curator’s pipeline architecture.