Get Started#
NeMo Curator provides many tools for curating large scale text-image pair datasets for training generative image models.
Install NeMo Curator#
To install the image curation modules of NeMo Curator, ensure you meet the following requirements:
Python 3.10 or higher * packaging >= 22.0
Ubuntu 22.04/20.04
NVIDIA GPU * Volta™ or higher (compute capability 7.0+) * CUDA 12 (or above)
Note: While some of the text-based NeMo Curator modules do not require a GPU, all image curation modules require a GPU.
You can get NeMo Curator in 3 ways.
PyPi
Source
NeMo Framework Container
PyPi#
NeMo Curator’s PyPi page can be found here.
pip install nemo-curator[image]
Source#
NeMo Curator’s GitHub can be found here.
git clone https://github.com/NVIDIA/NeMo-Curator.git
pip install ./NeMo-Curator[image]
NeMo Framework Container#
NeMo Curator comes preinstalled in the NeMo Framework container. You can find a list of all the NeMo Framework container tags here.
Use NeMo Curator#
NeMo Curator can be run locally, or on a variety of compute platforms (Slurm, k8s, and more).
To get started using the image modules in NeMo Curator, we recommend you check out the following resources: