Get Started#

NeMo Curator provides many tools for curating large scale text-image pair datasets for training generative image models.

Install NeMo Curator#

To install the image curation modules of NeMo Curator, ensure you meet the following requirements:

  • Python 3.10 or higher * packaging >= 22.0

  • Ubuntu 22.04/20.04

  • NVIDIA GPU * Volta™ or higher (compute capability 7.0+) * CUDA 12 (or above)

Note: While some of the text-based NeMo Curator modules do not require a GPU, all image curation modules require a GPU.

You can get NeMo Curator in 3 ways.

  1. PyPi

  2. Source

  3. NeMo Framework Container

PyPi#

NeMo Curator’s PyPi page can be found here.

pip install nemo-curator[image]

Source#

NeMo Curator’s GitHub can be found here.

git clone https://github.com/NVIDIA/NeMo-Curator.git
pip install ./NeMo-Curator[image]

NeMo Framework Container#

NeMo Curator comes preinstalled in the NeMo Framework container. You can find a list of all the NeMo Framework container tags here.

Use NeMo Curator#

NeMo Curator can be run locally, or on a variety of compute platforms (Slurm, k8s, and more).

To get started using the image modules in NeMo Curator, we recommend you check out the following resources: