Important

You are viewing the NeMo 2.0 documentation. This release introduces significant changes to the API and a new library, NeMo Run. We are currently porting all features from NeMo 1.0 to 2.0. For documentation on previous versions or features not yet available in 2.0, please refer to the NeMo 24.07 documentation.

NeMo-Run documentation#

NeMo-Run is a powerful tool designed to streamline the configuration, execution and management of Machine Learning experiments across various computing environments. NeMo Run has three core responsibilities:

  1. Configuration

  2. Execution

  3. Management

Please click into each link to learn more. This is also the typical order Nemo Run users will follow to setup and launch experiments.

Installation#

To install the project, use the following command:

pip install git+https://github.com/NVIDIA/NeMo-Run.git

To install Skypilot, we have optional features available.

pip install git+https://github.com/NVIDIA/NeMo-Run.git[skypilot] will install Skypilot w Kubernetes

pip install git+https://github.com/NVIDIA/NeMo-Run.git[skypilot-all] will install Skypilot w all clouds

You can also manually install Skypilot from https://skypilot.readthedocs.io/en/latest/getting-started/installation.html

Make sure you have pip installed and configured properly.

Tutorials#

The hello_world tutorial series provides a comprehensive introduction to NeMo Run, demonstrating its capabilities through a simple example. The tutorial covers:

  • Configuring Python functions using Partial and Config classes.

  • Executing configured functions locally and on remote clusters.

  • Visualizing configurations with graphviz.

  • Creating and managing experiments using run.Experiment.

You can find the tutorial series below:

  1. Part 1

  2. Part 2

  3. Part 3