Connecting to Weights & Biases

Tutorial for sending training metrics to Weights & Biases

Weights & Biases (W&B) is a widely-used tool for charting training metrics for machine learning jobs, such as loss curves, resource usage, accuracy scores, and more. This makes it easy to validate how a model learns over time and to compare multiple runs to determine the best models for certain outcomes.

W&B supports a simple Python API to send training information to their servers. To use the API, users will need to create an access token on W&B, install the Python package, and tell W&B which values to track.

Setup

First, create a W&B access token by navigating to https://wandb.ai and click Sign Up in the top right to create a free account if not done already. Once logged in, go to https://wandb.ai/settings and go to the bottom to create a new API key. This API key needs to be specified for jobs that use W&B.

Python Package Installation

The container used to run your job on DGX Cloud Lepton needs the W&B Python package installed. Some NGC images like the NeMo Framework container (nvcr.io/nvidia/nemo) already have the package installed, while others like the PyTorch image (nvcr.io/nvidia/pytorch) do not. If your container does not have W&B installed already, run this command as part of an entrypoint on container start or in a running container.

You can check if your container already has W&B installed with:

If the above command returns nothing, W&B is not installed already.

Example W&B Job

The following is a trivial example of a job that sends metrics to W&B using the API. The key points are:

Import the wandb module
Initialize the wandb project with wandb.init and specify hyperparameters and the project name
Tell W&B which values to send with wandb.log()

To authenticate with W&B, set the WANDB_API_KEY environment variable to your API key created earlier:

You can also set this environment variable directly in the platform when defining the job.

After running the example code, you should see a new project called my-awesome-project in your W&B account.

For your own W&B experiments, adding the API key will automate the login process so your own code should run automatically connected to your account.

Integration with NVIDIA NeMo Framework

NVIDIA NeMo Framework supports W&B natively. To use W&B with NeMo Framework, set your W&B key as an environment variable named WANDB_API_KEY. Refer to the documentation on integrating W&B for your specific NeMo Framework job.

Setup

Python Package Installation

Example W&B Job

Integration with NVIDIA NeMo Framework

Connecting to Weights & Biases

Setup

Python Package Installation

Example W&B Job

Integration with NVIDIA NeMo Framework

Corporate Info

NVIDIA Developer

Resources