Step #3: Set up Weights and Biases

Weights and Biases is an MLOps platform that allows you to visualize and optimize your ML models with better experiment tracking, dataset versioning and model management. This step is optional but recommended since we will be using Weights and Biases to visualize the model accuracy and training speed throughout the lab.

  1. Create a free Weights and Biases account at https://wandb.ai/

  2. Login to your account

  3. Go to settings and copy your WandB api key

  4. Run wandb login <api key> in the job console before the torchrun commands to connect to your wandb account

When you train your model, you can view the training logs and graphs at https://wandb.ai/. This is a useful tool to compare different training runs. This lab will use Weights and Biases to compare training loss, time to training completion and model accuracy. Weights and Biases can also be used to visualize real time GPU usage.

step-03-image-01.png

© Copyright 2022-2023, NVIDIA. Last updated on Jan 10, 2023.