TAO v5.5.0
v5.5.0

Visualizing Training

Visualization is a very important part of training a Deep Neural Network (DNN). Training an DNN involves designing complex neural networks with models having parameters to the tune of millions, and iterating over large datasets. Therefore, understanding how this training progresses over time, visualizing structure of the model graph, and what are the statistics of the model weights are of significant importance.

TAO 3.22.05 introduces integration of the following computer vision networks with TensorBoard.

  1. DetectNet-v2

  2. FasterRCNN

  3. Image Classification

  4. MultiTask Classification

  5. RetinaNet

  6. YOLOv4/YOLOv4-Tiny

  7. YOLOv3

  8. MaskRCNN

  9. UNet

  10. SSD

  11. DSSD

The networks supported in TAO supports visualizing

  1. Scalar plots such as training loss, validation loss and learning rate

  2. Histograms for weights

  3. Images

In order to enable tensorboard while training, you can simply add the following spec element to the training_config element of the configuration/experiment spec file.

Copy
Copied!
            

visualizer{ enabled: true }

For detailed information about the configurable elements of the visualizer, please review the table corresponding to the training_config element of the network, in the Creating an Experiment Spec file of the respective network.

Installing tensorboard

  • Installing tensorboard is as simple as running a simple pip installation command.

    Copy
    Copied!
                

    python -m pip install tensorboard


Invoking Tensorboard

  • Once you have installed tensorboard in your python environment, you may instantiate a tensorboard session by running the following command.

    Copy
    Copied!
                

    tensorboard --logdir $RESULTS_DIR --host 0.0.0.0 --port 8080

    where $RESULTS_DIR is the path to where the events.out.tfevents.* files are stored from the training experiment.

Note

If you would like to visualize results from multiple experiments side-by-side in a single tensorboard session, you may do so by adding multiple directories to the same tensorboard path, as shown in the command below.

Copy
Copied!
            

tensorboard --logdir_spec experiment_name_1:${RESULTS_DIR_1},experiment_name_2:${RESULTS_DIR_2} \ --host 0.0.0.0 --port 8080

For more information about the TensorBoard client, please refer to the official documentation. You may also refer to the getting started documentation and FAQ.

Previous Optimizing the Training Pipeline
Next Data Annotation Format
© Copyright 2024, NVIDIA. Last updated on Oct 15, 2024.