Deep Learning Profiler 20.02 Release Notes

Description

Deep Learning Profiler (DLProf) is a tool for profiling deep learning models to help data scientists understand and improve performance of their models visually via Tensorboard or by analyzing text reports. It also helps understand resource usage when models are trained.

Driver Requirements

Release 20.02 is based on NVIDIA CUDA 10.2.89, which requires NVIDIA Driver release 440.30.01. However, if you are running on Tesla (for example, T4 or any other Tesla board), you may use NVIDIA driver release 396, 384.111+, 410, 418.xx, or 440.30. The CUDA driver's compatibility package only supports particular drivers. For a complete list of supported drivers, see the CUDA Application Compatibility topic. For more information, see CUDA Compatibility and Upgrades.

New Features

The key features of DLProf v0.9.0 / r20.02 are:
  • Released in the TensorFlow 20.02 NGC container.
  • Latest DLProf build is based on TensorFlow 1.15.2, TensorBoard 1.15.0, and Nsight Systems 2020.1.1.
  • Added --delay and --duration options that will delay when the profiler will start and terminate the profiler after a set duration.
  • Partial support for custom NVTX ranges and domains.
    • Can profile tensorflow models that use the NVTX Plugin.
    • Can select which domain(s) to use in generated report(s).
  • New group node report that shows the aggregated times for each group node.
  • Removed original TensorBoard GPU Summary Plugin.
  • Added pie charts and line series to the new TensorBoard DLProf Plugin.

Known Issues

  • XLA cluster mapping in the TensorBoard Graph plugin is not supported in the 20.02 Tensorflow container.
  • This software is only accessible in the NGC TensorFlow container.
  • This software is only supported by TensorFlow 1.15.

Resolved Issues

  • Iteration reports are now properly sorted.
  • --force will work correctly with auto generated graphdef files.