Deep Learning Profiler 21.02 Release Notes
Release 21.02 is based on NVIDIA CUDA 11.2, which requires NVIDIA Driver release 455 or later. However, if you are running on Tesla (for example, T4 or any other Tesla board), you may use NVIDIA driver release 418.xx, 440.30, or 450.xx. The CUDA driver's compatibility package only supports particular drivers. For a complete list of supported drivers, see the CUDA Application Compatibility topic. For more information, see CUDA Compatibility and Upgrades.
- Released in the TensorFlow 1.x 21.02, TensorFlow 2.x 21.02 and PyTorch 21.02 NGC container.
- Latest DLProf build is based on TensorFlow 1.15.4, TensorBoard 1.15.0, TensorFlow 2.3.1, TensorBoard 2.3.0, PyTorch 1.8.0, and Nsight Systems 2020.3.4.
- Added support for DALI.
- Improved PyTorch profiling
- Iteration reports can print the long kernel names
- Iteration reports display IO bytes
- Expert Systems detector will detect when emit_nvtx is not used
- Capture all unassociated CUDA operations in all frameworks.
- This software is accessible in the NGC TensorFlow and PyTorch containers and as a separate PIP wheel.
- This software is only supported for TensorFlow 1.15, TensorFlow 2.3, PyTorch 1.8, TensorBoard 1.15, and TensorBoard 2.3.
- When launching TensorBoard in a TensorFlow 2.x container, the --bind_all argument must be passed onto the command line. Example:
# tensorboard --bind_all --logdir /path/to/event_files