Deep Learning Profiler v1.8.0 Release Notes
DLProf Release for v1.8.0 is available in the 21.12 NVIDIA TensorFlow 1.x, TensorFlow 2.x, and PyTorch NGC containers, and as a Python Wheel on the NVIDIA PY Index. This will be the final release of DLProf. It will not be included in future NVIDIA NGC containers, but will still be available to download from the NVIDIA PY Index.
Release 21.12 is based on NVIDIA CUDA 11.5.0, which requires NVIDIA Driver release 495 or later. However, if you are running on Tesla (for example, T4 or any other Tesla board), you may use NVIDIA driver release 418.40 (or later R418), 440.33 (or later R440), 450.51 (or later R450), or 460.27 (or later R460). The CUDA driver's compatibility package only supports particular drivers. For a complete list of supported drivers, see the CUDA Application Compatibility topic. For more information, see CUDA Compatibility and Upgrades.
The key features of DLProf v1.8.0 are:
- Released in the TensorFlow 1.x 21.12, TensorFlow 2.x 21.12 and PyTorch 21.12 NGC containers.
- This software is only supported for TensorFlow 1.15.5, TensorFlow 2.6.2, PyTorch 1.11, and TensorRT 220.127.116.11.
- Collecting I/O data can cause issues in some cases, so it is off by default in this release. To turn it on, pass the following option to DLProf CLI:
--nsys_opts=“-t cuda,nvtx,osrt -s cpu”