cuDNN Release 7.6.5

This is the cuDNN 7.6.5 release notes. This release includes fixes from the previous cuDNN v7.x.x releases as well as the following additional changes. These release notes are applicable to both cuDNN and JetPack users unless appended specifically with (not applicable for Jetson platforms).

For previous cuDNN release notes, see the cuDNN Archived Documentation.

Key Features and Enhancements

The following features and enhancements have been added to this release:

  • Made performance improvements to several APIs including cudnnAddTensor, cudnnOpTensor, cudnnActivationForward and cudnnActivationBackward.

  • Separated the cuDNN datatype references and APIs from the cuDNN Developer Guide into a new cuDNN API.

  • Published Best Practices For Using cuDNN 3D Convolutions..

Compatibility

For the latest compatibility software versions of the OS, CUDA, the CUDA driver, and the NVIDIA hardware, see the cuDNN Support Matrix for v7.6.5.

Fixed Issues

The following issues have been fixed in this release:

  • Corrected the documentation for cudnnBatchNormalization* API functions, clarifying which are optional arguments and when the user needs to pass them to the API.

  • Fixed a lack-of-synchronization issue when cudnnRNNBackwardData() and cudnnRNNBackwardDataEx() calls a kernel that is not synchronized back to the application's stream. This issue only appears when users are using bidirectional RNN using algo of CUDNN_RNN_ALGO_STANDARD. This issue affects cuDNN versions 5 through 7.6.4.

  • Corrected supported tensor format tables for cudnnConvolutionForward().

  • cudnnConvolutionBackwardData used to give wrong answers when the kernel size was >=30 in any dimension and the stride is 2 in that dimension; with the algorithm set to CUDNN_CONVOLUTION_BWD_DATA_ALGO_FFT_TILING. This has been fixed.

  • Fixed an issue where if the user uses cudnnBatchNormalizationForwardInference with the mode of CUDNN_BATCHNORM_SPATIAL_PERSISTENT, the API will return CUDNN_STATUS_NOT_SUPPORTED and not fall back to CUDNN_BATCHNORM_SPATIAL mode. Now, it falls back correctly similar to the behavior of the other batch normalization APIs including cudnnBatchNormalizationForwardTraining, cudnnBatchNormalizationForwardTrainingEx, cudnnBatchNormalizationBackward and cudnnBatchNormalizationBackwardEx.

  • Previously, if the format is NCHW, cuDNN would invoke conv2d_grouped_direct_kernel, however, the output wouldn’t be clipped. This issue has been fixed.