cuDNN Release 7.6.5

This is the cuDNN 7.6.5 release notes. This release includes fixes from the previous cuDNN v7.x.x releases as well as the following additional changes. These release notes are applicable to both cuDNN and JetPack users unless appended specifically with (not applicable for Jetson platforms).

For previous cuDNN release notes, see the cuDNN Archived Documentation.

Key Features and Enhancements

The following features and enhancements have been added to this release:

  • Made performance improvements to several APIs including cudnnAddTensor, cudnnOpTensor, cudnnActivationForward and cudnnActivationBackward.

  • Separated the cuDNN datatype references and APIs from the cuDNN Developer Guide into a new cuDNN API.

  • Published Best Practices For Using cuDNN 3D Convolutions..


For the latest compatibility software versions of the OS, CUDA, the CUDA driver, and the NVIDIA hardware, see the cuDNN Support Matrix for v7.6.5.

Fixed Issues

The following issues have been fixed in this release:

  • Corrected the documentation for cudnnBatchNormalization* API functions, clarifying which are optional arguments and when the user needs to pass them to the API.

  • Fixed a lack-of-synchronization issue when cudnnRNNBackwardData() and cudnnRNNBackwardDataEx() calls a kernel that is not synchronized back to the application's stream. This issue only appears when users are using bidirectional RNN using algo of CUDNN_RNN_ALGO_STANDARD. This issue affects cuDNN versions 5 through 7.6.4.

  • Corrected supported tensor format tables for cudnnConvolutionForward().

  • cudnnConvolutionBackwardData used to give wrong answers when the kernel size was >=30 in any dimension and the stride is 2 in that dimension; with the algorithm set to CUDNN_CONVOLUTION_BWD_DATA_ALGO_FFT_TILING. This has been fixed.

  • Fixed an issue where if the user uses cudnnBatchNormalizationForwardInference with the mode of CUDNN_BATCHNORM_SPATIAL_PERSISTENT, the API will return CUDNN_STATUS_NOT_SUPPORTED and not fall back to CUDNN_BATCHNORM_SPATIAL mode. Now, it falls back correctly similar to the behavior of the other batch normalization APIs including cudnnBatchNormalizationForwardTraining, cudnnBatchNormalizationForwardTrainingEx, cudnnBatchNormalizationBackward and cudnnBatchNormalizationBackwardEx.

  • Previously, when cuDNN invoked convolve_common_engine_int8_NHWC kernel for NHWC format, irrespective of the output data precision, the output values were clipped to be in the range from -128 to 127. In this release, we have fixed the issue. As a result, output values are clipped only for INT8 precision. Whereas, if the output data is float precision, the values are not clipped.