This is the cuDNN 7.6.5 release notes. This release includes fixes from the previous
cuDNN v7.x.x releases as well as the following additional changes. These release notes are
applicable to both cuDNN and JetPack users unless appended specifically with (not
applicable for Jetson platforms).
Key Features and Enhancements
The following features and enhancements have been added to this release:
-
Made performance improvements to several APIs including cudnnAddTensor,
cudnnOpTensor, cudnnActivationForward
and cudnnActivationBackward.
-
Separated the cuDNN datatype references and APIs from the cuDNN Developer
Guide into a new cuDNN API.
-
Published Best Practices For Using cuDNN 3D
Convolutions..
Compatibility
For the latest compatibility software versions of the OS, CUDA, the CUDA driver, and
the NVIDIA hardware, see the cuDNN Support Matrix for v7.6.5.
Fixed Issues
The following issues have been fixed in this release:
-
Corrected the documentation for cudnnBatchNormalization* API functions,
clarifying which are optional arguments and when the user needs to pass them
to the API.
-
Fixed a lack-of-synchronization issue when cudnnRNNBackwardData() and
cudnnRNNBackwardDataEx() calls a kernel that is not
synchronized back to the application's stream. This issue only appears when
users are using bidirectional RNN using algo of
CUDNN_RNN_ALGO_STANDARD. This issue affects cuDNN
versions 5 through 7.6.4.
-
Corrected supported tensor format tables for cudnnConvolutionForward().
-
cudnnConvolutionBackwardData used to give wrong answers when
the kernel size was >=30 in any dimension and the stride is 2 in that
dimension; with the algorithm set to
CUDNN_CONVOLUTION_BWD_DATA_ALGO_FFT_TILING. This has
been fixed.
-
Fixed an issue where if the user uses
cudnnBatchNormalizationForwardInference with the mode
of CUDNN_BATCHNORM_SPATIAL_PERSISTENT, the API will return
CUDNN_STATUS_NOT_SUPPORTED and not fall back to
CUDNN_BATCHNORM_SPATIAL mode. Now, it falls back
correctly similar to the behavior of the other batch normalization APIs
including cudnnBatchNormalizationForwardTraining,
cudnnBatchNormalizationForwardTrainingEx,
cudnnBatchNormalizationBackward and
cudnnBatchNormalizationBackwardEx.
-
Previously, if the format is NCHW, cuDNN would invoke
conv2d_grouped_direct_kernel, however, the output
wouldn’t be clipped. This issue has been fixed.