cuDNN Release Notes v7.4.1
cuDNN Release Notes v7.4.1 (PDF)
Key Features and Enhancements
- For API Logging, a conversion specifier for the process id is added. With this, the process id can be included in the log file name. See API Logging.
- Performance of
cudnnPoolingBackward()is enhanced for the average pooling when using NHWC data format--for both the CUDNN_POOLING_AVERAGE_COUNT_INCLUDE_PADDING and CUDNN_POOLING_AVERAGE_COUNT_EXCLUDE_PADDING cases of
- Performance of the strided convolution in
cudnnConvolutionBackwardData()is enhanced when the filter is in NHWC format and the data type is TRUE_HALF_CONFIG or PSEUDO_HALF_CONFIG or FLOAT_CONFIG. For strides
u,v < r,sthe performance is further enhanced.
- Significantly improved the performance of cudnnConvolutionForward(), cudnnConvolutionBackwardData() & cudnnConvolutionBackwardFilter() functions on RCNN models such as Fast RCNN, Faster RCNN, & Mask RCNN.
- The following set up was giving “Misaligned Address” error in cuDNN 7.3.x. This is fixed in cuDNN 7.4.1: For the cudnnConvolutionForward() function with the CUDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_PRECOMP_GEMM algorithm, in the data type configuration of PSEUDO_HALF_CONFIG, when the input and output tensors are in in NHWC and the filter is 1x1 and NCHW, and Tensor Op is enabled.
- For a few convolution sizes for ALGO_0 and ALGO_1, the performance of the function cudnnConvolutionBackwardFilter() was degraded in cuDNN 7.3.1. This is now fixed.
- Fixed. In cuDNN 7.3.1 the function cudnnAddTensor was computing incorrect results when run on GPUs with the compute capability < 6.0 (prior to Pascal).
- When calling the
cudnnConvolutionBiasActivationForward()function with the
algoparameter set to CUDNN_CONVOLUTION_FWD_ALGO_FFT and the
activationDescparameter set to CUDNN_ACTIVATION_RELU and sufficiently large inputs, the ReLU operation is not applied and negative values are passed through to the output. This issue is present in all previous cuDNN versions.