cuDNN Release Notes v7.3.1
cuDNN Release Notes v7.3.1 (PDF)
Key Features and Enhancements
- The FFT tiling algorithms for convolution have been enhanced to support strided convolution. In specific, for the algorithms CUDNN_CONVOLUTION_FWD_ALGO_FFT_TILING and CUDNN_CONVOLUTION_BWD_DATA_ALGO_FFT_TILING, the
convDesc's vertical and horizontal filter stride can be 2 when neither the filter width nor the filter height is 1.
- The CUDNN_CONVOLUTION_FWD_ALGO_WINOGRAD algorithm for
cudnnConvolutionBackwardData()now give superior performance for Volta architecture. In addition, the mobile version of this algorithm in the same functions gives superior performance for Maxwell and Pascal architectures.
- Dilated convolutions now give superior performance for
cudnnConvolutionBackwardFilter()on Volta architecture, in some cases.
Known Issues and Limitations
- For the
cudnnConvolutionForward(), when using a 1x1 filter with input and output tensors of
NHWCformat and of CUDNN_DATA_HALF (half precision) type, and the filter format is
NCHW, with compute type of float, cuDNN will generate incorrect results.
- On Quadro P4000, when calling
cudnnConvolutionForward()function with CUDNN_CONVOLUTION_FWD_ALGO_WINOGRAD_NONFUSED algorithm, there may be a small chance of seeing intermittent inaccurate results.
- When using
cudnnConvolutionBackwardFilter()with CUDNN_CONVOLUTION_BWD_FILTER_ALGO_0 in mixed precision computation, with input/output in CUDNN_DATA_HALF (half precision) and compute type of float, when the number of batches (N) is larger than 1 the results might include INF due to an intermediate down-convert to half float. In other words, with an accumulation of float for all intermediate values (such as in CUDNN_CONVOLUTION_BWD_FILTER_ALGO_1) the result will be a finite half precision float. This limitation also exists in all previous cuDNN versions.
- Fixed a pointer arithmetic integer overflow issue in RNN forward and backward functions, when sequence length and mini-batch size are sufficiently large.
- When tensor cores are enabled in cuDNN 7.3.0, the
cudnnConvolutionBackwardFilter()calculations were performing an illegal memory access when K and C values are both non-integral multiples of 8. This issue is fixed.
- For the CUDNN_CONVOLUTION_BWD_FILTER_ALGO_1 algorithm in
cudnnConvolutionBackwardFilter(), on Volta, the tensor operations were occasionally failing when the filter spatial size (filter
w) was greater than 64. This issue is fixed.
- While running cuDNN 7.3.0 on Turing with CUDA 10.0, r400 driver, the functions
cudnnRNNForwardInference(Ex)errored out returning CUDNN_STATUS_NOT_SUPPORTED. This issue is fixed.
- In cuDNN 7.3.0, when using CUDNN_CONVOLUTION_BWD_FILTER_ALGO_1 with tensor data or filter data in
NHWCformat, the function might have resulted in a silent failure. This is now fixed.