cuDNN Release 7.0.3

This is the cuDNN 7.0.3 release notes. This release includes fixes from the previous cuDNN v7.x.x releases as well as the following additional changes.

Key Features and Enhancements

Performance improvements for various cases:
  • Forward Grouped Convolutions where input channel per groups is 1, 2 or 4 and hardware is Volta or Pascal.
  • cudnnTransformTensor() where input and output tensor is packed.
    Note: This is an improved fallback, improvements will not be seen in all cases.

Known Issues

The following are known issues in this release:

  • CUDNN_CONVOLUTION_FWD_ALGO_FFT_TILING may cause CUDA_ERROR_ILLEGAL_ADDRESS. This issue affects input images of just one 1 pixel in width and certain n, c, k, h combinations.

Fixed Issues

The following issues have been fixed in this release:

  • AddTensor and TensorOp produce incorrect results for half and INT8 inputs for various use cases.
  • cudnnPoolingBackward() can produce incorrect values for rare cases of non-deterministic MAX pooling with window_width > 256. These rare cases are when the maximum element in a window is duplicated horizontally (along width) by a stride of 256*k for some k. The behavior is now fixed to accumulate derivatives for the duplicate that is left-most.
  • cudnnGetConvolutionForwardWorkspaceSize() produces incorrect workspace size for algorithm FFT_TILING for 1d convolutions. This only occurs for large sized convolutions where intermediate calculations produce values greater than 2^31 (2 to the power of 31).
  • CUDNN_STATUS_NOT_SUPPORTED returned by cudnnPooling*() functions for small x image (channels * height * width < 4).