cuDNN Release 7.6.4

This is the cuDNN 7.6.4 release notes. This release includes fixes from the previous cuDNN v7.x.x releases as well as the following additional changes.

For previous cuDNN release notes, see the cuDNN Archived Documentation.

Key Features and Enhancements

The following features and enhancements have been added to this release:

  • Gained significant speed-up in multihead-attention forward training and inference.

Compatibility

For the latest compatibility software versions of the OS, CUDA, the CUDA driver, and the NVIDIA hardware, see the cuDNN Support Matrix for v7.6.4.

Limitations

  • When launching a CUDA graph constructed via a stream capture that includes a cudnnConvolutionForward operation, the subsequent synchronization point reports a cudaErrorLaunchFailure error. This error appears when cuDNN is set to use a non-default stream.

Fixed Issues

The following issues have been fixed in this release:

  • Earlier versions of cuDNN v7.6 contained symbols which would conflict with those of in TensorRT 5.1 and later. In some cases, these conflicts could lead to application crashes when applications linked against cuDNN and TensorRT. This issue is fixed in cuDNN 7.6.4.
  • Addressed the regressions that were introduced in the cudnnConvolutionBiasActivationForward function in cuDNN 7.6.3. Previously, if this API had different values in destination data buffer and zData buffer, then incorrect results were computed. This issue has been resolved and now the API will compute correct results even if users provide an arbitrary set of values to the destination data and zData.
  • Multi-head attention will now return CUDNN_STATUS_ARCH_MISMATCH for true-half configuration on devices with compute capability less than 5.3 (for example, most of Maxwell and all of Kepler, etc..), which do not have native hardware support for true half computation. Previously, an error like CUDNN_STATUS_EXECUTION_FAILED may be triggered or inaccurate results may be produced.