Key Features and Enhancements
The following enhancements have been added to this release:
- None.
Known Issues
Following are known issues in this release:
- cuDNN library may trigger a CPU floating point exception when FP exceptions are enabled by user. This issue exists for all 7.0.x releases.
- There are heavy use cases of RNN layers that might hit a memory allocation issue in the CUDA driver when using cuDNN v7 with CUDA 8.0 and R375 driver on pre-Pascal architectures (Kepler and Maxwell). In these cases, subsequent CUDA kernels may fail to launch with an Error Code 30. To resolve the issue, it is recommended to use the latest R384 driver (from NVIDIA driver downloads) or to ensure that the persistence daemon is started. This behavior is observed on all 7.0.x releases.
- When using TENSOR_OP_MATH mode with
cudnnConvolutionBiasActivationForward
, the pointer to the bias must be aligned to 16 bytes and the size of allocated memory must be multiples of 256 elements. This behavior exists for all 7.0.x releases.
Fixed Issues
The following issues have been fixed in this release:
- Corrected the algorithm fallback behavior in RNN when user set to use
CUDNN_TENSOR_OP_MATH
when using compute card without HMMA. Instead of returningCUDNN_STATUS_NOT_SUPPORTED
, the RNN algorithm will now continue to run usingCUDNN_DEFAULT_MATH
. The correct behavior is to fall back to using default math when Tensor Core is not supported. Fixed to the expected behavior. - On Volta hardware,
BWD_FILTER_ALGO_1
andBWD_DATA_ALGO_1
convolutions using a number of filter elements greater than 512 were causingCUDA_ERROR_ILLEGAL_ADDRESS
andCUDNN_STATUS_INTERNAL_ERROR
errors. Logic was added to fall back to a generic kernel for these filter sizes. - cuDNN v7 with CUDA 8.0 produced erroneous results on Volta for some common cases of Algo 1. Logic was added to fall back to a generic kernel when cudnn v7 with CUDA 8.0 is used on Volta.