TensorFlow Release 17.09
The NVIDIA container image of TensorFlow, release 17.09, is available.
Contents of TensorFlow
This container image contains the complete source of the version of NVIDIA TensorFlow in
/opt/tensorflow. It is pre-built and installed into the
/usr/local/[bin,lib] directories in the container image.
To achieve optimum TensorFlow performance, for image based training, the container includes a sample script that demonstrates efficient training of convolutional neural networks (CNNs). The sample script may need to be modified to fit your application. The container also includes the following:
- Ubuntu 16.04
- NVIDIA CUDA® 9.0
- NVIDIA CUDA® Deep Neural Network library™ (cuDNN) 7.0.2
- NVIDIA® Collective Communications Library ™ (NCCL) 2.0.5 (optimized for NVLink™ )
Release 17.09 is based on CUDA 9, which requires NVIDIA Driver release 384.xx.
Key Features and Enhancements
- Tensor Core operation support in TensorFlow is enabled by default on Volta for FP16 convolutions and matrix multiplies, which should give a speedup for FP16 models.
- Added experimental support for:
- FP16 training in
- FP16 input/output in the fused batch normalization operation (
- Tensor Core operation in FP16 convolutions and matrix multiplications
- Added the TF_ENABLE_TENSOR_OP_MATH parameter which enables and disables Tensor Core operation (defaults to enabled).
- Tensor Core operation in FP32 matrix multiplications
- Added the TF_ENABLE_TENSOR_OP_MATH_FP32 parameter which enables and disables Tensor Core operation for float32 matrix multiplications (defaults to disabled because it reduces precision).
- FP16 training in
- Increased the TF_AUTOTUNE_THRESHOLD parameter which improves auto-tune stability.
- Increased the CUDA_DEVICE_MAX_CONNECTIONS parameter which solves performance issues related to streams on Tesla K80 GPUs.
- Enhancements to
- Fixed a bug where the final layer was wrong when running in evaluation mode.
is_trainingto a constant instead of a placeholder for better performance and reduced memory use.
- Merged gradients for all layers into a single NCCL call for better performance.
- Disabled use of XLA by default for better performance.
zero_debias_moving_meanin batch normalization operation.
- Latest version of CUDA
- Latest version of cuDNN
- Latest version of NCCL
- Ubuntu 16.04 with August 2017 updates
There are no known issues in this release.