TensorFlow Release 18.06
The NVIDIA container image of TensorFlow, release 18.06, is available.
Contents of TensorFlow
This container image contains the complete source of the version of NVIDIA TensorFlow in /opt/tensorflow
. It is pre-built and installed as a system Python module.
To achieve optimum TensorFlow performance, for image based training, the container includes a sample script that demonstrates efficient training of convolutional neural networks (CNNs). The sample script may need to be modified to fit your application. The container also includes the following:
- Ubuntu 16.04
Note:
Container image
18.06-py2
contains Python 2.7;18.06-py3
contains Python 3.5. - NVIDIA CUDA 9.0.176 (see Errata section and 2.1) including CUDA® Basic Linear Algebra Subroutines library™ (cuBLAS) 9.0.333 (see section 2.3.1)
- NVIDIA CUDA® Deep Neural Network library™ (cuDNN) 7.1.4
- NCCL 2.2.13 (optimized for NVLink™ )
- Horovod™ 0.12.1
- OpenMPI™ 3.0.0
- TensorBoard 1.8.0
- MLNX_OFED 3.4
- OpenSeq2Seq v0.2 at commit a4f627e
- TensorRT 4.0.1
Driver Requirements
Release 18.06 is based on CUDA 9, which requires NVIDIA Driver release 384.xx.
Key Features and Enhancements
This TensorFlow release includes the following key features and enhancements.
- TensorFlow container image version 18.06 is based on TensorFlow 1.8.0.
- Updated scripts and README in
nvidia-examples/cnn/
to use cleaner implementation with high-level TensorFlow APIs including Datasets, Layers, and Estimators. Multi-GPU support in these scripts is now provided exclusively using Horovod/MPI. - Fixed incorrect network definition in
resnet18
andresnet34
models innvidia-examples/cnn/
. - Updated scripts and README in
nvidia-examples/build_imagenet_data/
to improve usability and ensure that the dataset is correctly downloaded and resized. - Added support for TensorRT 4 features to TensorFlow-TensorRT integration.
- Includes integration with TensorRT 4.0.1
- Optimized CPU bilinear image resize kernel to improve performance of input pipeline.
- Ubuntu 16.04 with May 2018 updates
Accelerating Inference In TensorFlow With TensorRT (TF-TRT)
For step-by-step instructions on how to use TF-TRT, see Accelerating Inference In TensorFlow With TensorRT User Guide.
- Key Features And Enhancements
-
-
Added TensorRT 4.0 API support with extended layer support. This support includes the FullyConnected layer and
BatchedMatMul
op. -
Resource management added, where memory allocation is uniformly managed by TensorFlow.
-
Bug fixes and better error handling in conversion.
-
- Limitations
-
-
TensorRT conversion relies on static shape inference, where the frozen graph should provide explicit dimension on all ranks other than the first batch dimension.
-
Batchsize for converted TensorRT engines are fixed at conversion time. Inference can only run with batchsize smaller than the specified number.
-
Current supported models are limited to CNNs. Object detection models and RNNs are not yet supported.
-
- Known Issues
-
-
Input tensors are required to have rank 4 for quantization mode (INT8 precision).
-
Announcements
Starting with the next major version of CUDA release, we will no longer provide updated Python 2 containers and will only update Python 3 containers.
Known Issues
There are no known issues in this release.