Release 18.08
The container image of NVIDIA Optimized Deep Learning Framework, powered by Apache MXNet, release 18.08, is available.
Contents of the Optimized Deep Learning Framework container
This container image contains the complete source of the version of NVIDIA Optimized Deep Learning Framework, powered by Apache MXNet in /opt/mxnet
. It is pre-built and installed to the Python path.
The container also includes the following:
- Ubuntu 16.04
Note:
Container image
18.08-py2
contains Python 2.7;18.08-py3
contains Python 3.5. - NVIDIA CUDA 9.0.176 (see Errata section and 2.1) including CUDA® Basic Linear Algebra Subroutines library™ (cuBLAS) 9.0.425
- NVIDIA CUDA® Deep Neural Network library™ (cuDNN) 7.2.1
- NCCL 2.2.13 (optimized for NVLink™ )
- ONNX exporter 0.1 for CNN classification models
Note:
The ONNX exporter is being continuously improved. You can try the latest changes by pulling from the main branch.
- Amazon Labs Sockeye sequence-to-sequence framework 1.18.28 (for machine translation)
- TensorRT 4.0.1
- DALI 0.1.2 Beta
Driver Requirements
Release 18.08 is based on CUDA 9, which requires NVIDIA Driver release 384.xx.
Key Features and Enhancements
This Optimized Deep Learning Framework release includes the following key features and enhancements.
- NVIDIA Optimized Deep Learning Framework, powered by Apache MXNet container image version 18.08 is based on 1.2.0, with all upstream changes from the Apache MXNet main branch up to and including PR 11545.
- Latest version of cuDNN 7.2.1.
- Latest version of DALI 0.1.2 Beta.
- New demonstrator of increased mixed-precision ResNet-50 training speeds on Volta when processed end-to-end in the
NHWC
data layout. We are working to PR the code improvements to upstream Apache MXNet. To evaluate in the meantime, type/opt/mxnet/examples/image_classification/train_imagenet_runner --batch-size N
. Substitute 256 forN
on systems with GPUs having 32GB global memory (or 192 with 16GB GPUs) and prepare the imagenet database as directed innvidia-examples/imagenet_preparations
. Training images should be pre-resized to 480px shorter side and validation should be pre-resized to 256px shorter side. The script expects RecordIO files to be present in the/data/imagenet/train-480-val-256-recordio/
directory. - Ubuntu 16.04 with July 2018 updates
Announcements
Starting with the next major version of the CUDA release, we will no longer provide updated Python 2 containers and will only update Python 3 containers.
Known Issues
The multi-threaded nature of Apache MXNet model execution may result in a variable maximum usage of GPU global memory. Users that experience sporadic out-of-GPU-memory errors should experiment with setting the environment variable MXNET_GPU_WORKER_NTHREADS=1
as a possible remedy. We anticipate the need for this experimentation will be removed in a subsequent release.