Release 18.08

The container image of NVIDIA Optimized Deep Learning Framework, powered by Apache MXNet, release 18.08, is available.

Contents of the Optimized Deep Learning Framework container

This container image contains the complete source of the version of NVIDIA Optimized Deep Learning Framework, powered by Apache MXNet in /opt/mxnet. It is pre-built and installed to the Python path.

The container also includes the following:

Ubuntu 16.04

Note:

Container image 18.08-py2 contains Python 2.7; 18.08-py3 contains Python 3.5.
NVIDIA CUDA 9.0.176 (see Errata section and 2.1) including CUDA^® Basic Linear Algebra Subroutines library™ (cuBLAS) 9.0.425
NVIDIA CUDA^® Deep Neural Network library™ (cuDNN) 7.2.1
NCCL 2.2.13 (optimized for NVLink™ )
ONNX exporter 0.1 for CNN classification models

Note:

The ONNX exporter is being continuously improved. You can try the latest changes by pulling from the main branch.
Amazon Labs Sockeye sequence-to-sequence framework 1.18.28 (for machine translation)
TensorRT 4.0.1
DALI 0.1.2 Beta

Driver Requirements

Release 18.08 is based on CUDA 9, which requires NVIDIA Driver release 384.xx.

Key Features and Enhancements

This Optimized Deep Learning Framework release includes the following key features and enhancements.

NVIDIA Optimized Deep Learning Framework, powered by Apache MXNet container image version 18.08 is based on 1.2.0, with all upstream changes from the Apache MXNet main branch up to and including PR 11545.
Latest version of cuDNN 7.2.1.
Latest version of DALI 0.1.2 Beta.
New demonstrator of increased mixed-precision ResNet-50 training speeds on Volta when processed end-to-end in the NHWC data layout. We are working to PR the code improvements to upstream Apache MXNet. To evaluate in the meantime, type /opt/mxnet/examples/image_classification/train_imagenet_runner --batch-size N . Substitute 256 for N on systems with GPUs having 32GB global memory (or 192 with 16GB GPUs) and prepare the imagenet database as directed in nvidia-examples/imagenet_preparations. Training images should be pre-resized to 480px shorter side and validation should be pre-resized to 256px shorter side. The script expects RecordIO files to be present in the /data/imagenet/train-480-val-256-recordio/ directory.
Ubuntu 16.04 with July 2018 updates

Announcements

Starting with the next major version of the CUDA release, we will no longer provide updated Python 2 containers and will only update Python 3 containers.

Known Issues

The multi-threaded nature of Apache MXNet model execution may result in a variable maximum usage of GPU global memory. Users that experience sporadic out-of-GPU-memory errors should experiment with setting the environment variable MXNET_GPU_WORKER_NTHREADS=1 as a possible remedy. We anticipate the need for this experimentation will be removed in a subsequent release.