Release 18.01
The container image of NVIDIA Optimized Deep Learning Framework, powered by Apache MXNet, release 18.01, is available.
NVIDIA Optimized Deep Learning Framework, powered by Apache MXNet container image version 18.01 is based on Apache MXNet 1.0.0.
Contents of the Optimized Deep Learning Framework container
This container image contains the complete source of the version of NVIDIA Optimized Deep Learning Framework, powered by Apache MXNet in /opt/mxnet
. It is pre-built and installed to the Python path.
The container also includes the following:
- Ubuntu 16.04
Note:
Container image
18.01-py2
contains Python 2.7;18.01-py3
contains Python 3.5. - NVIDIA CUDA 9.0.176 including CUDA® Basic Linear Algebra Subroutines library™ (cuBLAS) 9.0.282
- NVIDIA CUDA® Deep Neural Network library™ (cuDNN) 7.0.5
- NCCL 2.1.2 (optimized for NVLink™ )
- ONNX exporter 0.1 for CNN classification models
Note:
The ONNX exporter is being continuously improved. You can try the latest changes by pulling from the main branch.
- Amazon Labs Sockeye sequence-to-sequence framework 1.16.2 (for machine translation)
Driver Requirements
Release 18.01 is based on CUDA 9, which requires NVIDIA Driver release 384.xx.
Key Features and Enhancements
This NVIDIA Optimized Deep Learning Framework, powered by Apache MXNet release includes the following key features and enhancements.
- Addition of Python 3 package
- Enhanced-performance cuDNN-based batched 1D convolutions (merged to upstream)
- Added
MxNet-to-ONNX
exporter for classification of CNN models (tested with LeNet-5, ResNet-50, etc.). - Added the Sockeye sequence-to-sequence framework, along with a German-to-English translation model, based on the WMT’15 dataset and translation task. This model's launch script should reproduce the OpenNMT reference model when trained until convergence.
- Latest version of cuBLAS
- Latest version of cuDNN
- Latest version of NCCL
- Ubuntu 16.04 with December 2017 updates
Known Issues
cuBLAS 9.0.282 regresses RNN seq2seq FP16 performance for a small subset of input sizes. As a workaround, revert back to the 17.12 container.