Caffe2 Release 18.03

The NVIDIA container image of Caffe2, release 18.03, is available.

Contents of Caffe2

This container image contains the complete source of the version of Caffe2 in /opt/caffe2. It is pre-built and installed into the /opt/caffe2/[binaries,lib] directories in the container image. The container also includes the following:

Ubuntu 16.04

Note:

Note: Container image 18.03-py2 contains Python 2.7; 18.03-py3 contains Python 3.5.
NVIDIA CUDA 9.0.176 (see Errata section and 2.1) including CUDA^® Basic Linear Algebra Subroutines library™ (cuBLAS) 9.0.333 (see section 2.3.1)
NVIDIA CUDA^® Deep Neural Network library™ (cuDNN) 7.1.1
NCCL 2.1.2 (optimized for NVLink™ )
OpenMPI™ 1.10.3

Driver Requirements

Release 18.03 is based on CUDA 9, which requires NVIDIA Driver release 384.xx.

Key Features and Enhancements

This Caffe2 release includes the following key features and enhancements.

Caffe2 container image version 18.03 is based on Caffe2 0.8.1.
When using ImageNet training scripts in nvidia-examples on multiple GPUs, the printed metrics in the log for weak scaling was wrong. Also, the number of epochs the model is trained for was wrong. Both of these issues are fixed in the this release.
Gradient clipping used to be done by executing a series of small operators that compute a ratio by which the learning rate gets scaled, which has the same effect as gradient clipping for SGD optimizers. However, that method is wrong with optimizers that use momentum or history such as AdaGrad and Adam. In this release, we added a new operator ClipByGlobalNorm that explicitly clips the gradient. This operator also supports mixed precision for inputs and outputs.
Caffe2 already supported cuDNN RNN, however that integration does not provide enough features and flexibility to use cuDNN RNN in seq2seq. We improved this integration and also enabled using cuDNN RNN in the seq2seq example in nvidia-examples.
Incorporated GitHub Caffe2 code as of February 16, 2018.
Latest version of cuBLAS 9.0.333
Latest version of cuDNN 7.1.1
Ubuntu 16.04 with February 2018 updates

Announcements

Starting with the next major version of CUDA release, we will no longer provide Python 2 containers and will only maintain Python 3 containers.

Known Issues

In nvidia-examples/seq2seq, there is a bug that causes training to skip one epoch in case of loading a snapshot. This bug will be fixed in 18.04.