Abstract

This NVIDIA Docker Containers for Deep Learning Frameworks Quick Start Guide provides the minimal first-step instructions for pulling (downloading) a container and running a container.

1. Introduction

This guide covers the basic instructions needed to pull (download) a container and run the container.

If you have an NVIDIA® DGX™ Cloud Services account, you can login and pull containers from the DGX™ Container Registry. You can then run neural networks, deploy deep learning models, and perform AI analytics in these containers on DGX-1™.

2. Key Concepts

In order to issue the pull and run commands, ensure that you are familiar with the following concepts.
container-name
The name of the container in the DGX™ Container Registry that you want to run.
nvcr.io
The name of the container registry, which for the DGX™ Container Registry is nvcr.io.
nvidia
The name of the space within the registry that contains the container. For containers provided by NVIDIA®, the registry space is nvidia.
repository
Repositories are collections of containers of the same name, but distinguished from each other by their tags. Think of it as the main container name.
tag
A tag of the form yy.mm to indicate the year and month in which the version of the container was released.

3. Installing Docker and NVIDIA Docker

To enable portability in Docker images that leverage GPUs, NVIDIA® developed nvidia-docker, an open-source project that provides a command line tool to mount the user mode components of the NVIDIA driver and the GPUs into the Docker container at launch.

By default, Docker containers run with root privilege, so consult your IT department for assistance on how to properly setup Docker to conform to your organizations security policies.

These instructions describe command line entries made from the DGX-1™ Linux shell.

Security

The instructions below are provided as a convenient method for accessing Docker containers; however, the resulting docker group is equivalent to the root user, which may violate your organizations security policies. See the Docker Daemon Attack Surface for information on how this can impact security in your system. Always consult your IT department to make sure the installation is in accordance with the security policies of your data center.
Ensure your environment meets the prerequisites before installing Docker. For more information, see Getting Started with Docker.
  1. Install Docker.
    $ sudo apt-key adv --keyserver
    hkp://p80.pool.sks-keyservers.net:80 --recv-keys
    58118E89F3A912897C070ADBF76221572C52609D
    $ echo deb https://apt.dockerproject.org/repo ubuntu-trusty main
    | sudo tee /etc/apt/sources.list.d/docker.list
    $ sudo apt-get update
    $ sudo apt-get -y install docker-engine=1.12.6-0~ubuntu-trusty
  2. Edit the /etc/default/docker file.
    To prevent IP address conflicts between Docker and the DGX-1.
    To ensure that the DGX-1 can access the network interfaces for nvidia-docker containers, the nvidia-docker containers should be configured to use a subnet distinct from other network resources used by the DGX-1. By default, Docker uses the 172.17.0.0/16 subnet. If addresses within this range are already used on the DGX-1 network, the nvidia-docker network can be changed by either modifying the /etc/docker/daemon.json file or modify the /etc/systemd/system/docker.service.d/docker-override.conf file specifying the DNS, Bridge IP address, and container address range to be used by nvidia-docker containers.
    For example, if your DNS server exists at IP address 10.10.254.254, and the 192.168.0.0/24 subnet is not otherwise needed by the DGX-1, you can add the following line:
    DOCKER_OPTS=”--dns 10.10.254.254 --bip=192.168.0.1/24 --fixedcidr=192.168.0.0/24”
    To use the Overlay2 storage driver.
    The Overlay2 storage driver is preferable to the default AUFS storage driver. Add the following option to the DOCKER_OPTS line as previously mentioned.
    --storage-driver=overlay2
    If you are using the base OS, the dgx-docker-options package already sets the storage driver to Overlay2 by default. However, the following lists shows what the NVIDIA recommended docker options are:
    • use the Overlay2 storage driver
    • disable the use of legacy registries
    • increase the stack size to 64 MB
    • unlimited locked memory size
    To use proxies to access external websites or repositories (if applicable).
    If your network requires use of a proxy, then edit the file /etc/apt/apt.conf.d/proxy.conf and make sure the following lines are present:
    Acquire::http::proxy
    "http://<username>:<password>@<host>:<port>/";
    Acquire::ftp::proxy "ftp://<username>:<password>@<host>:<port>/";
    Acquire::https::proxy
    "https://<username>:<password>@<host>:<port>/";

    If you will be using the DGX-1 in base OS mode, then after installing Docker on the system, refer to the information at Control and configure Docker with systemd. This is to ensure that Docker is able to access the DGX Container Registry through the proxy.

    Save and close the /etc/default/docker file when done.

  3. Restart Docker with the new configuration.
    $ sudo service docker restart
  4. Install NVIDIA Docker.
    1. Install nvidia-docker and nvidia-docker-plugin. The following example installs both nvidia-docker and the nvidia-docker-plugin.
      $ wget -P /tmp  
      https://github.com/NVIDIA/nvidia-docker/releases/download/v1.0.1/nvidia-docker_1.0.1-1_amd64.deb
      
      $ sudo dpkg -i /tmp/nvidia-docker*.deb && rm
      /tmp/nvidia-docker*.deb
    2. Choose which users will have access to the docker group. This is required for users who want to be able to launch containers with docker and nvidia-docker. To add a user to the docker group, first see which groups the user already belongs too.
      $ groups <username>
      1. If the user is not part of the docker group, then they can be easily added using the following command.
        Note: This command requires sudo access, therefore, this step should be performed by a system administrator.
        $ sudo usermod -a -G docker <username>
    3. If there are no user accounts on the machine, then add a user by performing the following steps:
      1. Create a user account to associate with this docker group usage. This is needed so that accounts already on the system can use Docker. In the following steps, replace <user1> with the actual user name.
      2. Add the user.
        $ sudo useradd <user1>
      3. Setup the password.
        $ sudo passwd <user1>
        Enter a password at the prompts:
        Enter new UNIX password:
        Retype new UNIX password:
        passwd: password updated successfully
      4. Add the user to the docker group.
        $ sudo usermod -a -G docker <user1>
      5. Switch to the new user.
        $ su <user1>

3.1. Getting Your NVIDIA DGX Cloud Services API Key

Your NVIDIA DGX Cloud Services API key authenticates your access to DGX Container Registry from the command line.

CAUTION:

You need to generate your NVIDIA DGX Cloud Services API key only once. Anyone with your API key can access all the services and resources to which you are entitled through your NVIDIA DGX Cloud Services account. Therefore, keep your API key secret and do not share it or store it where others can see or copy it.

  1. Use a web browser to log in to your NVIDIA DGX Cloud Services account on the DGX Cloud Services website.
  2. In the top right corner, click your user account icon and select API KEY.
  3. In the API Key page that opens, click GENERATE API KEY.
    Note: If you misplace your API key, you can get a new API key from the DGX Cloud Services website whenever you need it. When you get your API key, a new key is generated, which invalidates any keys you may have obtained previously.
  4. In response to the warning that your old API key will become invalid, click CONTINUE. Your NVIDIA DGX Cloud Services API key is displayed with examples of how to use it.
    Tip: You can copy your API key to the clipboard by clicking the Copy icon to the right of the API key.

3.2. Accessing DGX Container Registry

You can access the DGX™ Container Registry by running a Docker command from your client computer. You are not limited to using your NVIDIA DGX platform to access the DGX™ Container Registry. You can use any Linux computer with Internet access on which Docker is installed.
Before accessing DGX™ Container Registry, ensure that the following prerequisites are met:
  • Your NVIDIA® DGX™ Cloud Services account is activated.
  • You have an NVIDIA® DGX™ Cloud Services API key for authenticating your access to DGX™ Container Registry.
  • You are logged in to your client computer as an administrator user.

An alternate approach for enabling other users to run containers without giving them sudo privilege, and without having to type sudo before each Docker command, is to add each user to the docker group, with the command:

$ sudo usermod -aG docker $USER

While this approach is more convenient and commonly used, it is less secure because any user who can send commands to the docker engine can escalate privilege and run root level operations. If you choose to use this method, only add users to the docker group who you would trust with root privileges.

  1. Log in to the DGX™ Container Registry.
    $ docker login nvcr.io
  2. When prompted for your user name, enter the following text:
    $oauthtoken

    The $oauthtoken user name is a special user name that indicates that you will authenticate with an API key and not a user name and password.

  3. When prompted for your password, enter your NVIDIA® DGX™ Cloud Services API key as shown in the following example.
    Username: $oauthtoken
    Password: eK4buTUvM2x5vGsPnPv1elYGnpz9RmIpqtm67Qxx
    Tip: When you get your API key, copy it to the clipboard so that you can paste the API key into the command shell when you are prompted for your password.

4. Pulling a Container

You can pull (download) an NVIDIA container that is already built, tested, tuned, and ready to run. Each NVIDIA deep learning container includes the code required to build the framework so that you can make changes to the internals. The containers do not contain sample data-sets or sample model definitions unless they are included with the source for the framework.

Containers are available for download from the DGX™ Container Registry. NVIDIA has provided a number of containers for download from the DGX™ Container Registry. If your organization has provided you with access to any custom containers, you can download them as well.

The location of the framework source is in /opt/<framework> in each container, where <framework> is the name of your container.

You can use the docker pull command to pull images from the NVIDIA DGX Container Registry.

Before pulling an NVIDIA Docker container, ensure that the following prerequisites are met:
  • You have read access to the registry space that contains the container.
  • You are logged into DGX™ Container Registry as explained in Accessing DGX™ Container Registry.
  • You are member of the docker group, which enables you to use docker commands.
Tip: To browse the available containers in the DGX™ Container Registry, use a web browser to log in to your NVIDIA® DGX™ Cloud Services account on the DGX Cloud Services website.

To pull a container from the registry, use the following procedure.

  1. Run the command to download the container that you want from the registry.
    $ docker pull nvcr.io/nvidia/<repository>:<tag>
    where nvcr.io is the name of the NVIDIA Docker repository. For example, you could issue the following command.
    $ docker pull nvcr.io/nvidia/caffe:17.03
    In this case, the container is being pulled from the caffe repository and is version 17.03 (the tag is 17.03).
  2. To confirm that the container was downloaded, list the Docker images on your system.
    $ docker images

    After pulling a container, you can run jobs in the container to run neural networks, deploy deep learning models, and perform AI analytics.

4.1. Pulling a Container from NVIDIA Container Registry

A Docker registry is the service that stores Docker images. The service can be on the internet, on the company intranet, or on a local machine. For example, http://nvcr.io/ is the location of the NVIDIA DGX Container Registry for NVIDIA Docker images.

All http://nvcr.io/ Docker images use explicit version-tags to avoid ambiguous versioning which can result from using the latest tag. For example, a locally tagged latest version of an image may actually override a different latest version in the registry.

For more information pertaining to your specific container, refer to the /workspace/README.md file inside the container.

Before you can pull a container from the DGX Container Registry, you must have Docker installed. Ensure that you have installed Docker and NVIDIA Docker. For more information, see Installing Docker and NVIDIA Docker.

The following task assumes:
  1. You have a DGX-1 and it is connected to the network.
  2. Your DGX-1 has Docker installed.
  3. You have access to a browser to go to https://compute.nvidia.com and your DGX Cloud Services account is activated.
  4. You now want to pull a container onto your client machine.
  5. You want to push the container onto your private registry.
  6. You want to pull and run the container on your DGX-1. You will need to have a terminal window open to an SSH session on your DGX-1 to complete this step.
  1. Open a web browser and log onto DGX Cloud Services.
  2. Select the container that you want to pull from the left navigation. For example, click caffe.
  3. In the Tags section, locate the release that you want to run. For example, hover over release 17.03.
  4. In the Actions column, hover over the Download icon. Click the Download icon to display the docker pull command.
  5. Copy the docker pull command and click Close.
  6. Open a command prompt and paste:
    docker pull
    The pulling of the container image begins. Ensure the pull completes successfully.
  7. After you have the Docker container file on your local system, load the container into your local Docker registry.
  8. Verify that the image is loaded into your local Docker registry.

Running a Container

To run a container, you must issue the nvidia-docker run command, specifying the registry, repository, and tags.

Before you can run an NVIDIA Docker deep learning framework container, you must have nvidia-docker installed. For more information, see Installing Docker and NVIDIA Docker.
  1. As a user, run the container interactively.
    $ nvidia-docker run --rm -ti nvcr.io/nvidia/<framework>

    The following example runs the December 2016 release (16.12) of the NVIDIA Caffe container in interactive mode. The container is automatically removed when the user exits the container.

    $ nvidia-docker run --rm -ti nvcr.io/nvidia/caffe:16.12
    
    ===========
    == Caffe ==
    ===========
    
    NVIDIA Release 16.12 (build 6217)
    
    Container image Copyright (c) 2016, NVIDIA CORPORATION.  All rights reserved.
    Copyright (c) 2014, 2015, The Regents of the University of California (Regents)
    All rights reserved.
    
    Various files include modifications (c) NVIDIA CORPORATION.  All rights reserved.
    NVIDIA modifications are covered by the license terms that apply to the underlying project or file.
    root@df57eb8e0100:/workspace#
  2. From within the container, start the job that you want to run. The precise command to run depends on the deep learning framework in the container that you are running and the job that you want to run. For details see the /workspace/README.md file for the container.

    The following example runs the caffe time command on one GPU to measure the execution time of the deploy.prototxt model.

    # caffe time -model models/bvlc_alexnet/ -solver deploy.prototxt -gpu=0
  3. Optional: Run the December 2016 release (16.12) of the same NVIDIA Caffe container but in non-interactive mode.
    % nvidia-docker run --rm nvcr.io/nvidia/caffe:16.12 caffe time -model
          /workspace/models/bvlc_alexnet -solver /workspace/deploy.prototxt -gpu=0

Notices

Notice

THE INFORMATION IN THIS GUIDE AND ALL OTHER INFORMATION CONTAINED IN NVIDIA DOCUMENTATION REFERENCED IN THIS GUIDE IS PROVIDED “AS IS.” NVIDIA MAKES NO WARRANTIES, EXPRESSED, IMPLIED, STATUTORY, OR OTHERWISE WITH RESPECT TO THE INFORMATION FOR THE PRODUCT, AND EXPRESSLY DISCLAIMS ALL IMPLIED WARRANTIES OF NONINFRINGEMENT, MERCHANTABILITY, AND FITNESS FOR A PARTICULAR PURPOSE. Notwithstanding any damages that customer might incur for any reason whatsoever, NVIDIA’s aggregate and cumulative liability towards customer for the product described in this guide shall be limited in accordance with the NVIDIA terms and conditions of sale for the product.

THE NVIDIA PRODUCT DESCRIBED IN THIS GUIDE IS NOT FAULT TOLERANT AND IS NOT DESIGNED, MANUFACTURED OR INTENDED FOR USE IN CONNECTION WITH THE DESIGN, CONSTRUCTION, MAINTENANCE, AND/OR OPERATION OF ANY SYSTEM WHERE THE USE OR A FAILURE OF SUCH SYSTEM COULD RESULT IN A SITUATION THAT THREATENS THE SAFETY OF HUMAN LIFE OR SEVERE PHYSICAL HARM OR PROPERTY DAMAGE (INCLUDING, FOR EXAMPLE, USE IN CONNECTION WITH ANY NUCLEAR, AVIONICS, LIFE SUPPORT OR OTHER LIFE CRITICAL APPLICATION). NVIDIA EXPRESSLY DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY OF FITNESS FOR SUCH HIGH RISK USES. NVIDIA SHALL NOT BE LIABLE TO CUSTOMER OR ANY THIRD PARTY, IN WHOLE OR IN PART, FOR ANY CLAIMS OR DAMAGES ARISING FROM SUCH HIGH RISK USES.

NVIDIA makes no representation or warranty that the product described in this guide will be suitable for any specified use without further testing or modification. Testing of all parameters of each product is not necessarily performed by NVIDIA. It is customer’s sole responsibility to ensure the product is suitable and fit for the application planned by customer and to do the necessary testing for the application in order to avoid a default of the application or the product. Weaknesses in customer’s product designs may affect the quality and reliability of the NVIDIA product and may result in additional or different conditions and/or requirements beyond those contained in this guide. NVIDIA does not accept any liability related to any default, damage, costs or problem which may be based on or attributable to: (i) the use of the NVIDIA product in any manner that is contrary to this guide, or (ii) customer product designs.

Other than the right for customer to use the information in this guide with the product, no other license, either expressed or implied, is hereby granted by NVIDIA under this guide. Reproduction of information in this guide is permissible only if reproduction is approved by NVIDIA in writing, is reproduced without alteration, and is accompanied by all associated conditions, limitations, and notices.

Trademarks

NVIDIA, the NVIDIA logo, and cuBLAS, CUDA, cuDNN, cuFFT, cuSPARSE, DIGITS, DGX, DGX-1, Jetson, Kepler, NVIDIA Maxwell, NCCL, NVLink, Pascal, Tegra, TensorRT, and Tesla are trademarks and/or registered trademarks of NVIDIA Corporation in the Unites States and other countries. Other company and product names may be trademarks of the respective companies with which they are associated.