Examples of Running Containers

This chapter walks you through the process of logging in to the NGC container registry, pulling and running a container, and using file storage and data disks for storage.

Logging Into the NGC Container Registry

You need to log in to the NGC container registry only if you want to access locked containers from the registry. Most of the NGC containers are freely available (unlocked) and do not require an NGC account or NGC API key.

Note: You do not need to log into the NGC container registry if you are using either the NVIDIA Deep Learning with PyTorch or the NVIDIA Deep Learning with TensorFlow and intend to use the containers already built into the image.

If necessary, log in to the NGC container registry manually by running the following script from the VMI.

ngc-login.sh <your-NGC-API-key>

From this point you can run Docker commands and access locked NGC containers from the VM instance.

Preparing to Run Containers

The VMI includes a mechanism for supporting GPUs within Docker containers to obtain the best performance. Depending on the NVIDIA VMI version, the mechanisms are as follows.
  • Native GPU support with Docker-CE

    Requires Docker-CE 19.03 or later (Included in NVIDIA VMIs 19.10 and later)

  • NVIDIA Container Runtime with Docker-CE

    Included in NVIDIA VMIs prior to 19.10

Using Native GPU Support with Docker-CE

Use this method with NVIDIA VMIs version 19.10 and later.

Use docker run --gpus to run GPU-enabled containers.

  • Example using all GPUs
    $ docker run --gpus all ... 
  • Example using two GPUs
    $ docker run --gpus 2 ...
  • Examples using specific GPUs
    $ docker run --gpus "device=1,2" ... $ docker run --gpus "device=UUID-ABCDEF,1" ... 

Using the NVIDIA Container Runtime with Docker-CE

Use this method with NVIDIA VMIs prior to version 19.10

Use docker run and specify runtime=nvidia.

 $ docker run --runtime=nvidia ...  

Running a Container

This section explains the basic process for running a container on the NVIDIA Deep Learning for TensorFlow, the NVIDIA Deep Learning for PyTorch, and the basic NVIDIA Deep Learning.

Running the Built-in TensorFlow Container

To run the TensorFlow container in the VM created from the NVIDIA Deep Learning for TensorFlow, refer to the release notes for the correct tag to use, then enter the following command.

On NVIDIA VMIs version 19.10 and later

docker run --gpus all --rm -it nvcr.io/nvidia/tensorflow:<tag>  

On NVIDIA VMIs prior to version 19.10

docker run --runtime=nvidia --rm -it nvcr.io/nvidia/tensorflow:<tag>  

Running the Built-in PyTorch Container

To run the PyTorch container in the VM created from the NVIDIA Deep Learning, refer to the release notes for the correct tag to use, then enter the following command.

On NVIDIA VMIs version 19.10 and later

docker run --gpus all --rm -it nvcr.io/nvidia/pytorch:<tag>  

On NVIDIA VMIs prior to version 19.10

docker run --runtime=nvidia --rm -it nvcr.io/nvidia/pytorch:<tag>

Running a Container from the NGC Container Registry

To run containers from the NGC container registry,

  1. If necessary, log in to the NGC container registry as explained in the previous section.

  2. Enter the following commands.

docker pull nvcr.io/nvidia/<container-image>:<tag>

On NVIDIA VMIs version 19.10 and later

docker run --gpus all --rm -it nvcr.io/nvidia/<container-image>:<tag>  

On NVIDIA VMIs prior to version 19.10

docker run --runtime=nvidia --rm -it nvcr.io/nvidia/<container-image>:<tag>

Example: MNIST Training Run Using PyTorch Container

Once logged in to the NVIDIA GPU Cloud Image instance, you can run the MNIST example under PyTorch.

Note that the PyTorch example will download the MNIST dataset from the web.

  1. Pull and run the PyTorch container:
    docker pull nvcr.io/nvidia/pytorch:18.02-py3

    On NVIDIA VMIs version 19.10 and later

    docker run --gpus all --rm -it nvcr.io/nvidia/pytorch:18.02-py3.10

    On NVIDIA VMIs prior to version 19.10

    docker run --runtime=nvidia --rm -it nvcr.io/nvidia/pytorch:18.02-py3.10
  2. Run the MNIST example:
    cd /opt/pytorch/examples/mnist
    python main.py

Example: MNIST Training Run Using TensorFlow Container

Once logged in to the NVIDIA GPU Cloud image, you can run the MNIST example under TensorFlow.

Note that the TensorFlow built-in example will pull the MNIST dataset from the web.

  1. Pull and run the TensorFlow container.
    docker pull nvcr.io/nvidia/tensorflow:18.08-py3

    On NVIDIA VMIs version 19.10 and later

    docker run --gpus all --rm -it nvcr.io/nvidia/tensorflow:18.08-py3  

    On NVIDIA VMIs prior to version 19.10

    docker run --runtime=nvidia --rm -it nvcr.io/nvidia/tensorflow:18.08-py3
  2. Following this tutorial: https://www.tensorflow.org/get_started/mnist/beginners, run the MNIST_with_summaries example.
    cd /opt/tensorflow/tensorflow/examples/tutorials/mnist
    python mnist_with_summaries.py