Ways To Run DIGITS

Ways To Run DIGITS (PDF)

You can run DIGITS in the following ways:

  1. Running DIGITS
  2. Running DIGITS from Developer Zone
  3. Docker®. For more information, see DIGITS on GitHub.

Running DIGITS

Before you begin

Before you can run an NGC deep learning framework container, your Docker environment must support NVIDIA GPUs. To run a container, issue the appropriate command as explained in the Running A Container chapter in the NVIDIA Containers And Frameworks User Guide and specify the registry, repository, and tags.

About this task

On a system with GPU support for NGC containers, the following occurs when running a container:

  • The Docker engine loads the image into a container which runs the software.
  • You define the runtime resources of the container by including additional flags and settings that are used with the command. These flags and settings are described in Running A Container.
  • The GPUs are explicitly defined for the Docker container (defaults to all GPUs, but can be specified using NVIDIA_VISIBLE_DEVICES environment variable). Starting in Docker 19.03, follow the steps as outlined below. For more information, refer to the nvidia-docker documentation here.

The method implemented in your system depends on the DGX OS version installed (for DGX systems), the specific NGC Cloud Image provided by a Cloud Service Provider, or the software that you have installed in preparation for running NGC containers on TITAN PCs, Quadro PCs, or vGPUs.

Procedure

  1. Issue the command for the applicable release of the container that you want. The following command assumes you want to pull the latest container.
    Copy
    Copied!
                

    docker pull nvcr.io/nvidia/digits:21.09-tensorflow

  2. Open a command prompt and paste the pull command. The pulling of the container image begins. Ensure the pull completes successfully before proceeding to the next step.
  3. Run the application. If you have Docker 19.03 or later, a typical command to launch the container is:
    Copy
    Copied!
                

    docker run --gpus all -it --rm -v local_dir:container_dir nvcr.io/nvidia/digits:<xx.xx>-<framework>


    If you have Docker 19.02 or earlier, a typical command to launch the container is:
    Copy
    Copied!
                

    nvidia-docker run -it --rm -v local_dir:container_dir nvcr.io/nvidia/digits:<xx.xx>-<framework>


    1. To run the server as a daemon and expose port 5000 in the container to port 8888 on your host:
      Copy
      Copied!
                  

      docker run --gpus all --name digits -d -p 8888:5000 nvcr.io/nvidia/digits:<xx.xx>-<framework>

      Note:

      Note:DIGITS 6.1.1 uses port 5000 by default.

    2. To mount one local directory containing your data (read-only), and another for writing your DIGITS jobs:
      Copy
      Copied!
                  

      docker run --gpus all --name digits -d -p 8888:5000 -v /home/username/data:/data -v /home/username/digits- jobs:/workspace/jobs nvcr.io/nvidia/digits:<xx.xx>-<framework>

      Note:

      Note: In order to share data between ranks, NVIDIA® Collective Communications Library ™ (NCCL) may require shared system memory for IPC and pinned (page-locked) system memory resources. The operating system’s limits on these resources may need to be increased accordingly. Refer to your system’s documentation for details. In particular, Docker containers default to limited shared and pinned memory resources. When using NCCL inside a container, it is recommended that you increase these resources by issuing:

      Copy
      Copied!
                  

      --shm-size=1g --ulimit memlock=-1

      in the command line to:

      Copy
      Copied!
                  

      docker run --gpus all


Running DIGITS from Developer Zone

About this task

For more information about downloading, running, and using DIGITS, see: NVIDIA DIGITS: Interactive Deep Learning GPU Training System.

Last updated on Sep 27, 2021.