Running PaddlePaddle

NVIDIA Optimized Frameworks (Latest Release) Download PDF

Before you can run an NGC deep learning framework container, your Docker environment must support NVIDIA GPUs. To run a container, issue the appropriate command as explained in Running A Container and specify the registry, repository, and tags.

About this task

On a system with GPU support for NGC containers, when you run a container, the following occurs :

  • The Docker engine loads the image into a container that runs the software.
  • You define the container's runtime resources by including the additional flags and settings that are used with the command.

    These flags and settings are described in Running A Container.

  • The GPUs are explicitly defined for the Docker® container, which defaults to all GPUs, but can be specified by using the NVIDIA_VISIBLE_DEVICES environment variable.

    For more information, refer to the nvidia-docker documentation.


    Starting in Docker 19.03, complete the steps below.

The method implemented in your system depends on the DGX OS version that you installed (for DGX systems), the NGC Cloud Image that was provided by a Cloud Service Provider, or the software that you installed to prepare to run NGC containers on TITAN PCs, Quadro PCs, or NVIDIA Virtual GPUs (vGPUs).


  1. Issue the command for the applicable release of the container that you want.

    The following command assumes that you want to pull the latest container.


    docker pull

  2. Open a command prompt and paste the pull command.

    Ensure that the pull process successfully completes before you proceed to step 3.

  3. Run the container image.
    • If you have Docker 19.03 or later, a typical command to launch the container is:

      docker run --gpus all -it --rm

    • If you have Docker 19.02 or earlier, a typical command to launch the container is:

      nvidia-docker run -it --rm

    To run PaddlePaddle, import it as a Python module:


    $ python -c 'import paddle; paddle.utils.run_check()' Running verify PaddlePaddle program … W0516 06:36:54.208734 442] Please NOTE: device: 0, GPU Compute Capability: 8.0, Driver API Version: 11.7, Runtime API Version: 11.7 W0516 06:36:54.212574 442] device: 0, cuDNN Version: 8.4. PaddlePaddle works well on 1 GPU. W0516 06:37:12.706600 442] Find all_reduce operators: 2. To make the speed faster, some all_reduce ops are fused during training, after fusion, the number of all_reduce ops is 2. PaddlePaddle works well on 8 GPUs. PaddlePaddle is installed successfully! Let's start deep learning with PaddlePaddle now.

    To pull data and model descriptions from locations outside the container for use by PaddlePaddle or save results to locations outside the container, mount one or more host directories as Docker data volumes.

    To share data between GPUs, NVIDIA Collective Communications Library (NCCL) might require shared system memory for IPC and pinned (page-locked) system memory resources, so the operating system’s limits on these resources might need to be increased. Refer to your system’s documentation for more information.

    In particular, Docker containers default to limited shared and pinned memory resources. When using NCCL inside a container, we recommend that you increase these resources by issuing:


    --shm-size=1g --ulimit memlock=-1

    in the command line to:


    docker run --gpus all

    Similarly, on some Redhat Enterprise Linux (RHEL) systems, Docker limits the number of simultaneous PIDs in the container to 4096, which might be too small, particularly for multi-GPU training tasks.

    To increase this limit, pass the following option to docker run.



© Copyright 2024, NVIDIA. Last updated on Jul 3, 2024.