Running JAX

Running JAX (PDF)

Before you can run an NGC deep learning framework container, your Docker® environment must support NVIDIA GPUs. To run a container, issue the appropriate command as explained in Running A Container and specify the registry, repository, and tags.

About this task

On a system with GPU support for NGC containers, when you run a container, the following occurs:

  • The Docker engine loads the image into a container which runs the software.
  • You define the runtime resources of the container by including additional flags and settings that are used with the command.

    These flags and settings are described in Running A Container.

  • The GPUs are explicitly defined for the Docker container (defaults to all GPUs, but can be specified by using the NVIDIA_VISIBLE_DEVICES environment variable).
    Note:

    Starting in Docker 19.03, complete the steps below.

The method implemented in your system depends on the DGX OS version that you installed (for DGX systems), the NGC Cloud Image that was provided by a Cloud Service Provider, or the software that you installed to prepare to run NGC containers on TITAN PCs, Quadro PCs, or NVIDIA Virtual GPUs (vGPUs).

Procedure

  1. Issue the command for the applicable release of the container that you want.

    The following command assumes you want to pull the latest JAX container, where 23.10 is the container version. For example, 23.10 for October 2023 release:

    Copy
    Copied!
                

    docker pull nvcr.io/nvidia/jax:23.10-py3

    To pull the latest Paxml container:

    Copy
    Copied!
                

    docker pull nvcr.io/nvidia/jax:23.10-paxml-py3

    To pull the latest T5x container:

    Copy
    Copied!
                

    docker pull nvcr.io/nvidia/jax:23.10-t5x-py3

  2. In ther terminal, paste the above command. Ensure that the pull successfully completes before you proceed to step 3.
  3. Run the container image:
    • Use the following commands to run the container, where 23.10 is the container version:
      Copy
      Copied!
                  

      docker run --gpus all -it --rm -v local_dir:container_dir nvcr.io/nvidia/jax:23.10-py3


      If you use multiprocessing for multi-threaded data loaders, the default shared memory segment size with which the container runs might not be enough. Therefore, you should increase the shared memory size by issuing one of the following commands:
      • Copy
        Copied!
                    

        --ipc=host

      • Copy
        Copied!
                    

        --shm-size=<requested memory size>

      in the command line to

      Copy
      Copied!
                  

      docker run --gpus all


      To pull data and model descriptions from locations outside the container for use by JAX or save results to locations outside the container, mount one or more host directories as Docker® data volumes. This is done via the -v parameter in the example above.

© Copyright 2024, NVIDIA. Last updated on Apr 5, 2024.