Step #5: Modify Image with Dockerfile

After inspecting the PyTorch container, we will now create a modified version of the base image with custom applications and scripts included so we don’t need to install them manually whenever we launch a container. This involves creating a Dockerfile which allows us to specify various commands and settings we want to invoke while building a new image.

Creating a Dockerfile

To create a Dockerfile, open a new file named “Dockerfile” (make sure it uses a capital “D” and no extension) in the local directory using a text editor. For more information on the Dockerfile syntax and usage, reference the official documentation from Docker.

First, we need to specify the base image that will be used. The base image provides us a starting point for our custom image that we can build upon. Since we will use the PyTorch image we inspected in the previous section as our base, our custom image will look identical to the PyTorch image for the first step. To specify our base image in the Dockerfile, we will add the following on the first line:

Copy
Copied!

            
            FROM nvcr.io/nvidia/pytorch:22.03-py3

Note that Docker will first try to use a local version of the listed image if available, followed by attempting to pull the image from the specified container registry if not available. If we were to build the image without making any additional changes to the Dockerfile, our image will be identical to the PyTorch image from NGC as we haven’t made any changes.

To make the custom image more useful, let’s clone a repo that we can use to run some exciting deep learning examples. The repo we will use is the DeepLearningExamples repo found on NVIDIA’s GitHub. We will use this repo to run an example image classification application using PyTorch on GPUs.

Dockerfile has a “RUN” instruction which tells Docker to run the indicated command while at that step in the file. This allows us to run steps inside the container that might be necessary for later instructions or for installing packages inside the container for the user. For example, if we wanted to install htop similar to when we inspected the container in the previous section, we would write the following in our Dockerfile. Note that these were written for the same “RUN” instruction. Given Docker creates a new image layer every time a command is provided, we want to reduce the overall number of unique Docker commands in the Dockerfile to keep our image small in size.

Copy
Copied!

            
            RUN apt update && \
      apt install -y htop

For our purposes, we don’t need the htop package and can skip including the line above in our Dockerfile. Instead, we will want to clone the DeepLearningExamples repository using git (which has already been installed in the base image) at a specific commit hash. The commit hash shown below was found by navigating to https://github.com/nvidia/deeplearningexamples and selecting the latest commit hash at the time of writing. Considering images can be built at any time and reference code and packages may be updated at any point, it is recommended to use specific versions or tags of code and applications when possible to avoid unexpected breaking changes down the road.

To clone the repository in the Dockerfile, create an empty line after the “FROM” line above and on the next line, add the following command:

Copy
Copied!

            
            RUN git clone https://github.com/nvidia/deeplearningexamples && \
      cd deeplearningexamples && \
      git checkout f3dbf8a69522d69c63c4508769bd8137658786a1

Your Dockerfile should now look like this:

Copy
Copied!

            
            FROM nvcr.io/nvidia/pytorch:22.03-py3

  RUN git clone https://github.com/nvidia/deeplearningexamples && \
      cd deeplearningexamples && \
      git checkout f3dbf8a69522d69c63c4508769bd8137658786a1

Another useful instruction is the “WORKDIR” command. This allows us to specify the working directory for all subsequent commands and for when the container is launched. Note that the WORKDIR can be updated multiple times in a Dockerfile as necessary. The final listed WORKDIR will be the directory the container opens on launch.

We will update the WORKDIR command to specify that we want a specific directory in the DeepLearningExamples repository to be our default directory for future commands as well as at container runtime. After putting another blank line after the “git checkout” in the previous command, add the following to your Dockerfile:

Copy
Copied!

            
            WORKDIR /workspace/deeplearningexamples/PyTorch/Classification/ConvNets

Next, let’s install some packages necessary for running the example. Unlike when we ran the container previously, these packages we install during the build process will be included with our custom image and we will not need to install them again while running a container based on the custom image.

Add another empty newline after the WORKDIR instruction and create another RUN instruction to install the application’s dependencies using Python’s package manager, pip.

Copy
Copied!

            
            RUN pip install -r requirements.txt nvidia-imageinary==1.1.3

With the dependencies installed, we are now finished with our custom Dockerfile which should look like the following:

Copy
Copied!

            
            FROM nvcr.io/nvidia/pytorch:22.03-py3

  RUN git clone https://github.com/nvidia/deeplearningexamples && \
      cd deeplearningexamples && \
      git checkout f3dbf8a69522d69c63c4508769bd8137658786a1

  WORKDIR /workspace/deeplearningexamples/PyTorch/Classification/ConvNets

  RUN pip install -r requirements.txt nvidia-imageinary==1.1.3

Building the modified image

Now that our Dockerfile is complete, we can build the image which will run through the steps that we specified in the file and save a copy of the image to your local workstation. To do so, run the following command which will build a new image named “nvcr.io/nv-launchpad-orgname/sample-image” and the tag “1.0”. The full image name is specified after the “-t” flag and is everything before the first colon with the tag being everything specified after the first colon. Note that your organization name (nv-launchpad-orgname in this case) will likely be different and should be updated to reflect the provided orgname accessible from your account. Otherwise, you are free to change the image name (“sample-image” in this case) and tag as desired (don’t forget the “.” at the end of the command).

Copy
Copied!

            
            $ docker build -t nvcr.io/nv-launchpad-orgname/sample-image:1.0 .

If you used the same Dockerfile as shown above, this will generate text similar to the following:

Copy
Copied!

            
            Sending build context to Docker daemon  2.048kB
  Step 1/4 : FROM nvcr.io/nvidia/pytorch:22.03-py3
  ---> 4730bc516b92
  Step 2/4 : RUN git clone https://github.com/nvidia/deeplearningexamples &&     cd deeplearningexamples &&     git checkout f3dbf8a69522d69c63c4508769bd8137658786a1
  ---> Running in 326d5fb91a89
  Cloning into 'deeplearningexamples'...
  Note: switching to 'f3dbf8a69522d69c63c4508769bd8137658786a1'.

  You are in 'detached HEAD' state. You can look around, make experimental
  changes and commit them, and you can discard any commits you make in this
  state without impacting any branches by switching back to a branch.

  If you want to create a new branch to retain commits you create, you may
  do so (now or later) by using -c with the switch command. Example:

    git switch -c <new-branch-name>

  Or undo this operation with:

    git switch -

  Turn off this advice by setting config variable advice.detachedHead to false

  HEAD is now at f3dbf8a6 [BERT/PyT] add JIT autocast
  Removing intermediate container 326d5fb91a89
  ---> 08e1c8dd26f8
  Step 3/4 : WORKDIR /workspace/deeplearningexamples/PyTorch/Classification/ConvNets
  ---> Running in 63a3c614dfb1
  Removing intermediate container 63a3c614dfb1
  ---> a0cc6bb5202e
  Step 4/4 : RUN pip install -r requirements.txt nvidia-imageinary==1.1.3
  ---> Running in 801287fbc5df
  Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
  Collecting dllogger
    Cloning https://github.com/NVIDIA/dllogger (to revision v1.0.0) to /tmp/pip-install-17k4pxj6/dllogger_ae526c1927be46a8b397f50ec3a3fe85
    Running command git clone -q https://github.com/NVIDIA/dllogger /tmp/pip-install-17k4pxj6/dllogger_ae526c1927be46a8b397f50ec3a3fe85
    Resolved https://github.com/NVIDIA/dllogger to commit 89913fd227b720a3026550b904cdca0d49d82100
  Collecting nvidia-imageinary==1.1.3
    Downloading https://developer.download.nvidia.com/compute/redist/nvidia-imageinary/nvidia_imageinary-1.1.3-py3-none-any.whl (13 kB)
  Collecting pynvml==11.0.0
    Downloading pynvml-11.0.0-py3-none-any.whl (46 kB)
  Requirement already satisfied: Pillow>=7.1.2 in /opt/conda/lib/python3.8/site-packages (from nvidia-imageinary==1.1.3) (9.0.0)
  Requirement already satisfied: numpy>=1.18.0 in /opt/conda/lib/python3.8/site-packages (from nvidia-imageinary==1.1.3) (1.22.3)
  Building wheels for collected packages: dllogger
    Building wheel for dllogger (setup.py): started
    Building wheel for dllogger (setup.py): finished with status 'done'
    Created wheel for dllogger: filename=DLLogger-1.0.0-py3-none-any.whl size=5670 sha256=1f358bd0e559e49885ac67146957c1f343e24c400f5142fde7bcffef824dfaaa
    Stored in directory: /tmp/pip-ephem-wheel-cache-0h6p75zr/wheels/32/ff/4a/1d61bdc575b373a327658f1de2513a0af81094c50c9c56fa8b
  Successfully built dllogger
  Installing collected packages: pynvml, nvidia-imageinary, dllogger
    Attempting uninstall: pynvml
      Found existing installation: pynvml 11.4.1
      Uninstalling pynvml-11.4.1:
        Successfully uninstalled pynvml-11.4.1
  Successfully installed dllogger-1.0.0 nvidia-imageinary-1.1.3 pynvml-11.0.0
  WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
  Removing intermediate container 801287fbc5df
  ---> b98e9cad2b60
  Successfully built b98e9cad2b60
  Successfully tagged nvcr.io/nv-launchpad-orgname/sample-image:1.0

Push image to NGC

Now that we have a modified image built locally, we can push it to NGC so we can use it on other machines and collaborate with teammates. Assuming you are logged into NGC locally by following the steps above, run this command, updating the image name and tag as necessary:

Copy
Copied!

            
            $ docker push nvcr.io/nv-launchpad-orgname/sample-image:1.0

While the image is being pushed, you will see output similar to the following:

Copy
Copied!

            
            The push refers to repository [nvcr.io/nv-launchpad-orgname/sample-image]
  c170ca729e6b: Pushed
  62f753b7286f: Pushed
  9c7a2e08fe4c: Mounted from nvidia/pytorch
  fe48bfeac91d: Mounted from nvidia/pytorch
  6d24369f9726: Mounted from nvidia/pytorch
  0f010943c2be: Mounted from nvidia/pytorch
  80df8233699e: Mounted from nvidia/pytorch
  4995463ed504: Mounted from nvidia/pytorch
  c68289e5466a: Mounted from nvidia/pytorch
  b92c6f3cf8ba: Mounted from nvidia/pytorch
  ea54ed1c9d39: Mounted from nvidia/pytorch
  e94fa5c9c518: Mounted from nvidia/pytorch
  8456d4967bfe: Mounted from nvidia/pytorch
  40f364efa84f: Mounted from nvidia/pytorch
  65c14c7eaf47: Mounted from nvidia/pytorch
  14e6ddddf256: Mounted from nvidia/pytorch
  7821737d952f: Mounted from nvidia/pytorch
  77a776e8014b: Mounted from nvidia/pytorch
  7a7051e759c4: Mounted from nvidia/pytorch
  e1aa1f9ee97e: Mounted from nvidia/pytorch
  3b720402b8ab: Mounted from nvidia/pytorch
  3b1792efdad9: Mounted from nvidia/pytorch
  5f70bf18a086: Mounted from nvidia/pytorch
  6ba71d233b75: Mounted from nvidia/pytorch
  5342e89df8e3: Mounted from nvidia/pytorch
  fc3209a87194: Mounted from nvidia/pytorch
  1ee80d85e1cf: Mounted from nvidia/pytorch
  489f24d7d381: Mounted from nvidia/pytorch
  f7655918bfe6: Mounted from nvidia/pytorch
  5ec341fc8fe7: Mounted from nvidia/pytorch
  8fb729c89bb4: Mounted from nvidia/pytorch
  852255d743c1: Mounted from nvidia/pytorch
  abf81ae6f4c8: Mounted from nvidia/pytorch
  f89ef356505e: Mounted from nvidia/pytorch
  6fb2a344ac89: Mounted from nvidia/pytorch
  850236713495: Mounted from nvidia/pytorch
  b9dfd77f5b0a: Mounted from nvidia/pytorch
  6a1014d46250: Mounted from nvidia/pytorch
  85f49f4e6923: Mounted from nvidia/pytorch
  2f175b794573: Mounted from nvidia/pytorch
  899455397741: Mounted from nvidia/pytorch
  2df8c0a32afe: Mounted from nvidia/pytorch
  a060c5cefec7: Mounted from nvidia/pytorch
  83cdade3c9b5: Mounted from nvidia/pytorch
  fec6965e7a6b: Mounted from nvidia/pytorch
  2ff0ade8d3c9: Mounted from nvidia/pytorch
  01e996931197: Mounted from nvidia/pytorch
  867d0767a47c: Mounted from nvidia/pytorch
  1.0: digest: sha256:64e3f9abb33ac2b7287d8626d79ef9bff2d5126eadce40699a2651c7aee72ec9 size: 10642

Once the image has been fully pushed, it should now be available on NGC for usage on Base Command as well as allowing the image to be pulled down locally on different systems.