NVIDIA AI Enterprise 2.0 or later
The NVIDIA AI Enterprise offers a collection of containers for running AI/ML and Data Science workloads. The containers are packaged and delivered as containers. The container runtime used by Ubuntu OS is docker and the container runtime used by RHEL is podman.
First you will need to set up the repository.
Update the apt package index with the command below:
$ sudo apt-get update
Install packages to allow apt to use a repository over HTTPS:
$ sudo apt-get install -y \
apt-transport-https \
ca-certificates \
curl \
gnupg-agent \
software-properties-common
Next you will need to add Docker’s official GPG key with the command below:
$ curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
Verify that you now have the key with the fingerprint 9DC8 5822 9FC7 DD38 854A E2D8 8D81 803C 0EBF CD88, by searching for the last 8 characters of the fingerprint:
$ sudo apt-key fingerprint 0EBFCD88
pub rsa4096 2017-02-22 [SCEA]
9DC8 5822 9FC7 DD38 854A E2D8 8D81 803C 0EBF CD88
uid [ unknown] Docker Release (CE deb) <docker@docker.com>
sub rsa4096 2017-02-22 [S]
Use the following command to set up the stable repository:
$ sudo add-apt-repository \
"deb [arch=amd64] https://download.docker.com/linux/ubuntu \
$(lsb_release -cs)\
stable"
Install Docker Engine - Community Update the apt package index:
$ sudo apt-get update
Install Docker Engine:
$ sudo apt-get install -y docker-ce docker-ce-cli containerd.io
Verify that Docker Engine - Community is installed correctly by running the hello-world image:
$ sudo docker run hello-world
More information on how to install Docker can be found here.
For installing Podman, follow the official instructions for your supported Linux distribution. For convenience, the documentation below includes instructions on installing Podman on RHEL 8.
On RHEL 8, check if the
container-tools
module is available with the command below.$ sudo dnf module list | grep container-tools
This should return an output as shown below.
container-tools rhel8 [d] common [d] Most recent (rolling) versions of podman, buildah, skopeo, runc, conmon, runc, conmon, CRIU, Udica, etc as well as dependencies such as container-selinux built and tested together, and updated as frequently as every 12 weeks. container-tools 1.0 common [d] Stable versions of podman 1.0, buildah 1.5, skopeo 0.1, runc, conmon, CRIU, Udica, etc as well as dependencies such as container-selinux built and tested together, and supported for 24 months. container-tools 2.0 common [d] Stable versions of podman 1.6, buildah 1.11, skopeo 0.1, runc, conmon, etc as well as dependencies such as container-selinux built and tested together, and supported as documented on the Application Stream lifecycle page. container-tools rhel8 [d] common [d] Most recent (rolling) versions of podman, buildah, skopeo, runc, conmon, runc, conmon, CRIU, Udica, etc as well as dependencies such as container-selinux built and tested together, and updated as frequently as every 12 weeks. container-tools 1.0 common [d] Stable versions of podman 1.0, buildah 1.5, skopeo 0.1, runc, conmon, CRIU, Udica, etc as well as dependencies such as container-selinux built and tested together, and supported for 24 months. container-tools 2.0 common [d] Stable versions of podman 1.6, buildah 1.11, skopeo 0.1, runc, conmon, etc as well as dependencies such as container-selinux built and tested together, and supported as documented on the Application Stream lifecycle page.
Now, proceed to install the
container-tools
module, which will install Podman with the command below.$ sudo dnf module install -y container-tools
Once, Podman is installed, check the version with the command below.
$ podman version Version: 2.2.1 API Version: 2 Go Version: go1.14.7 Built: Mon Feb 8 21:19:06 2021 OS/Arch: linux/amd64
If the user running the containers is a privileged user (e.g. root) this change should not be made and will cause containers using the NVIDIA Container Toolkit to fail.
To be able to run rootless containers with Podman, we need the following configuration change to the NVIDIA runtime with the command below.
$ sudo sed -i 's/^#no-cgroups = false/no-cgroups = true/;' /etc/nvidia-container-runtime/config.toml