Running with Kubernetes#

Installing Docker#

First you will need to set up the repository.

Update the apt package index with the command below:

sudo apt-get update

Install packages to allow apt to use a repository over HTTPS:

sudo apt-get install -y \
  apt-transport-https \
  ca-certificates \
  curl \
  gnupg-agent \
  software-properties-common

Next you will need to add Docker’s official GPG key with the command below:

curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -

Verify that you now have the key with the fingerprint 9DC8 5822 9FC7 DD38 854A E2D8 8D81 803C 0EBF CD88, by searching for the last 8 characters of the fingerprint:

sudo apt-key fingerprint 0EBFCD88

pub   rsa4096 2017-02-22 [SCEA]
    9DC8 5822 9FC7 DD38 854A  E2D8 8D81 803C 0EBF CD88
uid           [ unknown] Docker Release (CE deb) <docker@docker.com>
sub   rsa4096 2017-02-22 [S]

Use the following command to set up the stable repository:

sudo add-apt-repository \
"deb [arch=amd64] https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) \
stable"

Install Docker Engine - Community Update the apt package index:

sudo apt-get update

Install Docker Engine:

sudo apt-get install -y docker-ce=5:19.03.12~3-0~ubuntu-bionic docker-ce-cli=5:19.03.12~3-0~ubuntu-bionic containerd.io

Verify that Docker Engine - Community is installed correctly by running the hello-world image:

sudo docker run hello-world

More information on how to install Docker can be found here.

Installing Kubernetes#

Make sure Docker has been started and enabled before beginning installation:

sudo systemctl start docker && sudo systemctl enable docker

Execute the following to add apt keys:

sudo apt-get update && sudo apt-get install -y apt-transport-https curl
curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -
sudo mkdir -p /etc/apt/sources.list.d/

Create kubernetes.list:

cat <<EOF | sudo tee /etc/apt/sources.list.d/kubernetes.list
deb https://apt.kubernetes.io/ kubernetes-xenial main
EOF

Now execute the below to install kubelet, kubeadm and kubectl:

sudo apt-get update
sudo apt-get install -y -q kubelet=1.21.1-00 kubectl=1.21.1-00 kubeadm=1.21.1-00
sudo apt-mark hold kubelet kubeadm kubectl

Reload the system daemon:

sudo systemctl daemon-reload

Disable swap#

sudo swapoff -a
sudo nano /etc/fstab

Note

Add a # before all the lines that start with /swap. # is a comment, and the result should look something like this:

UUID=e879fda9-4306-4b5b-8512-bba726093f1d / ext4 defaults 0 0
UUID=DCD4-535C /boot/efi vfat defaults 0 0
#/swap.img       none    swap    sw      0       0

Initializing the Kubernetes cluster to run as a control-plane node#

Execute the following command:

sudo kubeadm init --pod-network-cidr=192.168.0.0/16

Output:

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

    mkdir -p $HOME/.kube
    sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
    sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

    export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join <your-host-IP>:6443 --token 489oi5.sm34l9uh7dk4z6cm \
        --discovery-token-ca-cert-hash sha256:17165b6c4a4b95d73a3a2a83749a957a10161ae34d2dfd02cd730597579b4b34

Following the instructions in the output, execute the commands as shown below:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

With the following command, you install a pod-network add-on to the control plane node. We are using calico as the pod-network add-on here:

kubectl apply -f https://docs.projectcalico.org/manifests/calico.yaml

You can execute the below commands to ensure that all pods are up and running:

kubectl get pods --all-namespaces

Output:

NAMESPACE     NAME                                       READY   STATUS    RESTARTS   AGE
kube-system   calico-kube-controllers-65b8787765-bjc8h   1/1     Running   0          2m8s
kube-system   calico-node-c2tmk                          1/1     Running   0          2m8s
kube-system   coredns-5c98db65d4-d4kgh                   1/1     Running   0          9m8s
kube-system   coredns-5c98db65d4-h6x8m                   1/1     Running   0          9m8s
kube-system   etcd-#yourhost                             1/1     Running   0          8m25s
kube-system   kube-apiserver-#yourhost                   1/1     Running   0          8m7s
kube-system   kube-controller-manager-#yourhost          1/1     Running   0          8m3s
kube-system   kube-proxy-6sh42                           1/1     Running   0          9m7s
kube-system   kube-scheduler-#yourhost                   1/1     Running   0          8m26s

The get nodes command shows that the control-plane node is up and ready:

kubectl get nodes

Output:

NAME             STATUS   ROLES                  AGE   VERSION
#yourhost        Ready    control-plane,master   10m   v1.21.1

Since we are using a single-node Kubernetes cluster, the cluster will not schedule pods on the control plane node by default. To schedule pods on the control plane node, we have to remove the taint by executing the following command:

kubectl taint nodes --all node-role.kubernetes.io/master-

Refer to https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/ for more information.

Installing Helm#

Execute the following command to download and install Helm 3.5.4:

wget https://get.helm.sh/helm-v3.5.4-linux-amd64.tar.gz
tar -zxvf helm-v3.5.4-linux-amd64.tar.gz
sudo mv linux-amd64/helm /usr/local/bin/helm
rm -rf helm-v3.5.4-linux-amd64.tar.gz linux-amd64/

Refer to helm/helm and https://helm.sh/docs/using_helm/#installing-helm for more information.

NVIDIA Network Operator#

Prerequisites#

Note

If Mellanox NICs are not connected to your nodes, please skip this step and proceed to next step Installing GPU Operator.

The below instructions assume that Mellanox NICs are connected to your machines.

Execute the below command to verify Mellanox NICs are enabled on your machines:

lspci | grep -i "Mellanox"

Output:

0c:00.0 Ethernet controller: Mellanox Technologies MT2892 Family [ConnectX-6 Dx]
0c:00.1 Ethernet controller: Mellanox Technologies MT2892 Family [ConnectX-6 Dx]

Execute the below command to know which Mellanox Device is Active:

Note

Use the Device whichever shows as Link Detected: yes in further steps. Below command works only if you add the NICs before installing the Operating System.

for device in `sudo lshw -class network -short | grep -i ConnectX | awk '{print $2}' | egrep -v 'Device|path' | sed '/^$/d'`;do echo -n $device; sudo ethtool $device | grep -i "Link detected"; done

Output:

ens160f0        Link detected: yes
ens160f1        Link detected: no

Create the custom network operator values.yaml.

nano network-operator-values.yaml

Update the active Mellanox device from the above command.

deployCR: true
ofedDriver:
deploy: true
nvPeerDriver:
deploy: true
rdmaSharedDevicePlugin:
deploy: true
resources:
    - name: rdma_shared_device_a
    vendors: [15b3]
    devices: [ens160f0]

For more information about custom network operator values.yaml, please refer Network Operator.

Add the NVIDIA repo:

Note

Helm is required to install Network Operator.

helm repo add mellanox https://mellanox.github.io/network-operator

Update the Helm repo:

helm repo update

Install NVIDIA Network Operator#

Execute the commands below:

kubectl label nodes --all node-role.kubernetes.io/master- --overwrite
helm install -f ./network-operator-values.yaml -n network-operator --create-namespace --wait network-operator mellanox/network-operator

Validating the State of Network Operator#

Please note that the installation of the Network Operator can take a couple of minutes. How long the installation will take depends on your internet speed.

kubectl get pods --all-namespaces | egrep 'network-operator|nvidia-network-operator-resources'

NAMESPACE                           NAME                                                              READY   STATUS      RESTARTS   AGE
network-operator                    network-operator-547cb8d999-mn2h9                                 1/1     Running            0          17m
network-operator                    network-operator-node-feature-discovery-master-596fb8b7cb-qrmvv   1/1     Running            0          17m
network-operator                    network-operator-node-feature-discovery-worker-qt5xt              1/1     Running            0          17m
nvidia-network-operator-resources   cni-plugins-ds-dl5vl                                              1/1     Running            0          17m
nvidia-network-operator-resources   kube-multus-ds-w82rv                                              1/1     Running            0          17m
nvidia-network-operator-resources   mofed-ubuntu20.04-ds-xfpzl                                        1/1     Running            0          17m
nvidia-network-operator-resources   rdma-shared-dp-ds-2hgb6                                           1/1     Running            0          17m
nvidia-network-operator-resources   sriov-device-plugin-ch7bz                                         1/1     Running            0          10m
nvidia-network-operator-resources   whereabouts-56ngr                                                 1/1     Running            0          10m

Please refer to the Network Operator page for more information.

NVIDIA GPU Operator#

NVIDIA AI Enterprise customers have access to a pre-configured GPU Operator within the NVIDIA NGC Catalog. The GPU Operator is pre-configured to simplify the provisioning experience with NVIDIA AI Enterprise deployments.

The pre-configured GPU Operator differs from the GPU Operator in the public NGC catalog. The differences are:

It is configured to use a prebuilt vGPU driver image (Only available to NVIDIA AI Enterprise customers).
It is configured to use the NVIDIA License System (NLS).

Install GPU Operator#

Note

The GPU Operator with NVIDIA AI Enterprise requires some tasks to be completed prior to installation. Refer to the document NVIDIA AI Enterprise for instructions prior to running the below commands.

Tip

NVIDIA GPU Operator Install scripts are also available here.

Add the NVIDIA AI Enterprise Helm repository, where api-key is the NGC API key for accessing the NVIDIA Enterprise Collection that you generated:

helm repo add nvaie https://helm.ngc.nvidia.com/nvaie --username='$oauthtoken' --password=api-key && helm repo update

helm install --wait --generate-name nvaie/gpu-operator -n gpu-operator

License GPU Operator#

Copy the NLS license token in the file named client_configuration_token.tok.
Create an empty gridd.conf file.
touch gridd.conf

Create Configmap for the NLS Licensing.

kubectl create configmap licensing-config -n gpu-operator --from-file=./gridd.conf --from-file=./client_configuration_token.tok

Create K8s Secret to Access NGC registry.

kubectl create secret docker-registry ngc-secret --docker-server="nvcr.io/nvaie" --docker-username='$oauthtoken' --docker-password=’<YOUR API KEY>’ --docker-email=’<YOUR EMAIL>’ -n gpu-operator

GPU Operator with RDMA (Optional)#

Prerequisites#

Please install the Network Operator to ensure that the MOFED drivers are installed.

After NVIDIA Network Operator installation is completed, execute the below command to install the GPU Operator to load nv_peer_mem modules.

helm install --wait gpu-operator nvaie/gpu-operator -n gpu-operator --set driver.rdma.enabled=true

Validating the Network Operator with GPUDirect RDMA#

Execute the below command to list the Mellanox NIC’s with the status:

kubectl exec -it $(kubectl get pods -n nvidia-network-operator-resources | grep mofed | awk '{print $1}') -n nvidia-network-operator-resources -- ibdev2netdev

Output:

mlx5_0 port 1 ==> ens192f0 (Up)
mlx5_1 port 1 ==> ens192f1 (Down)

Edit the networkdefinition.yaml.

1nano networkdefinition.yaml

Create network definition for IPAM and replace the ens192f0 with active Mellanox device for master.

apiVersion: k8s.cni.cncf.io/v1
kind: NetworkAttachmentDefinition
metadata:
annotations:
    k8s.v1.cni.cncf.io/resourceName: rdma/rdma_shared_device_a
name: rdma-net-ipam
namespace: default
spec:
config: |-
    {
        "cniVersion": "0.3.1",
        "name": "rdma-net-ipam",
        "plugins": [
            {
                "ipam": {
                    "datastore": "kubernetes",
                    "kubernetes": {
                        "kubeconfig": "/etc/cni/net.d/whereabouts.d/whereabouts.kubeconfig"
                    },
                    "log_file": "/tmp/whereabouts.log",
                    "log_level": "debug",
                    "range": "192.168.111.0/24",
                    "type": "whereabouts"
                },
                "type": "macvlan",
                "master": "ens192f0",
                "vlan": 111
            },
            {
                "mtu": 1500,
                "type": "tuning"
            }
        ]
    }
EOF

Note

If you do not have VLAN-based networking on the high-performance side, please set “vlan”: 0

Validate the state of the GPU Operator#

Please note that the installation of the GPU Operator can take a couple of minutes. How long the installation will take depends on your internet speed.

kubectl get pods --all-namespaces | grep -v kube-system

Results:

NAMESPACE                NAME                                                              READY   STATUS      RESTARTS   AGE
default                  gpu-operator-1622656274-node-feature-discovery-master-5cddq96gq   1/1     Running     0          2m39s
default                  gpu-operator-1622656274-node-feature-discovery-worker-wr88v       1/1     Running     0          2m39s
default                  gpu-operator-7db468cfdf-mdrdp                                     1/1     Running     0          2m39s
gpu-operator-resources   gpu-feature-discovery-g425f                                       1/1     Running     0          2m20s
gpu-operator-resources   nvidia-container-toolkit-daemonset-mcmxj                          1/1     Running     0          2m20s
gpu-operator-resources   nvidia-cuda-validator-s6x2p                                       0/1     Completed   0          48s
gpu-operator-resources   nvidia-dcgm-exporter-wtxnx                                        1/1     Running     0          2m20s
gpu-operator-resources   nvidia-dcgm-jbz94                                                 1/1     Running     0          2m20s
gpu-operator-resources   nvidia-device-plugin-daemonset-hzzdt                              1/1     Running     0          2m20s
gpu-operator-resources   nvidia-device-plugin-validator-9nkxq                              0/1     Completed   0          17s
gpu-operator-resources   nvidia-driver-daemonset-kt8g5                                     1/1     Running     0          2m20s
gpu-operator-resources   nvidia-operator-validator-cw4j5                                   1/1     Running     0          2m20s

Please refer to the GPU Operator page on NGC for more information.