Prerequisites

The deployment procedure in this guide relies on having BCM with K8s already installed on a DGX BasePOD configuration. If needed, see the NVIDIA DGX BasePOD Deployment Guide for information on how to perform the installation.

Ensure that the tenant name and application secret key have been provided by the Run:ai support team before starting the installation. The tenant name is the dedicated control plane URL for accessing the Run:ai Atlas platform and the application secret key is an API key required to securely communicate with the platform.

In addition, the Cluster URL, certificate (in .crt format), and private key (in a key file) must be generated by the IT department of customer before installation begins.

Note

Run:ai will not function properly with self-signed certificates.

The Cluster URL corresponds to a DNS A record that is created and maintained by the enterprise DNS server of the customer. The hostname for the DNS A record should be unique and resolve to one of the nodes within the BCM K8s cluster.

The Cluster URL must be appended with the port that the NGINX ingress service is listening on. Run the following command in the CLI to determine the ingress-nginx port mapping:

1root@basepod-head1:~# kubectl get svc -n ingress-nginx
2NAME                                                                TYPE            CLUSTER-IP          EXTERNAL-IP PORT(S) AGE
3ingress-nginx-controller            NodePort        10.150.30.30    10.130.122.9    80:30080/TCP, 443:3044 3/TCP    12d
4ingress-nginx-controller-admission  ClusterIP       10.150.166.37   <none>  443/TCP 12d

In this example, the port mapping for 443 is 30453.

Additionally, the certificate CN must be matched to the DNS record of the Cluster URL to secure all the inbound traffic to the cluster.

Other prerequisites needed for the Run:ai installation are at: https://docs.run.ai/admin/runai-setup/cluster-setup/cluster-prerequisites/