High Availability#

This documentation is part of NVIDIA DGX BasePOD: Deployment Guide Featuring NVIDIA DGX A100 Systems.

Warning

The # prompt indicates commands that you execute as the root user on a head node. The % prompt indicates commands that you execute within cmsh.

Configure High Availability #

This section covers how to configure high availability (HA) using cmha-setup CLI wizard.

Ensure that both head nodes are licensed.

We provided the MAC address for the secondary head when we installed the cluster license. For details, see Cluster Configuration Steps.
```
1% main licenseinfo  | grep ^MAC
2MAC address / Cloud ID               04:3F:72:E7:67:07|14:02:EC:DA:AF:18
```

Configure the NFS shared storage.

Mounts configured in fsmounts will be automatically mounted by the CMDaemon.

% device
% use master
% fsmounts
% add /nfs/general
% set device 10.227.48.252:/var/nfs/general
% set filesystem nfs
% commit
% show
Parameter                        Value
---------------------------- ------------------------------------------------
Device                           10.227.48.252:/var/nfs/general
Revision
Filesystem                       nfs
Mountpoint                       /nfs/general
Dump                             no
RDMA                             no
Filesystem Check                 NONE
Mount options                    defaults

Verify that the shared storage is mounted.

# mount | grep '/nfs/general'
10.227.48.252:/var/nfs/general on /nfs/general type nfs4 (rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.130.12210.227.48_lock=none,addr=10.130.122.252)10.227.48

Verify that head node has power control over the cluster nodes.

% device
% power -c dgx,k8s-master status
[basepod-head1->device]% power -c dgx,k8s-master status
ipmi0 .................... [   ON    ] dgx01
ipmi0 .................... [   ON    ] dgx02
ipmi0 .................... [   ON    ] dgx03
ipmi0 .................... [   ON    ] dgx04
ipmi0 .................... [   ON    ] knode01
ipmi0 .................... [   ON    ] knode02
ipmi0 .................... [   ON    ] knode03
[basepod-head1->device]%

Power off the cluster nodes. The cluster nodes must be powered off before configuring HA.

% power -c k8s-master,dgx off
ipmi0 .................... [   OFF   ] knode01
ipmi0 .................... [   OFF   ] knode02
ipmi0 .................... [   OFF   ] knode03
ipmi0 .................... [   OFF   ] dgx01
ipmi0 .................... [   OFF   ] dgx02
ipmi0 .................... [   OFF   ] dgx03
ipmi0 .................... [   OFF   ] dgx04

Start the cmha-setup CLI wizard as the root user on the primary head node.
```
1# cmha-setup
```
Choose Setup and then select SELECT.
Choose Configure and then select NEXT.

Verify that the cluster license information found by the wizard is correct and then select CONTINUE.

Configure an external virtual IP address to be used by the active head node in the HA configuration and then select NEXT.

This will be the IP that should always be used for accessing the active head nodes.
Provide an internal virtual IP address that will be used by the active head node in the HA configuration and then select NEXT.
Provide the name of the secondary head node and then select NEXT.
DGX BasePOD uses the internal network as the failover network, so select SKIP.
Configure the IP addresses for the secondary head node and then select NEXT.
Review the summary of the configuration and then select NEXT.

This screen shoes the VIP that will be assigned to the internal and external interfaces.
Select Yes to proceed with the failover configuration.
Enter the MySQL root password and then select OK. This should be the same as the root password.

The wizard implements the first steps in the HA configuration. If all the steps show OK, press ENTER to continue. The progress is shown below:

Initializing failover setup on master.............. [  OK  ]
Updating shared internal interface................. [  OK  ]
Updating shared external interface................. [  OK  ]
Updating extra shared internal interfaces.......... [  OK  ]
Cloning head node.................................. [  OK  ]
Updating secondary master interfaces............... [  OK  ]
Updating Failover Object........................... [  OK  ]
Restarting cmdaemon................................ [  OK  ]
Press any key to continue

When the failover setup installation on the primary master is complete, select OK to exit the wizard.
PXE boot the secondary head node and then select RESCUE from the grub menu.

Since this is the initial boot of this node, it must be done outside of Base Command Manager (BMC or physical power button).
After the secondary head node has booted into the rescue environment, run the /cm/cm-clone-install –failover command, then enter yes when prompted.

The secondary head node will be cloned from the primary.
When cloning is completed, enter y to reboot the secondary head node.

The secondary must boot from its hard drive. PXE boot should not be enabled.
Wait for the secondary head node to reboot and then continue the HA setup procedure on the primary head node.
Choose finalize from the cmha-setup menu and then select NEXT.

This will clone the MySQL database from the primary to the secondary head node.
Select CONTINUE on the confirmation screen.
Enter the MySQL root password and then select OK. This should be the same as the root password.

The cmha-setup wizard continues. Press ENTER to continue when prompted.

The progress is shown below:

Updating secondary master mac address.............. [  OK  ]
Initializing failover setup on basepod-head2....... [  OK  ]
Stopping cmdaemon.................................. [  OK  ]
Cloning cmdaemon database.......................... [  OK  ]
Checking database consistency...................... [  OK  ]
Starting cmdaemon, chkconfig services.............. [  OK  ]
Cloning workload manager databases................. [  OK  ]
Cloning additional databases....................... [  OK  ]
Update DB permissions.............................. [  OK  ]
Checking for dedicated failover network............ [  OK  ]
Press any key to continue

Select REBOOT when the WARNING: REBOOT REQUIRED screen is shown.

Wait for the secondary head node to reboot before continuing.

The secondary head node is now UP.

% device list -f hostname:20,category:12,ip:20,status:15
hostname (key)       category     ip                   status
-------------------- ---------- -------------------- ---------------
basepod-head1                     10.227.48.254       [   UP   ]
basepod-head2                     10.227.48.253       [   UP   ]
knode01              k8s-master   10.227.48.9         [  DOWN  ]
knode02              k8s-master   10.227.48.10        [  DOWN  ]
knode03              k8s-master   10.227.48.11        [  DOWN  ]
dgx01                dgx          10.227.48.5         [  DOWN  ]
dgx02                dgx          10.227.48.6         [  DOWN  ]
dgx03                dgx          10.227.48.7         [  DOWN  ]
dgx04                dgx          10.227.48.8         [  DOWN  ]

Choose Shared Storage from the cmha-setup menu and select SELECT.

In this final HA configuration step, cmha-setup will copy the /cm/shared and /home directories to the shared storage, and it configures both head nodes and all cluster nodes to mount it.
Choose NAS and then select SELECT.
Choose /cm/shared and /home and then select NEXT.
Provide the IP address of the NAS host, the paths for the /cm/shared and /home directories should be copied to on the shared storage and then select NEXT.

In this case, /var/nfs/general is exported, so the /cm/shared directory will be copied to 10.227.48.252:/var/nfs/general/cmshared, and it will be mounted over /cm/shared on the cluster nodes.
The wizard shows a summary of the information that it has collected. Press ENTER to continue.
Select YES when prompted to proceed with the setup.

The cmha-setup wizard proceeds with its work. When it completes, select ENTER to finish the HA setup.

The progress is shown below:

Copying NAS data................................... [  OK  ]
Mount NAS storage.................................. [  OK  ]
Remove old fsmounts................................ [  OK  ]
Add new fsmounts................................... [  OK  ]
Remove old fsexports............................... [  OK  ]
Write NAS mount/unmount scripts.................... [  OK  ]
Copy mount/unmount scripts......................... [  OK  ]
Press any key to continue

cmha-setup is now complete. Select EXIT to return to the shell prompt.

Verify the High Availability Setup #

Run the cmha status command to verify that the failover configuration is correct and working as expected.

Note that the command tests the configuration from both directions: from the primary head node to the secondary, and from the secondary to the primary. The active head node is indicated by an asterisk.
```
 1# cmha status
 2Node Status: running in active mode
 3
 4basepod-head1* -> basepod-head2
 5mysql         [  OK  ]
 6ping          [  OK  ]
 7status        [  OK  ]
 8
 9basepod-head2 -> basepod-head1*
10mysql         [  OK  ]
11ping          [  OK  ]
12status        [  OK  ]
```

Verify that the /cm/shared and /home directories are being mounted from the NAS server.

# mount
. . . some output omitted . . .
10.227.48.252:/var/nfs/general/cmshared on /cm/shared type nfs4 (rw,relatime,vers=4.2,rsize=32768,wsize=32768,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.130.12210.227.48_lock=none,addr=10.130.122.252)10.227.48
10.227.48.252:/var/nfs/general/home on /home type nfs4 (rw,relatime,vers=4.2,rsize=32768,wsize=32768,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.130.12210.227.48_lock=none,addr=10.130.122.252)10.227.48

Login to the head node to be made active and run cmha makeactive.

# ssh basepod-head2
# cmha makeactive
=========================================================================
This is the passive head node. Please confirm that this node should become
the active head node. After this operation is complete, the HA status of
the head nodes will be as follows:

basepod-head2 will become active head node (current state: passive)
basepod-head1 will become passive head node (current state: active)
=========================================================================

Continue(c)/Exit(e)? c

Initiating failover.............................. [  OK  ]

basepod-head2 is now active head node, makeactive successful

Run the cmha status command again to verify that the secondary head node has become the active head node.

# cmha status
Node Status: running in active mode

basepod-head2* -> basepod-head1
mysql         [  OK  ]
ping          [  OK  ]
status        [  OK  ]

basepod-head1 -> basepod-head2*
mysql         [  OK  ]
ping          [  OK  ]
status        [  OK  ]

Manually failover back to the primary head node.

# ssh basepod-head1
# cmha makeactive

===========================================================================
This is the passive head node. Please confirm that this node should become
the active head node. After this operation is complete, the HA status of
the head nodes will be as follows:

basepod-head1 will become active head node (current state: passive)
basepod-head2 will become passive head node (current state: active)
===========================================================================

Continue(c)/Exit(e)? c

Initiating failover.............................. [  OK  ]

basepod-head1 is now active head node, makeactive successful

Run cmsh status again to verify that the primary head node has become the active head node.

# cmha status
Node Status: running in active mode

basepod-head1* -> basepod-head2
mysql         [  OK  ]
ping          [  OK  ]
status        [  OK  ]

basepod-head2 -> basepod-head1*
mysql         [  OK  ]
ping          [  OK  ]
status        [  OK  ]

Power on the cluster nodes.

# cmsh -c “power -c k8s-master,dgx on”
ipmi0 .................... [   ON    ] knode01
ipmi0 .................... [   ON    ] knode02
ipmi0 .................... [   ON    ] knode03
ipmi0 .................... [   ON    ] dgx01
ipmi0 .................... [   ON    ] dgx02
ipmi0 .................... [   ON    ] dgx03
ipmi0 .................... [   ON    ] dgx04

(Optional) Configure Jupyter High Availability #

If Jupyter was deployed on the primary head node before HA was configured, configure the Jupyter service to run on the active head node.

% device
% use basepod-head1
% services
% use cm-jupyterhub
% show
Parameter                        Value
-------------------------------- --------------------------------------------
Revision
Service                          cm-jupyterhub
Run if                           ALWAYS
Monitored                        yes
Autostart                        yes
Timeout                          -1
Belongs to role                  yes
Sickness check script
Sickness check script timeout    10
Sickness check interval          60

Set the runif parameter to active.

% set runif active
% commit

% show
Parameter                        Value
-------------------------------- --------------------------------------------
Revision
Service                          cm-jupyterhub
Run if                           ACTIVE
Monitored                        yes
Autostart                        yes
Timeout                          -1
Belongs to role                  yes
Sickness check script
Sickness check script timeout    10
Sickness check interval          60

Configure the Jupyter service on the secondary head node.

% device
% use basepod-head2
% services
% use cm-jupyterhub
% set runif active

High Availability#

Configure High Availability#

Verify the High Availability Setup#

(Optional) Configure Jupyter High Availability#

Configure High Availability #

Verify the High Availability Setup #

(Optional) Configure Jupyter High Availability #