High Availability#

This documentation is part of NVIDIA DGX BasePOD: Deployment Guide Featuring NVIDIA DGX A100 Systems.

Warning

The # prompt indicates commands that you execute as the root user on a head node. The % prompt indicates commands that you execute within cmsh.

Configure High Availability#

This section covers how to configure high availability (HA) using cmha-setup CLI wizard.

  1. Ensure that both head nodes are licensed.

    We provided the MAC address for the secondary head when we installed the cluster license. For details, see Cluster Configuration Steps.

    1% main licenseinfo  | grep ^MAC
    2MAC address / Cloud ID               04:3F:72:E7:67:07|14:02:EC:DA:AF:18
    
  2. Configure the NFS shared storage.

    Mounts configured in fsmounts will be automatically mounted by the CMDaemon.

     1% device
     2% use master
     3% fsmounts
     4% add /nfs/general
     5% set device 10.227.48.252:/var/nfs/general
     6% set filesystem nfs
     7% commit
     8% show
     9Parameter                        Value
    10---------------------------- ------------------------------------------------
    11Device                           10.227.48.252:/var/nfs/general
    12Revision
    13Filesystem                       nfs
    14Mountpoint                       /nfs/general
    15Dump                             no
    16RDMA                             no
    17Filesystem Check                 NONE
    18Mount options                    defaults
    
  3. Verify that the shared storage is mounted.

    1# mount | grep '/nfs/general'
    210.227.48.252:/var/nfs/general on /nfs/general type nfs4 (rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.130.12210.227.48_lock=none,addr=10.130.122.252)10.227.48
    
  4. Verify that head node has power control over the cluster nodes.

     1% device
     2% power -c dgx,k8s-master status
     3[basepod-head1->device]% power -c dgx,k8s-master status
     4ipmi0 .................... [   ON    ] dgx01
     5ipmi0 .................... [   ON    ] dgx02
     6ipmi0 .................... [   ON    ] dgx03
     7ipmi0 .................... [   ON    ] dgx04
     8ipmi0 .................... [   ON    ] knode01
     9ipmi0 .................... [   ON    ] knode02
    10ipmi0 .................... [   ON    ] knode03
    11[basepod-head1->device]%
    
  5. Power off the cluster nodes. The cluster nodes must be powered off before configuring HA.

    1% power -c k8s-master,dgx off
    2ipmi0 .................... [   OFF   ] knode01
    3ipmi0 .................... [   OFF   ] knode02
    4ipmi0 .................... [   OFF   ] knode03
    5ipmi0 .................... [   OFF   ] dgx01
    6ipmi0 .................... [   OFF   ] dgx02
    7ipmi0 .................... [   OFF   ] dgx03
    8ipmi0 .................... [   OFF   ] dgx04
    
  6. Start the cmha-setup CLI wizard as the root user on the primary head node.

    1# cmha-setup
    
  7. Choose Setup and then select SELECT.

  8. Choose Configure and then select NEXT.

  1. Verify that the cluster license information found by the wizard is correct and then select CONTINUE.

  1. Configure an external virtual IP address to be used by the active head node in the HA configuration and then select NEXT.

    This will be the IP that should always be used for accessing the active head nodes.

    _images/high-availability-10.png
  2. Provide an internal virtual IP address that will be used by the active head node in the HA configuration and then select NEXT.

    _images/high-availability-11.png
  3. Provide the name of the secondary head node and then select NEXT.

    _images/high-availability-12.png
  4. DGX BasePOD uses the internal network as the failover network, so select SKIP.

    _images/high-availability-13.png
  5. Configure the IP addresses for the secondary head node and then select NEXT.

    _images/high-availability-14.png
  6. Review the summary of the configuration and then select NEXT.

    This screen shoes the VIP that will be assigned to the internal and external interfaces.

    _images/high-availability-15.png
  7. Select Yes to proceed with the failover configuration.

    _images/high-availability-16.png
  8. Enter the MySQL root password and then select OK. This should be the same as the root password.

    _images/high-availability-17.png
  9. The wizard implements the first steps in the HA configuration. If all the steps show OK, press ENTER to continue. The progress is shown below:

    1Initializing failover setup on master.............. [  OK  ]
    2Updating shared internal interface................. [  OK  ]
    3Updating shared external interface................. [  OK  ]
    4Updating extra shared internal interfaces.......... [  OK  ]
    5Cloning head node.................................. [  OK  ]
    6Updating secondary master interfaces............... [  OK  ]
    7Updating Failover Object........................... [  OK  ]
    8Restarting cmdaemon................................ [  OK  ]
    9Press any key to continue
    
  10. When the failover setup installation on the primary master is complete, select OK to exit the wizard.

    _images/high-availability-19.png
  11. PXE boot the secondary head node and then select RESCUE from the grub menu.

    Since this is the initial boot of this node, it must be done outside of Base Command Manager (BMC or physical power button).

    _images/high-availability-20.png
  12. After the secondary head node has booted into the rescue environment, run the /cm/cm-clone-install –failover command, then enter yes when prompted.

    The secondary head node will be cloned from the primary.

    _images/high-availability-21.png
  13. When cloning is completed, enter y to reboot the secondary head node.

    The secondary must boot from its hard drive. PXE boot should not be enabled.

    _images/high-availability-22.png
  14. Wait for the secondary head node to reboot and then continue the HA setup procedure on the primary head node.

  15. Choose finalize from the cmha-setup menu and then select NEXT.

    This will clone the MySQL database from the primary to the secondary head node.

    _images/high-availability-24.png
  16. Select CONTINUE on the confirmation screen.

    _images/high-availability-25.png
  17. Enter the MySQL root password and then select OK. This should be the same as the root password.

    _images/high-availability-26.png
  18. The cmha-setup wizard continues. Press ENTER to continue when prompted.

    _images/high-availability-27.png

    The progress is shown below:

     1Updating secondary master mac address.............. [  OK  ]
     2Initializing failover setup on basepod-head2....... [  OK  ]
     3Stopping cmdaemon.................................. [  OK  ]
     4Cloning cmdaemon database.......................... [  OK  ]
     5Checking database consistency...................... [  OK  ]
     6Starting cmdaemon, chkconfig services.............. [  OK  ]
     7Cloning workload manager databases................. [  OK  ]
     8Cloning additional databases....................... [  OK  ]
     9Update DB permissions.............................. [  OK  ]
    10Checking for dedicated failover network............ [  OK  ]
    11Press any key to continue
    
  19. Select REBOOT when the WARNING: REBOOT REQUIRED screen is shown.

    Wait for the secondary head node to reboot before continuing.

    _images/high-availability-28.png
  20. The secondary head node is now UP.

     1% device list -f hostname:20,category:12,ip:20,status:15
     2hostname (key)       category     ip                   status
     3-------------------- ---------- -------------------- ---------------
     4basepod-head1                     10.227.48.254       [   UP   ]
     5basepod-head2                     10.227.48.253       [   UP   ]
     6knode01              k8s-master   10.227.48.9         [  DOWN  ]
     7knode02              k8s-master   10.227.48.10        [  DOWN  ]
     8knode03              k8s-master   10.227.48.11        [  DOWN  ]
     9dgx01                dgx          10.227.48.5         [  DOWN  ]
    10dgx02                dgx          10.227.48.6         [  DOWN  ]
    11dgx03                dgx          10.227.48.7         [  DOWN  ]
    12dgx04                dgx          10.227.48.8         [  DOWN  ]
    
  21. Choose Shared Storage from the cmha-setup menu and select SELECT.

    In this final HA configuration step, cmha-setup will copy the /cm/shared and /home directories to the shared storage, and it configures both head nodes and all cluster nodes to mount it.

    _images/high-availability-30.png
  22. Choose NAS and then select SELECT.

    _images/high-availability-31.png
  23. Choose /cm/shared and /home and then select NEXT.

    _images/high-availability-32.png
  24. Provide the IP address of the NAS host, the paths for the /cm/shared and /home directories should be copied to on the shared storage and then select NEXT.

    In this case, /var/nfs/general is exported, so the /cm/shared directory will be copied to 10.227.48.252:/var/nfs/general/cmshared, and it will be mounted over /cm/shared on the cluster nodes.

    _images/high-availability-33.png
  25. The wizard shows a summary of the information that it has collected. Press ENTER to continue.

  26. Select YES when prompted to proceed with the setup.

    _images/high-availability-35.png
  27. The cmha-setup wizard proceeds with its work. When it completes, select ENTER to finish the HA setup.

    _images/high-availability-36.png

    The progress is shown below:

    1Copying NAS data................................... [  OK  ]
    2Mount NAS storage.................................. [  OK  ]
    3Remove old fsmounts................................ [  OK  ]
    4Add new fsmounts................................... [  OK  ]
    5Remove old fsexports............................... [  OK  ]
    6Write NAS mount/unmount scripts.................... [  OK  ]
    7Copy mount/unmount scripts......................... [  OK  ]
    8Press any key to continue
    
  28. cmha-setup is now complete. Select EXIT to return to the shell prompt.

    _images/high-availability-37.png

Verify the High Availability Setup#

  1. Run the cmha status command to verify that the failover configuration is correct and working as expected.

    Note that the command tests the configuration from both directions: from the primary head node to the secondary, and from the secondary to the primary. The active head node is indicated by an asterisk.

     1# cmha status
     2Node Status: running in active mode
     3
     4basepod-head1* -> basepod-head2
     5mysql         [  OK  ]
     6ping          [  OK  ]
     7status        [  OK  ]
     8
     9basepod-head2 -> basepod-head1*
    10mysql         [  OK  ]
    11ping          [  OK  ]
    12status        [  OK  ]
    
  2. Verify that the /cm/shared and /home directories are being mounted from the NAS server.

    1# mount
    2. . . some output omitted . . .
    310.227.48.252:/var/nfs/general/cmshared on /cm/shared type nfs4 (rw,relatime,vers=4.2,rsize=32768,wsize=32768,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.130.12210.227.48_lock=none,addr=10.130.122.252)10.227.48
    410.227.48.252:/var/nfs/general/home on /home type nfs4 (rw,relatime,vers=4.2,rsize=32768,wsize=32768,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.130.12210.227.48_lock=none,addr=10.130.122.252)10.227.48
    
  3. Login to the head node to be made active and run cmha makeactive.

     1# ssh basepod-head2
     2# cmha makeactive
     3=========================================================================
     4This is the passive head node. Please confirm that this node should become
     5the active head node. After this operation is complete, the HA status of
     6the head nodes will be as follows:
     7
     8basepod-head2 will become active head node (current state: passive)
     9basepod-head1 will become passive head node (current state: active)
    10=========================================================================
    11
    12Continue(c)/Exit(e)? c
    13
    14Initiating failover.............................. [  OK  ]
    15
    16basepod-head2 is now active head node, makeactive successful
    
  4. Run the cmha status command again to verify that the secondary head node has become the active head node.

     1# cmha status
     2Node Status: running in active mode
     3
     4basepod-head2* -> basepod-head1
     5mysql         [  OK  ]
     6ping          [  OK  ]
     7status        [  OK  ]
     8
     9basepod-head1 -> basepod-head2*
    10mysql         [  OK  ]
    11ping          [  OK  ]
    12status        [  OK  ]
    
  5. Manually failover back to the primary head node.

     1# ssh basepod-head1
     2# cmha makeactive
     3
     4===========================================================================
     5This is the passive head node. Please confirm that this node should become
     6the active head node. After this operation is complete, the HA status of
     7the head nodes will be as follows:
     8
     9basepod-head1 will become active head node (current state: passive)
    10basepod-head2 will become passive head node (current state: active)
    11===========================================================================
    12
    13Continue(c)/Exit(e)? c
    14
    15Initiating failover.............................. [  OK  ]
    16
    17basepod-head1 is now active head node, makeactive successful
    
  6. Run cmsh status again to verify that the primary head node has become the active head node.

     1# cmha status
     2Node Status: running in active mode
     3
     4basepod-head1* -> basepod-head2
     5mysql         [  OK  ]
     6ping          [  OK  ]
     7status        [  OK  ]
     8
     9basepod-head2 -> basepod-head1*
    10mysql         [  OK  ]
    11ping          [  OK  ]
    12status        [  OK  ]
    
  7. Power on the cluster nodes.

    1# cmsh -c “power -c k8s-master,dgx on”
    2ipmi0 .................... [   ON    ] knode01
    3ipmi0 .................... [   ON    ] knode02
    4ipmi0 .................... [   ON    ] knode03
    5ipmi0 .................... [   ON    ] dgx01
    6ipmi0 .................... [   ON    ] dgx02
    7ipmi0 .................... [   ON    ] dgx03
    8ipmi0 .................... [   ON    ] dgx04
    

(Optional) Configure Jupyter High Availability#

  1. If Jupyter was deployed on the primary head node before HA was configured, configure the Jupyter service to run on the active head node.

     1% device
     2% use basepod-head1
     3% services
     4% use cm-jupyterhub
     5% show
     6Parameter                        Value
     7-------------------------------- --------------------------------------------
     8Revision
     9Service                          cm-jupyterhub
    10Run if                           ALWAYS
    11Monitored                        yes
    12Autostart                        yes
    13Timeout                          -1
    14Belongs to role                  yes
    15Sickness check script
    16Sickness check script timeout    10
    17Sickness check interval          60
    
  2. Set the runif parameter to active.

     1% set runif active
     2% commit
     3
     4% show
     5Parameter                        Value
     6-------------------------------- --------------------------------------------
     7Revision
     8Service                          cm-jupyterhub
     9Run if                           ACTIVE
    10Monitored                        yes
    11Autostart                        yes
    12Timeout                          -1
    13Belongs to role                  yes
    14Sickness check script
    15Sickness check script timeout    10
    16Sickness check interval          60
    
  3. Configure the Jupyter service on the secondary head node.

    1% device
    2% use basepod-head2
    3% services
    4% use cm-jupyterhub
    5% set runif active