NVIDIA UFM High-Availability User Guide v5.6.0
NVIDIA UFM High-Availability User Guide v5.6.0

Installation and Configuration

The UFM HA package can be downloaded by running the following command:

Copy
Copied!
            

wget https://www.mellanox.com/downloads/UFM/ufm_ha/5.6.0/ufm_ha_5.6.0-4.tgz

The UFM HA package should be installed on both machines (Master and Standby) and the required UFM products. Installation order does not matter. To install the UFM-HA package:

  • Untar the ufm-ha package:

    Copy
    Copied!
                

    tar xvzf ufm-ha-<version>.tgz

  • Go to the directory you extracted and run the installation script. For example:

    Copy
    Copied!
                

    ./install.sh -l /opt/ufm/files/ -d /dev/sda5 -p enterprise

    For NFS support, run the following installation script. For example:

    Copy
    Copied!
                

    ./install.sh -l /opt/ufm/files/ -p enterprise

    Option

    Description

    -l

    Sync Files Location. Must be always /opt/ufm/files/

    -d

    Disk name for DRBD. For example /dev/sda5 (in case of using DRBD). Note that the `-d` option is not needed in case of NFS.

    -p

    Product Name. Must use “enterprise” to UFM Enterprise

Info

In cases where you have a previous installation of ufm_ha and you want to upgrade to the newer version, run the following command:

Copy
Copied!
            

./install.sh -u

Info

UFM HA scripts are installed under /usr/bin.

There are two methods to configure the HA cluster:

Configure HA with SSH Trust

  1. On the master server only, configure the HA nodes. To do so, from /tmp, run the configure_ha_nodes.sh command as shown in the below example

    Copy
    Copied!
                

    configure_ha_nodes.sh --cluster-password 12345678 \ --master-primary-ip 10.10.10.1 \ --standby-primary-ip 10.10.10.2 \ --master-secondary-ip 192.168.10.1 \ --standby-secondary -ip 192.168.10.2 \ --virtual-ip 10.10.10.5

    Note

    The script configure_ha_nodes.sh is located under /usr/bin/, therefore, by default, you do not need to use the full path to run it.

    Note

    The --cluster-password must be at least 8 characters long.

    Note

    To ensure effective HA sync interface functionality for PCS version 0.9.X, employing back-to-back ports with local IP addresses, it is crucial to incorporate the relevant IP addresses and hostnames into the /etc/hosts file. This step is necessary to enable the HA configuration to accurately resolve hostnames based on the specific IP addresses in use.

    Note

    configure_ha_nodes.sh requires SSH connection to the standby server. If SSH trust is not configured, then you are prompted to enter the SSH password of the standby server during configuration runtime

    Note

    While configuring UFM HA on Oracle Linux, make sure the SELinux is disabled. You can check SELinux status with sestatus.

    If it is enabled, follow the below steps to disable it:

    • Run vi /etc/selinux/config

    • Add SELINUX=disabled

    • Reboot the machine

    • Verify SELinux is disabled with the command sestatus.

    Option

    Description

    --cluster-password

    UFM HA cluster password for authentication by the pacemaker.

    --master-ip

    Master (main) server IP address

    --standby-ip

    Standby server IP address

    --virtual-ip OR --no-vip

    UFM HA cluster Virtual IP or configure HA without virtual IP

  2. Depending on the size of your partition, wait for the configuration process to complete and DRBD sync to finish.

Configure HA without SSH Trust

If you cannot establish an SSH trust between your HA servers, you can use ufm_ha_cluster directly to configure HA. You can see all the options for configuring HA in the Help menu:

Copy
Copied!
            

ufm_ha_cluster config -h

Usage:

Copy
Copied!
            

ufm_ha_cluster config [<options>] 

Option

Description

-r

--role <node role>

Node role (master or standby).

-e

--peer-primary-ip <ip address>

Peer node primary IP address (mandatory).

-l

--local-primary-ip <ip address>

Local node primary IP address (mandatory).

-E

--peer-secondary-ip <ip address>

Peer node secondary IP address (mandatory).

-L

--local-secondary-ip <ip address>

Local node primary IP address (mandatory).

-i

--virtual-ip <virtual-ip>

--virtual-ip6 <virtual-ip6>

Cluster virtual IP(v4).

Cluster virtual IP(v4).

-p

--hacluster-pwd <pwd>

HA cluster user password.

-h

--help

Show this message

-N

--no-vip

Configure HA without virtual IP

-M

--ignore-mgmt-failure

Ignore management interface status if VIP is configured.

Will not failover if master node's secondary IP is down.

To configure HA, follow the below instructions:

Note

Please change the variables in the commands below based on your setup.

  1. [On Standby Server] Run the following command to configure Standby Server:

    Copy
    Copied!
                

    ufm_ha_cluster config -r standby -e <peer primary ip address> -l <local primary ip address> -E <peer secondary ip address> -L <local secondary ip address> -p <cluster_password>

  2. [On Master Server] Run the following command to configure Master Server:

    Copy
    Copied!
                

    ufm_ha_cluster config -r master -e <peer primary ip address> -l <local primary ip address> -E <peer secondary ip address> -L <local secondary ip address> -p -i <virtual ip address>

NFS synchronization mechanism can be used instead of DRBD. Multi-Nodes Support can be used with NFS synchronization mechanism only, as described in the following section. To activate this functionality, users must define the following parameters:

  • Mode: NFS

  • NFS Server

  • Shared Folder

Ensure that the NFS version supports nfs4. It is recommended that the NFS server is not one of the UFM-HA nodes. Refer to the section below for details on configuring the file.

The UFM-HA cluster can comprise of more than two nodes. Among these nodes, one will serve as the master, while the others will operate in standby mode.

To configure multiple nodes, users must populate the configuration file '/etc/ufm_ha/ha_nodes.cfg' on all nodes (ensuring that the file is identical across all nodes).

This file contains details about each participating node, including:

  • Role: Master/Standby

  • Primary IP address

  • Secondary IP address

Using File Configuration

The '/etc/ufm_ha/ha_nodes.cfg' file contains all the necessary information for HA configuration and can serve as a replacement for command-line configuration. The only configuration not saved in the file is the password for security reasons.

To configure, use the following command (should be executed after setting the configuration):

Copy
Copied!
            

ufm_ha_cluster config –p <password>

Info

The standby nodes must be configured at first, with the last node being set as the master node.


Configuration File

The sample configuration file includes up to three sections for nodes, but users can add additional sections as needed.

Copy
Copied!
            

[General] # Connection mode # in case dual_link is true, each node must have primary and secondary IPs dual_link = true   [Node.1] # valid role options: master/standby role = master # Mandatory primary_ip = # Mandatory if dual_link = true secondary_ip =   [Node.2] role = standby primary_ip = secondary_ip =   [Node.3] role = standby primary_ip = secondary_ip =   # Add other Node.x sections if needed.   [Virtual] # If virtual IP should not be added, set `no_vip = true` no_vip =   virtual_ip =   virtual_ip6 =   ignore_mgmt_failure = false # when using BGP virtual IP, you must use the loopback interface, set `interface = lo` # in other cases we let the pcs to decide on the relevant network interface. interface =   [FileSync] # valid options are: drbd/nfs mode = nfs   [NFS] # fill in case the FileSync.mode is nfs nfs_server = shared_folder =


Show UFM HA version

Run the following command to show UFM HA version:

Copy
Copied!
            

ufm_ha_cluster version


Starting UFM HA Cluster

Note

Before starting the UFM cluster, ensure that the DRBD sync is completed.

To start UFM HA cluster:

Copy
Copied!
            

ufm_ha_cluster start


Checking UFM Cluster Status

To check UFM HA cluster status:

Copy
Copied!
            

ufm_ha_cluster status 


Stopping UFM HA Cluster

To stop UFM HA cluster:

Copy
Copied!
            

ufm_ha_cluster stop 


Takeover Services

The takeover command can be executed on the standby machine so that it will be the master.

Copy
Copied!
            

ufm_ha_cluster takeover


Master Failover

The failover command can be executed on the master machine so that it will be the standby.

Copy
Copied!
            

ufm_ha_cluster failover


Replacing the Standby Node

  • Install the HA package for the new node (standby).

  • Disconnect the standby node (the old standby) and run the following command on the master node:

    Copy
    Copied!
                

    ufm_ha_cluster detach

  • Config the new standby node; please refer to Configuration.

  • Connect the new standby to the cluster by running the command on the master node:

    Copy
    Copied!
                

    ufm_ha_cluster attach -l <local primary ip address> -e <peer primary ip address> -E <peer secondary ip address> -p <cluster_password>

Uninstalling UFM HA

To uninstall UFM HA, first stop the cluster and then run the uninstallation command as follows:

Copy
Copied!
            

/opt/ufm/ufm_ha/uninstall_ha.sh


© Copyright 2024, NVIDIA. Last updated on Aug 15, 2024.