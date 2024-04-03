NVIDIA UFM High-Availability User Guide v5.4.0
Installation and Configuration

Installation

The UFM HA package can be downloaded by running the following command:

wget https://www.mellanox.com/downloads/UFM/ufm_ha_5.4.0-9.tgz

The UFM HA package should be installed on both machines (Master and Standby) and the required UFM products. Installation order does not matter. To install the UFM-HA package:

  • Untar the ufm-ha package:

    tar xvzf ufm-ha-<version>.tgz

  • Go to the directory you extracted and run the installation script. For example:

    ./install.sh -l /opt/ufm/files/ -d /dev/sda5 -p enterprise

    For NFS support, run the following installation script. For example:

    ./install.sh -l /opt/ufm/files/ -p enterprise

    Option

    Description

    -l

    Sync Files Location. Must be always /opt/ufm/files/

    -d

    Diskname for DRBD. For example /dev/sda5 (in case of using DRBD). Note that the `-d` option is not needed in case of NFS.

    -p

    Product Name. Must use “enterprise” to UFM Enterprise

Note

In cases where you have a previous installation of ufm_ha and you want to upgrade to the newer version, run the following command:

./install.sh -u

Note

UFM HA scripts are installed under /usr/bin.

Configuration

There are two methods to configure the HA cluster:

Configure HA with SSH Trust

  1. On the master server only, configure the HA nodes. To do so, from /tmp, run the configure_ha_nodes.sh command as shown in the below example

    configure_ha_nodes.sh --cluster-password 12345678 \
    --master-primary-ip 10.10.10.1 \
    --standby-primary-ip 10.10.10.2 \
    --master-secondary-ip 192.168.10.1 \
    --standby-secondary -ip 192.168.10.2 \
    --virtual-ip 10.10.10.5
    Warning

    The script configure_ha_nodes.sh is is located under /usr/bin/, therefore, by default, you do not need to use the full path to run it.

    Warning

    The --cluster-password must be at least 8 characters long.

    Warning

    To ensure effective HA sync interface functionality for PCS version 0.9.X, employing back-to-back ports with local IP addresses, it is crucial to incorporate the relevant IP addresses and hostnames into the /etc/hosts file. This step is necessary to enable the HA configuration to accurately resolve hostnames based on the specific IP addresses in use.

    Warning

    configure_ha_nodes.sh requires SSH connection to the standby server. If SSH trust is not configured, then you are prompted to enter the SSH password of the standby server during configuration runtime

    Option

    Description

    --cluster-password

    UFM HA cluster password for authentication by the pacemaker.

    --master-ip

    Master (main) server IP address

    --standby-ip

    Standby server IP address

    --virtual-ip OR --no-vip

    UFM HA cluster Virtual IP or configure HA without virtual IP

  2. Depending on the size of your partition, wait for the configuration process to complete and DRBD sync to finish.

Configure HA without SSH Trust

If you cannot establish an SSH trust between your HA servers, you can use ufm_ha_cluster directly to configure HA. You can see all the options for configuring HA in the Help menu:

ufm_ha_cluster config -h

Usage:

ufm_ha_cluster config [<options>] 

Option

Description

-r

--role <node role>

Node role (master or standby).

-e

--peer-primary-ip <ip address>

Peer node primary IP address (mandatory).

-l

--local-primary-ip <ip address>

Local node primary IP address (mandatory).

-E

--peer-secondary-ip <ip address>

Peer node secondary IP address (mandatory).

-L

--local-secondary-ip <ip address>

Local node primary IP address (mandatory).

-i

--virtual-ip <virtual-ip>

Cluster virtual IP (should be used for master only)

-p

--hacluster-pwd <pwd>

HA cluster user password.

-h

--help

Show this message

-N

--no-vip

Configure HA without virtual IP

To configure HA, follow the below instructions:

Warning

Please change the variables in the commands below based on your setup.

  1. [On Standby Server] Run the following command to configure Standby Server:

    ufm_ha_cluster config -r standby -e <peer primary ip address> -l <local primary ip address> -E <peer secondary ip address> -L <local secondary ip address> -p <cluster_password>

  2. [On Master Server] Run the following command to configure Master Server:

    ufm_ha_cluster config -r master -e <peer primary ip address> -l <local primary ip address> -E <peer secondary ip address> -L <local secondary ip address> -p -i <virtual ip address>

NFS File Sharing

NFS synchronization mechanism can be used instead of DRBD. Multi-Nodes Support can be used with NFS synchronization mechanism only, as described in the following section. To activate this functionality, users must define the following parameters:

  • Mode: NFS

  • NFS Server

  • Shared Folder

Ensure that the NFS version supports nfs4. It is recommended that the NFS server is not one of the UFM-HA nodes. Refer to the section below for details on configuring the file.

Multi-Nodes Support

The UFM-HA cluster can comprise of more than two nodes. Among these nodes, one will serve as the master, while the others will operate in standby mode.

To configure multiple nodes, users must populate the configuration file '/etc/ufm_ha/ha_nodes.cfg' on all nodes (ensuring that the file is identical across all nodes).

This file contains details about each participating node, including:

  • Role: Master/Standby

  • Primary IP address

  • Secondary IP address

Using File Configuration

The '/etc/ufm_ha/ha_nodes.cfg' file contains all the necessary information for HA configuration and can serve as a replacement for command-line configuration. The only configuration not saved in the file is the password for security reasons.

To configure, use the following command (should be executed after setting the configuration):

ufm_ha_cluster config –p <password>

Note

The standby nodes must be configured at first, with the last node being set as the master node.


Configuration File

The sample configuration file includes up to three sections for nodes, but users can add additional sections as needed.

[General]
# Connection mode
# in case dual_link is true, each node must have primary and secondary IPs
dual_link = true
 
[Node.1]
# valid role options: master/standby
role = master
# Mandatory
primary_ip =
# Mandatory if dual_link = true
secondary_ip =
 
[Node.2]
role = standby
primary_ip =
secondary_ip =
 
[Node.3]
role = standby
primary_ip =
secondary_ip =
 
# Add other Node.x sections if needed.
 
[Virtual]
# If virtual IP should not be added, set `virtual_ip = no-vip`
virtual_ip =
# when using BGP virtual IP, you must use the loopback interface, set `interface = lo`
# in other cases we let the pcs to decide on the relevant network interface.
interface =
 
[FileSync]
# valid options are: drbd/nfs
mode = nfs
 
[NFS]
# fill in case the FileSync.mode is nfs
nfs_server =
shared_folder =


UFM HA Cluster Operations

Show UFM HA version

Run the following command to show UFM HA version:

ufm_ha_cluster version


Starting UFM HA Cluster

Warning

Before starting the UFM cluster, ensure that the DRBD sync is completed.

To start UFM HA cluster:

 ufm_ha_cluster start


Checking UFM Cluster Status

To check UFM HA cluster status:

ufm_ha_cluster status 


Stopping UFM HA Cluster

To stop UFM HA cluster:

ufm_ha_cluster stop 


Takeover Services

The takeover command can be executed on the standby machine so that it will be the master.

ufm_ha_cluster takeover


Master Failover

The failover command can be executed on the master machine so that it will be the standby.

ufm_ha_cluster failover


Replacing the Standby Node

  • Install the HA package for the new node (standby).

  • Disconnect the standby node (the old standby) and run the following command on the master node:

    ufm_ha_cluster detach

  • Config the new standby node; please refer to Configuration.

  • Connect the new standby to the cluster by running the command on the master node:

    ufm_ha_cluster attach -l <local primary ip address> -e <peer primary ip address> -E <peer secondary ip address> -p <cluster_password>

Uninstalling UFM HA

To uninstall UFM HA, first stop the cluster and then run the uninstallation command as follows:

/opt/ufm/ufm_ha/uninstall_ha.sh


