image image image image image

On This Page

All nodes in an SM HA subnet must be of the same CPU type (e.g. x86), and must run the same MLNX-OS version.

High availability (HA) refers to a system or component that is continuously operational for a desirably extended period of time.

Mellanox Subnet Manager (SM) HA reduces subnet downtime and disruption as it is continuously operational for a desirably long length of time. It assures continuity of the work even when one of the SMs dies. The database is synchronized with all the nodes participating in the InfiniBand subnet and a configuration change is prepared. The synchronization is done out-of-band using an Ethernet management network.

Mellanox SM HA allows the systems’ manager to enter and modify all InfiniBand SM configuration of different subnet managers from a single location. It creates an InfiniBand subnet and associates all the Mellanox management appliances that are attached to the same InfiniBand subnet into that InfiniBand subnet ID. All subnet managers can be controlled, started, or stopped from this address.

All the nodes that participate in the Mellanox SM HA are joined to the InfiniBand subnet ID and once joined, the synchronized SMs are launched. One of the nodes is elected as Master and the others are Slaves (or down). Mellanox SM HA uses an IP address (VIP) that is always directed to the SM HA master to monitor the SM state and to verify that all configurations are executed.

Joining, Creating or Leaving an InfiniBand Subnet ID

When transitioning from standalone into a group or vice versa, a few seconds are required for the node state to stabilize. During that time, group feature commands (e.g. SM HA commands) should not be executed. To run group features, wait for the CLI prompt to turn into [standalone:master], [<group>:master] or [<group>:standby] instead of [standalone:*unknown*] or [<group>:*unknown*].

An InfiniBand subnet is formed by a network of InfiniBand nodes interconnected via InfiniBand switches. It includes all systems that can run an SM and is part of the SM HA domain. A switch that can potentially run an SM must be a member of an InfiniBand subnet ID to be associated with the Mellanox SM HA domain. An IB subnet is recognized by its ID which is used by the system to either join or leave the subnet.

Every system that is not associated to an existing IB subnet (has never been part of an IB subnet or has left an existing one) or does not have MLNX-OS license installed, is by default associated to a subnet called “Standalone”.

In order to create, join or leave an InfiniBand subnet, one may use the following commands:

  • Create – “ib ha <IB_subnet_ID> ip <ip_addr> <netmask>”
  • Join – “ib ha <IB_subnet_ID>”
  • Leave – “no ib ha”

When leaving an SM HA cluster, SM configuration is not saved on the node leaving the cluster. After leaving, the configuration is reset to its default values.

For further information see section “Creating and Adding Systems to an InfiniBand Subnet ID”.

MLNX-OS Management Centralized Location

MLNX-OS centralized management infrastructure enables the user to configure or modify an existing configuration and monitor the subnet running status. MLNX-OS centralized management IP (VIP) is defined when a new subnet manager is created by running the command “ib ha <IB_subnet_ID> ip <ip_addr> <netmask>”. The created VIP is used as the current subnet master’s alias thus, assumes the same roles as the master.

The VIP always points to one of the systems part of the SM HA domain. It is always active even if one or more of the members are down. For example: 

switch (config) # ib ha subnet2 ip 192.168.10.110 255.255.255.0

High Availability Node Roles

A node is an InfiniBand switch system. Every node member of an IB subnet ID has one of the following roles:

  • Master – the node that manages SM configurations and provides services to the Virtual IP (VIP) addresses
  • Standby – the node that replaces the Master node and takes over its responsibilities once the Master node is down
  • Offline – has run an SM in the past and is currently offline, or it was created manually by the “ib smnode <node name> create” command. If the node has been removed from the environment, you can remove it from the list with the “no ib smnode xxx” command.

To see the mode of the current node, look at the CLI prompt for the following format: 

<host name> [<subnet ID>:<mode>] [standalone: master] (config) #

For example:

switch [ibstandalone: master] (config) # 

To see a list of the existing nodes and details about the running state, run the command “show ib smnodes {brief}”.

Configuring MLNX-OS SM HA Centralized Location

The IP is used to configure or modify the existing configuration and monitor the subnet running status. To configure your IP, run the command “ib ha <IB_subnet_ID> ip <ip_addr> <netmask>”: 

switch [standalone: master] (config) # ib ha subnet2 ip 192.168.10.110 255.255.255.0
switch [subnet2: master] (config) #

Creating and Adding Systems to an InfiniBand Subnet ID

To create and add systems to a subnet:

  1. Log into the system from which you intend to create the subnet.
  2. Enter config mode. Run:

    switch [standalone: master] >
    switch [standalone: master] > enable
    switch [standalone: master] # configure terminal
  3. Create a new subnet using the command “ib ha <IB_subnet_ID> ip <ip_addr> <netmask>”. Run: 

    switch [standalone: master] (config) # ib ha subnet2 ip 192.168.10.110 255.255.255.0
    switch [subnet2: master] (config) # 

    You must run the “ib ha <IB_subnet_ID> ip <ip_addr> <netmask>” command only once per subnet ID.

  4. Log into the system that you are going to join to the new created subnet.
  5. Join the system to the subnet, using the “ib ha <IB_subnet_ID>” command. Run: 

    switch [standalone: master] (config) # ib ha subnet2
    switch [subnet2: standby] (config) # 

Restoring Subnet Manager Configuration

In instances where the SM configuration becomes corrupted or the subnet manager cannot raise any logical links it is suggested that you restore the default SM configuration.

To restore subnet manager configuration:

  1. Enter config mode. Run: 

    *switch [subnet2: master] > enable
    *switch [subnet2: master] # configure terminal
    *switch [subnet2: master] (config) #
  2. Run the command “ib sm reset-config”. Run: 

    *switch [subnet2: master] (config) # ib sm reset-config

    The asterisk in the example above (*switch) indicates the local system from where the command is running.

In order to receive information on the running state of a specific node one could run one of the following commands with its requested parameter:

  • show ib smnode <name> sm-running
  • show ib smnode <name> sm-state
  • show ib smnode <name> sm-priority
  • show ib smnode <name> active
  • show ib smnode <name> ha-state
  • show ib smnode <name> ha-role

Subnet Manager Configuration

To configure the subnet manager, log into the centralized management IP (VIP). Once the SM configuration is created, the SM database is duplicated to the other nodes.

The SM must be configured from MLNX-OS centralized management IP (VIP). All the configurations that are not created or modified in the master node (using the VIP) are overridden by the master configuration.

The user can configure different SM parameters such as where to run the SM(s) or the SM priority by running the commands according to the desired action.

Mellanox High Availability and OpenSM Handover/Failover 

Mellanox Technologies products are fully compliant and interoperable with OpenSM.

Once an SM fails, the SM which takes over the subnet needs to reproduce the internal state of the failed master. Most of the information required is obtained by scanning the subnet and extracting the information from the devices. However, some information which is not stored directly in the network devices cannot be reproduced this way. InfiniBand management architecture limits such information to data exchanged between clients (either user-level programs or kernel modules) and the Subnet Administration (SA) service (attached to the SM). The SA keeps this set of client registrations in an internal data structure called SA-DB. The SA-DB information includes the multicast groups, the multicast group members, subscriptions for event forwarding and service records.

The new SM may retrieve the SA-DB by requesting the clients to re-register with the SA or by obtaining a copy of the previous master SM internal SA-DB via an SA-DB dump file. The client-re-registration offers database correctness and the SA-DB dump file replication provides lower setup time. Client re-registration is required since the SA-DB may not be up-to-date on the registrations listed in the master SM.

Furthermore, since the SM does not maintain SA-DB information for unknown nodes, it is very possible that some of the SA-DB information relating to nodes momentarily disconnected from the master SM become purged. Therefore, these nodes must re-register with the new SM when they are reconnected (they receive a client-re-register request from the SM). Relying only on client re-registration is also non-optimal as it takes some time to recreate the entire SA-DB and the network state.

Mellanox SM HA replicates the SA-DB dump file from the current master SM to all the standby SMs running on Mellanox switches. The SA-DB dump file replication provides further optimization to the standby SM that becomes master.

Standby SM loads the existing SA-DB file the old master has used. By using the existing SA-DB the amount of processing needed on client re-registration is lessened resulting in a reduced time to complete setting up the network. 

SM HA does not replace InfiniBand spec requirement for client re-registration.

When running an SM HA cluster with more than 2 active OpenSM instances, IB multicast applications need to support client re-register or they may not work correctly after OpenSM failover.

SM HA Commands

ib ha


ib ha <IB_subnet_ID> [ip <IP address> <subnet mask> [force]]
no ib ha 

Creates a subnet <IB_subnet_ID> with the specified IP.
The no form of the command removes this node from an InfiniBand subnet ID.

Syntax Description

IB subnet ID

Simple group name for shared IB config

ip <IP address>

Assigns management IP address

netmask

Netmask (e.g. 255.255.255.0 or /24)

force

Joins if exists or creates if not

Default

N/A

Configuration Mode

config

History

3.1.0000

Example

switch (config) # ib ha my-subnet

Related Commands

show ib ha

Notes

A new subnet may be joined only after leaving the current one



ib smnode


ib smnode <hostname> [create | disable | enable | sm-priority <priority>]
no ib smnode <hostname> [create | disable | enable | sm-priority] 

Manages HA SM.
The no form of the command removes HA SM node configuration.

Syntax Description

hostname

Specifies <hostname> SM configuration to modify.

create

Creates SM configuration for selected node.

disable

Makes SM inactive on selected node.

enable

Makes SM active on selected node.

sm-priority <priority>

Sets SM selected node priority (0=low, 15=high).

Default

N/A

Configuration Mode

config

History

3.1.0000

Example

switch (config) # ib smnode switch-1133ce create

Related Commands

show ib smnode
show ib smnodes

Notes




show ib smnode


show ib smnode <hostname> {active | ha-role | ha-state | ip | sm-priority | sm-running | sm-state}

Displays SM High availability information.

Syntax DescriptionhostnameSpecifies <hostname> SM configuration to display
activeDisplays whether <hostname> is currently active
ha-roleDisplays the High Availability role of <hostname>. Possible return values are: offline, unknown, master, standby, or disabled
ha-statePossible return values are: offline, init, searching, joining, online, creating, waiting, leaving, join-sync, failed, removed, or regroup
ipDisplays the local management IP address associated with the active node, <hostname>. If <hostname> is not active, the command displays “offline”
sm-priorityDisplays the SM priority for SM running on <hostname>
sm-runningDisplays if <hostname> has an SM running. The command will display “active” (that is, SM is running) only if <hostname> is currently active, has a license, is enabled as a potential SM, is active as SM, and if there is a maximum of 2 SMs in the fabric.
sm-stateDisplays if SM is enabled to run on <hostname>
DefaultN/A
Configuration Modeconfig
History3.1.0000
3.8.1000Updated Syntax Description
Example
switch (config) # show ib smnode my-hostname sm-state
enabled
Related Commandsshow ib smnodes
Notes



show ib smnodes


show ib smnodes [brief]

Displays SM High availability information.

Syntax DescriptionbriefDisplays information on all HA nodes
DefaultN/A
Configuration Modeconfig
History3.1.0000

3.8.1000Updated example
Example
switch (config) # show ib smnodes

HA state of switch infiniband-default:
  IB Subnet HA name: Mantaray142
  HA IP address    : 10.7.145.141/24
  Active HA nodes  : 2

  HA node local information:
    Name       : Mantaray142 (active)  <--- (local node)
    SM-HA state: standby
    SM Running : stopped
    SM Enabled : enabled - master
    SM Priority: 0
    IP         : 10.7.144.142

  HA node local information:
    Name       : Mantaray141 (active) 
    SM-HA state: master
    SM Running : stopped
    SM Enabled : disabled
    SM Priority: 0
    IP         : 10.7.144.141
Related Commands
Notes



show ib ha


show ib ha [brief]

Displays information about all the systems that are active or might be able to run SM.

Syntax DescriptionbriefDisplays brief HA information
DefaultN/A
Configuration Modeconfig
History3.1.0000
Example

Related Commands
switch (config) # show ib ha
Global HA state
==================
IB Subnet HA name:subnet4
HA IP address: 192.168.10.43/24
Active HA nodes: 2

ID State Role IP SM Priority
--------------------------------------------------------------------
switch standalone 192.168.10.42 disabled
switch master 192.168.10.18 disabled
Notes