NVIDIA UFM Enterprise User Manual v6.17.2
v6.17.2

UFM Communication Requirements

This chapter describes how the UFM server communicates with InfiniBand fabric components.

The UFM Server communicates with clients over IP. The UFM Server can belong to a separate IP network, which can also be behind the firewall.

UFM Server Communication with Clients

image2019-6-16_15-31-23-version-1-modificationdate-1718612972470-api-v2.png

UFM Server Communication with UFM Web UI Client

Communication between the UFM Server and the UFM web UI client is HTTP(s) based. The only requirement is that TCP port 80 (443) must not be blocked.

UFM Server Communication with SNMP Trap Managers

The UFM Server can send SNMP traps to configured SNMP Trap Manager(s). By default, the traps are sent to the standard UDP port 162. However, the user can configure the destination port. If the specified port is blocked, UFM Server traps will not reach their destination.

Summary of UFM Server Communication with Clients

Affected Service

Network

Address / Service / Port

Direction

Web UI Client

Out-of-band management*

HTTP / 80

HTTPS / 443

Bi-directional

SNMP Trap Notification

Out-of-band management*

UDP / 162 (configurable)

UFM Server to SNMP Manager

*If the client machine is connected to the IB fabric, IPoIB can also be used.

UFM Server Communication with InfiniBand Switches

image2019-6-16_15-33-4-version-1-modificationdate-1718612974247-api-v2.png

UFM Server InfiniBand Communication with Switch

The UFM Server must be connected directly to the InfiniBand fabric (via an InfiniBand switch). The UFM Server sends the standard InfiniBand Management Datagrams (MAD) to the switch and receives InfiniBand traps in response.

UFM Server Communication with Switch Management Software (Optional)

The UFM Server auto-negotiates with the switch management software on Mellanox Grid Director switches. The communication is bound to the switch Ethernet management port.

The UFM Server sends a multicast notification to MCast address 224.0.23.172, port 6306 (configurable). The switch management replies to UFM (via port 6306) with a unicast message that contains the switch GUID and IP address. After auto-negotiation, the UFM server uses Switch JSON API (HTTPS based) to retrieve inventory data and to apply switch actions (software upgrade and reboot) on the managed switch.

The following Device Management tasks are dependent on successful communication as described above:

  • Switch IP discovery

  • FRU Discovery (PSU, FAN, status, temperature)

  • Software and firmware upgrades

The UFM Server manages IB Switch Devices over HTTPS (default port 443 – configurable) and / or SSH (default port 22 – configurable).

UFM Server Communication with Externally Managed Switches (Optional)

UFM server uses Ibdiagnet tool to discover chassis information (PSU, FAN, status, temperature) of the externally managed switches.

By monitoring chassis information data, UFM can trigger selected events when module failure occurs or a specific sensor value is above threshold.

Summary of UFM Server Communication with InfiniBand Switches

Affected Service

Network

Address / Service / Port

Direction

InfiniBand Management / Monitoring

InfiniBand

Management Datagrams

Bi-directional

Switch IP Address Discovery (auto-negotiation with switch management software)

Out-of-band management

Multicast 224.0.23.172,

TCP / 6306 (configurable)

Multicast: UFM Server to switch

TCP: Bi-directional

Switch Chassis Management / Monitoring

Out-of-band management

TCP / UDP / 6306 (configurable)

SNMP / 161 (configurable)

SSH / 22 (configurable)

Bi-directional


UFM Server Communication with InfiniBand Hosts

image2019-6-16_15-34-4-version-1-modificationdate-1718612976057-api-v2.png

UFM Server InfiniBand Communication with HCAs

The UFM Server must be connected directly to the InfiniBand fabric. The UFM Server sends the standard InfiniBand Management Datagrams (MADs) to the Host Card Adapters (HCAs) and receives InfiniBand traps.

UFM Server Communication with InfiniBand Hosts

Affected Service

Network

Address / Service / Port

Direction

InfiniBand Management / Monitoring

InfiniBand

Management Datagrams

Bi-directional


UFM Server HA Active—Standby Communication

image2019-6-16_15-35-52-version-1-modificationdate-1718612976730-api-v2.png

UFM Server HA Active—Standby Communication

UFM Active — Standby communication enables two services: heartbeat and DRBD.

  • heartbeat is used for auto-negotiation and keep-alive messaging between active and standby servers. heartbeat uses port 694 (udp).

  • DRBD is used for low-level data (disk) synchronization between active and standby servers. DRBD uses port 8888 (tcp).

Affected Service

Network

Address / Service / Port

Direction

UFM HA heartbeat

Out-of-band management*

UDP / 694

Bi-directional

UFM HA DRBD

Out-of-band management*

TCP / 8888

Bi-directional

*An IPoIB network can be used for HA, but this is not recommended, since any InfiniBand failure might cause split brain and lack of synchronization between the active and standby servers.

© Copyright 2024, NVIDIA. Last updated on Aug 27, 2024.