NVIDIA UFM Enterprise User Manual v6.11.2
NVIDIA UFM Enterprise User Manual v6.11.2

Running UFM Server Software

  • Perform initial configuration.

  • Ensure that all ports used by the UFM server for internal and external communication are open and available. For the list of ports, see Used Ports.
    You can run the UFM server software in the following modes:

  • Management

  • Monitoring

  • High Availability

  • High Availability with failover to an external SM

Note

In Management or High Availability mode, ensure that all Subnet Managers in the fabric are disabled before running UFM. Any remaining active Subnet Managers will prevent UFM from running.

After installing, run the UFM Server by invoking:

Copy
Copied!
            

systemctl start ufm-enterprise.service

Note

/etc/init.d/ufmd - Available for backward compatibility.

Log files are located under /opt/ufm/files/log (the links to log files are in /opt/ufm/log).

On the Master server, run the UFM Server by invoking:

Copy
Copied!
            

ufm_ha_cluster start

You can specify additional command options for the ufmha service.

ufm_ha_cluster Command Options

Command

Description

start

Starts UFM HA cluster.

stop

Stops UFM HA cluster.

failover

Initiates failover (change mastership from local server to remote server).

takeover

Initiates takeover (change mastership from remote server to local server).

status

Shows current HA cluster status.

cleanup

Cleans the HA configurations on this node.

help

Displays help text.

Run UFM in Monitoring mode while running concurrent instances of Subnet Manager on NVIDIA switches. Monitoring and event management capabilities are enabled in this mode. UFM non-monitoring features such as provisioning and performance optimization are disabled in this mode.

The following table describes whether features are enabled or disabled in Monitoring mode.

Features Enabled/Disabled in Monitoring Mode

Feature

Enabled/Disabled in Monitoring Mode

Fabric Discovery

Enabled

Topology Map

Enabled

Fabric Dashboard

Enabled

Fabric Monitoring

Enabled

Alerts and Thresholds (inc. SNMP traps)

Enabled

Fabric Logical Model

Enabled

Subnet Manager and plugins

Disabled

Subnet Manager Configuration

Disabled

Automatic Fabric Partitioning

Disabled

Central Device Management

Disabled

Quality of Service

Disabled

Failover (High Availability mode)

Disabled

Traffic Aware Routing Algorithm

Disabled

Device Management

Disabled

Integration with Schedulers

Disabled

Unhealthy Ports

Disabled

In Monitoring mode, UFM periodically discovers the fabric and updates the topology maps and database.

For Monitoring mode, connect UFM to the fabric using port ib0 only. The fabric must have a subnet manager (SM) running on it (on another UFM, HBSM, or switch SM).

Note

When UFM is running in Monitoring mode, the internal OpenSM is not sensitive to changes in OpenSM configuration (opensm.conf).

Note

When running in Monitoring mode, the following parameters are automatically

overwritten in the /opt/ufm/files/conf/opensm/opensm_mon.conf file on startup:

  • event_plugin_name osmufmpi

  • event_plugin_options --vendinfo -m 0

Any other configuration is not valid for Monitoring mode.

  1. In the /opt/ufm/conf/gv.cfg configuration file:

  • Set monitoring_mode to yes

  • If required, change mon_mode_discovery_period (the default is 60 seconds)

  • Set reset_mode to no_reset

    We recommend this setting when running multiple instances of UFM so that each port counter is not reset by different UFM instances. For more information, see Resetting Physical Port Counters.

2. Restart the UFM Server.

The Running mode is set to Monitoring, and the frequency of fabric discovery is updated according to the setting of mon_mode_discovery_period.

Note that a monitor icon will appear at the top of the navigation bar indicating that monitoring mode is enabled:

monitoring-mode-icon-version-1-modificationdate-1719404185367-api-v2.PNG

After installation, you can configure the web server to communicate in the secure protocol HTTP/S. For further information, please refer to the Launching a UFM Web UI Session section.

Port 8088 is an internal port that is used by the UFM server (a port that is not exposed to the user by the Apache Web Server). Apache web server listens on port 80 and forwards the incoming traffic to the local port 8088. Port 8088 is configurable, port 80 is not.

To configure using HTTP/S protocol instead of the default HTTP, add the following to the configuration file at /opt/ufm/conf/gv.cfg::

Copy
Copied!
            

# WebServices Protocol (http/https) and Port ws_port = 8088 ws_protocol = https

UFM installation configures HTTPS protocol in the webserver as follows:

  • Configures listening on port 443

  • Configures default virtual host

  • Creates/uses local certificates

For instructions, please refer to the UFM Quick Start Guide.

UFM User Authentication is based on standard Apache User Authentication. Each Web Service client application must authenticate against the UFM server to gain access to the system.

The UFM software comes with one predefined user:

  • Username: admin

  • Password: 123456

You can add, delete, or update users via User Management Tab.

UFM license is subscription-based featuring the following subscription options:

  • 1-year subscription

  • 3-year subscription

  • 5-year subscription

  • Evaluation 30-day trial license

Note

UFM will continue to support old license types, but they are no longer available to obtain.

2 months before the expiration of your subscription license, UFM will warn you that your license will expire soon. After the subscription expires, UFM will continue to work with the expired license for two months beyond its expiration.

During this extra two-month period, UFM will generate a critical alarm indicating that the UFM license has expired and that you need to renew your subscription. Failing to do so within that 2-month period activates UFM Limited Mode. Limited mode blocks all REST APIs and access to the UFM web UI.

UFM enables functionality based on the license that was purchased and installed. This license determines the functionality and the maximum allowed number of nodes in the fabric.

To renew your UFM subscription, purchase a new license and install the new license file by downloading the license file to a temp directory on the UFM master server and then copying the license file to /opt/ufm/files/licenses/ directory.

Note

UFM may not detect new license files if downloaded directly to /opt/ufm/files/licenses. If UFM does not detect the new license file, a UFM restart may be required.

If several licenses are installed on the server (more than one license file exists under /opt/ufm/files/licenses/), UFM uses only the strongest license and takes into consideration the expiration date, and the managed device limits on it, regardless of any other licenses that may exist on the server.

For instructions on how to view your license, please refer to the UFM Quick Start Guide.

A script under /opt/ufm/scripts calls show_ufm_status.sh, which allows the user to view the current status of UFM's main processes.

Running the command with the –e (extended_processes) option shows the main and sub-processes being handled by the UFM.

UFM_STATUS1-version-1-modificationdate-1719404185057-api-v2.png

UFM_STATUS2-version-1-modificationdate-1719404184330-api-v2.png


© Copyright 2024, NVIDIA. Last updated on Jul 4, 2024.