What can I help you with?
NVIDIA UFM Enterprise User Manual v6.20.2

Changes and New Features

This section lists the new and changed features in this software version.

Note

For an archive of changes and features from previous releases, please refer to Changes and New Features History.

Feature

Description

Routing Engine

Asym3 routing engine fix.

Rev 6.20.1

Port Recovery Policy (OpenSM)

Updated the port recovery policy to be enabled by default.

Client Based Authentication

Added support for client-based authentication based on client certificate common name (CN) associated with UFM user. For more information, refer to Client Based Authentication.

Running SM and AM as Unprivileged User (For UFM Container Only)

Added the ability to run SM and AM with least privilege, without requiring root access and support for running Docker Container with root file system in read-only mode.

Low-Frequency (Secondary) Telemetry Enhancement

Added the host_name counter to the Low-Frequency (Secondary) Telemetry Fields.

mTLS Support in the gNMI Plugin

Added support for mTLS in the gNMI plugin. For more information, refer Secure Server using mTLS and Certificate Subject Identifier.

mTLS Support for Authentication Proxy

Added support for mTLS authentication between the UFM and Authentication proxy. For more information, refer to Proxy Authentication.

DOCA-Host Driver Installation

Installation of UFM is now supported on top of DOCA-Host driver. For more information, refer to Installing UFM Server on Bare Metal Server.

Setting Node Description (Unmanaged Switches) Enhancement

Added the ability to set node description of the unmanaged switches when MKey per port is enabled. For more information, refer to Devices Action.

Exposing SM Link State Transition in UFM Events

Added the ability to trigger a new event when the link changed from Active to INIT state (flapping link). For more information, refer to Appendix - Supported Port Counters and Events (added event #1321).

SQlite Health Test (UFM Health)

Added a new UFM health test to detect SQLite database corruption. For more information, refer to Supported Traps and Events (added event #1612) and UFM Server Health Monitoring (added test "CheckGvSqliteCorruptionTest").

UFM Events Streaming Enhancement

Added the ability to stream all UFM events to logs/syslog. For more information, refer to Configuring UFM Logging (added the stream_all_events field).

UFM Telemetry Secured Endpoint

Added the ability to change the telemetry endpoint protocol. For more information, refer to Changing Telemetry Endpoint Protocol.

Setting Static Topology for Large Scale Fabrics

Added the ability to implement and test TopoSpec on a large scale. For more information, refer to OpenSM Static Topology Configuration REST API.

Ports Cable Information (XDR support)

Added support for showing extended cable information for XDR aggregated ports. For more information, refer to Cable Information.

UFM HA Node Health Check

Added a standalone script that checks if the master/standby nodes are properly configured. For more information, refer to Changing Maximum SSL Request Size when Using Client Certificate

SSL Renegotiation Buffer Size (Apache setting enhancement)

Adding the option to change the maximum request size in bytes when using client certificate. The default value is 1572864 (1536KB / 1.5MB) and the parameter name is max_ssl_request_size.

REST APIs

Links REST API and Ports REST API

Updated the HDR and NDR port name to display the port label (instead of port number) in UFM UI. For more information, refer to Links REST API and Ports REST API.

OpenSM Static Topology Configuration REST API

Enhanced the static topology REST API to show full topology difference in addition to showing difference for HCAs only. For more information, refer to OpenSM Static Topology Configuration REST API.

Plugin

Version

Changes and New Features

REST-RDMA Plugin

1.0.0-33

N/A

NDT Plugin

1.1.1-17

New Feature:

Added verification for duplicated node descriptions.

UFM Telemetry Fluentd Streaming (TFS) Plugin

1.1.0-0

N/A

UFM Events Fluent Streaming (EFS) Plugin

1.0.0-6

N/A

UFM Bright Cluster Integration Plugin

1.0.0-3

N/A

IB Link Resiliency Plugin

1.1.0-2

New Features:

  • Added a new alert module for detection of flapping links, with separate configurable thresholds for switch-switch and switch-host links.

  • Added the option to suppress duplicate UFM event messages for a configurable number of hours.

  • Added a new isolated-unrecoverable state, indicating that an isolated link could not be reinstated for a configurable number of hours as it continues to receive failure alerts.

  • Added new columns to the Port Level Status table in the Current State UI tab.

  • Updated the format of the UFM event messages reported by the plugin.

  • Removed the link down rule from the PDR alert module and moved the Symbol Errors rule to be enabled by default.

  • Changed the default value of the model_threshold configuration parameter.

ClusterMinder Plugin 

1.1.8

New Features:

  1. Added support for GB200 Compute Tray and Switch Tray

  2. Added support for XDR

  3. Added support for NVOS switches amBER data

  4. Added support for HOST DTS ad-hoc mode collect data from NVIDIA-SMI

  5. Performance improvements and bug fixes

Sysinfo Plugin 

1.1.1

N/A

SNMP Plugin

1.0.0-3

N/A

Packet Level Monitoring Collector (PMC) Plugin

1.19.18

N/A

GNMI-Telemetry Plugin

1.3.1-0

New Features:

  • Added a new configuration flag to always return all the queried data in every heartbeat when subscribing to on change. The default value is false.

  • Added support for reading the log file name and path from the configurations.

  • Sampling for events and inventory has been disabled. A new configuration flag, disable_events_inventory, has been added to the /opt/ufm/files/conf/gnmi_telemetry.ini configuration file.

  • Added the ability to configure telemetry sampling rates for individual endpoints via the /opt/ufm/files/conf/gnmi_telemetry.ini configuration file

UFM Telemetry Manager (UTM) Plugin

1.19.10

N/A

UFM Consumer Plugin

1.0.0-16

N/A

Fast-API Plugin

1.0.0-2

N/A

UFM Light Plugin

1.1.0-2

N/A

Key Performance Indexes (KPI) Plugin

1.0.7-2

N/A

UFM Events Grafana Dashboard Plugin

1.0.2-0

Limitations:

  • Not supported on UFM Gen 2.0

  • FluentD fails with RHEL9 OS.

Note

The items listed in the table below apply to all UFM license types.

Note

For bare metal installation of UFM, it is required to install MLNX_OFED 5.X (or newer) before the UFM installation.

Please make sure to use the UFM installation package that is compatible with your setup, as detailed in Bare Metal Deployment Requirements.

The following distributions are no longer supported in UFM:

  • RH7.0-RH7.7 / CentOS7.0-CentOS7.7

  • SLES12 / SLES 15

  • EulerOS2.2 / EulerOS2.3

  • Ubuntu18.04

Deprecated Features:

  • Mellanox Care (MCare) Integration

  • UFM on VM (UFM with remote fabric collector)

  • Logical server auditing

  • The UFM high availability script - /etc/init.d/ufmha - is no longer supported

  • The UFM Multi-site portal feature is no longer supported. The Multi-Subnet feature can be used instead

  • As of UFM Enterprise v6.19.0, the Autonomous Link Maintenance (ALM) and PDR Deterministic plugins are no longer supported.

  • The GRPC-Streamer plugin is deprecated.

  • As of UFM Enterprise v6.18.0, UFM Agent discovery will be disabled by default, and managed switches will be discovered in-band

  • As of UFM Enterprise v6.18.0, the ibdiagpathdiagnostic utility is deprecated

  • As of UFM Enterprise version 6.14.0, UFM Monitoring Mode is deprecated and is no longer supported

  • As of UFM Enterprise v6.12.0, the Logical Elements tab is removed

  • Removed the following fabric validation tests: CheckPortCounters & CheckEffectiveBER

Note

In order to continue working with /etc/init.d/ufmha options, use the same options using the /etc/init.d/ufmd script.

For example:

Instead of using /etc/init.d/ufmha model_restart, please use /etc/init.d/ufmd model_restart (on the primary UFM server)

Instead of using /etc/init.d/ufmha sharp_restart, please use /etc/init.d/ufmd sharp_restart (on the primary UFM server)

The same goes for any other option that was supported on the /etc/init.d/ufmha script

© Copyright 2025, NVIDIA. Last updated on Apr 2, 2025.