NVIDIA UFM Enterprise User Manual v6.21.0

Changes and New Features

This section lists the new and changed features in this software version.

Note

For an archive of changes and features from previous releases, please refer to Changes and New Features History.

Feature

Description

DDoS Protection

Added the UFM MAD Limiter security tool which mitigates Distributed Denial-of-Service (DDoS) attacks targeting the management node. For more information, refer to Appendix - MAD Limiter.

Fabric Validation Tests

Added two new fabric validation tests (Validate Nodes Firmware Version, Validate Switches CPLD Version) to check the CPLD and firmware versions for HCAs and switches. For more information, refer to Fabric Validation Tab.

OpenSM Traps

Added support for three new OpenSM traps (VPort P_key Violation, VPort Q_Key Violation, IB Mad Keys Periodic Update). For more information, refer to Threshold-Crossing Events Reference.

Pkey APIs

Enhanced the Pkey APIs to allow changing the partition name by accepting a pkey_name_prefix parameter. This prefix is written to the partitions.conf file. For more information, refer to PKey GUIDs Rest API -> Set/Update PKey GUIDs and Add GUID to PKey.

Bare Metal Cloud Mode

Defined and implemented the BareMetal Cloud mode in UFM for Cloud Service Providers (CSPs). For more information, refer to Default Bare-Metal Cloud Mode.

Improved Detection of Module Changes for Unmanaged Switches

Introduced an improved detection mechanism for PSU/fan removal events on unmanaged switches by generating a new fabric analysis report that runs every 2 minutes by default. The feature is disabled by default. For more information, refer to Faster Detection of Fan/PSU Removal on Unmanaged Switches.

Cable Temperature

Added support for displaying the cable temperature per cable transceiver of selected port. For more information, refer to Devices Window.

Periodic Key Updates

Added support for periodic key updates for M_Keys, CC, VS, and N2N keys. For more information, refer to Appendix – Security Features.

Cable Info for XDR Fabrics

Added support in UFM to display information for all four plane ports on XDR when 'Cable Info' is selected from the menu of an aggregated port. For more information, refer to Cables Window.

UFM License

Added support for the UFM license to support 3 MAC addresses (for UFM HA deployments). For more information, refer to Activating Software License.

UFM Rollback

Added the ability to revert back to or reinstall a backed-up specific historical version of UFM including configuration. For more information, refer to UFM Rollback.

LDAP Authentication

Added support for UFM LDAP-integration (allowing users to set LDAP for authenticating UFM users). For more information, refer to LDAP Authentication

UFM/SM Dynamic Configurations

Added the ability to update UFM, OpenSM, and SHARP configurations without needing to restart UFM, as well as the option to modify parameters while UFM is running. This feature includes a list of OpenSM and gv.cfg configuration parameters that can be updated dynamically without requiring a UFM restart. For more information, refer to Setting UFM Configurations Without Requiring UFM Restart.

Proxy Elimination

As of UFM Enterprise v6.21.0, proxy elimination is enabled by default, resulting in improved performance when accessing UFM plugin APIs compared to previous versions.

UFM Infra

Added support for the UFM Infra feature, which introduces a structured architecture where services are divided into two categories; UFM Infra and UFM Enterprise, each deployed differently based on functionality.

The feature allows fast availability (up to 3 seconds) of the UFM REST API after failover.

For more information, refer to UFM Infra. For Installation instruction, refer to Installing UFM Infra Using Rootless with Podman.

Deploying UFM Podman as Non-Root User: Added an automated script to seamlessly install/uninstall UFM as a non-privileged user.

This section lists the new and changed features in this software version.

Note
  • For an archive of changes and features from previous releases, please refer to Changes and New Features History.

  • For bare metal installation of UFM, it is required to install MLNX_OFED 5.X (or newer) before the UFM installation. Please make sure to use the UFM installation package that is compatible with your setup, as detailed in Bare Metal Deployment Requirements.

  • The items listed in the table below apply to all UFM license types.

Feature

Description

DDoS Protection

Added the UFM MAD Limiter security tool which mitigates Distributed Denial-of-Service (DDoS) attacks targeting the management node. For more information, refer to Appendix - MAD Limiter.

Fabric Validation Tests

Added two new fabric validation tests (Validate Nodes Firmware Version, Validate Switches CPLD Version) to check the CPLD and firmware versions for HCAs and switches. For more information, refer to Fabric Validation Tab.

OpenSM Traps

Added support for three new OpenSM traps (VPort P_key Violation, VPort Q_Key Violation, IB Mad Keys Periodic Update). For more information, refer to Threshold-Crossing Events Reference.

Pkey APIs

Enhanced the Pkey APIs to allow changing the partition name by accepting a pkey_name_prefix parameter. This prefix is written to the partitions.conf file. For more information, refer to PKey GUIDs Rest API -> Set/Update PKey GUIDs and Add GUID to PKey.

Bare Metal Cloud Mode

Defined and implemented the BareMetal Cloud mode in UFM for Cloud Service Providers (CSPs). For more information, refer to Default Bare-Metal Cloud Mode.

Improved Detection of Module Changes for Unmanaged Switches

Introduced an improved detection mechanism for PSU/fan removal events on unmanaged switches by generating a new fabric analysis report that runs every 2 minutes by default. The feature is disabled by default. For more information, refer to Faster Detection of Fan/PSU Removal on Unmanaged Switches.

Cable Temperature

Added support for displaying the cable temperature per cable transceiver of selected port. For more information, refer to Devices Window.

Periodic Key Updates

Added support for periodic key updates for M_Keys, CC, VS, and N2N keys. For more information, refer to Appendix – Security Features.

Cable Info for XDR Fabrics

Added support in UFM to display information for all four plane ports on XDR when 'Cable Info' is selected from the menu of an aggregated port. For more information, refer to Cables Window.

UFM License

Added support for the UFM license to support 3 MAC addresses (for UFM HA deployments). For more information, refer to Activating Software License.

UFM Rollback

Added the ability to revert back to or reinstall a backed-up specific historical version of UFM including configuration. For more information, refer to UFM Rollback.

LDAP Authentication

Added support for UFM LDAP-integration (allowing users to set LDAP for authenticating UFM users). For more information, refer to LDAP Authentication

UFM/SM Dynamic Configurations

Added the ability to update UFM, OpenSM, and SHARP configurations without needing to restart UFM, as well as the option to modify parameters while UFM is running. This feature includes a list of OpenSM and gv.cfg configuration parameters that can be updated dynamically without requiring a UFM restart. For more information, refer to Setting UFM Configurations Without Requiring UFM Restart.

Proxy Elimination

As of UFM Enterprise v6.21.0, proxy elimination is enabled by default, resulting in improved performance when accessing UFM plugin APIs compared to previous versions.

UFM Infra

Added support for the UFM Infra feature, which introduces a structured architecture where services are divided into two categories; UFM Infra and UFM Enterprise, each deployed differently based on functionality.

The feature allows fast availability (up to 3 seconds) of the UFM REST API after failover.

For more information, refer to UFM Infra. For Installation instruction, refer to Installing UFM Infra Using Rootless with Podman.

Deploying UFM Podman as Non-Root User: Added an automated script to seamlessly install/uninstall UFM as a non-privileged user.

Change

Description

Security

Added support for setting m_key_protect_bitsto 1 when M_Keyper port feature is enabled.

Added support for periodic management key updates.

Added the ability to generate and configure new random management keys for subnet devices.

The Supported management classes are: SM, VS, CC and N2N.

Feature control parameters:

  • periodic_key_update- Key update interval in minutes.

    • 0 - Feature is disabled.

    • 10-71582 - Key update interval. Feature is applicable only for management classes on which key protection features is enabled.

Routing

Added support for specifying ranks for switches in root GUIDs file.

When rank is specified next to switch GUID or switch port group, Subnet Manager will assign the rank to that switch. If the rank is not specified, Subnet Manager will mark the specified switches as roots (rank 0).

Added support for 4 level tress and unbalanced tress in enhanced asymmetric routing algorithm.

Congestion Control/XDR

Added support for enabling Congestion Control per plane. Configurable via congestion control policy file.

The feature is enabled by default for supporting devices.

Logging

Added support for logging trap 1257 and trap 1258 details.

Added destination port GUID when logging trap repress for GMPs.

General

Added support for specifying GUIDs ranges in partitions file.

Added the option to disable SA.

Bug Fixes

  • Fixed a crash issue when multiple switches fail to reply to MADs and ucast_cacheis enabled

  • Fixed an issue with re-discovering device after timeouts when topology file is specified

  • Fixed a crash issue when enabling congestion control keys during runtime

  • Fixed clearing FLID of removed spines from router LID tables

  • Fixed an issue with configuring SL to VL mapping when using specific settings for switch-to-switch links

  • Fixed logging error when enhanced QoS file is empty

  • Fixed an issue with configuring trap P_Keyfor VS,CC and N2N management classes

Change

Description

Q3200-RA NVIDIA Quantum-3 Switch Systems

SHARP XDR now supports an extended range of switch platforms, adding compatibility with Q3200-RA NVIDIA Quantum-3 Switch Systems alongside existing support for Q3400-RA NVIDIA Quantum-3 Switch Systems.

Bug Fix

[Ref #:4259313]

Description: Fixed an issue where the SHARP component was not upgraded correctly during a DOCA-Host upgrade to version 3.0.0 from earlier versions on RedHat and SLES systems.

Note: This fix applies to upgrades from version 3.0.0 onward. Upgrades to 3.0.0 from prior versions will still experience the issue.

Workaround for affected upgrades:

To ensure SHARP is updated when upgrading to version 3.0.0:

  • On SLES: run zypper up doca-ofed sharp

  • On RedHat: run dnf update doca-ofed sharp

Change

Description

General

Added support for CPLD validation.

Added support for End Port Plane Filter Validation.

Added support for CC per plane.

Added support for new IBNL files CX-6/CX-7/CX-7 for HCA.

Added support for new IBNL files Q3400/Q3200 for Switches.

Removed duplicated warnings in CC validations.

Added support for skipping topo match validation the topo file does not contain nodes.

Improved routing validation for XDR.

Improved SM HCA configuration validation.

PHY Plugin

Removed SLREG register.

Updated PPCNT register.

Updated PEMI register.

Updated PDDR register.

Added support for "Gen6" for representation speed of PCIe.

Bug Fixes

  • Fixed wrong port speed in case of timeouts in fabric.

  • Fixed cable RX/TX Power uW to mW conversions.

Plugin

Version

Changes and New Features

REST-RDMA Plugin

1.0.0-39

Added support to run REST-RDMA client in Arm environment.

NDT Plugin

1.1.1-21

Added the ability to run on SELinux and rootless environment.

UFM Telemetry Fluentd Streaming (TFS) Plugin

1.1.1-1

Added the ability to run on SELinux and rootless environment.

UFM Events Fluent Streaming (EFS) Plugin

1.0.0-6

N/A

UFM Bright Cluster Integration Plugin

1.0.0-3

N/A

IB Link Resiliency Plugin

1.1.0-2

New Feature:

Added the ability to suppress duplicated events in shadow mode.

Added the ability to run on SELinux and rootless environment.

Bug Fix:

Resolved third-party dependency vulnerabilities in plugin – addressed CVE-2024-6345 (BDSA severity 7.9) (ref #4402643 ).

ClusterMinder Plugin

1.1.10

Added the ability to run on SELinux and rootless environment.

Sysinfo Plugin 

1.1.1

N/A

SNMP Plugin

1.0.0-3

N/A

Packet Level Monitoring Collector (PMC) Plugin

1.19.33

New Features:

  • Added support for m_key_per_port

  • Added the ability to run on SELinux and rootless environment.

Bug Fixes:

  • Fixed incorrect threshold value for Fast Recovery events.

Limitation:

  • The PMC plugin does not support environments where the SM is configured with vs_key_enable, cc_key_enable, and/or n2n_key_enable settings.

GNMI-Telemetry Plugin

1.3.5-1

New Feature:

  • Added the ability to fetch telemetry serially instead of in parallel.

  • Aligned the certificate management directory with the UFM container flag local-cert-dir.

  • Added the ability to subscribe to UFM health KPIs notifications.

  • Added the ability to subscribe to telemetry collection completion notification.

  • Added the ability to run on SELinux and rootless environment.

Bug Fixes:

  • Resolved issue of receiving unexpected values for down ports during link-down testing (ref #4320510).

  • Resolved the issue of session disconnection when UFM restarts (ref #4379789).

UFM Telemetry Manager (UTM) Plugin

1.21.3

Added the ability to run on SELinux and rootless environment.

UFM Consumer Plugin

1.0.0-16

N/A

Fast-API Plugin

1.0.4-2

  • Added support for UFM Infra.

  • Added the ability to run on SELinux and rootless environment.

UFM Light Plugin

1.1.0-2

N/A

Key Performance Indexes (KPI) Plugin

1.0.8-2

New Feature:

Added the ability to run on SELinux and rootless environment.

Bug Fix:

Resolved third-party dependency vulnerabilities in plugin – addressed CVE-2024-6345 (BDSA severity 7.9) (ref #4402643 ).

UFM Events Grafana Dashboard Plugin

1.0.2-0

Limitations:

  • Not supported on UFM Gen 2.0

  • FluentD fails with RHEL9 OS.

Log Streamer Plugin

1.0.1-2

Introduced the Log Streamer plugin which enables users to stream external logs (such as OpenSM, SHARP, etc.) to a remote syslog server.

GNMI NVOS Events Plugin

1.0.1-1

Introduced the GNMI NVOS Events plugin, a standalone Docker container managed by UFM. Its main role is to collect GNMI events from NVOS switches and relay them to UFM as external events.

Unmanaged Switch Dump (USD) Plugin

1.0.1-0

Introduced the USD plugin which retrieves system collections from unmanaged NVIDIA switches, compiles the data into files, and transfers these files to the same remote destination used for data collection from managed switches.

The following distributions are no longer supported in UFM:

  • RH7.0-RH7.7 / CentOS7.0-CentOS7.7

  • SLES12 / SLES 15

  • EulerOS2.2 / EulerOS2.3

  • Ubuntu18.04

Deprecated Features:

  • Mellanox Care (MCare) Integration

  • UFM on VM (UFM with remote fabric collector)

  • Logical server auditing

  • The UFM high availability script - /etc/init.d/ufmha - is no longer supported

  • The UFM Multi-site portal feature is no longer supported. The Multi-Subnet feature can be used instead

  • As of UFM Enterprise v6.19.0, the Autonomous Link Maintenance (ALM) and PDR Deterministic plugins are no longer supported.

  • The GRPC-Streamer plugin is deprecated.

  • As of UFM Enterprise v6.18.0, UFM Agent discovery will be disabled by default, and managed switches will be discovered in-band

  • As of UFM Enterprise v6.18.0, the ibdiagpathdiagnostic utility is deprecated

  • As of UFM Enterprise version 6.14.0, UFM Monitoring Mode is deprecated and is no longer supported

  • As of UFM Enterprise v6.12.0, the Logical Elements tab is removed

  • Removed the following fabric validation tests: CheckPortCounters & CheckEffectiveBER

Note

In order to continue working with /etc/init.d/ufmha options, use the same options using the /etc/init.d/ufmd script.

For example:

Instead of using /etc/init.d/ufmha model_restart, please use /etc/init.d/ufmd model_restart (on the primary UFM server)

Instead of using /etc/init.d/ufmha sharp_restart, please use /etc/init.d/ufmd sharp_restart (on the primary UFM server)

The same goes for any other option that was supported on the /etc/init.d/ufmha script

© Copyright 2025, NVIDIA. Last updated on Nov 20, 2025