NVIDIA UFM Enterprise User Manual v6.23.1

Changes and New Features

Note

Notes:

  • For an archive of changes and features from previous releases, please refer to Changes and New Features History.

  • The items listed in the table below apply to all UFM license types.

  • For bare metal installation of UFM, it is required to install DOCA_HOST before the UFM installation. Please make sure to use the UFM installation package that is compatible with your setup, as detailed in Bare Metal Deployment Requirements.

Feature

Description

UFM Versions Manager

Added the UFM Versions Manager command-line tool, which manages UFM system backups, configurations, and migrations. It provides comprehensive backup and restore capabilities for UFM systems running on root-less Podman containers, with full support for both Standalone (SA) and High Availability (HA) environments. For more information, refer to Appendix - UFM Versions Manager.

Upgraded Python Version

Upgraded runtime UFM Python version from 3.9 to 3.12.

Fabric Health as Unified Tool

Added the ability to use UFM fabric health report as a centralized fabric-wide health monitoring tool. For more information, refer to Fabric Health Tab and the Reports REST API.

Supported Operating Systems

Added support for RedHat 10. For more information, refer to Installation Notes.

System Dump Script

Reduced the UFM system dump collected by default via the ufm_sysdump.sh script. For more information, refer to Collecting System Dump.

SHARP Reservations

Added support for automatic synchronization between the PKey database and SHARP reservations in multi-tenant clusters - upon PKey modifications the corresponding reservations are automatically updated . For more information, refer to UFM SHARP REST API .

SHARP

Enabled SHARP by default in UFM. For more information, refer to Enabling SHARP Aggregation Manager.

Cloud Profile Mode

Added support for Secured Cloud Profile mode for CSPs ( Cloud Service Providers ) which leverages enhanced security settings when enabled. For more information, refer to Cloud Profile Mode.

Clustered Telemetry

Added the Clustered Telemetry feature which enables multiple telemetry data collection instances across multiple network adapters (HCAs) in the InfiniBand fabric. This feature provides improved performance and scalability for large-scale deployments through workload distribution. For more information, refer to Telemetry.

Telemetry Enhancements

Optimized the UFM telemetry restart mechanism only upon newly added ports.

Visibility into UFM Services

Added the ability to monitor and track the initialization and progress of all UFM services in real-time during UFM startup. For more information, refer to UFM Startup Logging.

Syslog API

Updated the Syslog API validation to support hostnames in addition to IPv4 and IPv6 addresses for the destination parameter.

UFM Configuration Parameters

Aligned the UFM large-scale subnet configuration parameters. For more information, refer to Adjusting UFM Configuration Files Based on Fabric Size.

Azure Authentication

Added the ability to increase Azure AD username length. For more information, refer to Azure AD Authentication.

Tool

Version

Changes and New Features

SHARP

3.13.0

Automatic Synchronization of SHARP Reservations and PKeys

Added support for automatic synchronization between the PKey database and SHARP reservations (create, modify, and delete) in multi-tenant clusters. When this mode is enabled, there is no need to invoke the SHARP reservation REST API manually.

Stochastic Rounding Support (Beta)

Added beta-level support for Stochastic Rounding during SHARP mathematical calculations.

Default Client-Server Communication

New installations of SHARP_am now use UCX as the default communication method with libsharp instead of sockets.

During upgrades, SHARP_am retains the previously configured communication method. If the system was previously using sockets, customers are advised to switch to UCX for improved reliability.

Reduce-Scatter and All-Gather

Added Reduce-Scatter and All-Gather operations support on Quantum-3 systems.

Default Configuration in UFM

sharp_enable is now set to trueby default, eliminating the need to update the settings in the configuration file.

For more information, please refer NVIDIA Scalable Hierarchical Aggregation and Reduction Protocol (SHARP) Rev 3.13.0.

OpenSM

5.25.1

Routing

Improved routing calculation time on planarized networks.

Reduced routing table updates on planarized networks.

General

Added support for SHARP multicast tress.

Added support for writing port partition table to SMDB file.

Added support for specifying BW limitation to devices connected to XDR switches.

Added support for configuring SL2VL Mapping from device configuration file.

Added information on whether a parameter can be changed during runtime to generated configuration file.

Logging

Added support for additional APort validity checks.

For more information, please refer MLNXSM (InfiniBand Subnet Manager) Utility Release Notes v5.25.1.

IBUtils2 Utility

2.24.0

  • Added support for CPO.

  • Added support for Multi-Cast Private LFT.

  • Added support for GPU devices.

  • Updated routing analysis in XDR fabrics.

  • Added support for PPLM register.

  • Added support for ARHC and ARHR HCA registers.

  • Added support for new PEMI layouts.

  • Updated PEMI section numbering according to PRM.

  • Disabled by default PEMI SNR and PEMI OEDP.

  • Disabled by default SLRG, SLRP, SLRIP, and SLSIR registers.

  • Disabled Diagnostic Data Pages; using Access Registers only.

  • Changed PDDR cable register timeout to long timeout.

  • Updated APort file.

  • Updated fetch mechanism for all MADs.

  • Reduced the number of recurring MAD fetch errors.

  • Added delta for Performance Histogram Port Data.

  • Added delta for Performance Histogram Buffer Data.

  • Added Template GUID to Physical Hierarchy section.

For more information, please refer IBUtils2 Utility Documentation v2.24.0.

For plugin changes, new features, bug fixes or known issues, refer to plugin documentation under UFM Plugins.

The following distributions are no longer supported in UFM:

  • RH7.0-RH7.7 / CentOS7.0-CentOS7.7

  • SLES12 / SLES 15

  • EulerOS2.2 / EulerOS2.3

  • Ubuntu18.04

Deprecated Features:

  • Mellanox Care (MCare) Integration

  • UFM on VM (UFM with remote fabric collector)

  • Logical server auditing

  • The UFM high availability script - /etc/init.d/ufmha - is no longer supported

  • The UFM Multi-site portal feature is no longer supported. The Multi-Subnet feature can be used instead

  • As of UFM Enterprise v6.19.0, the Autonomous Link Maintenance (ALM) and PDR Deterministic plugins are no longer supported.

  • The GRPC-Streamer plugin is deprecated.

  • As of UFM Enterprise v6.18.0, UFM Agent discovery will be disabled by default, and managed switches will be discovered in-band

  • As of UFM Enterprise v6.18.0, the ibdiagpathdiagnostic utility is deprecated

  • As of UFM Enterprise version 6.14.0, UFM Monitoring Mode is deprecated and is no longer supported

  • As of UFM Enterprise v6.12.0, the Logical Elements tab is removed

  • Removed the following fabric validation tests: CheckPortCounters & CheckEffectiveBER

Note

In order to continue working with /etc/init.d/ufmha options, use the same options using the /etc/init.d/ufmd script.

For example:

Instead of using /etc/init.d/ufmha model_restart, please use /etc/init.d/ufmd model_restart (on the primary UFM server)

Instead of using /etc/init.d/ufmha sharp_restart, please use /etc/init.d/ufmd sharp_restart (on the primary UFM server)

The same goes for any other option that was supported on the /etc/init.d/ufmha script

© Copyright 2025, NVIDIA. Last updated on Nov 20, 2025