Changes and New Features History
The items listed in the table below apply to all UFM license types.
Feature |
Description |
Rev 6.14.0 |
|
UFM Upgrade |
Added support for in-service upgrade procedure for UFM HA. Refer to the following sections: |
User Authorization |
Added support for user-defined roles based on REST APIs subsets. Refer to Rest Roles Access Control. |
User Authentication |
Added support for user authentication based on Azure Active Directory. Refer to Azure AD Authentication. |
Plugins Management |
Added support for loading UFM plugin to both master and standby nodes in case of UFM HA deployment. Refer to Plugin Management. |
Unhealthy Ports Policy Management |
Added support for unhealthy ports policy management via UFM Web UI. Refer to Health Policy Management. |
REST over RDMA Plugin |
Added support for remote ibdiagnet authentication. Refer to rest-rdma Plugin. |
SHARP Reservation |
Added support for synchronous SHARP reservation REST API (in addition to the existing asynchronous REST API). Refer to the NVIDIA SHARP REST API. |
Secondary Telemetry |
Added support for secondary telemetry running by default upon UFM startup, fetching NVIDIA Amber counters. Refer to Secondary Telemetry. |
Added support for down ports telemetry. Refer to Secondary Telemetry. |
|
PCI Analysis |
Added support for PCI analysis as part of UFM Fabric Analysis Report (added new events for degraded hosts PCI devices). Refer to Appendix - Supported Port Counters and Events. |
UFM System Dump |
Added human readable time to the dmsg de-message output as part of UFM system dump. |
Factory Reset |
Added support for UFM Factory Reset. Refer to Appendix - UFM Factory Reset. |
Rev 6.13.0 |
|
Network Fast Recovery |
Added the ability to automatically isolate a malfunctioning switch port as detected by the switch. Refer to Enabling Network Fast Recovery |
Multi-Subnet UFM |
Added support for multiple UFM instances, wherein multiple instances are aggregated, managed and controlled by a centralized UFM instance. Refer to Multi-Subnet UFM. |
Switch ASIC Failure Detection |
Added support for a new indication (UFM event) that identifies a failure of a specific switch ASIC. Refer to Configuring Partial Switch ASIC Failure Events. |
UFM High-Availability Enhancements |
Added support for configuring high-availability with dual-link connections to improve the high-availability robustness. |
Automatic Switch Grouping |
Added support for enabling automatic grouping of 1U switches by UFM, as per a pre-defined user-configured mapping. Refer to Appendix - Switch Grouping. |
SHARP Trees APIs |
Incorporated support for a new UFM REST API that presents the current active SHARP trees. Refer to NVIDIA SHARP Resource Allocation REST API. |
SHARP Reservation APIs |
Added support for SHARP Reservation API enhancements. Refer to NVIDIA SHARP Resource Allocation REST API. |
Operating System Update support |
Implemented functionality to support the installation and upgrade of a standalone UFM after the upgrade of operating system packages (e.g., using yum update/apt upgrade). Furthermore, upgrading operating system packages will not impact a standalone UFM installation. |
Email Time-Zone Settings |
Added the ability to configure time-zone settings for UFM email notifications, ensuring that sent events or daily reports align with the configured time zone. Refer to Email. |
Switch Connectivity Failure Indication |
Incorporated support for a new UFM event indication that identifies failed communication with a specified managed switch. Appendix - Supported Port Counters and Events |
Dynamic Telemetry |
Added APIs that enable the creation and management of UFM Telemetry instances, allowing users to select desired counters and ports as per their requirements. Refer to UFM Dynamic Telemetry Instances REST API. |
TFS (Telemetry Fluent Streaming) Plugin |
Added support for UFM telemetry data streaming from multiple endpoints to Fluent Bit. Refer to Telemetry to Fluent Streaming (TFS) Plugin REST API. |
Added support for enabling white/black counters lists within the TFS Plugin. Refer to Telemetry to Fluent Streaming (TFS) Plugin REST API. |
|
DTS (DPU Telemetry) Plugin |
Added support for displaying DPUs data within the UFM Web UI. Refer to DTS Plugin. |
Cyber-AI Plugin |
Added support for displaying Cyber-AI software within the UFM Web UI. Refer to UFM Cyber-AI Plugin. |
Packet Mirroring Collector (PMC) Plugin |
Added the Packet Mirroring Collector (PMC) plugin that allows users to catch and collect mirrored pFRN and congestion notifications from switches for enhanced real-time network visibility. Refer to Packet Mirroring Collector (PMC) Plugin. |
SNMP Traps Listener Plugin |
Added the capability to enable registration and monitoring of SNMP traps from managed switches, in addition to updating UFM with the relevant trap information. Refer to SNMP Plugin. |
Bright Cluster Integration Plugin |
Added support for integration of data from Bright Cluster Manager (BCM) into UFM, providing a more comprehensive network perspective. Refer to UFM Bright Cluster Integration Plugin. |
UFM System Dump |
UFM System Dump collection enhancement. Refer to UFM System Dump Tab. |
Expanding Non-Blocking Fabric (NDT Plugin extension) |
Added a feature that facilitates seamless expansion of the IB fabric, ensuring uninterrupted functionality and optimal performance throughout the fabric. Refer to NDT Format – Merger. |
PDR (Packet Drop Rate) Plugin |
Added a new functionality that enables automatic detection and isolation of port failures through monitoring of PDR (Packet Drop Rate), BER (Bit Error Rate), and high cable temperatures. Refer to PDR Deterministic Plugin. |
Rev 6.12.0 |
|
Managed Switches - Sysinfo Mechanism |
Added the ability to save switches inventory data into JSON format files and present the latest fetched switches data upon UFM start-up. The saved switches data is available UFM upon system dump. Refer to Appendix - Managed Switches Configuration Info Persistency |
REST over RDMA Plugin |
Introduced security improvements (allowed read-only options in remote ibdiagnet) and added support for Telemetry API. Refer to rest-rdma Plugin. |
Events and Notifications |
Added support for indicating potential switch ASIC failure by detecting a defined percentage of unhealthy switch ports. Refer to Additional Configuration (Optional) |
SHARP AM Multi-Port |
Added support for detecting IB fabric interface failure and automatic failover to an alternative active port in SHARP Aggregation Manager (AM). Refer to Multi-port SM |
UFM System Dump |
Added support for downloading the generated UFM system dump. Refer to UFM System Dump Tab |
UFM REST API |
Added support for adding or removing hosts to Partition key (PKey) assignments (when adding/removing hosts, all the related host GUIDs are assigned to/removed from the PKey). Refer to Add Host REST API |
UFM System Dump Improvements including Creating New System Dump API |
|
UFM SLURM Integration |
Enhanced UFM SLURM integration; allow flexible configuration of PKey and SHARP resources usage. Refer to Appendix - UFM SLURM Integration |
UFM HA |
Improved UFM HA configuration by setting UFM HA nodes using IP addresses only (removed the need of using hostnames and sync interface names). Refer to Configuring UFM Docker in HA Mode and Installing UFM Server Software for High Availability |
Managed Switch Operations |
Added support for persistent enablement/disablement of managed switches ports. Refer to Ports Window |
UFM SDK |
Created a script to get TopX data by category. Refer to UFM Aggregation TopX README.md file |
Proxy Authentication |
Added option to delegate authentication to a proxy. Refer to Delegate Authentication to a Proxy |
UFM Initial Settings |
Removed the requirement to set the IPoIB address to the main IB interface used by UFM/SM (gv.cfg → fabric_interface) |
Port auto-isolation |
Symbol BER warning does not trigger port auto-isolation, only symbol BER error |
MFT Package |
Integrated with MFT version 4.23.0-104 |
Rev 6.11.0 |
|
UFM Discovery and Device Management |
|
CPU Affinity |
Enabling the user to control CPU affinity of UFM's major processes |
gRPC API |
Added support for streaming UFM REST API data over gRPC as part of new UFM plugin. Refer to GRPC-Streamer Plugin |
Telemetry |
|
EFS UFM Plugin |
Added support for streaming UFM events data to FluentD destination as part of a new UFM plugin. Refer to UFM Telemetry FluentD Streaming (TFS) Plugin |
General UI Enhancements |
• Displayed columns of all tables are persistent per user, with the option to restore defaults. Refer to Displayed Columns |
High Availability Deployment |
|
REST APIs |
Added support for PKey filtering for default session data. Refer to Get Default Monitoring Session Data by PKey Filtering. |
Added support for filtering session data by groups. Refer to Monitoring Sessions REST API. |
|
Added support for resting all unhealthy ports at once. Refer to Mark All Unhealthy Ports as Healthy at Once |
|
Added support for presenting system uptime in UFM REST API. Refer to Systems REST API. |
|
Deployment Installation |
UFM installation is now based on Conda-4.12 (or newer) for python3.9 environment and third party packages deployments. |
NVIDIA SHARP Software |
Updated NVIDIA SHARP software version to v3.1.1. |
UFM Logical Elements |
UFM Logical Elements (Environments, Logical Servers, Networks) views are deprecated and will no longer be available starting from UFM v6.12.0 (January 2023 release) |
Rev 6.10.0 |
|
System health enhancements |
Add support for the periodic fabric health report, and reflected the ports' results in UFM's dashboard |
UFM Plugins Management |
Add support for plugin management via UFM web UI |
UFM Extended Status |
|
Failover to Other Ports |
Add support for SM and UFM Telemetry failover to other ports on the local machine |
UFM Appliance Upgrade |
Added a set of REST APIs for supporting the UFM Appliance upgrade |
Configuration Audit |
Add support for tracking changes made in major UFM configuration files (UFM, SM, SHARP, Telemetry) |
UFM Plugins |
Add support for new SDK plugins |
Telemetry |
Add support for statistics processing based on UFM telemetry csv format |
UFM High Availability Installation |
UFM high availability installation has changed and it is now based on an independent high availability package which should be deployed in addition to the UFM Enterprise standalone package. for further details about the new UFM high availability installation, please refer to - Installing UFM Server Software for High Availability |
Rev 6.9.0 |
|
NDR Support |
Full E2E NDR including ConnectX-7 HCAs Family (Discovery and Monitoring) |
Cable FW burn |
Add support for multiple switches with multiple FW images burning |
Events |
Add support for monitoring and alerting on cable transceiver temperatures over threshold |
Improve SM traps handling (offloading SM traps handling to a separated process) |
|
Add option for setting events persistency (keeping max last X events) for showing upon UFM startup |
|
Add option for consolidating similar events on the UFM Web UI Events Log View |
|
SHARP |
Add support for failover to secondary bond port in case of IB interface failure |
Add option to override SHARP smx_sock_interface based on UFM fabric_interface (gv.cfg) |
|
Add option to set SHARP AM ib_port_guid based on UFM fabric_interface (gv.cfg) |
|
SM |
Add support for tracking SM configuration changes (configuration history) |
Add support for pkey assignment validation (for user defined pkey assignment only) |
|
Client Certificate Authentication |
Add support for client certificate authentication |
Add option to push bootstrap certificate to the UFM via REST API |
|
Configuration Migration (backup / restore) |
Add option to migrate UFM configuration from bare metal UFM to a docker container based UFM |
MFT Integration Enhancement |
Add support for MFT based operation (FW burning, cable info) while m_key/vs_key are configured on SM |
Logging |
Adding option to configure UFM log folder location |
UFM Health |
Add option for users to add customized health tests based on scripts (Python / Bash) |
Web UI Enhancements |
Add support for user defined modular UFM dashboard views (based on available list of pre-defined panels) |
Add support for UFM dashboard timeline (for viewing historical dashboard views) |
|
Enhance the dashboard inventory view for showing elements (HCAs, Switches, Cables, Gateways, Routers) by version |
|
Add support for user defined modular UFM telemetry persistent dashboard (Telemetry View) |
|
Adding option for viewing Web client data based on local client time or UFM server time |
|
Add option to select UFM look and feel between dark mode and light mode (default is light mode) |
|
Add support for hierarchical view when presenting the network map elements. |
|
Add option for selecting the displayed columns for all data tables. |
|
Add option for exporting all table data into CSV (not only the current displayed page data) |
|
Improved view of the ports table (port name, speed and width) |
|
Add option to show disabled/down ports |
|
Add support for Web UI usage statistics collection |
|
Add option for sending test email |
|
Telemetry |
Add support for updating Telemetry package within installed UFM Enterprise. |
UFM Plugins |
Add support for running UFM plugins within UFM docker container |
Add support for AHX monitoring plugin |
|
Supported OSs |
Add support for installing UFM on Ubuntu18 (Standalone and High availability modes) |
Add support for installing UFM on CentOS 7.9/Redhat7.9 |
|
Add support for installing UFM on FAIR OS 22.03 |
|
Rev 6.8 |
|
UFM Telemetry |
Changed the Telemetry infrastructure from UFM Telemetry docker container to UFM Telemetry bare metal |
Performance improvements for supporting telemetry on large scale fabrics (up to 216,000 ports fabric) |
|
Live sessions enhancements – adding support for multiple telemetry sessions based on one UFM Telemetry instance |
|
Add support for collecting historical telemetry (all fabric ports counters) by default |
|
Unhealthy Ports |
Add option (configurable) for automatically Isolating ports which were detected with high BER |
Add option to present unhealthy port table by the connection type (switch-switch or switch-host). |
|
Add option to mark selected device as unhealthy |
|
UFM Plugins – REST over RDMA |
Add support for REST API over RDMA plugin (allowing execution of UFM REST API requests over the InfiniBand fabric) |
Add ability to run Linux command line command, including ibdiagnet, over rdma |
|
UFM Plugins – NDT |
Add support for NDT (CSV formatted topology) comparison with UFM fabric detected topology |
Fabric Validation Tests |
Add context menu options for selected results of fabric validation tests based of UFM model objects (Devices and Ports). |
Add support for Socket-direct mode reporting (Inventory) |
|
Add support for SHARP Aggregation Manager health tests |
|
Add support for Tree Topology Analysis support in UFM |
|
Events Policy |
Add new category for Events Policy – Security |
Add new UFM events indicating Pkey assignment of guids and removal of guids from Pkey |
|
Add new UFM events which are triggered when duplicated node or port GUIDs are detected in the fabric |
|
Add new event for indicating switch down reported by SM |
|
UFM SDK |
Add option to get topology via UFM REST API and stream it out to an external destination |
Virtualization |
Add option to assign selected virtual ports to a specified PKEY (via UFM Web UI) |
Cable Information |
Showing Link grade in Cable info |
Network Map |
Add support for network map topology persistency on server side. |
UFM Web UI |
Add option to copy and paste tables content ( GUIDS and LIDS ) via UFM Web UI |
UFM Authentication |
Add support for token based authentication |
UFM Slurm Integration |
Add several UFM-SLURM Integration Improvements |
UFM Docker container |
Several docker Enhancement mainly for improving the deployment procedure |
SM Configuration |
Setting AR (Adaptive Routing) Up Down as the default routing configuration in UFM / SM ( for new UFM installations ) |
UFM REST API |
Add Support for CloudX API in UFM for OpenStack integration and allow auto provisioning of the InfiniBand fabric |
NDR support |
Add support for discovering and monitoring Nvidia NDR switches. |
Installation |
Updated UFM installation to run without docker dependencies (docker service is no longer required for the UFM installation) |
Supported OSs |
Add support for installing UFM on CentOS 8 stream, kernel 5.4 |
UFM High Availability |
Add support for independent high availability package (based on Pacemaker and DRBD) which server as the basis for UFM containers high availability deployment |