MLNXSM (InfiniBand Subnet Manager) Utility Release Notes v5.25.1

Bug Fixes

VersionDescription
5.25.1
  • Fixed an issue with line breaks in Virtualization DB dump file
  • Fixed an issue with reconfiguring PFRN on routers when changing configuration during runtime
  • Fixed an issue with redundant HCA configuration on topology changes
  • Fixed trailing newlines from syslog messages
  • Fixed hang due to race when handling CC/VS/N2N KeyViolation MADs
  • Fixed an issue with failing DFP routing engine after failing to assign LID to CA
  • Fixed an issue with failing to configure routing tables on all switches
  • Fixed an issue with reporting APort attributes misalignment
  • Fixed an issue with multicast routing of SendOnly members in planarized networks
  • Fixed an issue with multicast routing of Aggregation Nodes
  • Fixed race condition when parsing the partition configuration from UFM plugin
5.24.0
  • Fixed an issue with routing calculation when using ar_tree_asymmetric_flow 3
  • Fixed an issue with configuring routing using dfp2 routing engine
  • Fixed an issue with routing calculation to isolated switches and single switch system on planarized topologies
  • Fixed an issue with handling switches not responding to ARInfo on planarized topologies
  • Fixed an issue with routing calculation time on planarized topologies
  • Fixed an issue with recovering after parsing malformated configuration files
  • Fixed an issue with handling faulty devices when ISSU is activated
  • Fixed an issue with running SM with issu_mode 2 (passive mode)
  • Fixed an issue with ignoring devices due to M_Key mismatch after lease period expires
  • Fixed an issue with updating M_Key for new ports
  • Fixed an issue with logging port recovery traps and ISSU traps
  • Fixed an issue with logging topology changes when using routing chains
  • Fixed an incorrect trap name when logging key violation traps
5.23.0
  • Fixed a crash issue when multiple switches fail to reply to MADs and ucast_cache is enabled
  • Fixed an issue with re-discovering device after timeouts when topology file is specified
  • Fixed a crash issue when enabling congestion control keys during runtime
  • Fixed clearing FLID of removed spines from router LID tables
  • Fixed an issue with configuring SL to VL mapping when using specific settings for switch-to-switch links
  • Fixed logging error when enhanced QoS file is empty
  • Fixed an issue with configuring trap P_Key for VS,CC and N2N management classes
5.22.11
  • Fixed race condition while processing SA requests during heavy sweep
  • Fixed an issue with sending multiple optimized SL2VLMapping MADs to a single device
  • Fixed an issue with redundant router table updates on light sweeps
  • Fixed an issue with configuring pFRN on devices that does not support FRN
5.22.0
  • No bug fixes are introduced in this release
5.21.0
  • Fixed an issue with non optimal routing in enhanced asymmetric routing algorithm
  • Fixed an issue with sending client reregister to switches when not required
  • Fixed an issue with redundant sweep when restarting SM
  • Fixed potential crash when SMs device capability changes
  • Fixed a binding issue when UMAD returns invalid IB devices
  • Fixed a memory leak issue when parsing hexadecimal numbers in QoS configuration file
5.20.0
  • Fixed an issue with setting incorrect ARGroupToRouterLIDTable block
  • Fixed an issue with setting incorrect value in RouterInfo.max_ar_group_id
  • Fixed an issue with configuring incorrect routes to switches in ar_minhop
  • Fixed an issue with configuring incorrect routes to routers after SM port reconnection
5.19.0
  • Fixed an issue in root detection to identify roots on topologies with two switches
  • Fixed an issue with setting transaction_retries to 0
  • Fixed an issue with not resending RouterLIDTable MADs after timeouts
  • Fixed a logging issue when handling invalid MC join requests
5.18.0
  • Fixed an issue with supporting DF+ topologies with more than 341 leaf switches in dfp2 routing engine
  • Fixed an issue with assigning empty AR groups between AR group 0
  • Fixed an issue in root detection algorithm related to selecting non-optimal roots
  • Fixed an issue of not resending routing tables to switches after switch restarts
  • Fixed an issue with incorrect port states in SMDB file of sub-topologies
5.17.0
  • Fixed the process of marking peers of isolated ports as flapping ports
  • Fixed root detection algorithm to support additional cases of missing links
  • Fixed the process of sending N2N MADs to isolated devices
  • Fixed the process of sending CC MADs to isolated hosts
  • Fixed an issue that caused repeated heavy sweep when isolating HCA with two ports with the same node GUID
  • Fixed NDR port labels
  • Fixed an issue related to SM not resending MEPI after timeout
5.16.0
  • Fixed a race condition when parsing configuration file that can result in SM crash
  • Fixed an issue with reconfiguring switch after removed from held-back list
  • Fixed a redundant heavy sweep when suspecting a port is unhealthy
  • Fixed the handling process of switches with all port marked as 'no_discover'
  • Fixed an issue with writing 'unknown vendor' to screen
5.15.0
  • Fixed the reporting of PortInfo validation failure for rebooted HCA ports
  • Fixed the writing of VS and CC MADs in debug verbosity
  • Fixed an issue related to sending SL2VL MADs for unhealthy ports
  • Fixed an issue related to the initiating heavy sweep process when reporting new unhealthy port
5.14.0
  • Fixed a crash that occurred when Incremental Multicast Routing was enabled
  • Fixed an issue related to the enabled PLFT2 on DF+2 when the "dive-ins" are not permitted
  • Marked ports that did not respond to the NI as unhealthy instead of entire node
  • Fixed root detection algorithm in trees with missing links
  • Fixed root detection algorithm in DF+ with roots without leaves
5.13.0
  • Fixed an issue related to removing ServiceRecords when the port is disconnected.
  • Set threads affinity according to the scheduler affinity.
  • Enabled dumping SMDB file when ucast cache feature is enabled.
  • Fixed event reporting to untrusted subscribers.
  • Fixed an issue related to creating service records with P_Key 0.
  • Fixed the handling of routers marked as "unhealthy" when using fat-tree routing engine.
  • Fixed an issue related to creating a dump files directory when it did not exist on startup.
5.12.0
  • Fixed a crash that occurred when drop_subscr_on_report_fail was enabled.
  • Fixed a case that caused FRN to fail when there were isolated/heldback switches.
  • Fixed a memory leak when changed the list of routing engines during runtime.
  • Fixed an issue that prevented from ports to be directly activated in INIT state.
  • Fixed an issue that prevented activating virtual ports on first time master sweep when running with --once.
  • Fixed a memory leak when parsing QoS policy file with errors.
  • Fixed an issue that prevented the incrementation of of outstanding AN2AN/VS/CC MADs when no response was expected.
  • Added support for routers in FTREE routing engine.
  • Fixed an issue related to AR LFT in trees that had entries with FREE state and empty group 0.
5.11.0
  • Fixed unconditional jump on uninitialized value when in dfp2 when ar_sl_mask is set to 0.
  • Fixed a case of duplicated LIDs when persistent SM LID feature is enabled.
  • Fixed invalidating ucast cache when discovering faulty switch.
  • Fixed a crash when detecting two ports of the same node with different port GUID but the same port number.
  • Fixed traps 1310 and 1311 (duplicate GUIDs) type to 'security'.
  • Fixed reporting trap 1312 to UFM.
5.10.0
  • Fixed a crash that occurred during a race between the LFT record get query and routing configuration.
  • Fixed a non-generic notices statistics counters in the dump file.
  • Fixed the postponing isolation and reporting process of the noisy ports.
  • Fixed an issue related to the selecting of the held back/isolated switches as roots for multicast trees.
  • Fixed an issue that caused the unresponsive links to to remain in Active state.
  • Fixed an issue that affected the writing of invalid AN2AN links to SMDB dump file.
  • Fixed the IPoIB traffic loss after changing the subnet prefix and loading the MC groups from SADB upon SM restart.
  • Fixed the SM build on Debian with libibumad from rdma-core.
  • Fixed the way how port capability changes are handled during runtime.
  • Fixed an incorrect endianness issue in error log message 0F29.
  • Fixed an incorrect log message when enabling SHARP on the device.
  • Fixed the statistics counters race condition with SM multi port.
  • RFixed rewriting of the statistics file when the existing file had different header than the current. In case the previous header is different from the current, a backup of the old file is created as well as the updated statistics file.
5.9.1
  • Fixed a crash incident when isolating the switch using:

    • the "held_back_sw_guid" file while running SM with updn/ar_updn
    • using GUIDs order file with a port group that includes HCAs that are connected to a held-back switch
  • Fixed an issue that resulted in breaking routing for virtual port LIDs upon failover/restart
  • Fixed an issue that caused ar_ftree to create non-credit loop free routing between IO nodes
  • Fixed an issue that resulted in continuation of the discovery stage during subnet configuration stage
  • Fixed an issue that missed getting MEPI after switch reset
  • Fixed multicast group leak when handling leave of SendOnlyFullMembers of multicast groups
  • Fixed a leak when spoofing notice 144 for virtual ports
5.8.1
  • Enabled SA requests with default subnet prefix in GRH on subnet with non-default subnet prefix
  • Fixed a crash when processing virtual ports after aborted heavy sweep
  • Fixed a wrong direct route for GeneralInfo MADs after coming out-of-standby
  • Fixed s crash in UPDN LID tracking that happened when multithreading was enabled
  • Fixed file descriptor leakage when running with crashd
  • Fixed an issue that resulted in setting default pkey at index 0 on invalid partitions.conf
  • Fixed an issue that prevented setting ar_sl_mask on hosts when running with armgr plugin
  • Freed alias GUIDs resources when deleting virtual port object
  • Fixed checking 2x link width capability
  • Enabled handling MCMemberRecord request with default subnet prefix on subnet with non-default subnet prefix
5.7.1
  • Fixed memory overflow upon virtual ports removal from the Subnet when using Adaptive Routing.
  • Fixed handling ‘;’ and ‘:’ in nodes names in port groups policy file parser.
  • Fixed missing routes-to-routers after recovery the routing engine in Dragonfly+ .
  • Fixed port_search_order usage when LMC is enabled.
  • Fixed SA LinkRecords and MultipathRecords LMC support.
  • Fixed partition checking for LinkRecord and PortInfoRecord queries.
  • Fixed dedicated groups calculation for switches with ANs when FRN enabled.
  • Fixed router support in port groups.
  • Fixed an issue that prevented SADB dumping when updating service records.

© Copyright 2025, NVIDIA. Last updated on Mar 10, 2026