Changes and New Features
v5.19.0 |
|
FLID |
Added support for assigning single FLID to multiple leaf switches. It is enabled via router configuration file. |
PFRN |
Added support for PFRN over routers. The control parameters: pfrn_over_router_enabled - Enable/disable the feature. 0 - SM will not change PFRN over router related configuration. 1 - Disable the feature 2 - Enable the feature (default) |
Routing Algorithm |
Added support for new routing algorithm for asymmetric tree topologies. It supports asymmetric quasi fat-tree topologies. It is enabled by setting ar_tree_asymmetric_flow to 3. |
General |
Add neighborhood ID to switch coordinates in SMDB when using ar_updn. |
General |
Select hub switch according to maximal number of reachable LIDs. |
General |
Log remove links due to timeouts. |
General |
Log switch rank changes. |
v5.18.0 |
|
Routing Engines |
Added support for Asymmetric Routing for QFT topologies and Enhanced algorithm for routing asymmetric QFT topologies. It supports 2 and 3 level QFT without parallel links. Feature control parameters:
|
Routing Engines |
Added support for calculating missing routes between switches in dfp2 routing engine. |
Debug Parameters |
Added Warning notes about debug parameters in the generated configuration file. Note: It is not recommended to change these debug parameters. |
Logging Improvements |
Improved logging of trap 259, and discovery process. |
Adaptive Routing Options |
Added the option to enable and disable AR group copy optimization. |
v5.17.0 |
|
Tenant Managers |
This new capability prevents a tenant from using GUIDs that were assigned to another tenant. Feature control parameters:
|
Rank Adjustment |
Enabled rank adjustment for mid-level switch with missing links to roots. Feature control parameters:
|
PFRN |
enabled the option to configure PFRN even when not all devices support it. |
‘Well-Known SM GID’ Support |
Added support for handing ‘Well-Known SM GID’ in PathRecords. |
v5.16.0 |
|
Congestion Control |
Added support for configuring CC buffer thresholds according to the switch capabilities. |
Performance |
Improved performance of routing calculation for routers. |
Unicast Route Rebalancing |
Added the option to avoid unicast route rebalancing when HBF is enabled on all SLs. |
Port GUID |
Added the option to write the destination port GUID when logging direct routed SMPs. |
Systemd |
Added support for Systemd. |
v5.15.0 |
|
Programmable Congestion Control |
Extended IBCC to support programmable congestion control. Feature control parameters:
|
Fast Recovery |
Added Fast Recovery support to configure a policy on switches for reporting ports as unhealthy, and to support isolating the unhealthy ports. Feature control parameters:
|
General |
Added support for running SM with a topology specification file Feature control parameters:
|
General |
Added support for generating SM performance report. the report file is created in the SM logs directory with the name "opensm-perflog.json". Feature control parameters:
|
General |
Added port label for supporting NDR switches and CAs. |
General |
Added support for additional predefined port groups:
|
General |
Limited the number of simultaneous SL2VL and VLArb MADs sent per device. |
General |
Enabled dumping FLID ranges to the opensm-router.dump file |
Routing |
Added support for running an SM root detection algorithm when root GUIDs file is invalid. |
Routing |
Improved FLID routing calculation time. |
v5.14.0 |
|
General |
Updated the subnet configuration flow as follow:
|
General |
Added support for multithreaded SL2VL calculation. |
Routers |
Added support for AR over routers (FLIDs). |
DFP Routing Engine |
Added support for persistent AR groups for DF+ 1 (dfp) routing engine. |
UPDN and AR UPDN Routing Engine |
Improved UPDN minhop tables calculation times. |
v5.13.0 |
|
SL-to-VL Mapping Table |
Added support for port masked optimized SLtoVLMappingTable programming. |
DFP2 Routing Engine |
Added support for PFRN on DFP2 routing engine. |
PKey Validation Traps |
Added support for suppressing multicast PKey validation traps. |
General |
Updated the default value of drop_event_subscriptions to TRUE. |
General |
Updated the default value of drop_subscr_on_report_fail to TRUE. |
General |
Enabled sending LFT/ARLFT entry for SMLID last. |
MTU/Rate |
Enabled MTU and Rate calculation for router PathRecords according to route. |
v5.12.0 |
|
Self Healing Network with PFRN |
Self healing network with PFRN is now at GA level for AR_UPDN routing engine. |
Hash Based Forwarding (HBF) |
Hash Based Forwarding (HBF) is now at GA level |
Adaptive Routing Engine |
Improved routing calculation time for AR_UPDN routing engine. |
Adaptive Routing |
A dedicated AR group ID per leaf is now assigned also when SHIELD is disabled. |
General |
Added the option to avoid initializing links marked for port resets. |
General |
Updated root_guid_file parameter description. |
General |
Removed from guid2lid vPorts with LID required set to 0. |
DragonFly+ Topologies |
Improved root detection algorithm for DragonFly+ topologies to support leaf switches without hosts. |
v5.11.0 |
|
Self healing network with PFRN |
[Beta] PFRN is used for fast link fault recovery. If a link fails or disconnects, switches send messages to the peer switches to update the routing tables. This feature is supported only on ar_updn and ar_ftree routing engine and if all fabric switches support PFRN (NVIDIA Quantum and NVIDIA Quantum-2 switch systems only). Feature control parameters:
In order to disable PFRN, set shield_mode value to 2. |
Multiport high availability |
Allows SM to failover to another port in the case of SM link failure. It requires configuring more than one port GUID in the GUID parameter. Feature control parameter:
|
Hash Based Forwarding (HBF) |
Allows selection of the switch outgoing port for statically routed packets based on the packet's parameters (ECMP like). With dfp2 routing engine, non-minhop routes will be used for static routing as well as for Adaptive Routing. Feature control parameters:
|
SA response time |
Improved SA response time for multicast join requests during routing calculation. |
Persistent mapping |
Added support for persistent mapping between AR group ID and the destination switch GUID. |
Switch SMA response MADs |
Switch SMA response MADs are now routed using PLFT0 to overcome a firmware limitation in dfp2. |
SM ports table |
Added SM ports table to SMDB. |
Log message verbosity |
Changed verbosity of log message when toggling ports to INFO. |
Port state report |
Added the option to report to the log when failing to update port state from ARM to ACTIVE. |
Virtualization traps |
Added details for virtualization traps to the log file. |
Statistics dump file per SM |
Enabled statistics dump file per SM port by default. |
Asymmetric flow algorithm for trees |
Enabled asymmetric flow algorithm for trees (ar_updn and ar_ftree) by default. |
v5.10.0 |
|
Adaptive Timeout SL Mask |
Added support for Adaptive Timeout SL mask. Feature control parameter :
|
IB Router QoS |
Extended the QoS policy file to support subnet prefixes and port GIDs for inter subnet QoS. This improvement enables the definition of SL/rate/MTU/packet-life for cross-subnet paths. For further information, refer to the doc/QoS_management_in_OpenSM.txt document. |
Routing Engine |
Added a new root detection algorithm in UPDN and ar_updn routing engines. Feature control parameters :
|
Routing Engine |
Changed the default routing engine to be ar_updn instead of minhop. |
Vendor Specific (VS) Key |
Added support for Vendor Specific (VS) key. The following are the parameters related to the feature:
|
SA Response Time |
Improved SA response time during routing calculation. Feature control parameter:
|
Report Duplicated GUIDs |
Added support for reporting duplicated GUIDs to UFM. |
Switch Reboot |
Added the option to report switch reboots to UFM. |
Long Transaction Timeout |
Enabled the option to use long transaction timeout for PI for port 0 MADs. |
SMDB Dump File |
Added subnet prefix to SMDB dump file. |
SM Binding Port Information |
Added SM binding port information to the MAD details in the timeout message, dumped to the SM log file. |
OpenSM Start Time |
Added SM start time to the SMDB dump file. |
Dump MAD Statistics per SM Port |
Added the option to enable dump MAD statistics per SM port. Feature control parameter:
|
Adaptive Routing (AR) Group IDs |
Made the process of selecting AR (Adaptive Routing) group IDs deterministic in each run of the SM on the same fabric. |
v5.9.1 |
|
Link Speed |
Added support for NDR InfiniBand link speed in SM, |
Configuration File Validation |
Added a new command line option "--validate_conf_files" to enable SM to only validate configuration files and exit afterwards. Note: This version of the tool supports only the validation partition file part. |
Persistent Multicast (MC) Trees |
This capability enables reading MulticastForwardingTables tables upon SM startup/fail-over to ensure the new MASTER SM does not break multicast routing. To enable/disable it use the "get_mft_tables" parameter (default TRUE). |
DragonFly+ Topologies |
Added SHIELD support for dfp2 routing engine for DragonFly+ topologies. |
SM Allowed GUIDs List |
This new capability enables the user to specify the list of GUIDs allowed to run SM in the fabric. When the list is provided, the master SM will avoid handover to ports that are not specified in the list. To enable this feature use the "allowed_sm_guids" parameter. When set to "(null)", the feature disabled. |
Limiting the Number of VLs for Long Distance Links |
This new capability enables the user to set the maximum operational VL per port by a new file specified by the "device_configutarion_file" parameter in the OpenSM configuration file. To provide per port configuration use the "device_configutarion_file". For more information, see doc/device_configuration.md. |
Send ClientReregister after Subnet Configuration |
This new capability enables the user to send ClientReregister after subnet configuration to prevent the hosts from sending SA requests to the SM before the SM is ready to respond to them. This feature can be controlled using the following parameters: client_rereg_mode - Control modes of sending ClientRergister. Supported values:
The new parameter replaces the depracated "no_clients_rereg" parameter. |
kDOR Generalized Hypercube Engine |
Added kDOR Generalized Hypercube engine. |
General |
|
v5.8.1 |
|
Asymmetric trees |
The feature is applicable to ar_updn and ar_ftree routing engines. It reduces congestion in asymmetric tree topologies with missing uplinks on leaf switches. To enable/disable the feature, use the ar_tree_asymmetric_flow parameter. The supported values are:
|
Selecting LID for Master SM |
This feature prevents SM LID changes upon fail-over. To set the LID for master SM, use the master_sm_lid parameter. The supported values are:
|
Root GUIDs file for Dragonfly+ Routing Engines |
This feature enables root GUIDs file for Dragonfly+ topology Routing Engines (dfp and dfp2). To set the file with GUIDs of root switches of the topology use the root_guid_file parameter. |
Dragonfly+ Routing Engine |
Added new routing engine (dfp2) for Dragonfly+ topologies. This engine supports Dragonfly+ topologies with any kind of tree topology islands. If the topology contains an island with more than 2 tree levels, the root GUIDs file, including the root switches of all Dragonfly+ islands should be provided. To add the dfp2 new Routing Engine, use the routing_engine parameter. |
Maximum Operational VLs for Ca, Routers and Switches |
This feature enables the user to configure different max_op_vls for CAs, Routers and Switches. To set the maximum operational VLs per device type, use the following parameters:
|
“VL packing” for Dragonfly+ and KDOR Routing Engines. |
Added support for “VL packing” for Dragonfly+ and KDOR routing engines. This feature reduces the maximum operational VLs for CAs to half of subnet max_op_vls when using dfp/dfp2/kdor_hc routing engines. To enable/disable the feature, use the enable_vl_packing parameter. The following is an example of “VL packing”:
|
Support SRP target on HCAs with Socket-Direct architecture/Virtual Machines. |
Enabled returning PortInfoRecord and NodeRecord for virtual ports and reporting virtual port capability changes To enable/disable the feature (default TRUE), use the enable_virt_rec_ext parameter. |
General |
|
v5.7.2 |
|
General |
|
v5.7.1 |
|
Multiple ports |
Allows MLNXSM to use multiple ports for sending Subnet configuration MADs. Feature control parameters:
|
Extend router selection algorithm |
Supports specifying hash function, seed and additional hash function arguments for router selection during path records calculation. Feature control parameters:
|
LMC for routers and number of LIDs allowed per router for inter-subnet path records |
Feature control parameters:
|
Congestion Control |
Feature control parameters:
For additional information, please review congestion_control.md file provided with MLNXSM. |
LIDs range in Routing Chains |
Replaces path-bit qualifier in routing chain configuration by min-path-bit and max-path-bit qualifiers. (path-bit is still supported for backward compatibility). Example of usage:
|
Controlling maximum number of MADs on wire per destination |
Feature control parameters:
|
Configuring service keys to service name |
Service keys' configuring service names. Feature control parameters: service_name2key_map_file - Path to service name to service key map file. File contains mapping from service name to service key which is specified in IPv6 format. For example, map service name <SERVICE NAME> service key 0::1 by adding the following line to the file: <SERVICE NAME> 0::1 |
General |
|
Default Configuration Changes |
|
Parameter Name |
Status |
Type |
Description |
5.18.0 |
|||
ar_tree_asymmetric_flow |
Update |
Numeric |
Support enabling new asymmetric routing algorithm for QFT topologies |
max_op_vls |
Update |
Numeric |
Changed default from 3 (4 data VLs) to 2 (2 data VLs) |
enable_ar_group_copy |
New |
Boolean |
Debug option to allow controlling AR group copy optimization. (default TRUE) |
5.17.0 |
|||
tenants_policy_enabled |
New |
Boolean |
Enable tenants manager feature (default is FALSE) |
tenant_policy_file |
New |
Path |
Path to tenants manager policy file (default is /etc/opensm.tenants-policy.conf) |
routing_flags |
New |
Mask |
Control various routing engine options (default is 0) |
5.15.0 |
|||
max_wire_smps |
Update |
Numeric |
Change default value to 32 |
max_wire_smps2 |
Update |
Numeric |
Change default value to 32 |
hbf_sl_mask |
Update |
Numeric |
Change default value to 0xffff |
topo_config_enabled |
New |
Boolean |
Enable running SM with topology spec file (default is FALSE) |
topo_config_file |
New |
Path |
Path to topology specification file (default is "(null)") |
fast_recovery_enabled |
New |
Numeric |
Enable fast recovery feature (default is 0) |
fast_recovery_conf_file |
New |
Path |
Path to fast recovery file (default is "(null)" |
ppcc_algo_dir |
New |
Path |
Path to directory with PPCC algorithm profile files (default is /etc/opensm/ppcc_algo_dir) |
enable_performance_logging |
New |
Boolean |
Enable generating SM performance report (default is TRUE) |
osm_perflog_dump_limit |
New |
Numeric |
Limit performance log file size in megabytes (default is 20) |
max_seq_redisc |
Update |
Numeric |
Change default value to 4 |
qos_policy_file |
Update |
Path |
Change default value to "(null)" |
routing_engine |
Update |
String |
Update description that dfp2 is not experimental |
5.14.0 |
|||
qos |
Update |
Boolean |
Changed default to TRUE |
use_optimized_slvl |
Update |
Boolean |
Changed default to TRUE |
use_optimized_port_mask_slvl |
Update |
Boolean |
Changed default to TRUE |
long_transaction_timeout |
Update |
Numeric |
Changed default from 500 milliseconds to 1000 milliseconds |
5.13.0 |
|||
root_guid_file |
Update |
Path |
Updated description to include all supported routing engines |
suppress_mc_pkey_traps |
New |
Boolean |
Suppress Multicast PKey violation traps (default is TRUE) |
drop_subscr_on_report_fail |
Update |
Boolean |
Changed default to TRUE |
drop_event_subscriptions |
Update |
Boolean |
Changed default to TRUE |
use_optimized_slvl |
Update |
Boolean |
Updated description that parameter control wild carded optimization |
use_optimized_port_mask_slvl |
New |
Boolean |
Enable port masked optimized SLtoVLMappingTable programming (default is FALSE) |
rtr_pr_mtu |
Update |
Numeric |
Change default to 255 (Calculate PathRecord MTU according to route) |
rtr_pr_rate |
Update |
Numeric |
Change default to 255 (Calculate PathRecord Rate according to route) |
5.12.0 |
|||
root_guid_file |
Update |
Path |
Updated description to include all supported routing engines. |
5.11.0 |
|||
pfrn_sl |
New |
Number |
SL for PFRN messages. Default 0 |
pfrn_mask_clear_timeout |
New |
Number |
Time since last PFRN for an AR group to clear unused port masks. Default 180 |
pfrn_mask_force_clear_timeout |
New |
Number |
Time since last mask clear, after which unused port mask is cleared by the switch. Default 720 |
n2n_key_enable |
New |
Number |
Enable Node-to-Node Key (management class 0xC) configuration. Default 0 (Ignore) |
n2n_key_protect_bit |
New |
Number |
Protection level for class 0xC. Default 1 |
n2n_key_lease_period |
New |
Number |
Lease period for class 0xC key. Default 60 |
n2n_max_outstanding_mads |
New |
Number |
Maximum number of N2N MADs in the network at once. Default 500 |
enable_sm_port_failover |
New |
Boolean |
Enable SM fail over to another port in case of link failure. Default is False |
hbf_sl_mask |
New |
Boolean |
SL mask for HBF. Default 0x0000 (Disabled) |
hbf_hash_type |
New |
Number |
HBF hash type. Default 0 (CRC) |
hbf_seed_type |
New |
Number |
HBF seed type. Default 0 (User defined seed) |
hbf_seed |
New |
Number |
Seed for HBF. Default 0xFFFFFFFF (Use switch GUID) |
hbf_hash_fields |
New |
Number |
Fields of packet for hash calculation. Default 0x40F00C0F |
hbf_weights |
New |
String |
Weights ratio between ports of different groups. Default 'auto' (Routing algorithm decision) |
cache_ar_group_id |
New |
Boolean |
Load GUID to AR group ID cache file on startup. Default TRUE |
ar_tree_asymmetric_flow |
Update |
Number |
Changed default to 1. |
sm_stats_dump_per_sm_port |
Update |
Boolean |
Changed default to TRUE, |
5.10.0 |
|||
adaptive_timeout_sl_mask |
New |
Number |
Define a adaptive timeout SL mask of the port. Default 0xFFFF |
routing_engine |
Update |
String |
Changed default value from (null) to ar_updn |
find_roots_color_algorithm |
New |
Boolean |
Find root using coloring algorithm for tree based topologies. Default is TRUE. |
max_cas_on_spine |
New |
Boolean |
The maximum number of CAs on a switch to allow considering it as a spine instead of a leaf by the routing algorithm. |
hm_num_traps |
Update |
Number |
Changed default value from 250 to 60. |
hm_num_traps_period_secs |
Update |
Number |
Changed default value from 60 to 90 seconds. |
5.9.1 |
|||
allowed_sm_guids |
New |
String |
Define list of allowed SM port GUIDs |
device_configuration_file |
New |
String |
Path to device configuration file |
client_rereg_mode |
New |
Number |
Control sending ClientReregister to devices |
max_rate_enum |
New |
Number |
Define maximal supported rate in SA records |
gmp_traps_threads_num |
New |
Number |
Number of threads for processing GMP traps |
get_mft_tables |
New |
Boolean |
Enable/Disable reading MFT tables on first master sweep |
routing_engine |
Update |
String |
Support kdor-ghc for Generalized Hypercube routing engine |
mepi_cache_enabled |
Update |
Boolean |
Changed default from FALSE to TRUE |
no_clients_rereg |
Update |
Boolean |
Deprecated by client_rereg_mode |
use_original_extended_sa_rates_only |
Update |
Boolean |
Deprecated by max_rate_enum |
dfp_down_up_turns_mode |
Update |
Number |
Changed default from 0 to 2 (disable down/up turns) |
routing_threads_num |
Update |
Number |
Changed default value from 1 to 0 |
force_link_speed_ext |
Update |
Number |
Support NDR speeds |
5.8.1 |
|||
max_wire_smps |
Update |
Number |
Change default from 4 to 16 |
max_wire_smps2 |
Update |
Number |
Change default from 4 to 16 |
max_smps_timeut |
Update |
Number |
Change default from 600000 to 300000 milliseconds |
max_msg_fifo_timeout |
Update |
Number |
Change default from 10000 to 5000 milliseconds |
transaction_timeout |
Update |
Number |
Change default from 200 to 100 milliseconds |
enable_crashd |
Update |
Boolean |
Change default from FALSE to TRUE |
routing_engine |
Update |
Text |
Support dfp2 routing engine |
master_sm_lid |
New |
LID |
LID for local SM when in MASTER state |
enable_virt_rec_ext |
New |
Boolean |
Enable PortInfoRecord/NodeRecord for virtual ports/nodes |
ar_tree_asymmetric_flow |
New |
Number |
AR Asymmetric trees max flow algorithm |
max_op_vls_ca |
New |
Number |
max_op_vl for CAs |
max_op_vls_sw |
New |
Number |
max_op_vl for switches |
max_op_vls_rtr |
New |
Number |
max_op_vl for routers |
enable_vl_packing |
New |
Boolean |
Enable VL packing |
5.7.2 |
|||
ar_sl_mask |
Existing |
Number |
Modified behavior: Parameter controls AR SL mask both in switches and HCAs |
5.7.1 |
|||
enable_lst_file |
New |
Boolean |
Control dumping subnet LST file of the topology |
lids_per_rtr |
New |
Number |
Control number of LIDs per router of inter-subnet path record |
max_wire_smps_per_device |
New |
Number |
Control maximum number of MADs on wire per device |
service_name2key_map_file |
New |
Path |
Path to service name to service key map file |
rtr_selection_function |
New |
String |
Hash function to be used by router selection algorithm |
rtr_selection_seed |
New |
Number |
Seed for router selection algorithm |
rtr_selection_algo_parameters |
New |
String |
Comma separated list of parameters for router selection algorithm |
updn_lid_tracking_prefer_total_routes |
New |
Boolean |
Control UPDN LID tracking exit port selection criteria |
mlnx_congestion_control |
New |
Number |
Control Mellanox Congestion Control enablement |
congestion_control_policy_file |
New |
Path |
Path to Congestion Control policy file |
guid |
Update |
List |
Changed the type from GUID to list of commas separated GUIDs |
scatter_ports |
Update |
Number |
Changed default value from 0 (disabled) to 8 |
log_flash |
Update |
Boolean |
Changed default value from FALSE to TRUE |
max_topologies_per_sw |
Update |
Number |
Changed default value from 1 to 4 |