What can I help you with?
NVIDIA UFM Enterprise User Manual v6.19.5

Appendix – UFM Subnet Manager Default Properties

The following table provides a comprehensive list of UFM SM default properties.

Category

Property

Config File Attribute

Default

Mode/ Field

Description

Generic

Subnet Prefix

subnet_prefix

0xfe80000000000000

RW

Subnet prefix used on the subnet 0xfe80000000000000

LMC

lmc

0

RW

The LMC value used on the subnet: 0-7

Changes to the LMC parameter require a UFM restart.

SM LID

master_sm_lid

0

Force specific LID for local SM when in MASTER state

Selected LID must match configured LMC

0 disables the feature

Security Keys

M_Key

m_key

0x0000000000000000

RW

M_Key value sent to all ports -used to qualify the set(PortInfo)

Not recommended

m_key_per_port

FALSE

RW

When enabled, OSM will generate unique M_Key for each HCA/RTR port and Switch port0.

m_key_lease_period

60

RW

The lease period used for the M_Key on the subnet in [sec]

m_key_lookup

TRUE

RW

If FALSE, SM will not try to determine the m_key of unknown ports.

SM_Key

sm_key

0x0000000000000001

RO

SM_Key value of the SM used for SM authentication

SA_Key

sa_key

0x0000000000000001

RO

SM_Key value to qualify incoming SA queries as 'trusted'

Key manager

key_mgr_seed

0x0000000000000000

RW

Parameter used by key manager for CC/VS key configuration.

if 0, uses mkey as a key for CC and VS Classes. Otherwise, use

this parameter as a seed for CC/VS key generation.

Congestion control

security keys

cc_key_lease_period

60

RW

The lease period used for CC Keys in [sec]

cc_key_protect_bit

1

RW

The protection level used for CC Keys. Supported values:

0: Protection is provided. However, CC managers are allowed to read the key by KeyInfo GET.

1: Protect subnet ports with CC key.

Vendor specific MADs security keys

vs_key_enable

0

RW

Enable Vendor Specific Key Configuration. If enabled, VS keys are

configured using a seed indicated by key_mgr_seed.

Supported values:

0: Ignore VSKey

1: Disable VSKey

2: Enable VSKey#

vs_key_lease_period

60

RW

The lease period used for VS keys in [sec]

vs_key_ci_protect_bits

1

RW

The protection level used for VS Keys. A mode defined by protect bit

and protection scope. Supported values:

0: Protection is provided. However, VS managers are allowed to read the key by KeyInfo GET.

1: Protect subnet ports with VS key for both Informational and Configurational MADs

Partition enforcement

part_enforce

both (default- outbound and inbound enforcement enabled)

RO

Partition enforcement type (for switch ports)

Limits

Packet Life Time

packet_life_time

0x12

RW

The maximum lifetime of a packet in a switch.

The actual time is 4.096usec * 2^<packet_life_time>

The value 0x14 disables the mechanism

VL Stall Count

vl_stall_count

0x07

RO

The number of sequential packets dropped that cause the port to enter the VL Stalled state. The result of setting the count to zero is undefined.

Leaf VL Stall Count

leaf_vl_stall_count

0x07

RO

The number of sequential packets dropped that causes the port to enter theleaf VL Stalled state. The count is for switch ports driving a CA or gateway port. The result of setting the count to zero is undefined.

Head Of Queue Life time

head_of_queue_lifetime

0x12

RW

The maximum time a packet can wait at the head of the transmission queue. The actual time is 4.096usec * 2^<head_of_queue_lifetime>

The value 0x14 disables the mechanism

Leaf Head Of Queue Life time

leaf_head_of_queue_lifetime

0x10

RW

The maximum time a packet can wait at the head of queue on a switch port connected to a CA or gateway port.

Local PHY Error Threshold

local_phy_errors_threshold

0x08

RW

Threshold of local phy errors for sending Trap 129

Overrun Errors Threshold

overrun_errors_threshold

0x08

RW

Threshold of credit overrun errors for sending Trap 130

Subnet Timeout

subnet_timeout

18 (1 second)

RW

The Infiniband subnet_timeout that will be set for all the ports.

The actual timeout is 4.096usec * 2^<subnet_timeout>

VS MADs on the wire

vs_max_outstanding_mads

500

RW

Maximum number of vendor-specific mads in the network at once

Link Speed

Force Link Speed

force_link_speed

15

(Do NOT change)

RW

Force PortInfo: LinkSpeedEnabled on switch ports.

If 0, do not modify.

Values are:

1: 2.5 Gbps

3: 2.5 or 5.0 Gbps

5: 2.5 or 10.0 Gbps

7: 2.5 or 5.0 or 10.0 Gbps

2,4,6,8-14 Reserved

15: set to PortInfo: LinkSpeedSupported

Force Link Speed

force_link_speed_ext

31

(Do NOT change)

RW

1: 14.0625 Gbps

2: 25.78125 Gbps

3: 14.0625 Gbps or 25.78125 Gbps

4: 53.125 Gbps

5: 14.0625 Gbps or 53.125 Gbps

6: 25.78125 Gbps or 53.125 Gbps

7: 14.0625 Gbps, 25.78125 Gbps or 53.125 Gbps

8: 106.25 Gbps

9: 14.0625 Gbps or 106.25 Gbps

10: 106.25 Gbps or 25.78125 Gbps or 106.25 Gbps

11: 14.0625 Gbps or 25.78125 Gbps or 106.25 Gbps

12: 53.125 Gbps or 106.25 Gbps

13: 14.0625 Gbps or 53.125 Gbps or 106.25 Gbps

14: 25.78125 Gbps or 53.125 Gbps or 106.25 Gbps

15: 14.0625 Gbps, 25.78125 Gbps or 53.125 Gbps or 106.25 Gbps

30: Disable extended link speeds

# Default 31: set to PortInfo:LinkSpeedExtSupported

Force Link Speed

force_link_speed_ext2

7

(Do NOT change)

RW

1: 212.50 Gbps

Default 7: set to PortInfo:LinkSpeedExtSupported2

SM Threading

SMP MAD/Trap processing

smp_threads

0

RW

Number of threads to be used for processing SMPs, 0 stands for all available cores.

SA/GMP MAD processing

gmp_threads

0

RW

Number of threads to be used for processing GMPs,0 stands for all available cores.

SA/GMP Trap processing

gmp_traps_threads_num

1

RW

Number of threads to be used for processing key violation trap mads

by VS and CC managers

Routing Threads

routing_threads_num

0

RW

Number of threads to be used for parallel minhop/updn calculations.

If 0, number of threads will be equal to number of processors.

Routing Threads Per Core

max_threads_per_core

0

RW

Max number of threads that are allowed to run on the same processor during parallel computing.

If 0, threads assignment per processor is up to operating system initial assignment.

Sweep

Sweep Interval

sweep_interval

10

RW

The time in seconds between subnet sweeps (Disabled if 0)

Reassign Lids

reassign_lids

FALSE (disabled)

RW

If TRUE (enabled), all LIDs are reassigned

For debug only.

Force Heavy Sweep

force_heavy_sweep_window

-1

RW

Forces heavy sweep after number of light sweeps

(-1 disables this option and 0 will cause every sweep to be heavy)

Sweep On trap

sweep_on_trap

TRUE (enabled)

RW

If TRUE every trap 128 and 144 will cause a heavy sweep

Alternative Route Calculation

max_alt_dr_path_retries

0

RW

Maximum number of attempts to find an alternative direct route towards unresponsive ports

Fabric Rediscovery

max_seq_redisc

2

RW

Max Failed Sequential Discovery Loops

Offsweep Rebalancing Enable

offsweep_balancing_enabled

FALSE

RW

Enable/Disable idle time routing rebalancing

(deprecated)

Offsweep Rebalancing Window

offsweep_balancing_window

180

RW

Set the time window in seconds after sweep to start rebalancing

(deprecated)

Handover

SM Priority

sm_priority

15

RO

SM (enabled). The priority used for deciding which is the master. Range is 0 (lowest priority) to 15 (highest)

Ignore Other SMs

ignore_other_sm

FALSE (disabled)

RO

If TRUE other SMs on the subnet should be ignored

Polling Timeout

sminfo_polling_timeout

10

RO

Timeout in seconds between two active master SM polls

Polling Retries

polling_retry_number

4

RO

Number of failing remote SM polls that declares it non-operational

Honor GUID-to-LID File

honor_guid2lid_file

FALSE

(disabled)

RO

If TRUE, honor the guid2lid file when coming out of standby state, if the guid2lid file exists and is valid (not applicable to UFM SM)

Allowed SM GUID list

allowed_sm_guids

(null)

(disabled)

RW

Comma-separated list of Host GUIDs where SM is allowed to run when specified. OpenSM ignores SM running on port that is not in this list.

If 0, does not allow any other SM.

If null, the feature is disabled.

MAD handling

Max Wire SMPs

max_wire_smps

32

RW

Maximum number of SMPs sent in parallel

max_wire_smps2

32

RW

Maximum number of timeout-based SMPs allowed to be outstanding

A value less than or equal to max_wire_smps disables this mechanism

max_wire_smps_per_device

2

RW

Maximum number of SMPs sent in parallel to the same port.

Currently, the supported MADs are:

portInfo/Extended portInfo, LFTs, AR LFTs, AR group table, AR copy group table, RN subgroup direction,

SLVL table, VL Arbitration

Transaction Timeout

transaction_timeout

200

RO

The maximum time in [msec] allowed for a SMP Get/Set MAD sending transaction to complete

Transaction Timeout

long_transaction_timeout

1000

RO

The maximum time in [msec] allowed for a "long" transaction to complete.

Currently, long transactions are only used for

optimized SL2VLMappingTable and PortInfo for port 0 MADs

Transaction Retries

transaction_retries

3

RO

The maximum number of retries allowed for a SMP Get/Set MAD sending transaction to complete

Max Message FIFO Timeout

max_msg_fifo_timeout

10000

RW

Maximum time in [msec] a message can stay in the incoming message queue

Max Message FIFO Length

max_msg_fifo_len

20000

RW

Maximum number of messages that can reside in the incoming message queue,

before dropping SubnAdmGet/SubnAdmGetTbl requests.

Logging

Log File

log_file

/opt/ufm/files/log/opensm.log

RO

Path of Log file to be used

Log Flags

log_flags

Error and Info

0x03

RW

The log flags, or debug level being used.

Force Log Flush

force_log_flush

FALSE

(disabled)

RO

Force flush of the log file after each log message

Log Max Size

log_max_size

4096

RW

Limit the size of the log file in MB. If overrun, log is restarted

Accumulate Log File

accum_log_file

TRUE

(enabled)

RO

If TRUE, will accumulate the log over multiple OpenSM sessions

Dump Files Directory

dump_files_dir

/opt/ufm/files/log

RO

The directory to hold the file SM dumps (for multicast forwarding tables for example). The file is used collects information.

Syslog log

syslog_log

0x0

RW

Sets a verbosity of messages to be printed in syslog

Dump tables

dump_ar

FALSE

Enable adaptive routing data dump to file.

Misc

Node Names Map File

node_name_map_name

(null)

RW

Node name map for mapping node's to more descriptive node descriptions

SA database File

sa_db_file

(null)

RO

SA database file name

Client Reregistration

no_clients_rereg

FALSE

(disabled)

RO

If TRUE, disables client reregistration

(Depricated)

client_rereg_mode

2

RO

Control parameter for sending client reregistration options.

0 - Sending client reregistration disabled.

This option is kept for backward compatibility. Not recommended for use.

1 - Send client reregistration during LID assignment.

This option is kept for backward compatibility. Not recommended for use.

2 - Send Client reregistration during link activation.

Exit On Fatal Event

exit_on_fatal

TRUE

(enabled)

RO

If TRUE (enabled), the SM exits for fatal initialization issues

Enable NVIDIA SHARP support

sharp_enabled

2

RW

SHArP support

0: Ignore SHArP - No SHArP support

1: Disable SHArP - Disable SHArP on all supporting switches

2: Enable SHArP - Enable SHArP on all supporting switches

Multicast

Disable Multicast

disable_multicast

FALSE

(disabled)

RO

If TRUE, OpenSM should disable multicast support and no multicast routing is performed

Multicast Group Parameters

default_mcg_mtu

0

RW

Default MC group MTU for dynamic group creation. 0 disables this feature, otherwise, the value is a valid IB encoded MTU

Multicast Group Parameters

default_mcg_rate

0

RW

Default MC group rate for dynamic group creation. 0 disables this feature, otherwise, the value is a valid IB encoded rate

MC root file

mc_roots_file

(null)

RW

Specify predefined MC groups root guids

Incremental Multicast Routing (IMR)

enable_inc_mc_routing

TRUE

RW

If TRUE, MC nodes will be added to the MC tree incrementally. When set to FALSE, the tree will be recalculated per eachg change.

QoS

Settings

qos

TRUE

RW

If TRUE (enabled), SM will apply QoS settings

Settings

# QoS default options

qos_max_vls 4

qos_high_limit 0

qos_vlarb_high 0:0

qos_vlarb_low 0:160,1:112

qos_sl2vl 0,1,1,1,1,1,1,1,15,15,15,15,15,15,15,15

# QoS CA options

qos_ca_max_vls 0

qos_ca_high_limit -1

qos_ca_vlarb_high (null)

qos_ca_vlarb_low (null)

qos_ca_sl2vl (null)

# QoS Switch Port 0 options

qos_sw0_max_vls 0

qos_sw0_high_limit -1

qos_sw0_vlarb_high (null)

qos_sw0_vlarb_low (null)

qos_sw0_sl2vl (null)

# QoS Switch external ports options

qos_swe_max_vls 0

qos_swe_high_limit -1

qos_swe_vlarb_high (null)

qos_swe_vlarb_low (null)

qos_swe_sl2vl (null)

# QoS Switch-to-switch external port options

qos_sw2sw_max_vls 0

qos_sw2sw_high_limit -1

qos_sw2sw_vlarb_high (null)

qos_sw2sw_vlarb_low (null)

qos_sw2sw_sl2vl (null)

# QoS Router ports options

qos_rtr_max_vls 0

qos_rtr_high_limit -1

qos_rtr_vlarb_high (null)

qos_rtr_vlarb_low (null)

qos_rtr_sl2vl (null)

RW

Recommended SL2VL and VL arbitration settings when qos flag is set to TRUE

No UFM restart is needed upon updat

Maximal Operational VL

max_op_vls

2

RW

Limit of the maximum operational VLs

Note, SM will flap all fabric links to deploy the configuration upon parameter change.

QoS Policy

qos_policy_file

/opt/ufm/files/conf/opensm/qos-policy.conf

RW

QoS policy file to be used

Unhealthy Ports

Enabling Unhealthy Ports

hm_unhealthy_ports_checks

TRUE

RW

Enables Unhealthy Ports configuration

Configuration file

hm_ports_health_policy_file

(null)

RW

Specifies configuration file for health policy

Unhealthy actions

hm_sw_manual_action

no_discover

RW

Specifies what to do with switch ports which were manually added to health policy file

MADs validation

validate_smp

TRUE

RW

If set to TRUE, opensm will ignore nodes sending non-spec compliant MADs. When set to FALSE, opensm will log the warning in the opensm log file about non-compliant node

Routing

Unicast Routing

Engine

routing_engine

ar_updn

RW

By default, ar_updn routing engine is used by the SM.

Supported routing engines are minhop, updn, dnup, ftree, dor, torus-2QoS, kdor-hc, kdor-ghc , dfp, dfp2, ar_updn, ar_ftree and ar_dor.

Root GUIDs file

root_guid_file

/opt/ufm/files/conf/opensm/root_guid.conf

RW

The file holds the root node GUIDs of the topology.

Single GUID or port group in each line.

Feature supported by updn, ar_updn, ftree, ar_ftree and dfp2 routing engines.

Unicast Routing Caching

use_ucast_cache

TRUE

RW

Use unicast routing cache for routing computation time improvement

Adaptive routing

ar_sl_mask

0xFFFF

RW

AR SL mask - 16-bit bitmask indicating which SLs should be configured for AR

enable_ar_by_device_cap

TRUE

RO

Enable adaptive routing only to devices that support packet reordering.

When enabled, state in ARLFT entries for devices which do not support packet

reordering is set to static.

When disabled, ARLFT entries remain as determined by the routing engine.

Changing the default value is not recommended.

enable_ar_group_copy

TRUE

RO

Enable adaptive routing group copy optimization.

Changing the default value is not recommended.

ar_mode

3

RO

Adaptive routing mode

Supported values:

0 - Adaptive routing disabled.

1 - Enable local adaptive routing (switches select exit port based on local buffer utilization).

2 - Enable adaptive routing with notifications (deprecated).

3 - Auto mode in which adaptive routing is determined by the routing engine. (default)

Changing the default value is not recommended.

ar_transport_mask

0x000A

RW

AR Transport mask - indicates which transport types are enabled for AR

Bit 0 = UD, Bit 1 = RC, Bit 2 = UC, Bit 3 = DCT, Bits 4-7 are reserved.

cache_ar_group_id

TRUE

RW

Load GUID to AR group ID from cache file. When enabled, it can reduce AR group configuration changes after restart.

Adaptive Routing in Asymmetric Tree topologies

ar_tree_asymmetric_flow

1

RW

AR Asymmetric trees max flow algorithm

Supported values:

0 - Disable the algorithm.

1 - Enable with 1 subgroup support.

2 - Enable with 2 subgroups on leaf switches.

3 - Enable asymmetric tree algorithm

ar_tree_asymmetric_flow_threshold

0

RW

In use if ar_tree_asymmetric_flow is set to 3.

Threshold (percent) of BW drop between spine and core

links before excluding spines.

If a spine's spine to core BW percent drops below the threshold

due to link failures that spine will be eligible for removal from the AR group.

Example: ar_tree_asymmetric_flow_threshold 34

If spine has 32 links towards the core, then it would need to have 11 links removed before excluding the switch

(34% of 32 = 10.88)

ar_tree_asymmetric_flow_threshold_limit

2

RW

Threshold limit of nodes to exclude If the number of eligible spines to be excluded, (as determined by

the ar_tree_asymmetric_flow_threshold parameter) is higher than this limit, no spines will be removed

Changing the default value is not recommended.

routing_flags

0x0

RO

Bit mask of flags to control various options and behavior of routing engines.

Supported values:

0x1 - Enable switch rank adjustments for tree based routing engines.

SHIELD/PFRN

shield_mode

3

RO

Advanced routing - Fast link fault recovery feature

The feature is required for traffic resiliency upon switch-to-switch link failures.

Supported values:

0 - Fast link fault recovery disabled.

1 - Enable local fast link fault recovery only.

2 - Enable legacy fast link fault recovery with notifications (deprecated)

3 - Auto mode in which enhanced fast link fault recovery support is determined by the routing engine. (default)

ar_updn, ar_ftree and dfp2 will enable enhanced fast link fault recovery in the switches.

Otherwise, local fast link fault recovery only will be enabled in the switches.

Changing the default value is not recommended.

pfrn_sl

4

RW

SL for pFRN communication between switches.

Make sure pfrn_sl is properly mapped in sl2vl qos settings

pfrn_mask_clear_timeout

180

RW

Time (in seconds) since last pFRN for a specific subgroup was received, after which the entire mask must be cleared

Multiple of 60 seconds

pfrn_mask_force_clear_timeout

720

RW

Maximal time (in seconds) since last mask clear, after which mask must be cleared.

Multiple of 240 seconds

pfrn_over_router_enabled

2

RW

Enable pFRN over routers

0: Ignore - Do not change pFRN configuration on routers

1: Disable - Disable pFRN over routers

2: Enable - Enable pFRN over routers

Held back switches

held_back_sw_file

(null)

RW

The file holding the node GUIDs list to held back from routing

GUID Ordering During Routing

guid_routing_order_file

(null)

RW

The file holding guid routing order of particular guids (for MinHop, Up/Down)

Torus Routing

torus_config

/opt/ufm/files/conf/opensm/torus-2QoS.con

RW

Torus-2QoS configuration file name

Routing Chains

pgrp_policy_file

(null)

RW

The file holding the port groups policy

topo_policy_file

(null)

RW

The file holding the topology policy

rch_policy_file

(null)

RW

The file holding the routing chains policy

max_topologies_per_sw

4

RO

Defines maximal number of topologies to which a single switch may be assigned during routing engine chain configuration.

Hash based Forwarding

hbf_sl_mask

0xFFFF

RW

HBF (Hash-Based Forwarding) SL mask - 16-bit bitmask indicating which SLs should be configured for HBF

hbf_hash_type

0

RW

HBF (Hash-Based Forwarding) hash type: 0 - CRC, 1 - XOR

Changing the value is not recommended.

hbf_seed_typ

0

RW

HBF (Hash-Based Forwarding) hash seed type: 0 - Configurable, 1 - Random

Changing the value is not recommended.

hbf_seed

0xFFFFFFFF

RW

HBF (Hash-Based Forwarding) hash seed. Values 0 - 0xFFFFFFFF

0xFFFFFFFF stands for taking the 32 LSB of the node GUID as HBF hash seed.

Changing the value is not recommended.

hbf_hash_fields

0x0000000040F00C0F

RW

HBF (Hash-Based Forwarding) hash fields - 64-bit bitmask indicating

the fields that affect hash calculation.

Bit 1: DLID

Bit 2: SL

Bit 3: VL

Bit 4: LRH LNH

GRH fields:

Bit 10: SGID

Bit 11: DGID

Layer4 fields:

Bit 20: BTH destination QP

Bit 21: DETH source QP

Bit 22: DCETH V1 source QP

Bit 23: DCETH V1 ISID

Bit 30: Input port

Changing the value is not recommended.

hbf_weights

auto

RW

Weighted Hash-Based Forwarding (WHBF) configures the ratio

between three subgroups of the Adaptive Routing (AR) Group.

format is a tuple of three integers in a range 0 to 0xffff:

<subgroup 0 weight>,<subgroup 1 weight>,<subgroup 2 weight>

Changing the value is not recommended.

Randomization

scatter_ports

8

RW

Assigns ports in a random order instead of round-robin. If 0, the feature is disabled, otherwise use the value as a random seed.

Applicable to the AR_MINHOP/AR_UPDN routing algorithms

guid_routing_order_no_scatter

TRUE

RO

Do not use scatter for ports defined in guid_routing_order file

use_scatter_for_switch_lid

FALSE

RW

Use scatter when routing to the switch’s LIDs

updn lid tracking mode

updn_lid_tracking_mode

FALSE

RW

Controls whether SM will use LID tracking or not when updn or ar_updn routing engine is used

Events

Event Subscription Handling

drop_subscr_on_report_fail

TRUE

RW

Drop subscription on report failure (o13-17.2.1)

drop_event_subscriptions

TRUE

RW

Drop event subscriptions (InformInfo and ServiceRecords) on port removal and SM coming out of STANDBY

drop_unreachable_event_subscriptions

TRUE

RW

Drop event subscriptions (InformInfo and ServiceRecord) on ports that have no routing to SM

Virtualization

Virtualization enabled

virt_enabled

2

RW

Virtualization support

0: Ignore Virtualization - No virtualization support

1: Disable Virtualization - Disable virtualization

2: Enable Virtualization - Enable virtualization

Maximum ports in virtualization process

virt_max_ports_in_process

64

RW

Sets a number of ports to be handled on each virtualization process cycle

Router

Router aguid enable

rtr_aguid_enable

0 (Disabled)

RW

Defines whether the SM should create alias GUIDs required for router support for each HCA port

Router path record flow label

rtr_pr_flow_label

0

RW

Defines flow label value to use in multi-subnet path query responses

Router path record tclass

rtr_pr_tclass

0

RW

Defines tclass value to use in multi-subnet path query responses.

Router path record sl

rtr_pr_sl

0

RW

Defines sl value to use in multi-subnet path query responses

Router path record MTU

rtr_pr_mtu

4 (IB_MTU_LEN_2048)

RW

Define MTU value to use in multi-subnet path query responses

Router path record rate

rtr_pr_rate

16 (IB_PATH_RECORD_RATE_100_GBS)

RW

Defines rate value to use in multi-subnet path query responses

SA Security

SA Tnhanced Trust Model (SAETM)

sa_enhanced_trust_model

FALSE

RW

Controls whether SAETM is enabled.

Untrusted GuidInfo records

sa_etm_allow_untrusted_guidinfo_rec

FALSE

RW

Controls whether to allow Untrusted Guidinfo record requests in SAETM.

Guidinfo record requests by VF

sa_etm_allow_guidinfo_rec_by_vf

FALSE

RW

Controls whether to allow

Guidinfo record requests by vf in SAETM.

Untrusted proxy requests

sa_etm_allow_untrusted_proxy_requests

FALSE

RW

Controls whether to allow

Untrusted proxy requests in SAETM.

Max number of multicast groups

sa_etm_max_num_mcgs

128

RW

Max number of multicast groups per port/vport that can be registered.

Max number of service records

sa_etm_max_num_srvcs

32

RW

Max number of service records per port/vport that can be registered.

Max number of event subscriptions

sa_etm_max_num_event_subs

32

RW

Max number of event subscriptions (InformInfo) per port/vport that can be registered.

SGID spoofing

sa_check_sgid_spoofing

TRUE

RW

If enabled, the SA checks for SGID spoofing in every request with GRH included, unless the SLID is from a router port at that request.

Topo config

topo_config_file

(null)

RW

The file holding the topo configuration.

topo_config_enabled

FALSE

RW

If set to true, the SM will adjust its operational mode to take into account the topo_config file (predefined topology file).

Tenant Manager

tenants_policy_enabled

FALSE

RW

If set to true, the SM will enable the tenants manager

tenants_policy_file

/opt/ufm/files/conf/opensm/tenants-policy.conf

RW

The file holding the tenant configuration.

Application performance

Fabric profile

fabric_mode_profile

none

RW

Fabric Mode feature

Indicates which profile from fabric mode policy file to use for better user application performance.

If set to none, feature is disabled and SM will not change current

device configuration

fabric_mode_policy_file

/opt/ufm/files/conf/opensm/fabric-mode-policy.conf

RW

The file holding the Fabric Mode policy

Congestion Control

Congestion Control

mlnx_congestion_control

0

RW

Enable Congestion Control Configuration

0: Ignore congestion control

1: Disable congestion control

2: Enable congestion control

congestion_control_policy_file

/opt/ufm/files/conf/opensm/cc-policy.conf

RW

The file holding the congestion control policy

ppcc_algo_dir

/opt/ufm/files/conf/opensm/ppcc_algo_dir

RW

The directory holding the PPCC algorithm profiles

cc_max_outstanding_mads

500

RW

Congestion Control Max outstanding MAD

Statistics/

Performance

Logging

SM Statistics

osm_stats_interval

60

RW

Time interval [in min] between statistics dumps.

The value 0 implies no statistic dump

Max value is 71,582

osm_stats_dump_limit

20

RW

Max size [in MB] of statistic dump file.

The value 0 implies no size limitation.

Max value is 4095 (4GB).

osm_stats_dump_per_sm_port

TRUE

RW

Indication whether to dump MADs statistics per SM port

SM Performance

Logging

osm_perflog_dump_limit

20

RW

Max size [in MB] of perflog dump file.

The value 0 implies no size limitation.

Max value is 4095 (4GB).

enable_performance_logging

TRUE

RW

Enable performance logging.

when enabled, SM dumps the time it took for specific stages.

Note

Modifying the following parameters does not require a system manager (SM) restart and will not toggle fabric ports:

  • Qos_high_limit (for HCA only)

  • Qos_vlarb_high (for HCA only)

  • Qos_vlarb_low

  • Qos_sl2vl

However, incorrect VLARB or SL2VL settings can impact application performance or lead to traffic failure.

Single-root I/O virtualization (SR-IOV) enables a PCI Express (PCIe) device to appear to be multiple separate physical PCIe devices.

UFM is ready to work with SR-IOV devices by default. You can fine-tune the configuration using the SM configuration.

The following arguments are available for ConnectX-5 and later devices:

Argument

Value

Description

virt_enabled

  • 0 – no virtualization support

  • 1 – disable virtualization on all virtualization supporting ports

  • 2 – enable virtualization on all virtualization supporting ports (default)

Virtualization support

virt_max_ports_in_process

Possible values: 0-65535; where 0 processes all pending ports

Default: 64

Maximum number of ports to be processed simultaneously by the virtualization manager

virt_default_hop_limit

Possible values: 0-255

Default: 2

Default value for hop limit to be returned in path records where either the source or destination are virtual ports

UFM can isolate particular switches from routing in order to perform maintenance of the switches with minimal interruption to the existing traffic in the fabric.

Isolating a switch from routing is done via UFM Subnet Manager as follows:

  1. Create a file that includes either the node GUIDs or system GUID of the switches under maintenance. For example:

    Copy
    Copied!
                

    0x1234566 0x1234567

  2. Set the filename of the parameter held_back_sw_file in the /conf/opensm.conf file (the same as the file created in Step 1).

  3. Run:

    Copy
    Copied!
                

    kill -s HUP 'pidof opensm'

Once SM completes rerouting, the traffic does not go through the ports of isolated switches.

To attach the switch to the routing:

  1. Remove the GUID of the switch from the list of isolated switches defined in Step 1 of the isolation process.

  2. Run:

    Copy
    Copied!
                

    kill -s HUP 'pidof opensm'

Once SM completes rerouting, traffic will go through the switch.

© Copyright 2024, NVIDIA. Last updated on Jan 7, 2025.