Congestion Control

The following ibdiagnet options can be used to dump Mellanox/Nvidia Congestion Control configuration from HCAs/switches and Congestion Control Counters.

Parameter

Description

--congestion_control

Dumps Congestion Control configuration to the ibdiagnet2.db_csv file.

--congestion_counters

Dumps Mellanox/Nvidia Congestion Control Counters in ibdiagnet2.db_csv file. This option also activates congestion_control option.
If in ibdiagnet configuration file the following are set, congestion counters will be collected:

  • congestion_counters is set to TRUE

  • congestion_control is set to FALSE

--clear_congestion_counters

Dumps Congestion Counters to the ibdiagnet2.db_csv file and clears them. This option also activate congestion_control option.

--ppcc <filename|path|pattern>

Enables fetching PPCC (Port Programable Congestion Control) counters.
Possible values:

1. File path - ibdiagnet loads from file PPCC Algorithms.

2. Folder path - ibdiagnet loads all files from the directory.

3. Wildcard - ibdiagnet loads files according to the wildcard matching (Note: In this case, quotation marks must be used!).

For more information on the supported wildcard syntax refer to the manual page by typing 'man 7 glob'

Example:

Copy
Copied!
            

ibdiagnet --congestion_control ibdiagnet --congestion_counters ibdiagnet --clear_congestion_counters ibdiagnet --congestion_counters --ppcc /tmp/file2.algo ibdiagnet --congestion_control --ppcc ‘/tmp/*.algo’

Output Congestion Control:

Copy
Copied!
            

START_CC_ENHANCED_INFO NodeGUID,ver0Supported,CC_Capability_Mask 0x0002c9000000001d,1,0x0000000000000002 0x0002c9000000004f,1,0x0000000000000002 0x0002c90000000011,1,0x0000000000000002 END_CC_ENHANCED_INFO     START_CC_SWITCH_GENERAL_SETTINGS NodeGUID,aqs_time,aqs_weight,en,cap_total_buffer_size 0x0002c9000000004f,0,0,0,0 0x0002c90000000041,0,0,0,0 0x0002c90000000043,0,0,0,0 END_CC_SWITCH_GENERAL_SETTINGS     START_CC_PORT_PROFILE_SETTINGS NodeGUID,portNum,vl,mode,profile1_min,profile1_max,profile1_percent,profile2_min,profile2_max,profile2_percent,profile3_min,profile3_max,profile3_percent 0x0002c9000000004f,1,0,0,0,0,0,0,0,0,0,0,0 0x0002c9000000004f,1,1,0,0,0,0,0,0,0,0,0,0 0x0002c9000000004f,1,2,0,0,0,0,0,0,0,0,0,0 END_CC_PORT_PROFILE_SETTINGS     START_CC_SL_MAPPING_SETTINGS NodeGUID,portNum,sl_profile_0,sl_profile_1,sl_profile_2,sl_profile_3,sl_profile_4,sl_profile_5,sl_profile_6,sl_profile_7,sl_profile_8,sl_profile_9,sl_profile_10,sl_profile_11,sl_profile_12,sl_profile_13,sl_profile_14,sl_profile_15 0x0002c9000000004f,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 0x0002c9000000004f,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 0x0002c9000000004f,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 END_CC_SL_MAPPING_SETTINGS     START_CC_HCA_GENERAL_SETTINGS NodeGUID,PortGUID,portNum,en_react,en_notify 0x0002c9000000001d,0x0002c9000000001e,1,0,0 0x0002c90000000011,0x0002c90000000012,1,0,0 0x0002c90000000015,0x0002c90000000016,1,0,0 END_CC_HCA_GENERAL_SETTINGS     START_CC_HCA_RP_PARAMETERS NodeGUID,PortGUID,portNum,clamp_tgt_rate_after_time_inc,clamp_tgt_rate,rpg_time_reset,rpg_byte_reset,rpg_threshold,rpg_max_rate,rpg_ai_rate,rpg_hai_rate,rpg_gd,rpg_min_dec_fac,rpg_min_rate,rate_to_set_on_first_cnp,dce_tcp_g,dce_tcp_rtt,rate_reduce_mionitor_period,initial_alpha_value 0x0002c9000000001d,0x0002c9000000001e,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 0x0002c90000000011,0x0002c90000000012,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 0x0002c90000000015,0x0002c90000000016,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 END_CC_HCA_RP_PARAMETERS     START_CC_HCA_NP_PARAMETERS NodeGUID,PortGUID,portNum,min_time_between_cnps,cnp_sl,cnp_sl_mode 0x0002c9000000001d,0x0002c9000000001e,1,0,0,0 0x0002c90000000011,0x0002c90000000012,1,0,0,0 0x0002c90000000015,0x0002c90000000016,1,0,0,0 END_CC_HCA_NP_PARAMETERS     START_CC_HCA_STATISTICS_QUERY NodeGUID,PortGUID,portNum,clear,cnp_ignored,cnp_handled,marked_packets,cnp_sent,timestamp,accumulators_period 0x0002c9000000001d,0x0002c9000000001e,1,1,8438294795498567466,11946806576396300733,10660184510038152731,17344637759085672224,2146452334753643592,1088665365 0x0002c90000000011,0x0002c90000000012,1,1,13850774289226306924,6716987295250300780,3875360350926614344,13023957060305061195,764498337964851634,1436934366 0x0002c90000000015,0x0002c90000000016,1,1,13520874084649801659,3138568427236183055,4818259338400718972,18218947454021603546,17720325260696739839,2265373316 END_CC_HCA_STATISTICS_QUERY

Output Port Programmable Congestion Control:

Copy
Copied!
            

START_CC_HCA_ALGO_CONFIG_SUPPORT NodeGUID,PortGUID,algo_en,algo_status,trace_en,counter_en,sl_bitmask,encap_len,encap_type,algo_id_0,algo_major_version_0,algo_minor_version_0,...,algo_id_15,algo_major_version_15,algo_minor_version_15 0x0002c9000000002d,0x0002c9000000002e,0,0,0,0,0x0186,8,15,32934,238,242,...,NA,NA,NA 0x0002c90000000031,0x0002c90000000032,0,1,0,0,0x8bb4,15,1,10469,170,215,...,NA,NA,NA END_CC_HCA_ALGO_CONFIG_SUPPORT     START_CC_HCA_ALGO_CONFIG NodeGUID,PortGUID,algo_slot,algo_en,algo_status,trace_en,counter_en,sl_bitmask,encap_len,encap_type,algo_info_text 0x0002c9000000002d,0x0002c9000000002e,0,1,0,1,0,0xe96f,12,0,"Pi9MrmDmzY" 0x0002c9000000002d,0x0002c9000000002e,1,1,1,1,1,0xdd9f,8,13,"hERqomdF" END_CC_HCA_ALGO_CONFIG     START_CC_HCA_ALGO_CONFIG_PARAMS NodeGUID,PortGUID,algo_slot,sl_bitmask,encap_len,encap_type,congestion_param_0,...,congestion_param_43 0x0002c9000000002d,0x0002c9000000002e,0,0x78e1,8,0,2670514607,...,NA 0x0002c9000000002d,0x0002c9000000002e,1,0x6fdb,15,8,852343172,...,NA END_CC_HCA_ALGO_CONFIG_PARAMS     START_CC_HCA_ALGO_COUNTERS NodeGUID,PortGUID,algo_slot,clear,sl_bitmask,encap_len,encap_type,congestion_counter_0,...,congestion_counter_43 0x0002c9000000002d,0x0002c9000000002e,1,0,0xf1dd,13,8,939773111,...,NA 0x0002c9000000002d,0x0002c9000000002e,2,0,0xb725,7,3,2936704535,...,NA END_CC_HCA_ALGO_COUNTERS

© Copyright 2023, NVIDIA. Last updated on May 23, 2023.