Changes and New Features

IBUtils2 Utility Release Notes v2.17

v.2.17.0

NVLink5 MADs

Added support for NVLink5 MADs.

FLID

Added support for compression ratio for FLIDs.

FLID

Added support PFRN for FLIDs.

Cables

Enabled PRTL registers for calculation cable length.

PHY Plugin

Added support for MPCNT register.

PHY Plugin

Allowed MRCS register for HCAs.

ibdiagpath

Deprecated and removed ibdiagpath.

ibdiagnet

Added support for auto build scope by hosts list (--host_file & --node_descr_filter).

v.2.16.0

PHY Plugin

Added support for MRCS register.

PHY Plugin

Added support for PRTL register.

PHY Plugin

Updated PDDR register.

Cables

Reduced false-positive errors about an invalid port in the cable plugin for splitted switches.

Cables

Changed cable info fetch priority from cable plugin to PHY plugin.

v2.15.0

Counters

Added support for PPCNT InfiniBand general counters.

DGX-H100

Added support for DGX-H100 systems in Rail Validation.

db_csv File

Added the option to report to the db_csv file if no BER threshold is found for a port.

SLRG Register

Excluded the SLRG register from the default PHY list.

MSPS Register

Updated MSPS register (added the power_consumption field to the dump).

v2.14.0

PCIe Connectivity Health Report

Introduced a new report on PCIe connectivity health based on a comparison of enabled/active speed and width.

Cables

Removed validation of transceiver firmware versions on the same cable.

v2.13.0

Counters

Added support for "Fast Link Recovery" counters.

Counters

Added support for On-Demand-Paging (ODP) counters.

Reports

Added reports for mismatch in cable firmware versions.

Reports

Added reports for cable length.

v2.12.0

Programable Port Congestion Control

Added support for fetching Programable Port Congestion Control Counters (PPCC) at GA level.

FLID

Added support for FLID at GA level.

SHARP

Added support for “ad-hoc” trees for SHARP.

iblinkinfo

Added new dump file in format if iblinkinfo.

v2.11.0

Programable Port Congestion Control

Added support for fetching Programable Port Congestion Control Counters (PPCC).

This feature is at Technical Preview level.

FLID

Added support for FLID.

This feature is at Technical Preview level.

GMP MADs

Added support for sending GMP MADs from user-space (Kernel bypass).

This feature is at Alpha level.

In-band auto-discovery

In-band auto-discovery for switch management IP.

This feature is available only in the UFM package.

PHY plugin

Added support for IB cable/transceiver diagnostics: SNR and EOM (PEMI).

ibdiagnet's runtime statistics

Added ibdiagnet's runtime statistics (CPU utilization, Stage duration, MAD statistics).

v2.10.0

Cable Information

Fetching cable info from Common Management Interface Specification (CMIS) compatible cables connected to HCA.

smparquery Uutility

The smparquery utility is moved from the deprecated package "ar_info" to the "ibutils2" package. Now, location of the binary file is "/usr/bin/smparquery" , instead of "/usr/sbin/smparquery.

topodiff Tool

The topodiff tool supports now more than 4 HCA systems.

Routers stage

"Routers stage" will be launched by default.

Auto-detection of Outgoing IB Port

Improved the auto-detection of outgoing IB port.

v2.9.0

Proactive FRN (pFRN)

Added support for Proactive FRN (pFRN) configuration and counters.

HashBasedForwarding

Added support for HashBasedForwarding (HBF).

PortVLXmitWait

Added support for vendor spec PortVLXmitWait 64bit.

Hierarchy Info

Added support for Hierarchy Info (A15 InfiniBand spec):

  • Port Hierarchy Info

  • Physical Hierarchy Info

PHY plugin

Added support for up to 16 fans (START_FANS_SPEED section).

SHARP trees validation

Added new validations for SHARP trees in fat-tree topology with "parallel" links between switches.

VL Arbitration

Added support for VL Arbitration.

Cables

Modified the format for temperature thresholds. How it uses human readable format.

PMPortSamplesControl

Added PMPortSamplesControl to db_csv.

FDBS file

Enabled FDBS file creation by default.

DFP dump file

Improved DFP dump file.

Dump file

Performance improvements in dump file creation

v2.8.0

Cables

Added support for CMIS cable.

PortVLXmit

Added support for PortVLXmitFlowCtlUpdateErrors.

PortVLXmit

Added support for PortVLXmitWait.

PHY plugin

Updated SLRG_16 and SLTP_16 registers.

PHY plugin

Added support for PCI Diagnostic Data Pages.

Adapter Cards: Socket-Direct

Added a list of Socket-Direct HCAs to DB_CSV.

Adapter Cards: Socket-Direct

Socket-Direct HCAs are excluded from Rail-Optimized Topology validation.

PM Stage Reports Overflow & Threshold

PM Stage reports overflow & threshold for 3 counters only:

  • SyncHeaderErrorCounter

  • UnknownBlockCounter

  • port_fec_uncorrectable_block_counter

Fat Tree Validation

Improved "Connectivity group" detection.

ibnetdiscover

ibnetdiscover file will not be generate in case of the scope file usage.

Fabric Summary

Fabric Summary now includes number of Socket-Direct HCAs

Error/Warning Reporting

Error/warning reporting to screen and log files is limited by 5 of each type, DB_CSV includes all errors

v2.7.0

Link Speed

Added support for NDR InfiniBand link speed.

Fat Tree Topology

Enabled a new Fat Tree Topology validation tool.

Virtualization Stage

Redesigned the Virtualization stage to asynchronous mode.

AGUID

AGUID stage is disabled by default.

To enable it, use the '--aguid' parameter.

db_csv

db_csv will now contain information about connected ports only.

v2.6.1

ibdiagnet

  • Enabled Adaptive Routing validation

  • Performance improvement of routing checking

  • Added new counters and diagnostic information:

    • SHARP: “SAT” (“Streaming Aggregation”) counters

    • PHY: Maximum PLR per second field in PPCNT

    • PHY: SLLM register

    • PHY: New SymbolBER thresholds

    • Adaptive Routing: PortARTrails counter

  • Added support for SHARP security (AMKEY)

  • Enabled reporting port counter differences when using "–pm_pause_time" (PM_DELTA section in db_csv)

  • Added Dragonfly+ Topology Validation (--dfp, --dfp_opt [<max_cas>] )

  • Added the option to report the version on the screen and in log file

ibtopodif

Added support for “stable” names of RDMA devices based on PCI/slot/function location.

ibnetdiscover dump file (created by ibdiagnet)

Now the dump file includes Virtual Port info

v2.5.1

Rail Optimized Topology Validation

Checks links between compute nodes and leaf switches to provide rail optimized topology (--rail_validation, --rail_validation_opt [<regex>] ).

Service Level

Customization SL for GMP MADs in ibdiagnet (--sl).

General

  • Added output similar to the results of the "ibnetdiscovery"

  • ibdiagnet: Disabled the output by default for AR/FDBS files in routing stage

  • Moved SHARP performance counters to db_csv

  • Dumped by default FEC_MODE section in db_csv

  • Improved AR connectivity check

v2.4.0

General

  • Flexible output control options (--enable_output , --disable_output, --path)

  • Discovery only mode (–discovery_only)

  • Support for MLNX Congestion Control counters

  • “Fabric Summary” is enabled by default.

  • Dates and versions in “nodes_info” file are printed in human-readable format.

  • Dump files include ibdiagnet version and command line parameters.

  • Added split mode to IBNL for “InfiniBand Smart Director Switches” (CS8500)

Performance Improvements

The performance of following steps in ibdiagnet have been improved:

  • Routing validation

  • Network discovery

  • Virtualization stage

  • Dump creation

Version

Tool

Parameter Name

Status

Description

2.16.0

ibdiagpath

--lids_list

New

Using the lids utility for creating scope file.

2.15.0

ibdiagnet

--skip

Deprecated

vs_cap_smp & vs_cap_gmp values will be ignored.

<br />SMP & GMP capabilities will be retrieve every time.

ibdiagpath

--adaptive_routing

New

Using adaptive routing tables to look up for possible paths.

2.12.0

ibdiagnet

--sharp_opt

Changed

Added a new option "- [ad_hoc]" which indicates the ad-hoc trees support in SHARP, and prevents warnings for tree_id duplication in the fabric

Value: <[csc][dsc][dscp][ad_hoc]>

2.11.0

ibdiagnet

--ppcc

New

This parameters enables fetching PPCC (Port Programable Congestion Control) counters.

Possible values:

  • File path: In this cases ibdiagnet loads from file PPCC Algorithms

  • Folder path: ibdiagnet loads all files from the directory

  • Wildcard: ibdiagnet loads files according to wildcard matching.

For more information on the supported wildcard syntax refer to the manual page by typing `man 7 glob

--verbs

Experimental

Send and receive GMPs via ibverbs instead of ibumad library.

2.9.0

ibdiagnet

--r_opt

Deprecated

The same functionality is supported by --disable_output option

Value: skip_far

Removed

Values: vs, far, rn, drnc, crnc

--pm_get_all

New

Get all PM counters by activate the following flags: --per_slvl_cntrs --sc --extended_speeds pm_per_lane

--pm_clear_all

New

Clear all PM counters by activate the following flags: --scr --pc

--ft_roots_regex_opt

New

Regular expression to select topology root nodes. To be applied to switch descriptions.

Value: <regular expression>

2.8.0

--r_opt

dump_only_skip_routing_tables

Added

Skips routing tables (LFTs) retrieving

rn

Deprecated

Dumps routing notification data to file (enabled by default)

drnc

Deprecated

Dumps routing notification port counters to file (enabled by default)

2.7.0

ibdiagnet

--ft

New

Enables Fat Tree Topology Validation (default - disabled)

--aguid

New

Collects AGUIDs

--enable_spst

Removed

SPST mode is enabled by default. Option was deprecated in 2.6.1

2.6.1

ibdiagnet

--smp_window

Upper limit and default are changed

Max: 256; default: 16

--gmp_window

Upper limit and default are changed

Max: 16384; default: 256

--am_key

New

Specifies default AMKEY for the fabric

--am_key_file

New

Specifies the path to file (AMKEY per GUID)

--smdb

New

Specifies the path to OpenSM SMDB file (required for Adaptive Routing & Dragonfly+ Topology validation)

--ber_threshold_table

New

Specifies the path to BER thresholds table file (BER per FEC)

--create_ber_threshold_table

New

Creates template file of BER threshold table

--enable_spst

Deprecated

SPST mode is enabled by default

--dfp

New

Enabling DFP Topology Validation (default - disabled)

--dfp_opt

New

Specifies comma separated DFP Topology Validation options

dfp_opt

<max_cas>

-

Specifies max number of CAs for “Root” switch in Dragonfly+ island (default: 1).

This parameter is mutual exclusive with --smdb

ibdmchk

--FAR

New

Adds support for input FAR file

2.5.1

ibdiagnet

--sl

New

Specifies the SL to be used (default=0)

--rail_validation

New

Enabling Rail Optimized Topology Validation (default - disabled)

--rail_validation_opt

New

Specifies comma separated Rail Optimized Topology Validation options

--clear_congestion_counters

New

Dumping Congestion Counters and clearing them

--fec_mode

Deprecated

FEC_MODE section will be dump to "db_csv" by default

--rail_validation_opt

<regex>

-

Specifies regular expression to filter HCA nodes from reports. To be applied to HCAs node descriptions

2.4.0

ibdiagnet

--enable_output

New

Enable creation of specific dump file

--disable_output

New

Disable specific dump file

--path

New

Set custom path for specific dump file

--discovery_only

New

Discover IB fabric, save topology information into “db_csv” file and exit

--smp_window

Upper limit is changed

New max is 128

--gmp_window

Upper limit is changed

New max is 8192

ibtopodiff

--ibnl_dir

New

Set path for IBNL files

© Copyright 2024, NVIDIA. Last updated on May 6, 2024.