ibdiagnet InfiniBand Fabric Diagnostic Tool User Manual v2.13.0

NVIDIA Scalable Hierarchical Aggregation and Reduction Protocol (SHARP) Support

The following ibdiagnet options can be used to dump SHARP Trees configuration and SHARP traffic counters.

Parameter

Description

--sharp

Collects SHARP Trees configuration and dump to ibdiagnet2.sharp & ibdiagnet2.sharp_an_info.

  • ibdiagnet.sharp contains SHARP distribution trees and tree QPs structures.

  • ibdiagnet2.sharp_an_info contains node information.

--sharp_control_version <0|1|2>

Checks and dumps only SHARP nodes with the specified version (default 0):

  • 0—all versions

  • 1—version 1 only

  • 2—version 2 only

--sharp_opt <[csc][dsc][dscp][ ad_hoc]>

Comma separated SHARP options once "--sharp" option is selected:

  • csc: Clears SHARP counters.

  • dsc: Dumps SHARP performance counters to the ibdiagnet2.db_csv file.

  • dscp: Dumps SHARP SAT performance counters per port to the ibdiagnet2.db_csv file.

  • ad_hoc: Indicates that SHARP support ad-hoc trees, avoid warnings for tree_id duplication in the fabric

Examples:

  • This example shows 3 level SHARP tree which root (rank 0) is Aggregation Node 0xec0d9a0300246f38.

    Copy
    Copied!
                

    TreeID:0, Max Radix:2 (0), AN:Mellanox Technologies Aggregation Node, lid:48, port guid:0xec0d9a0300246f38, Child index:0, parent QPn:0x00000000, remote parent QPn:0x00000000, radix:2 (1), AN:Mellanox Technologies Aggregation Node, lid:52, port guid:0xec0d9a0300246f58, Child index:0, parent QPn:0x008de801, remote parent QPn:0x008de801, radix:2 (2), AN:Mellanox Technologies Aggregation Node, lid:80, port guid:0xec0d9a030027dbb8, Child index:0, parent QPn:0x00fb6801, remote parent QPn:0x008de802, radix:0 (2), AN:Mellanox Technologies Aggregation Node, lid:32, port guid:0xec0d9a0300090168, Child index:1, parent QPn:0x00202801, remote parent QPn:0x008de803, radix:0 (1), AN:Mellanox Technologies Aggregation Node, lid:512, port guid:0xec0d9a0300246e38, Child index:1, parent QPn:0x008dc801, remote parent QPn:0x008de802, radix:2 (2), AN:Mellanox Technologies Aggregation Node, lid:108, port guid:0xec0d9a030027dbd8, Child index:0, parent QPn:0x00fb6801, remote parent QPn:0x008dc802, radix:0 (2), AN:Mellanox Technologies Aggregation Node, lid:144, port guid:0xec0d9a03000b6bf8, Child index:1, parent QPn:0x006d6801, remote parent QPn:0x008dc803, radix:0

  • The example shows SHARP port counters:

    Copy
    Copied!
                

    ------------------------------------------------------- AggNodeDesc=Mellanox Technologies Aggregation Node Lid=32 GUID=0xec0d9a0300090168 ------------------------------------------------------- packet_sent=0x0000000000000000 ack_packet_sent=0x0000000000000000 retry_packet_sent=0x0000000000000000 rnr_event=0x0000000000000000 timeout_event=0x0000000000000000 oos_nack_rcv=0x0000000000000000 rnr_nack_rcv=0x0000000000000000 packet_discard_transport=0x0000000000000000 packet_discard_sharp=0x0000000000000000

© Copyright 2023, NVIDIA. Last updated on May 23, 2023.