ibdiagnet InfiniBand Fabric Diagnostic Tool User Manual v2.11.0
IBUtils2 Utility Release Notes v2.15

Fat-Tree Topology Validation

This section specifies the options for Fat-Tree topology validation. Topology validation checks that the provided topology is a properly connected Fat-Tree topology. It detects tree structure, its “connectivity groups” and neighborhoods and their link issues. It also reports on a network theoretical “bisectional” bandwidth.

A newly generated output file – ibdiagnet2.fat_tree contains details about switches uplinks/downlinks issues and tree structure by levels.

Parameter

Description

--ft

Provides a report of the fabric Fat-Tree analysis.

Data will be dumped to the ibdiagnet2.fat_tree file.

Warnings and errors will be dumped to the ibdiagnet2.log and ibdiagnet2.fat_tree files.

--ft_roots_regex_opt

The regular expression to select Fat-Tree root nodes. Only nodes matching the regular expression will be taken as roots.

--smdb <file>

Loads Fat-Tree roots from the subnet manager SMDB file (the routing engine reported in opensm-smdb.dump file should be one form the following list: “Fat-Tree”, Adaptive Routing Fat-Tree”, “UPDN”, “Adaptive Routing UPDN”).

Example:

Copy
Copied!
            

ibdiagnet –-ft

  • ibdiagnet's Output:

    Copy
    Copied!
                

    Fat-Tree Topology Validation     -I- Fat-Tree topology detection finished successfully   -I- 3 level Fat-Tree was discovered:         rank: 0(Roots) #switches: 40         rank: 1 #switches: 40         rank: 2 #switches: 29 -E- Fat-Tree topology validation finished with errors -E- Invalid link between connectivity group 7 (GUID: 0x0002c90000000024 port:  16) and group 6 (GUID: 0x0002c900000001a0 port: 21) -E- Invalid link between connectivity group 7 (GUID: 0x0002c90000000024 port: 17) and group 6 (GUID: 0x0002c900000001a0 port: 22) -E- Invalid link between connectivity group 7 (GUID: 0x0002c90000000024 port: 18) and group 6 (GUID: 0x0002c900000001a0 port: 23) -E- Invalid link between connectivity group 7 (GUID: 0x0002c90000000024 port: 19) and group 6 (GUID: 0x0002c900000001a0 port: 24) -E- Invalid link between connectivity group 7 (GUID: 0x0002c90000000024 port: 20) and group 6 (GUID: 0x0002c900000001a0 port: 25) -E- Connectivity group 6: missing link between switches (GUID: 0x0002c90000000028) and (GUID: 0x0002c900000001a0) -E- Invalid link between connectivity group 6 (GUID: 0x0002c90000000028 port: 16) and group 7 (GUID: 0x0002c9000000019c port: 21) -E- Invalid link between connectivity group 6 (GUID: 0x0002c90000000028 port: 17) and group 7 (GUID: 0x0002c9000000019c port: 22) -E- Invalid link between connectivity group 6 (GUID: 0x0002c90000000028 port: 18) and group 7 (GUID: 0x0002c9000000019c port: 23) -E- Invalid link between connectivity group 6 (GUID: 0x0002c90000000028 port: 19) and group 7 (GUID: 0x0002c9000000019c port: 24) -E- Invalid link between connectivity group 6 (GUID: 0x0002c90000000028 port: 20) and group 7 (GUID: 0x0002c9000000019c port: 25) -E- Connectivity group 7: missing link between switches (GUID: 0x0002c90000000024) and (GUID: 0x0002c9000000019c) -E- For more errors see the dump file: ibdiagnet2.fat_tree     -I- Calculated Fat-Tree bisectional bandwidth: 1 Gbps

  • fat_tree file “connectivity groups”/neighborhoods and their switches uplinks/downlinks issues details:

    Copy
    Copied!
                

    -E- Connectivity group 0: spines with different number of downlinks (expected  40 downlinks)         39 downinks: 0x0002c900000012ec 0x0002c900000012fc 0x0002c90000001304 0x0002c90000001314 0x0002c90000001324 0x0002c9000000131c 0x0002c90000001334 0x0002c9000000133c 0x0002c90000001344 0x0002c9000000134c 0x0002c90000001354         38 downlinks: 0x0002c9000000130c         36 downlinks: 0x0002c9000000132c   -E- Connectivity group 0: lines with different number of uplinks (expected  15 uplinks)         14 uplinks: 0x0002c900000011cc 0x0002c900000011dc 0x0002c900000011e4 0x0002c900000011ec 0x0002c90000001234 0x0002c900000011a4 0x0002c9000000123c 0x0002c9000000124c               0x0002c90000001254 0x0002c900000011c4 0x0002c9000000125c 0x0002c90000001264 0x0002c9000000126c 0x0002c90000001274 0x0002c90000001284 0x0002c9000000110c               0x0002c9000000128c 0x0002c90000001114 0x0002c90000001294 0x0002c9000000111c 0x0002c900000012a4 0x0002c9000000117c 0x0002c90000001184 0x0002c9000000114c               0x0002c9000000119c 0x0002c90000001154         13 uplinks: 0x0002c900000011d4 0x0002c90000001244 0x0002c900000011b4 0x0002c900000011bc 0x0002c9000000127c 0x0002c90000001124 0x0002c9000000112c 0x0002c9000000113c               0x0002c9000000118c         12 uplinks: 0x0002c900000011ac 0x0002c90000001134 0x0002c90000001194 0x0002c9000000129c   -E- Connectivity group 0: lines with different number of downlinks (expected  14 downlinks)         13 downlinks: 0x0002c900000011cc 0x0002c90000001234 0x0002c9000000123c 0x0002c900000011b4 0x0002c900000011c4 0x0002c90000001274 0x0002c9000000111c 0x0002c9000000113c               0x0002c9000000129c         12 downlinks: 0x0002c9000000112c 0x0002c9000000117c 0x0002c9000000119c

  • fat_tree file tree structure - switches by rank:

    Copy
    Copied!
                

    rank: 0 (Roots)size: 28      0x0002c900000012ec -- ibsw1105-s15/U1      0x0002c900000012f4 -- ibsw1105-s17/U1      0x0002c900000012fc -- ibsw1105-s20/U1      0x0002c90000001304 -- ibsw1105-s22/U1      0x0002c9000000130c -- ibsw1105-s25/U1      0x0002c90000001314 -- ibsw1105-s27/U1      0x0002c90000001324 -- ibsw1105-s32/U1      0x0002c9000000131c -- ibsw1105-s30/U1      0x0002c9000000132c -- ibsw1105-s35/U1      0x0002c90000001334 -- ibsw1105-s37/U1      0x0002c9000000133c -- ibsw1105-s40/U1      0x0002c90000001344 -- ibsw1105-s42/U1

  • fat_tree file tree structure – “connectivity groups”/neighborhoods by rank

    Copy
    Copied!
                

    on ranks (0, 1) -- connectivity groups: 2      connectivity group: 0            spines: 14 switches                   0x0002c900000012ec -- ibsw1105-s15/U1                   0x0002c900000012f4 -- ibsw1105-s17/U1                   0x0002c900000012fc -- ibsw1105-s20/U1                   0x0002c90000001304 -- ibsw1105-s22/U1                   0x0002c9000000130c -- ibsw1105-s25/U1                   0x0002c90000001314 -- ibsw1105-s27/U1                   0x0002c90000001324 -- ibsw1105-s32/U1                   0x0002c9000000131c -- ibsw1105-s30/U1                   0x0002c9000000132c -- ibsw1105-s35/U1                   0x0002c90000001334 -- ibsw1105-s37/U1                   0x0002c9000000133c -- ibsw1105-s40/U1                   0x0002c90000001344 -- ibsw1105-s42/U1                   0x0002c9000000134c -- ibsw1105-s45/U1                   0x0002c90000001354 -- ibsw1105-s47/U1            lines: 40 switches                   0x0002c900000011cc -- ibsw0905-s37/U1                   0x0002c900000011d4 -- ibsw0905-s40/U1                   0x0002c900000011dc -- ibsw0905-s42/U1

© Copyright 2023, NVIDIA. Last updated on May 23, 2023.