Creating a Topology File

InfiniBand Cluster Bring-up Procedure

The topology file describes the connection between the different cluster elements. It ensures standard documentation of the topology plan, verifies the implementation matches the topology plan (deployment/maintenance phases) and helps mapping and visualizing the topology when fixing problems.

To generate the topo file from the Point-To-Point (PTP) xls file, download the python script from here.

Within the script file (in the source code, at the top), there's a dictionary named 'hcaPortMapping'. It maps each HCA port enumeration (from previous section) to the real HCA interface name, in mlx format, for example, mlx5_1/P1.

The dictionary structure is as the following:

Copy
Copied!
            

hcaPortMapping = { 'U1/P<HCA-enumeration1>': '<real-HCA-interface1>/P1', 'U1/P<HCA-enumeration2>': '<real-HCA-interface2>/P1', ... }

For example:

Copy
Copied!
            

hcaPortMapping = {   'U1/P1': 'mlx5_3/P1',     'U1/P2': 'mlx5_1/P1' }

It means that:

  • in PTP file, where we specified a connection of host with enumerated HCA port 1, it means that this connection is attached to interface mlx5_3 of the host

  • in PTP file, where we specified a connection of host with enumerated HCA port 2, it means that this connection is attached to interface mlx5_1 of the host

Before running the parser script, make sure that 'hcaPortMapping' maps properly all the HCA ports enumeration from your PTP file to the proper HCA interfaces, as described above.

To create the topology file, use the parse_ptp_file.py script:

Copy
Copied!
            

python parse_ptp_file.py parse -f ptp-data.xls -of -mhp

The script outputs the topo file "output.topo". Give it a meaningful name (name, date, etc.), and store/save it in a reachable location for a future use (this file is used for validations, etc.).

Example of a topology file:

Copy
Copied!
            

MQM9700 cl02s0lclf01 CFG : main=4x Pl -4x-100G-> HCA_mlx5_8 cl020ldgx01 Ul/Pl P2 -4x-100G-> HCA_mlx5_8 cl02s0ldgx02 Ul/Pl P3 -4x-100G-> HCA_mlx5_8 cl02s0ldgx03 Ul/Pl P4 -4x-100G-> HCA_mlx5_8 cl020ldgx04 Ul/Pl P5 -4x-100G-> HCA_mlx5_8 cl02s0ldgx05 Ul/Pl P6 -4x-100G-> HCA_mlx5_8 cl02s0ldgx06 Ul/Pl P7 -4x-100G-> HCA_mlx5_8 cl020ldgx07 Ul/Pl P8 -4x-100G-> HCA_mlx5_8 cl02s0ldgx08 Ul/Pl

© Copyright 2024, NVIDIA. Last updated on May 28, 2024.