What can I help you with?
NVIDIA UFM Cable Validation Tool v1.5.0

P2P File

The P2P file is an Excel file that details the physical link connections within the fabric. It may consist of multiple sheets, each containing the following columns:

  • A-Node Name: Specify the name of the node on the "A" side of the connection.

  • A-Type: Indicate the role of the "A" side node, either "Host" or "Switch" (applicable for Ethernet).

  • A-Port: Provide the port name for the "A" side node.

  • Z-Node Name: Specify the name of the node on the "Z" side of the connection.

  • Z-Type: Indicate the role of the "Z" side node, either "Host" or "Switch" (applicable for Ethernet).

  • Z-Port: Provide the port name for the "Z" side node.

worddavd8a4c871f3b2acf23f2a39fa8e46d482-version-1-modificationdate-1746456410640-api-v2.png

InfiniBand Example:

A sample sheet for InfiniBand connections in a P2P file:

Rack

U

Name

HCA/Port

Rack

Name

Name

Port

PXH

28

swx-proton03

1

PXX

30

sw-hdr-proton01

1/3

316

24

swx-proton04

3

PXX

27

sw-hdr-proton01

1/4

  • The designated port can be a single number or a split port (e.g., 1/2).

  • Mapping for HCA ports:

    • 1 → mlx5_0 P1

    • 2 → mlx5_1 P1

    • And so on.

The HCA mapping could be customized by the user, for more details see HCA Mapping File

XDR Example:

A sample sheet for NVOS connections in a P2P file:

Rack

U

Name

HCA/Port

Rack

Name

Name

Port

316

22

clx-abc-073

1

R113

22

bm-abc-t4

sw1p1

316

24

clx-abc-074

1

R113

22

bm-abc-t5

sw1p2

  • Designated ports are represented as sw<port_number>p<split_number>for switches.

  • Mapping for HCA ports: Same as InfiniBand (see above).

Ethernet Example:

A-Rack

A-U

A-Node Name

A-Type

A-Port

Z-Rack

Z-RU

Z-Node Name

Z-Type

Z-Port

ASN

2

memx-asm-01-sr1

Host

rail5

ASM

11

mem1-roc-f2-b2-r5-t1-d01

Switch

swp1s0

ASN

2

memx-asm-01-sr1

Host

rail6

ASM

13

mem1-roc-f2-b2-r6-t1-d01

Switch

swp1s1

Designated ports are represented as swp<port_number>s<split_number> for switches and rail<port_number for hosts>.

Please note that in all fabrics the tool relies on the header names to extract the information it needs. So user must have these names exactly as they appear above. If you make a syntax error then it will fail.

it is mandatory for the PTP file to incorporate a "Legend" sheet, which contains vital details regarding switch and host patterns. The below is an example:

Example:

Name

Model

Switch/HCA

Speed

Rate

c-csi-mqm*

MQM9700

Switch

4x 100G

NDR

c-csi-0*

HCA_2

HCA

4x 100G

NDR

CVT 1.5 introduces a n ew Unified Topology file format. This format aims to manage various cluster types, including InfiniBand( HDR/NDR/XDR) , Ethernet, and NVLink, with ease using an Excel-based structure for organized data representation. The proposed format (Excel file) with multiple sheets encapsulates the topology of modern data center requirements and acts as a one-stop solution.

The Unified Topology file format consists of 4 sheets:

  1. Nodes

  2. Links

  3. DC Floor Layout (optional)

  4. HCA Mapping (optional)

Nodes

The Nodes sheet will focus on providing comprehensive information about the nodes within the cluster. The primary data points will include:

FabricID

Rack

Unit

TrayIndex

NodeName

NodeType

NodeOS

NodeModel

Column

Value Type

Description

Mandatory

NodeName

String

The identifier for the node within the network.

​​☒​

NodeType

[Switch|Host]

The classification of the node (e.g., host, switch).

​​☒​

NodeModel

String

The specific model of the node.

​​☒​

NodeOS

String

The OS running on the node. Supported OS are: mlnx-os, cumulus, nvos, and linux for hosts

​​☒​

Rack

String

The physical rack where the node is located.

​​☐​

Unit

Integer

The specific unit within the rack.

​​☐​

TrayIndex

Integer

The tray index within the rack.

Relevant and mandatory for GB200

​​☐​

CoolingType

[Air|Liquid]

The node cooling type (e.g., Air, Liquid). For future use (not supported now)

​​☐​

FabricID

String

The Site/Cluster/Fabric name/ID

In case the topology includes multiple sites

​​☐​

Notes:

  • If Rack/Unit information is missing, some features—specifically the Rack View and filtering capabilities—will not function in the CVT interface.

  • Column names are case-insensitive, and the order of columns does not affect functionality.

  • The user can dismiss optional columns as described in the Mandatory column.

Links

The Links sheet will detail the connectivity information across the nodes within the cluster. It will consist of the following columns:

A-Node

A-Port

Z-Node

Z-Port

Protocol

Speed

Column

Value Type

Description

Mandatory

A-Node

String

The source node for the link. Must exist in the Nodes sheet

​​☒​

A-Port

String/Integer

The port on the source node.

​​☒​

Z-Node

String

The destination node for the link. Must exist in the Nodes sheet

​​☒​

Z-Port

String

The port on the destination node.

​​☒​

Protocol

(IB/Ethernet)

The protocol used for the connection

​​☒​

Speed

XDR/NDR/HDR/Numeric

The active speed of the link

​​☐​

Notes:

  • Column names are case-insensitive, and the order of columns does not affect functionality.

  • The user can dismiss Optional Columns as explained in the Mandatory column.

Internal Links:

Internal links (NVLink within the GB200 rack) will not be part of the Links sheet. CVT shall detect any internal links based on the NV-Switch model and shall use the predefined JSON representation of these links to build the links. This approach ensures seamless integration and efficient configuration within the GB200 racks.

DC Floor Layout

The data center floor layout sheet is an optional sheet that is useful to provide a layout of the data center. This includes information about:

  • Data Halls: The various halls within the data center.

  • Scalable Units: Units designed to be scalable for future expansions.

  • Racks: Detailed information about the racks within the data center. This information is mandatory for GB200 racks.

    • Rack: Rack Name

    • Rack Type: Type of the rack. Can be GB200_72x1 or GB200_36x2 for GB200 or any other rack type like ServersRack, NetworkRack,

    • Rack Group: Used for grouping the racks, especially in GB200 racks. It is a unique number for each GB200 rack system. A GB200 72x1 system would have a unique group number for itself. In a GB200 36x2 system, the two racks will belong to the same group_number. (Group number is just a made-up integer to facilitate the identification of rack systems correctly).

DataHall

ScalableUnit

Rack

RackType

Rack Group

HCA Mapping

The HCA Mapping sheet will be specific to IB fabric environments. It will map port numbers to their corresponding HCA names. Please refer to the HCA Mapping File for more information. HCA Mapping sheet will include below information:

Fabric ID

Server Model

OSFP

HCAName

Column

Value Type

Description

Mandatory

FabricID

String

For future use (not supported now)

​​☐​

ServerModel

String

For future use (not supported now)

​​☐​

OSFP

String

The OSFP port/slot

​​☒​

HCAName

String

Name of the HCA

​​☒​

NOTE:

  • DC Floor layout and HCA Mapping in Unified Topology are provided in the same Excel workbook and not passed as optional arguments.

© Copyright 2025, NVIDIA. Last updated on May 5, 2025.