NVIDIA UFM Cable Validation Tool v1.6.0

P2P File

The P2P file is an Excel file that details the physical link connections within the fabric. It may consist of multiple sheets, each containing the following columns:

  • A-Node Name: Specify the name of the node on the "A" side of the connection.

  • A-Type: Indicate the role of the "A" side node, either "Host" or "Switch" (applicable for Ethernet).

  • A-Port: Provide the port name for the "A" side node.

  • Z-Node Name: Specify the name of the node on the "Z" side of the connection.

  • Z-Type: Indicate the role of the "Z" side node, either "Host" or "Switch" (applicable for Ethernet).

  • Z-Port: Provide the port name for the "Z" side node.

worddavd8a4c871f3b2acf23f2a39fa8e46d482-version-1-modificationdate-1754593517172-api-v2.png

InfiniBand Example:

A sample sheet for InfiniBand connections in a P2P file:

Rack

U

Name

HCA/Port

Rack

Name

Name

Port

PXH

28

swx-proton03

1

PXX

30

sw-hdr-proton01

1/3

316

24

swx-proton04

3

PXX

27

sw-hdr-proton01

1/4

  • The designated port can be a single number or a split port (e.g., 1/2).

  • Mapping for HCA ports:

    • 1 → mlx5_0 P1

    • 2 → mlx5_1 P1

    • And so on.

The HCA mapping could be customized by the user, for more details see HCA Mapping File

XDR Example:

A sample sheet for NVOS connections in a P2P file:

Rack

U

Name

HCA/Port

Rack

Name

Name

Port

316

22

clx-abc-073

1

R113

22

bm-abc-t4

sw1p1

316

24

clx-abc-074

1

R113

22

bm-abc-t5

sw1p2

  • Designated ports are represented as sw<port_number>p<split_number>for switches.

  • Mapping for HCA ports: Same as InfiniBand (see above).

Ethernet Example:

A-Rack

A-U

A-Node Name

A-Type

A-Port

Z-Rack

Z-RU

Z-Node Name

Z-Type

Z-Port

ASN

2

memx-asm-01-sr1

Host

rail5

ASM

11

mem1-roc-f2-b2-r5-t1-d01

Switch

swp1s0

ASN

2

memx-asm-01-sr1

Host

rail6

ASM

13

mem1-roc-f2-b2-r6-t1-d01

Switch

swp1s1

Designated ports are represented as swp<port_number>s<split_number> for switches.

Please note that in all fabrics the tool relies on the header names to extract the information it needs. So user must have these names exactly as they appear above. If you make a syntax error then it will fail.

it is mandatory for the PTP file to incorporate a "Legend" sheet, which contains vital details regarding switch and host patterns. The below is an example:

Example:

Name

Model

Switch/HCA

Speed

Rate

c-csi-mqm*

MQM9700

Switch

4x 100G

NDR

c-csi-0*

HCA_2

HCA

4x 100G

NDR

CVT 1.5 introduced a n ew Unified Topology file format. This format is further enhanced in CVT 1.6. This format aims to manage various cluster types, including InfiniBand( HDR/NDR/XDR) , Ethernet, and NVLink, with ease using an Excel-based structure for organized data representation. The proposed format (Excel file) with multiple sheets encapsulates the topology of modern data center requirements and acts as a one-stop solution.

The Unified Topology file format consists of 4 sheets:

  1. Nodes

  2. Links

  3. DC Floor Layout (optional sheet if not managing GB200/300 racks)

  4. Server Profile (optional sheet if not managing Servers/Hosts)

Nodes

The Nodes sheet will focus on providing comprehensive information about the nodes within the cluster. The primary data points will include:

FabricID

Rack

Unit

TrayIndex

NodeName

NodeType

NodeOS

NodeModel

ServerProfile

Managed

Column

Value Type

Description

Mandatory

NodeName

String

The identifier for the node within the network.

​​☒​

NodeType

[Switch|Host]

The classification of the node (e.g., host, switch).

​​☒​

NodeModel

String

The specific model of the node.

​​☒​

NodeOS

String

The OS running on the node. Supported OS are: mlnx-os, cumulus, nvos, and linux for hosts

​​☒​

Rack

String

The physical rack where the node is located.

​​☐​

Unit

Integer

The specific unit within the rack.

​​☐​

TrayIndex

Integer

The tray index within the rack.

Relevant and mandatory for GB200

​​☐​

CoolingType

[Air|Liquid]

The node cooling type (e.g., Air, Liquid). For future use (not supported now)

​​☐​

FabricID

String

The Site/Cluster/Fabric name/ID

In case the topology includes multiple sites

​​☐​

ServerProfile

String

Hosts are assigned server profiles to provide mapping between different interface naming conventions.

It is a custom name you can create while will be referenced in the Server Profile Sheet.

​​☐​

Managed

[yes|no]

Control if agent should be installed on this node. Set to 'yes' by default.

​​☒​

Notes:

  • If Rack/Unit information is missing, some features—specifically the Rack View and filtering capabilities—will not function in the CVT interface.

  • IB hosts need to be added to the nodes sheet, so that their corresponding links to switches can be detected. However, agent installation on them is not supported, hence their 'Managed' column value should be 'no'.

  • Column names are case-insensitive, and the order of columns does not affect functionality.

  • The user can dismiss optional columns as described in the Mandatory column.

Links

The Links sheet will detail the connectivity information across the nodes within the cluster. It will consist of the following columns:

A-Node

A-Port

Z-Node

Z-Port

Protocol

Speed

Column

Value Type

Description

Mandatory

A-Node

String

The source node for the link. Must exist in the Nodes sheet

​​☒​

A-Port

String/Integer

The port on the source node.

​​☒​

Z-Node

String

The destination node for the link. Must exist in the Nodes sheet

​​☒​

Z-Port

String

The port on the destination node.

​​☒​

Protocol

(ib/ethernet/nvlink)

The protocol used for the connection

​​☒​

Speed

XDR/NDR/HDR/Numeric

The active speed of the link

​​☐​

In case of Host/Server ports, the A/Z-port would be the custom NIC name if available or the default NIC name. This value can be found in the output of `ip address show` command.

For IB hosts it is mandatory to have a server profile for this interface so that there is a mapping available for corresponding RDMA name of the port. RDMA name can be found from output of command: `sudo mst status -v`.

Notes:

  • Column names are case-insensitive, and the order of columns does not affect functionality.

  • The user can dismiss Optional Columns as explained in the Mandatory column.

Internal Links:

Internal links (NVLink within the GB200/300 racks) will not be part of the Links sheet. CVT shall detect any internal links based on the RackType and shall use the predefined JSON representation of these links to build the links. This approach ensures seamless integration and efficient configuration within the GB200/300 racks.

DC Floor Layout

The data center floor layout sheet is an optional sheet that is useful to provide a layout of the data center. This includes information about:

  • Data Halls: The various halls within the data center.

  • Scalable Units: Units designed to be scalable for future expansions. Add a default SU name for all if you are not using scalable unit concepts.

  • Racks: Detailed information about the racks within the data center. This information is mandatory for GB200/300 racks.

    • Rack: Rack Name

    • Rack Type: Type of the rack. Can be GB200_72x1, GB200_36x2, or GB200_36x1 for GB200 racks (similar convention is used for GB300) or any other general rack type like ServerRack, NetworkRack for other general racks.

    • Rack Group: Used for grouping the racks, especially in GB200 racks. It is a unique number for each GB200 rack system. A GB200 72x1 system would have a unique group number for itself. In a GB200 36x2 system, the two racks will belong to the same group_number. (Group number is just a made-up integer to facilitate the identification of rack systems correctly).

DataHall

ScalableUnit

Rack

RackType

Rack Group

Server Profile

The Server Profile sheet will be used to specify the interface configurations on Hosts, it does not have any significance for switches. The ServerProfile name mentioned in the Nodes sheet should have a corresponding entry here. Server Profile sheet will include below information:

Fabric ID

CustomNICName

NICOSName

PhysicalPort

RDMAName

PCIAddress

Column

Value Type

Description

Mandatory

FabricID

String

For future use (not supported now)

​​☐​

CustomNICName

String

Custom name given to the NIC if it has been renamed from its default value.

This value can be found listed in the output of the command `ip address show`

​​☒​

NICOSName

String

Default name of the interface. It can be found in the output of the command `ip address show`.

If NIC is renamed with a custom name, the default name can be found under the altname tag.

​​☒​

PhysicalPort

String

The OSFP port/slot in IB networks

☐​

RDMAName

String

RDMA name of IB ports like mlx5_0, etc. Mandatory for IB ports.

Can be found in the output of the command `sudo mst status -v`

​​☒​

PCIAddress

String

For future use (not supported now)

☐​

Note

DC Floor layout and Server Profile in Unified Topology are provided in the same Excel workbook and not passed as optional arguments.

© Copyright 2025, NVIDIA. Last updated on Aug 7, 2025.