Appendix#

Section 1.1: siteinfo.yaml#

The siteinfo.yaml file is a key configuration file used during the north-south network deployment process. It defines essential site-specific parameters such as DGX system type, network prefixes (OOB, data, storage), time servers, BGP ASNs for switches, and rack mapping information. This file is referenced by automation tools and scripts to generate network configurations, allocate IP addresses, and ensure consistent deployment across the environment. Properly populating siteinfo.yaml is critical for accurate and successful network provisioning.

The following is an example of what the siteinfo.yaml file should look like:

dgx_type: gb200

# The timeservers to be used on the Ethernet switches.
time_servers:
   - 0.cumulusnetworks.pool.ntp.org

networking:
   # root_prefix: 10.0.0.0/20
   oob_prefix: "7.241.0.0/21"
   data_prefix: "7.241.16.0/21"

   # The prefix for storage /31s.
   storage_prefix: "100.127.0.0/16"
   bms_prefix: "7.241.8.0/22"

   # The ASNs used for the BTOR switches. Provided by the customer.
   bgp_btor_asns:
      - 4260037003
      - 4260037004

   # The ASNs used for the FTOR switches. Provided by the customer.
   bgp_ftor_asns:
      - 4260037001
      - 4260037002

# Mapping customer rack IDs (as used in the P2P file) to rack serial numbers (as
# provided by the factory). This is used to determine MAC addresses/serial
# numbers of devices in GB200 racks.
rack_mapping:
   A08: '1830625000808'

# EOF

Section 1.2: Standard Point-to-Point (P2P) Column Header#

This section describes the standard column headers used in the P2P connectivity file. The columns are divided into two logical groups: Source (the originating device/port) and Destination (the target device/port). For clarity and ease of use, the table below presents both groups side by side, as they would appear in a typical P2P CSV or spreadsheet.

Table 1 Standard P2P Column Header Example#
#	BUNDLE_ID	SEQ	SRC_RACKROLE	SRC_RACK	SRC_U	SRC_NAME	SRC_HCA_PORT	SRC_TRANSCEIVER	DST_RACKROLE	DST_RACK	DST_U	DST_NAME	DST_PORT	DST_TRANSCEIVER	CABLE_LENGTH	CABLE_TYPE	CABLE_TRAY
1	B1	1	TOR	A01	10	A01-TOR-01	1	QSFP56	DGX	A02	20	A02-DGX-01	2	QSFP56	3m	DAC	TRAY-1

Column Descriptions:

Table 2 Standard P2P Column Header Descriptions#
Column	Description
#	Row number or unique identifier.
BUNDLE_ID	Logical bundle or group identifier for the connection.
SEQ	Sequence number within the bundle.
SRC_RACKROLE	Role of the source rack (e.g., TOR, DGX).
SRC_RACK	Source rack identifier.
SRC_U	Source rack unit (U position).
SRC_NAME	Source device name.
SRC_HCA_PORT	Source HCA port.
SRC_TRANSCEIVER	Source transceiver type.
DST_RACKROLE	Role of the destination rack.
DST_RACK	Destination rack identifier.
DST_U	Destination rack unit (U position).
DST_NAME	Destination device name.
DST_PORT	Destination port.
DST_TRANSCEIVER	Destination transceiver type.
CABLE_LENGTH	Length of the cable.
CABLE_TYPE	Type of cable used.
CABLE_TRAY	Cable tray or pathway identifier.

Note

The P2P file should include all columns above, with each row representing a single point-to-point connection. Keeping the source and destination columns grouped together in a single table improves readability and makes the file easier to work with for both humans and automation tools.

Section 1.3: Standard Worksheet Naming#

This section provides an example of the standard worksheet naming.

[TYPE] = (ETH)- Ethernet or (IB)- InfiniBand
[Pod/SU]<Sequence#> = Logical grouping of the Pod or switch unit (SU). For instance, P1, P2, … , PN and S1, S2, … , SN.
<Flow> = Describes the traffic or connection type that is defined in the table below. See the table in Section 1.4: Connection Type section for more details.

The following usage of the above naming works out to the following string:

<(TYPE)>-[<POD/SU>+<SEQ-NUM>]-<FlOW>

# Some examples of the above naming convention:
(ETH)-P1-DGX-DATA
(IB)-S1-DGX-OOB

Sample Tables Examples:

Table 3 Standard Worksheet Naming Examples#
Tab Name	Description
(ETH)-P1-DGX-DATA (ETH)-P1-DGX-OOBn	Ethernet P[1-N] or S[1-N] covers P2P connections between DGX and TOR (out-of-band management).
(ETH)-P1-SW-UPLINK (ETH)-P1-SW-EDGE	Ethernet switch to spine connections and connections to edge devices.
(ETH)-P1-NODE-OOB (ETH)-P1-NODE-DATA (ETH)-P1-MGMT-OOB	Ethernet: All OOB connections from node, including SW-to-OOB, Node-to-OOB, and DGX-to-OOB.
(IB)-P1-DGX-IB (IB)-P1-CLEAF-CSPINE	InfiniBand: DGX to compute IB, and compute leaf to spine uplinks.
(TEMPLATE)-DGX-OOB	Used only for GB200, as the racks are pre-cabled from the factory.
Validate_Columns	Just provide column format to compare with other tabs. This column is (Required).
NAME_MAPPING	This uses customer naming and combines with default naming to provide a complete naming convention.

Section 1.4: Connection Type#

This section provides an example of the connection type.

Note

The term “Flow” is used in this context to refer to the type or direction of network connection between devices or components in the system. A more precise term is “Connection Type,” as it describes the nature and endpoints of each network link (e.g., NODE-OOB, DGX-DATA). For a formal definition, see Flow in the Glossary of Terms.

Table 4 Table showing Flow, or Connection types#
FLOW Name	Meaning
NODE-OOB	Connection from compute node to OOB switch (out-of-band management).
NODE-DATA	Compute nodes to data (IB or Ethernet) fabric.
DGX-OOB	DGX system to out-of-band switch.
DGX-DATA	DGX system to data switch or network fabric.
NODE-NODE	Direct connection between compute nodes.
SW-OOB	Out-of-band cabling between switches.
SW-UPLINK	Uplink from switch to aggregation or spine switch.
STORAGE-DATA	Storage (HSS) or (NFS) system connected to a data switch or host.
STORAGE-OOB	Storage system to out-of-band switch.
UFM-OOB	UFM system (fabric manager) out-of-band connection.
UFM-DATA	UFM system connected to a data network.
EDGE-SW	Edge switch connections (e.g., border leaf or service leaf).
INRACKDGX-OOB	In-rack cabling from DGX to OOB switch.
INRACKDGX-DATA	In-rack cabling from DGX to leaf/data switch.
INRACKNVSW-OOB	In-rack NVLink Switch to OOB cabling.
PWR-OOB	PWR and PDU to OOB cabling.
ACCESS-OOB	First OOB Switch will be provisioned with different IP, just to provision SW.

Section 1.5: Standard Naming Conventions for Network Components#

This section provides the standard naming conventions used for various network components in the DGX SuperPOD Ethernet North-South Network. These conventions ensure consistency and clarity when identifying devices, racks, and network elements across documentation, configuration files, and operational procedures.

Purpose of These Tables: The tables below define the naming patterns for different types of devices and racks. Using these conventions helps teams quickly identify the role, location, and function of each component in the network.

Static vs. Incremental Naming:

Static
Naming is used for components where the number of instances does not change as the system scales (e.g., control plane head nodes). These names remain fixed regardless of cluster size.

Incremental
Naming is used for components that scale with the size of the deployment (e.g., GPU nodes, storage appliances, switches). The names include incrementing numbers or identifiers to distinguish between multiple instances.

Control Plane (Static Naming)

Naming Pattern	Description
`<RACK>-<RU>-P[1-16]-BCM-0[1-2]`	BCM Head Nodes, per POD; number of head nodes does not increase with scale.
`<RACK>-<RU>-P[1-16]-MGMT-0[1-2]`	SLogin nodes; number of management nodes is fixed.
`<RACK>-<RU>-P[1-16]-K8[ADMIN\|USER]-0[1-3]`	Kubernetes Admin\|User nodes.

GPU Rack (Incremental Naming)

Naming Pattern	Description
`<RACK>-<RU>-P[1-16]-<ROLE>-0[1-8]-C0[1-18]` (only GB200)	RACKNAME, POD#, ROLE: DGX, C#: ComputeTray Example: `A01-P1-DGX-01-C01` .. `A01-P1-DGX-01-C18` `B09-P1-DGX-08-C01` .. `B09-P1-DGX-02-C18`
`<RACK>-<RU>-SU[1-16]-<ROLE>-0[1-n]`	Example: `A01-SU1-DGX-01` .. `D01-SU1-DGX-127`

Storage Rack (Incremental Naming)

Naming Pattern	Description
`<RACK>-<RU>-P[1-16]-<storage_vendor>-0[1-n]`	Storage Appliance (StorageLeaf) SLEAF

Ethernet Switches (Incremental Naming)

Naming Pattern	Description
`<RACK>-<RU>-P[1-16]-<switch_role>-0[1-n]`	Pod#: equivalent to scalable units Switch_role: TOR, IPMI, LEAF, SPINE, SSPINE, CORE
`<RACK>-<RU>-P[1-16]-BTOR-0[1-2]`	Must have Edge connection (converged leaf)
`<RACK>-<RU>-P[1-16]-TOR-0[1-2]`	ComputeTray, DGX
`<RACK>-<RU>-P[1-16]-FTOR-0[1-2]`	Fabric Manager, InBand using SN2201 (UFM, NMX servers)
`<RACK>-<RU>-P[1-16]-STOR-0[1-2]`	Storage HSS Leaf
`<RACK>-<RU>-P[1-16]-OOB-0[1-n]`	OOB Switch (SN2201)

NVLink Switch (Incremental Naming)

Naming Pattern	Description
`<RACK>-P[1-16]-<switch_role>-0[1-9]`	Pod#, SwitchRole: nvsw, Rack# [1-8] (within pod, there are 8 racks), NVLink Switch incremental [1-9] Example: `A01-P1-NVSW-01` .. `A01-P1-NVSW-09`

Section 1.6: Example Point-to-Point (P2P) format#

This section is and appendix to the How to Format Point-to-Point (P2P) guide and provides examples of how to manually format the Excel file to P2P format. This is necessary because the netautogen tool requires the data to be in a specific format.

Example P2P in raw CSV format:

FLOW,FROM_RACK,FROM_RACKUNIT,CUSTOMER_SRC_NAME,FROM_NODE,FROM_PHYSICAL_PORT,FROM_PORT,FROM_BREAKOUT,TO_RACK,TO_RACKUNIT,CUSTOMER_DEST_NAME,TO_NODE,TO_PHYSICAL_PORT,TO_PORT,TO_BREAKOUT
NODE-DATA,A4,2,A4-P1-BCM-01,A4-P1-BCM-01,M1,M1,-,A3,8,A3-P1-BTOR-01,A3-P1-BTOR-01,1/1/1,1s0,4x
NODE-DATA,A4,5,A4-P1-BCM-02,A4-P1-BCM-02,M1,M1,-,A3,8,A3-P1-BTOR-01,A3-P1-BTOR-01,1/1/2,1s1,-
NODE-DATA,A4,8,A4-P1-MGMT-03,A4-P1-MGMT-03,M1,M1,-,A3,8,A3-P1-BTOR-01,A3-P1-BTOR-01,1/2/1,1s2,-
STORAGE-DATA,A5,20,A5-P1-HSS-05,A5-P1-HSS-05,S1,S1,-,A3,8,A3-P1-BTOR-01,A3-P1-BTOR-01,9/1/1,9s0,4x
IBSW-OOB,A3,43,A3-P1-IBLEAF-01,A3-P1-IBLEAF-01,bmc,bmc,-,A3,45,A3-P1-OOB-01,A3-P1-OOB-01,1,1,-
SW-OOB,A3,27,A3-P1-SPINE-01,A3-P1-SPINE-01,mgmt,mgmt,-,A3,45,A3-P1-OOB-01,A3-P1-OOB-01,9,9,-
UFM-OOB,A5,44,A5-P1-CUFM-01,A5-P1-CUFM-01,LOM3,LOM3,-,A3,45,A3-P1-OOB-01,A3-P1-OOB-01,23,23,-
NODE-OOB,A4,2,A4-P1-BCM-01,A4-P1-BCM-01,LOM2,LOM2,-,A3,45,A3-P1-OOB-01,A3-P1-OOB-01,30,30,-
PWR-OOB,A1,6,A1-P1-PWR-01,A1-P1-PWR-01,mgmt,mgmt,-,A3,46,A3-P1-OOB-02,A3-P1-OOB-02,1,1,-
UFM-DATA,A5,44,A5-P1-CUFM-01,A5-P1-CUFM-01,LOM1,LOM1,-,A4,45,A4-P1-FTOR-01,A4-P1-FTOR-01,1,1,-
STORAGE-OOB,A5,11,A5-P1-HSS-01,A5-P1-HSS-01,mgmt,mgmt,-,A5,41,A5-P1-OOB-01,A5-P1-OOB-01,1,1,-
EDGE-BTOR,-,-,EQX-EDGE-01,EQX-EDGE-01,-,-,-,A3,8,A3-P1-BTOR-01,A3-P1-BTOR-01,49/1/1,49s0,8x
SW-UPLINK,A3,14,A3-P1-TOR-01,A3-P1-TOR-01,53/1/1,53s0,2x,A4,42,A4-P1-SPINE-01,A4-P1-SPINE-01,1/1/1,1s0,2x
SW-UPLINK,A3,14,A3-P1-TOR-01,A3-P1-TOR-01,53/2/1,53s1,-,A4,42,A4-P1-SPINE-01,A4-P1-SPINE-01,1/2/1,1s1,-
INRACKDGX-DATA,A1,11,A1-P1-DGX-01-C01,A1-P1-DGX-01-C01,M1,M1,-,A3,14,A3-P1-TOR-01,A3-P1-TOR-01,1/1/1,1s0,4x
INRACKDGX-OOB,A2,12,A2-P1-DGX-02-C02,A2-P1-DGX-02-C02,BF1BMC,BF1BMC,-,A2,44,-,A2-P1-OOB-01,2,2,-
INRACKDGX-OOB,A1,12,A1-P1-DGX-01-C02,A1-P1-DGX-01-C02,BF1BMC,BF1BMC,-,A1,44,-,A1-P1-OOB-01,2,2,-
INRACKNVSW-OOB,A1,19,A1-P1-NVSW-01,A1-P1-NVSW-01,BMC,BMC,-,A1,45,-,A1-P1-OOB-02,9,9,-

The above csv data shown in an HTML table:

Table 5 Example P2P CSV in an easy to read table.#
FLOW	FROM_RACK	FROM_RACKUNIT	CUSTOMER_SRC_NAME	FROM_NODE	FROM_PHYSICAL_PORT	FROM_PORT	FROM_BREAKOUT	TO_RACK	TO_RACKUNIT	CUSTOMER_DEST_NAME	TO_NODE	TO_PHYSICAL_PORT	TO_PORT	TO_BREAKOUT
NODE-DATA	A4	2	A4-P1-BCM-01	A4-P1-BCM-01	M1	M1		A3	8	A3-P1-BTOR-01	A3-P1-BTOR-01	1/1/1	1s0	4x
NODE-DATA	A4	5	A4-P1-BCM-02	A4-P1-BCM-02	M1	M1		A3	8	A3-P1-BTOR-01	A3-P1-BTOR-01	1/1/2	1s1
NODE-DATA	A4	8	A4-P1-MGMT-03	A4-P1-MGMT-03	M1	M1		A3	8	A3-P1-BTOR-01	A3-P1-BTOR-01	1/2/1	1s2
STORAGE-DATA	A5	20	A5-P1-HSS-05	A5-P1-HSS-05	S1	S1		A3	8	A3-P1-BTOR-01	A3-P1-BTOR-01	9/1/1	9s0	4x
IBSW-OOB	A3	43	A3-P1-IBLEAF-01	A3-P1-IBLEAF-01	bmc	bmc		A3	45	A3-P1-OOB-01	A3-P1-OOB-01	1	1
SW-OOB	A3	27	A3-P1-SPINE-01	A3-P1-SPINE-01	mgmt	mgmt		A3	45	A3-P1-OOB-01	A3-P1-OOB-01	9	9
UFM-OOB	A5	44	A5-P1-CUFM-01	A5-P1-CUFM-01	LOM3	LOM3		A3	45	A3-P1-OOB-01	A3-P1-OOB-01	23	23
NODE-OOB	A4	2	A4-P1-BCM-01	A4-P1-BCM-01	LOM2	LOM2		A3	45	A3-P1-OOB-01	A3-P1-OOB-01	30	30
PWR-OOB	A1	6	A1-P1-PWR-01	A1-P1-PWR-01	mgmt	mgmt		A3	46	A3-P1-OOB-02	A3-P1-OOB-02	1	1
UFM-DATA	A5	44	A5-P1-CUFM-01	A5-P1-CUFM-01	LOM1	LOM1		A4	45	A4-P1-FTOR-01	A4-P1-FTOR-01	1	1
STORAGE-OOB	A5	11	A5-P1-HSS-01	A5-P1-HSS-01	mgmt	mgmt		A5	41	A5-P1-OOB-01	A5-P1-OOB-01	1	1
EDGE-BTOR			EQX-EDGE-01	EQX-EDGE-01				A3	8	A3-P1-BTOR-01	A3-P1-BTOR-01	49/1/1	49s0	8x
SW-UPLINK	A3	14	A3-P1-TOR-01	A3-P1-TOR-01	53/1/1	53s0	2x	A4	42	A4-P1-SPINE-01	A4-P1-SPINE-01	1/1/1	1s0	2x
SW-UPLINK	A3	14	A3-P1-TOR-01	A3-P1-TOR-01	53/2/1	53s1		A4	42	A4-P1-SPINE-01	A4-P1-SPINE-01	1/2/1	1s1
INRACKDGX-DATA	A1	11	A1-P1-DGX-01-C01	A1-P1-DGX-01-C01	M1	M1		A3	14	A3-P1-TOR-01	A3-P1-TOR-01	1/1/1	1s0	4x
INRACKDGX-OOB	A2	12	A2-P1-DGX-02-C02	A2-P1-DGX-02-C02	BF1BMC	BF1BMC		A2	44		A2-P1-OOB-01	2	2
INRACKDGX-OOB	A1	12	A1-P1-DGX-01-C02	A1-P1-DGX-01-C02	BF1BMC	BF1BMC		A1	44		A1-P1-OOB-01	2	2
INRACKNVSW-OOB	A1	19	A1-P1-NVSW-01	A1-P1-NVSW-01	BMC	BMC		A1	45		A1-P1-OOB-02	9	9

Section 2.1: GB200 Rack Inventory#

The following CSV file is an example from Splunk DB. The column header should have the following in the CSV file:

Note

The following CSV information consists entirely of column headers; there is no data content provided.

"COMP_PN","COMP_SN","COMP_SN_DIRECT_NVPN","COMP_SN_DIRECT_NVSN","COMP_TYPE",
DATECODE,LOCATION,NVPN,NVSN,"SCOMP_PN","START_TIME",VENDOR,"comp_pn","comp_sn",
"comp_type","date_hour","date_mday","date_minute","date_month","date_second",
"date_wday","date_year","date_zone",eventtype,filename,host,index,linecount,
location,nvpn,nvsn,punct,"scomp_pn",source,sourcetype,"splunk_server",
"splunk_server_group","start_time",starttime,tag,"tag::eventtype",
"tag::sourcetype",vendor,"_raw","_time"