BCM Networking Setup#

The manual method for network configuration of a reference DGX SuperPOD is described in this chapter. Deviations from these instructions will occur for customers who do not follow the reference architecture. The automated method for networking setup is done via the bcm-netautogen tool. It will create the necessary networks based on the DGX SuperPOD Ethernet RA.

Manual BCM Networking Setup#

For OEMs, the manual addition of the networks into BCM is required. Internalnet is defined during BCM head node installation but can be modified here. globalnet is also a predefined network and where the global network type is defined. For a more in-depth explanation of adding and configuring networks within BCM 11, consult Section 3.2 Network Settings of the BCM 11 Administrator Manual. In general, the following network definitions need to be added for a GB200 NVL72 cluster:

  • internalnet

  • dgxnet (only if a separate network subnet is being used to provision GB200 compute nodes)

  • ipminet

  • computenet

  • storagenet

  • failovernet (only used if a dedicated heartbeat RJ 45 cable is connected between headnodes)

For internalnet, dgxnet, and ipminet, set node booting to yes and management allowed to yes.

internalnet#

internalnet is used to provision the control plane nodes in a reference SuperPOD design. By default, both node booting and management allowed are set to yes. This means that the network will hand out DHCP and can be assigned to a category. This should have been set up during the BCM software installation.

Reference: internalnet settings#

[a03-p1-head-01->network[internalnet]]% show

Parameter                   Value
--------------------------  ---------------------
Name                        internalnet
Private Cloud
Revision
Domain Name                 eth.cluster
Type                        Internal
MTU                         9000
Allow autosign              Automatic
Write DNS zone              both
Node booting                yes
Lock down dhcpd             no
Management allowed          yes
Search domain index         0
Exclude from search domain  no
Disable automatic exports  no
Base address                7.241.16.0
Broadcast address           7.241.16.255
Dynamic range start         7.241.16.249
Dynamic range end           7.241.16.254
Netmask bits                24
Gateway                     7.241.16.1
Gateway metric
Cloud Subnet ID
EC2AvailabilityZone
Layer3                      no
Notes                       <0B>

dgxnet#

dgxnet is a separate subnet that provisions DGX nodes. The dhcp pool defined here will be the initial IPs given by the node installer until a node is identified by BCM and has its defined configuration applied. For a 1SU (8x GB200 Rack) DGX SuperPOD, there should be two dgxnet subnets.

Example: dgxnet#

cmsh
network
add dgxnet
set mtu 9000
set domainname dgxnet.cluster
set nodebooting yes
set managementallowed yes
set baseaddress <dgxnet subnet>
set dynamicrangestart <dynamic range ip start value>
set dynamicrangeend <dynamic range ip end value>
set netmaskbits 24
set gateway <gateway ip value>
commit

Reference: dgxnet settings#

[a03-p1-head-01->network[dgxnet1]]% show

Parameter                   Value
--------------------------  ---------------------
Name                        dgxnet1
Private Cloud
Revision
Domain Name                 dgxnet1.cluster
Type                        Internal
MTU                         9000
Allow autosign              Automatic
Write DNS zone              both
Node booting                yes
Lock down dhcpd             no
Management allowed          yes
Search domain index         0
Exclude from search domain  no
Disable automatic exports   no
Base address                7.241.18.0
Broadcast address           7.241.18.127
Dynamic range start         7.241.18.100
Dynamic range end           7.241.18.126
Netmask bits                25
Gateway                     7.241.18.1
Gateway metric              0
Cloud Subnet ID
EC2AvailabilityZone
Layer3                      no
Notes                       <0B>

ipminet#

For DGX SuperPOD, the RA has several ipminet networks which control the OOB access to the control plane nodes, the PDUs, InfiniBand switches, and the GB200 racks.

OOB subnet overview:

/16 - /21: The subnet will be divided into blocks of /24s. (DEFAULT)

  • The first /24 will be allocated for Control Plane nodes.

  • The second /24 will be allocated for PDU, PWR, and InfiniBand switches.

  • The remaining /23 blocks will be distributed across four GB200 rack groups, which include:

  • 4 x GB200 compute racks (DGX compute nodes).

Repeat the following example for each IPMI network.

Example: ipminet0#

cmsh
network
add ipminet0
set mtu 9000
set domainname ipminet0.cluster
set nodebooting yes
set managementallowed yes
set baseaddress <ipminet0 base address>
set dynamicrangestart <ipminet0 dynamic range start>
set dynamicrangeend <<ipminet0 dynamic range end>
set netmaskbits 24
set gateway <ipminet0 gateway>
commit

Reference: ipminet0 settings#

[a03-p1-head-01->network[ipminet0]]% show
Parameter                Value
-----------------------  ----------------------
Name                     ipminet0
Private Cloud
Revision
Domain Name              ipminet0.cluster
Type                     Internal
MTU                      9000
Allow autosign           Automatic
Write DNS zone           both
Node booting             yes
Lock down dhcpd          no
Management allowed       yes
Search domain index      0
Exclude from search domain no
Disable automatic exports no
Base address             7.241.0.0
Broadcast address        7.241.0.255
Dynamic range start      7.241.0.150
Dynamic range end        7.241.0.254
Netmask bits             24
Gateway                  7.241.0.1
Gateway metric           10
Cloud Subnet ID
EC2AvailabilityZone
Layer3                   no
Notes                    <0B>

computenet#

computenet is a non-routable subnet used for the East-West/InfiniBand configuration.

Example: computenet#

cmsh
network
add computenet
set mtu 4096
set domainname computenet.cluster
set nodebooting no
set lockdowndhcpd no
set managementallowed no
set baseaddress 100.126.0.0
set dynamicrangestart 0.0.0.0
set dynamicrangeend 0.0.0.0
set netmaskbits 16
set gateway 0.0.0.0
commit

Reference: computenet settings#

[a03-p1-head-01->network[computenet]]% show

Parameter                   Value
--------------------------  ---------------------
Name                        computenet
Private Cloud
Revision
Domain Name                 computenet.cluster
Type                        Internal
MTU                         4096
Allow autosign              Automatic
Write DNS zone              both
Node booting                no
Lock down dhcpd             no
Management allowed          no
Search domain index         0
Exclude from search domain  no
Disable automatic exports   no
Base address                100.126.0.0
Broadcast address           100.126.255.255
Dynamic range start         0.0.0.0
Dynamic range end           0.0.0.0
Netmask bits               16
Gateway                     0.0.0.0
Gateway metric              0
Cloud Subnet ID
EC2AvailabilityZone
Layer3                      no
Notes                       <0B>

storagenet#

storagenet is a subnet used for the converged Ethernet fabric through the second Bluefield3 (BF3) port on each BF3 NIC per GB200 compute tray.

Example: storagenet#

cmsh
network
add storagenet
set mtu 9000
set domainname storagenet.cluster
set nodebooting no
set lockdowndhcpd no
set managementallowed no
set baseaddress 100.127.0.0
set dynamicrangestart 0.0.0.0
set dynamicrangeend 0.0.0.0
set netmaskbits 16
set gateway 0.0.0.0
commit

Reference: storagenet settings#

[a03-p1-head-01->network[storagenet]]% show
Parameter                   Value
--------------------------  ---------------------
Name                        storagenet
Private Cloud
Revision
Domain Name                 storagenet.cluster
Type                        Internal
MTU                         9000
Allow autosign              Automatic
Write DNS zone              both
Node booting                no
Lock down dhcpd             no
Management allowed          no
Search domain index         0
Exclude from search domain  no
Disable automatic exports   no
Base address                100.127.0.0
Broadcast address           100.127.255.255
Dynamic range start         0.0.0.0
Dynamic range end           0.0.0.0
Netmask bits               16
Gateway                     0.0.0.0
Gateway metric              0
Cloud Subnet ID
EC2AvailabilityZone
Layer3                      yes
Layer3 route                none
Layer3 ecmp                 no
Layer3 split static route   no
Notes                       <0B>

failovernet#

failovernet is a generic network set up for high availability (HA) with a direct connection between the head nodes. It is a simple network, and it is configured during the HA setup. Do not add this network manually. Its details are listed here for reference.

Example: Headnode failovernet IPs for HA setup#

#headnode 1
physical enP2s2f0 10.151.0.1 failovernet always

#headnode 2
physical enP2s2f0 10.151.0.2 failovernet always

Reference: failovernet settings#

[a03-p1-head-01->network[failovernet]]% show
Parameter                   Value
--------------------------  ---------------------
Name                        failovernet
Private Cloud
Revision
Domain Name                 failover.cluster
Type                        Internal
MTU                         1500
Allow autosign              Automatic
Write DNS zone              both
Node booting                no
Lock down dhcpd             no
Management allowed          no
Search domain index         0
Exclude from search domain  no
Disable automatic exports   no
Base address                10.151.0.0
Broadcast address           10.151.255.255
Dynamic range start         0.0.0.0
Dynamic range end           0.0.0.0
Netmask bits               16
Gateway                     0.0.0.0
Gateway metric              0
Cloud Subnet ID
EC2AvailabilityZone
Layer3                      no
Notes                       <0B>

globalnet#

This is an automatic network and is present by default. No configuration required. The administrator can change network types (Type 1, Type 2, Type 3) in the globalnet settings.

Reference: globalnet settings#

[a03-p1-head-01->network[globalnet]]% show
Parameter                   Value
--------------------------  ---------------------
Name                        globalnet
Private Cloud
Revision                    type3
Domain Name                 cm.cluster
Type                        Global
MTU                         1500
Allow autosign              Automatic
Write DNS zone              both
Node booting                no
Lock down dhcpd             no
Management allowed          no
Search domain index         0
Exclude from search domain  no
Disable automatic exports   no
Base address                0.0.0.0
Broadcast address           255.255.255.255
Dynamic range start         0.0.0.0
Dynamic range end           0.0.0.0
Netmask bits               0
Gateway                     0.0.0.0
Gateway metric              0
Cloud Subnet ID
EC2AvailabilityZone
Layer3                      no
Notes                       <0B>

bcm-netautogen#

Note

For DGX SuperPOD, the bcm-netautogen has been re-architected for GB200 generation and beyond. It generates the configuration for TOR switches.

  • 200G in-band network (SN5600).

  • Control plane rack OOB switches as well as per-rack OOB switches (SN2201).

BCM import:

  • The networks and their CIDR information that are used.

  • The DGX GB200 node configurations within BCM that includes all network interface cards (NICs) with the appropriate assigned network and IPs.

  • NVLink Switch configuration and setup.

Required information:

The three documents/files that are inputs into bcm-netautogen are:

  • p2p_ethernet.csv which contains installation site point to point information for the ethernet fabric.

  • GB200 rack inventory files with MAC addresses for all NICs on every device in the GB200 rack. This is also in .csv format.

  • Site Information pertaining to the prefixes and BGP ASNs (that are required for generating the IP plan) placed into a site-info.yaml.

For DGX SuperPOD configurations, the project manager will receive the serial numbers for each DGX GB200 rack. Once the rack arrives on site, the deployment engineer will check/confirm the serial number on the rack and then assign it to a rack location based on the rack elevation.