NVIDIA DGX SuperPOD and BasePOD with DGX B200 Systems Deployment Guide with NVIDIA Mission Control 2.0#

Introduction#

This document provides the steps for deploying NVIDIA DGX SuperPOD and BasePOD with DGX B200 systems and NVIDIA Mission Control . NVIDIA Mission Control.

NVIDIA Mission Control 2.0 for DGX B200 includes Base Command Manager (BCM) 11.25.08 and NVIDIA Run:ai functionality as part of an integrated software delivery for configuration, validation, and cluster operations. This release introduces integrated B200 DGXOS and automated B200 setup through the bcm-pod-setup utility for faster provisioning. Run:ai and Slurm are deployed via the BCM wizard, enabling topology-aware, HPC-style GPU workload scheduling across SuperPOD and BasePOD environments.

Note

Direct upgrade from NMC 1.1 to 2.0 is NOT supported for DGX B200 systems. A full redeployment is required for any upgrade.

Hardware Overview#

The following reference architectures detail the hardware and software requirements for the SuperPOD and BasePoD.

DGX B200 Ports and connectivity options#

This section covers the DGX system network ports and an overview of the networks used by DGX B200 System Network Ports.

Figure 1 shows the physical layout of the back of the DGX B200 system.

_images/image1.png

Figure 1 Physical layout of the back of the DGX B200 system#

Figure 2 shows how the DGX B200 network ports are used in this deployment guide.

_images/image3.png

Figure 2 Physical layout of the back of the DGX B200 system#

DGX B200 network ports used in this deployment guide

The following ports are selected for DGX BasePOD networking:

  • Eight ports in four OSFP connections are used for the InfiniBand compute fabric

  • Each pair of dual-port NVIDIA BlueField-3 HCAs (NIC mode) provide parallel pathways to the storage and management fabrics.

  • Optional One port of dual-port BlueField-3 HCAs (IB mode) provides access to IB storage fabrics.

  • BMC network access is provided through the out-of-band network

  • The networking ports and their mapping are described in detail in the Network Ports section of the NVIDIA DGX B200 System User Guide.

Ethernet Management (North South) Network configuration#

The configuration of the Ethernet Management or North-South network can be performed either manually or automatically, depending on the deployment’s scale. The subsequent sections provide instructions for both approaches.

North/South Ethernet Network - Automated Deployment#

BCM can automate network switch provisioning and fabric configuration using bcm-netautogen. For large-scale deployments, contact your NVIDIA PTAM for NVIS engagement and assistance.

Manual Network Configuration#

Manual deployment is suggested for smaller deployments like BasePOD.

SN4600C – North South Management ethernet switches#

The SN4600c fabric provides connectivity for inband management and provisioning of the nodes. The key configuration requirements are:

  • MLAG between the two SN4600C switches

  • L3 SVI/VRRP for all the pod ethernet networks.

  • Each headnode / K8s node / DGX is dual homed to the SN4600C switches via bond interface

  • External connectivity to customer network, using customer specified routing arrangements, like BGP or static or other dynamic routing protocols

  • Link to IPMI Network for BCM to access node BMCs, either direct or indirect via customer network.

SN4600C-1 reference configuration#

# Basic management configuration
nv set system hostname 4600C-1
#
# Create SVIs for Internal/Management  Network with VRRP as FHRP
nv set bridge domain br_default vlan 102
nv set interface vlan102 type svi
nv set interface vlan102 ip vrr mac-address 00:00:5E:00:01:01
nv set interface vlan102 ip vrr address 10.150.124.1/24
nv set interface vlan102 ip address 10.150.124.2/24
nv set interface vlan102 ip vrr state up
# Repeat the same for other SVI interfaces
# Configure MLAG
# Define inter-chassis peerlink etherchannel/bond
nv set interface peerlink bond member swp63,swp64
nv set interface peerlink type peerlink
#
# Loopback for BGP/MLAG backup routing
nv set interface lo ip address 10.160.254.22
#
# Configure Peerlink L3 parameters
nv set interface peerlink.4094 base-interface peerlink
nv set interface peerlink.4094 type sub
nv set interface peerlink.4094 vlan 4094
nv set mlag backup 10.160.254.23
nv set mlag enable on
nv set mlag mac-address 44:38:39:ff:00:02
nv set mlag peer-ip linklocal
# MAG Primary
nv set mlag priority 2048
# Example port configuration for head nodes (BCM, Kube)
# BCM Head Nodes
nv set interface bond1 bond member swp1
nv set interface bond1 description "BCM Headnode 1"
nv set interface bond1 bond mlag id 1
nv set interface bond1 bridge domain br_default access 102
nv set interface bond1 bond mlag enable on
nv set interface bond1 bond lacp-bypass on
# Repeat for other management/workloads/compute nodes
#
# Uplink to the customer network.
# Example configuration with BGP unnumbered
nv set router bgp autonomous-system 4200004001
nv set router bgp enable on
nv set router bgp router-id 10.160.254.22
nv set vrf default router bgp address-family ipv4-unicast enable on
nv set vrf default router bgp address-family ipv4-unicast redistribute connected enable on
nv set vrf default router bgp enable on
# Uplinks via swp50
nv set vrf default router bgp neighbor swp50 type unnumbered
# Peering to MLAG peer switch
nv set vrf default router bgp neighbor peerlink.4094 remote-as internal
nv set vrf default router bgp neighbor peerlink.4094 type unnumbered

SN4600C-2 reference configuration - Same as SN4600C-1, with the following changes#

# Basic management configuration
nv set system hostname 4600C-2
#
# Create SVIs for Internal/Management Network with VRRP as FHRP
nv set bridge domain br_default vlan 102
nv set interface vlan102 type svi
nv set interface vlan102 ip vrr mac-address 00:00:5E:00:01:01
nv set interface vlan102 ip vrr address 10.150.124.1/24
nv set interface vlan102 ip address 10.150.124.3/24
nv set interface vlan102 ip vrr state up
# follow the same for other SVIs
#
# Configure MLAG
# Define inter-chassis peerlink etherchannel/bond
#
# Loopback for BGP/MLAG backup routing
nv set interface lo ip address 10.160.254.23
#
# Configure Peerlink L3 parameters
nv set mlag backup 10.160.254.22
nv set mlag mac-address 44:38:39:ff:00:02
# MLAG Secondary
nv set mlag priority 4096
#
# Example port configuration for head nodes (BCM, Kube)
# same as 4600-1
#
# Uplink to the customer network.
# Same as 4600-1

You can verify the MLAG status using the following command

root@mgmt-net-leaf-1:mgmt:/home/cumulus# clagctl
The peer is alive
    Our Priority, ID, and Role: 2048 9c:05:91:dd:cc:28 primary
    Peer Priority, ID, and Role: 2048 9c:05:91:f1:73:28 secondary
        Peer Interface and IP: peerlink.4094 fe80::9e05:91ff:fef1:7328 (linklocal)
                    Backup IP: 10.160.254.23 vrf mgmt (inactive)
                    System MAC: 44:38:39:ff:0a:00

CLAG Interfaces
Our Interface      Peer Interface     CLAG Id   Conflicts              Proto-Down Reason
----------------   ----------------   -------   --------------------   -----------------
        bond1    -                       1        -                          -
        bond10   -                      10        -                          -
        bond11   -                      11        -                          -
        bond12   -                      12        -                          -
        bond13   -                      13        -                          -
        bond14   -                      14        -                          -

For troubleshooting, you can use the consistency check command. Here is an example output from a working MLAG pair.

cumulus@mgmt-net-leaf-2:mgmt:~$ nv show mlag consistency-checker global
Parameter               LocalValue                 PeerValue                  Conflict  Summary
----------------------  -------------------------  -------------------------  --------  -------
anycast-ip              -                          -                          -
bridge-priority         32768                      32768                      -
bridge-stp-mode         rstp                       rstp                       -
bridge-stp-state        on                         on                         -
bridge-type             vlan-aware                 vlan-aware                 -
clag-pkg-version        1.6.0-cl5.11.0u2           1.6.0-cl5.11.0u2           -
clag-protocol-version   1.7.0                      1.7.0                      -
peer-ip                 fe80::9e05:91ff:fedd:cc28  fe80::9e05:91ff:fedd:cc28  -
peerlink-bridge-member  Yes                        Yes                        -
peerlink-mtu            9216                       9216                       -
peerlink-native-vlan    1                          1                          -
peerlink-vlans          1, 100->102                1, 100->102                -
redirect2-enable        yes                        yes                        -
system-mac              44:38:39:ff:0a:00          44:38:39:ff:0a:00          -

SN2201 – IPMI Switch for Out-of-Band Management#

All the BMCs are in the same subnet, configure all switch ports connected to the BMCs to be under the same VLAN.This IPMI network should be accessible from the Management Ethernet to allow the BCM headnodes to control the BMCs. In this example, the IPMI network is routed via the Management Ethernet SN4600C switches. It is recommended to add an additional uplink to the customer’s OOB network.

Example Configuration for the SN2201 switch.

nv set system hostname IPMI-SW
#<Basic management configuration>
#
# Set the VLAN for BMC ports. Adjust according to the customer specification
nv set bridge domain br_default vlan 101
#
# Enable the BMC Ports to the Access VLAN
#
nv set interface swp1-48 bridge domain br_default
nv set bridge domain br_default untagged 1
nv set interface swp1-48
nv set interface swp1-48 link state up
nv set interface swp1-48 description "BMC Ports"
nv set interface swp1-48 bridge domain br_default access 101
#
# Uplink to customer OOB/PIMI Network
# In this example the uplink is a layer 2 trunk with etherchannel/bond.
# Adjust according to the customer specification for Uplink
nv set interface swp49-50 link state up
nv set interface bond1 bond member swp49,swp50
nv set interface bond1 bridge domain br_default untagged 1
nv set interface bond1 bridge domain br_default vlan all

Refer to the appendix for complete switch configuration.

Once the SN2201 switches have been successfully configured, verify that all devices out of band management interfaces are reachable from the network. (i.e. make sure you can access the BMC/iLO/iDRAC of all nodes).

InfiniBand Compute Fabric#

The InfiniBand compute fabric offers two deployment options: a standalone configuration where the Subnet Manager operates on the switches, or a centrally managed Subnet Manager utilizing UFM which is the recommended option for scaled deployments. The following sections provide instructions for these deployment choices.

Manual InfiniBand Compute Fabric Configuration#

To initially configure the QM9700 switches in the Compute or Storage switch stacks, console serial port connectivity is essential. This can be established either remotely through a serial concentrator or by a physical connection to the switch’s serial port. After confirming connectivity, power on all Compute and Storage switches.

QM9700 – OSFP IB Switches#

We recommend configuring the InfiniBand switches with subnet manager HA enabled. Example configuration

QM-9700-1

ib sm
ib sm virt enable
ib smnode 9700-1 create
ib smnode 9700-1 enable
ib smnode 9700-1 sm-priority 15
ib ha infiniband-default ip <HA VIP> <mask>

QM-9700-2

ib sm virt enable
ib smnode 9700-1 create
ib smnode 9700-1 enable
ib smnode 9700-1 sm-priority 15

Verify IB SM HA status using the following command

QM9700-1[infiniband-default: master] # show ib smnodes

HA state of switch infiniband-default:
IB Subnet HA name: infiniband-default
HA IP address    : 10.185.230.247/22
Active HA nodes  : 2

HA node local information:
    Name       : 9700-2 (active)
    SM-HA state: standby
    SM Running : stopped
    SM Enabled : disabled
    SM Priority: 0
    IP         : 10.185.230.243

HA node local information:
    Name       : 9700-1 (active)  <--- (local node)
    SM-HA state: master
    SM Running : running
    SM Enabled : enabled - master
    SM Priority: 15
    IP         : 10.185.231.43

InfiniBand/Ethernet Storage Fabric Specific Configurations#

A DGX SuperPOD/BasePOD typically also includes dedicated storage, but the configuration is outside the scope of this document. Contact the vendor of the storage solution being used for instructions on configuring the high-performance storage portions of a SuperPOD/DGX BasePOD.

InfiniBand UFM Deployment#

Large scale deployments of IB/Compute Fabric with UFM require NVIS engagement, please contact your NVIDIA PTAM for further assistance.

Base Command Manager Headnodes Installation#

Download the Base Command Manager (BCM) ISO#

Download the latest BCM 11.x ISO image from the BCM website with the following options.

_images/image4.png

DGX B200 BIOS Config and Network Interface Boot Order#

The DGX B200 appliance Bios typically ships from the factory configured to PXE boot using the two primary in-band interfaces. For detailed steps on modifying the boot order to utilize these interfaces for PXE booting, consult the DGX B200 System User Guide.

Control Nodes BIOS, OOB/BMC configuration#

Before installing BCM, configure the BIOS boot sequence and out-of-band management interface (such as BMC, IPMI) for all cluster nodes. Once configured, verify that all out-of-band management interfaces are reachable within the cluster network and that the BMC/iLO/iDRAC UI can be accessed.

On the control nodes, configure the interfaces connected to the management network to operate in Ethernet Mode (instead of IB). Additionally, enable PXE boot for all control nodes except the primary and secondary head nodes.

Example: Configuring BIOS and Network Interface Boot order on Dell appliance with iDRAC9:#

Power up the appliance and Interrupt the boot cycle to enter the Boot Manager and select “Launch System Setup”.

_images/image5.png

Select “Device Settings”.

_images/image6.png

Select the Card that needs the mode changed from Infiniband (IB Mode) to Ethernet (ETH Mode).

_images/image7.png

Change the “Network Link Type” from “Infiniband” to “Ethernet” and select “Finish”.

_images/image8.png

On the confirmation message, click OK.

_images/image9.png

After confirming the CX card ports are in the correct mode, proceed to enabling PXE boot.

Return to the “System Setup” screen and select “System BIOS”.

_images/image10.png

Select ”Network Settings”.

_images/image11.png

Enable PXE Boot for the two primary inband network interfaces.

_images/image12.png _images/image13.png

To confirm the changes, click “Back” to return to the “System BIOS” screen. A “Warning - Save Changes” prompt will appear; select “Yes.” Then, click “Finish” to go back to the System Setup Main Menu.

_images/image14.png

Select “Finish”, and on the “Warning - Confirm Exit” prompt select “Yes” to confirm the appliance reboot.

_images/image15.png

Ensure the PXE-enabled NICs are configured as the primary boot device.

Other Branded appliances#

These settings are usually configured through the system’s BIOS or the appliance’s Out-of-Band management interface; refer to the vendor’s BMC/BIOS manual for specific instructions. Alternatively, you can adjust the port mode (Ethernet to IB) by temporarily booting the appliance into a Linux environment and utilizing the NVIDIA Firmware Tools.

Note

When using NVIDIA Firmware Tools, the device path in the example commands must be updated to reflect the correct interface. Do not execute the provided examples “as is” on a production system. Refer to the NVIDIA Firmware Tools documentation for detailed instructions on proper usage.

mstconfig -d 04:00.0 set LINK_TYPE_P1=2

RAID/Storage Configuration#

If available, configure the hardware RAID controller and disks to minimum RAID level 1 using the appliance’s BMC or BIOS. The procedure varies depending on the appliance vendor and RAID controller. Refer to the specific vendor documentation for the configuration procedure.

BCM Installation#

Headnode preparation#

Boot the headnode from the BCM installer ISO, which can be connected via USB, CD-ROM, or Virtual KVM. The specific steps for this process will differ based on the headnode appliance vendor.

Example BCM install with Dell appliances/iDRAC9#

Boot the appliance into the Lifecycle Controller and select “OS Deployment” on the left side of the screen and then click “Deploy OS”.

_images/image16.png

Select “Go Directly to OS Deployment” then click “Next”.

_images/image17.png

Set the Boot Mode to UEFI. Ensure Secure Boot is Disabled, the Secure Boot Policy is Standard, and “Any Other Operating System” is selected for the Available Operating System. Then, click “Next” to continue.

_images/image18.png

Select the option for “Manual Install” and click “Next”.

_images/image19.png

Proceed to choose the appropriate Media containing the BCM10 Installation ISO and then select “Next”.

_images/image20.png

Confirm the selected options and select “Finish”.

_images/image21.png

The appliance will proceed to boot as normal.

Other Appliances#

Attach the BCM10 installation media to the designated headnode appliance. Power on the appliance and proceed to boot from the BCM10 installation media.

Booting the Base Command Manager Graphical Installer#

The BCM installer GUI will launch once the installation media has successfully loaded.

_images/image22.png

On the Base Command Manager Installer Splash Screen select “Start Installation”.

_images/image23.png

If you agree to the governing terms, select “I Agree” and then “Next.”

_images/image24.png

If you agree to the software licensing agreement, select “I Agree” and then “Next”.

_images/image25.png

On the “Kernel Modules” page select “Next”.

_images/image26.png

First, verify that the interface names for the two CX-7 OSFP interfaces (e.g., enp138s0np0 and enp47s0np0), configured in Ethernet Mode, match the site survey. These names are crucial for subsequent installation and Cluster Bring Up steps. Once documented, proceed by clicking “Next.”

_images/image27.png

Select the appropriate install media source:

_images/image28.png

Populate the “Cluster settings” page with the requested details from the site survey.

_images/image29.png

Select “None” as the HPC workload manager, then click “Next”.

_images/image30.png

Select “Type2” as the Network Topology, then click “Next”.

_images/image31.png

In this deployment example we are using a type 2 network. More information on the different types of networks can be found in the BCM Installation Manual On the Head node settings page populate all requested values.

_images/image32.png

On the Compute Nodes page, accept the default values and click “Next.”

_images/image33.png

Under BMC configuration select “No” for both Head Node & Compute Nodes, then select “Next”.

_images/image34.png

Proceed to populate the Networks page as per site survey.

Note

Only populate Internalnet, the other networks used in the DGX cluster will be defined during Cluster Bring Up.

_images/image35.png

Next populate the headnode interface IP’s.

_images/image36.png

Set network offsets as 0.0.0.4 and select “Next”.

_images/image37.png

Choose the installation drive. It is recommended to install BCM on a RAID1 compliant drive. For Dell appliances, select the “Dell BOSS-N1” drive. In other cases where only software RAID is available, select a minimum of two drives, then click “Next”.

RAID controllers can be configured using the appliance’s BMC or vendor-provided software.

_images/image38.png

On the Disk layout settings page we’ll select “One Big Partition” for the Head node and Compute nodes disk layout.

_images/image39.png

Leave CUDA unchecked.

_images/image40.png

Do not select the CUDA option; we will be using the CUDA from the DGX OS image.. Once you have reviewed the configuration, you can either select “Back” to make corrections on the association configuration page or click “Start” to proceed with the installation.

_images/image41.png

Monitor the progress of the install – Typically installs take ~15-30 minutes.

_images/image42.png

On successful installation the node will briefly show the below screen before automatically rebooting (if selected)

_images/image43.png

First time booting into BCM#

After the headnode finishes rebooting from the installation environment SSH to the head-node. Proceed to update the headnode to the newest available packages.

root@clio-headnode-01:~# apt update && apt upgrade -y

Select keep the local version for the below dialog box.

_images/image44.png

On successful update, reboot the head-node.

Activate the BCM Cluster License - Online#

SSH to headnode and activate the Cluster License with the request-license command

root@clio-headnode-01:~# request-license
Product Key (XXXXXX-XXXXXX-XXXXXX-XXXXXX-XXXXXX):000000-...-000000
Country Name (2 letter code): US
State or Province Name (full name): California
Locality Name (e.g. city): Santa Clara
Organization Name (e.g. company): NVIDIA
Organizational Unit Name (e.g. department): DGX Technical Marketing
Cluster Name: Clio
Private key data saved to /cm/local/apps/cmd/etc/cluster.key.new

Warning: Permanently added 'clio-headnode-01' (ED25519) to the list of known hosts.
MAC Address of primary head node (clio-headnode-01) for ens3f1np1 [08:C0:00:00:00:00]: 00:00:00:00:00:0F
Will this cluster use a high-availability setup with 2 head nodes? [y/N] y
MAC Address of secondary head node for eth0 [XX:XX:XX:XX:XX:XX]: 00:00:00:00:00:FF

Certificate request data saved to /cm/local/apps/cmd/etc/cluster.csr.new
Submit certificate request to http://licensing.brightcomputing.com/licensing/index.cgi ? [Y/n] Y

Contacting http://licensing.brightcomputing.com/licensing/index.cgi...

License granted.
License data was saved to /cm/local/apps/cmd/etc/cluster.pem.new
Install license? [Y/n] Y
========= Certificate Information ========
Version:                    10
Edition:                    Advanced
OEM:                        NVIDIA
Common name:                CLIO Cluster
Organization:               NVIDIA
Organizational unit:        DGX Technical Marketing
Locality:                   Santa Clara
State:                      California
Country:                    US
Serial:                     2981953
Starting date:              07/Apr/2024
Expiration date:            31/Mar/2030
MAC address / Cloud ID:     16:49:F4:74:C0:54|0E:54:0A:9F:3A:10
Licensed tokens:            512
Accounting & Reporting:     Yes
Allow edge sites:           Yes
License type:               Commercial
==========================================


Is the license information correct ? [Y/n] Y
Backup directory of old license: /var/spool/cmd/backup/certificates/2025-03-31_15.31.58
Installed new license
Revoke all existing cmd certificates

Waiting for CMDaemon to stop: OK

Installing admin certificates

Waiting for CMDaemon to start: OK
mysql: [Warning] Using a password on the command line interface can be insecure.
Copy cluster certificate to 3 images / node-installers
Copy cluster certificate to /cm/images/default-image//cm/local/apps/cmd/etc/cluster.pem
Copy cluster certificate to /cm/node-installer//cm/local/apps/cmd/etc/cluster.pem


Regenerating certificates for users
New license was installed. In order to allow compute nodes to obtain a new
node certificate, all compute nodes must be rebooted.

Please issue the following command to reboot all compute nodes:
    pdsh -g computenode reboot

Activate the BCM Cluster License - Offline/Airgapped installations#

Install the license using the following steps for air-gapped environments.

root@clio-headnode-01:~# request-license
Product Key (XXXXXX-XXXXXX-XXXXXX-XXXXXX-XXXXXX):000000-...-000000
Country Name (2 letter code): US
State or Province Name (full name): CA
Locality Name (e.g. city): Santa Clara
Organization Name (e.g. company): NVIDIA
Organizational Unit Name (e.g. department): NSV Mission Control
Cluster Name: Viking
Private key data saved to /cm/local/apps/cmd/etc/cluster.key.new

MAC Address of primary head node (clio-headnode-01) for ens1np0 [<MAC:MAC>]: <MAC:MAC>
Will this cluster use a high-availability setup with 2 head nodes? [y/N] y
MAC Address of secondary head node for eth0 [XX:XX:XX:XX:XX:XX]: <MAC:MAC>

Certificate request data saved to /cm/local/apps/cmd/etc/cluster.csr.new
Submit certificate request to http://licensing.brightcomputing.com/licensing/index.cgi ? [Y/n] N

Please use:
    http://licensing.brightcomputing.com/licensing/index.cgi
to obtain a license, or submit a support request through the Bright Computing website.

Contents of /cm/local/apps/cmd/etc/cluster.csr.new:

-----BEGIN CERTIFICATE REQUEST-----
…
…
…
-----END CERTIFICATE REQUEST-----

Go to licensing portal to generate the license file using CSR generate above.

_images/image45.png

Install the license file by copying the contents of the generated license to a file and install it using install-license <file name> command

root@clio-headnode-01:~# vi license-file
# Paste the generated license contents #
root@clio-headnode-01:~#   install-license license-file
mysql: [Warning] Using a password on the command line interface can be insecure.
========= Certificate Information ========
Version:                    10
Edition:                    NVIDIA Mission Control
OEM:                        NVIDIA
Common name:                Viking
Organization:               NVIDIA
Organizational unit:        NSV Mission Control
Locality:                   Santa Clara
State:                      CA
Country:                    US
Serial:                     <>
Starting date:              08/Feb/2025
Expiration date:            23/Oct/2025
MAC address / Cloud ID:     88:E9:A4:20:18:9C|88:E9:A4:20:18:1C
Licensed tokens:            8192
Accounting & Reporting:     Yes
Allow edge sites:           Yes
License type:               Free
==========================================

Is the license information correct ? [Y/n] Y
Backup directory of old license: /var/spool/cmd/backup/certificates/2025-07-23_15.34.30
Installed new license
Revoke all existing cmd certificates

Waiting for CMDaemon to stop: OK

Installing admin certificates

Waiting for CMDaemon to start: OK
mysql: [Warning] Using a password on the command line interface can be insecure.
Copy cluster certificate to 3 images / node-installers
Copy cluster certificate to /cm/node-installer//cm/local/apps/cmd/etc/cluster.pem
Copy cluster certificate to /cm/images/dgx-image//cm/local/apps/cmd/etc/cluster.pem
Copy cluster certificate to /cm/images/default-image//cm/local/apps/cmd/etc/cluster.pem
mysql: [Warning] Using a password on the command line interface can be insecure.

Regenerating certificates for users

New license was installed. In order to allow compute nodes to obtain a new
node certificate, all compute nodes must be rebooted.

Please issue the following command to reboot all compute nodes:
        pdsh -g computenode reboot

Enable DeviceResolveAnyMAC#

The following section enables provisioning of the bonded interfaces on downstream appliances/nodes. This process enables failover PXE booting for bonded interfaces.

Edit /cm/local/apps/cmd/etc/cmd.conf and add the following line

AdvancedConfig = { "DeviceResolveAnyMAC=1" } # modified value
Example:

nano /cm/local/apps/cmd/etc/cmd.conf
GNU nano 6.2
# Set one or more advanced config parameters, only do this when needed
# AdvancedConfig = { "param=value", "param=value" }
AdvancedConfig = { "DeviceResolveAnyMAC=1" } # modified value
Once the above parameter has been saved restart the CMDaemon

root@bcm10-headnode:~# systemctl restart cmd

BCM Networking Configuration#

BCM networking configuration can be fully automated or manually configured. Refer to the following sections for more information.

Define Cluster Networks - Automated#

BCM can automatically create the networks, and categories nodes using bcm-netautogen.

Define Cluster Networks - Manual#

Nodes and networks can be manually defined within the BCM cluster using the steps outlined below.

Login to cmsh and define additional networks.

root@clio-headnode-01:~# cmsh
[clio-headnode-01]% network
[clio-headnode-01->network]% ls
Name (key)      Type            Netmask bits   Base address     Domain name             IPv6
------------------ -------------- -------------- ---------------- -------------------- ----
globalnet       Global          0               0.0.0.0         cm.cluster
internalnet     Internal        24              10.150.124.0    eth.cluster

Add ipminet. This is the Out Of Band (OOB) Management network.

[clio-headnode-01->network]% add ipminet
[clio-headnode-01->network*[ipminet*]]% set netmaskbits 24
[clio-headnode-01->network*[ipminet*]]% set baseaddress 10.150.123.0
[clio-headnode-01->network*[ipminet*]]% set excludefromsearchdomain yes
[clio-headnode-01->network*[ipminet*]]% set disableautomaticexports yes
[clio-headnode-01->network*[ipminet*]]% set gateway 10.150.123.1
[clio-headnode-01->network*[ipminet*]]% commit

Check that all the network settings are correct by running the show command for each network configured.

[clio-headnode-01->network[ipminet]]% show
Parameter                       Value
-------------------------------- ------------------------------------------------
Name                            ipminet
Private Cloud
Revision
Domain Name
Type                            Internal
MTU                             1500
Allow autosign                  Automatic
Write DNS zone                  both
Node booting                    no
Lock down dhcpd                 yes
Management allowed              yes
Search domain index             0
Exclude from search domain      yes
Disable automatic exports       yes
Base address                    10.150.123.0
Broadcast address               10.150.123.255
Dynamic range start             0.0.0.0
Dynamic range end               0.0.0.0
Netmask bits                    24
Gateway                         10.150.123.1
Cloud Subnet ID
EC2AvailabilityZone
Notes                           <0B>

Optional: create dgxnet(s), by cloning internalnet. For scaled deployments, multiple dgxnets can be utilized to isolate DGX nodes from the control nodes’ default management network. For small scale deployments, DGXes can be provisioned in the internalnet.

Note

Ensure the network is configured to forward DHCP requests from dgxnet to the BCM headnode IP.

[clio-headnode-01->network[ipminet]]% clone internalnet dgxnet
[clio-headnode-01->network*[dgxnet*]]% set domainname cm.dgx
[clio-headnode-01->network*[dgxnet*]]% set baseaddress 10.150.125.0
[clio-headnode-01->network*[dgxnet*]]% set gateway 10.150.125.1
[clio-headnode-01->network*[dgxnet*]]% commit
[clio-headnode-01->network[dgxnet]]% show
Parameter                       Value
-------------------------------- ------------------------------------------------
Name                            dgxnet
Private Cloud
Revision
Domain Name                     cm.dgx
Type                            Internal
MTU                             1500
Allow autosign                  Automatic
Write DNS zone                  both
Node booting                    yes
Lock down dhcpd                 no
Management allowed              yes
Search domain index             0
Exclude from search domain      no
Disable automatic exports       no
Base address                    10.150.125.0
Broadcast address               10.150.125.255
Dynamic range start             10.150.125.160
Dynamic range end               10.150.125.223
Netmask bits                    24
Gateway                         10.150.125.1
Cloud Subnet ID
EC2AvailabilityZone
Notes                           <0B>

Add computenet

[clio-headnode-01->network[dgxnet]]% add computenet
[clio-headnode-01->network*[computenet*]]% set domainname ib.compute
[clio-headnode-01->network*[computenet*]]% set baseaddress 100.126.0.0
[clio-headnode-01->network*[computenet*]]% set mtu 4000
[clio-headnode-01->network*[computenet*]]% commit
[clio-headnode-01->network[computenet]]% show
Parameter                           Value
-------------------------------- ------------------------------------------------
Name                                computenet
Private Cloud
Revision
Domain Name                         ib.compute
Type                                Internal
MTU                                 4000
Allow autosign                      Automatic
Write DNS zone                      both
Node booting                        no
Lock down dhcpd                     no
Management allowed                  no
Search domain index                 0
Exclude from search domain          no
Disable automatic exports           no
Base address                        100.126.0.0
Broadcast address                   100.126.255.255
Dynamic range start                 0.0.0.0
Dynamic range end                   0.0.0.0
Netmask bits                        16
Gateway                             0.0.0.0
Cloud Subnet ID
EC2AvailabilityZone
Notes                               <0B>

Add storagenet

[clio-headnode-01->network[computenet]]% clone computenet storagenet
[clio-headnode-01->network*[storagenet*]]% set domainname ib.storage
[clio-headnode-01->network*[storagenet*]]% set baseaddress 100.127.0.0
[clio-headnode-01->network*[storagenet*]]% commit
[clio-headnode-01->network[storagenet]]% show
Parameter                       Value
-------------------------------- ------------------------------------------------
Name                            storagenet
Private Cloud
Revision
Domain Name                     ib.storage
Type                            Internal
MTU                             4000
Allow autosign                  Automatic
Write DNS zone                  both
Node booting                    no
Lock down dhcpd                 no
Management allowed              no
Search domain index             0
Exclude from search domain      no
Disable automatic exports       no
Base address                    100.127.0.0
Broadcast address               100.127.255.255
Dynamic range start             0.0.0.0
Dynamic range end               0.0.0.0
Netmask bits                    16
Gateway                         0.0.0.0
Cloud Subnet ID
EC2AvailabilityZone
Notes                           <0B>

Before moving forward be sure to verify the defined networks.

[clio-headnode-01->network[storagenet]]% ..
[clio-headnode-01->network]% ls
Name (key)      Type            Netmask bits   Base address     Domain name             IPv6
------------------ -------------- -------------- ---------------- -------------------- ----
computenet      Internal        16              100.126.0.0     ib.compute
dgxnet          Internal        24              10.150.125.0    cm.dgx
globalnet       Global          0               0.0.0.0         cm.cluster
internalnet     Internal        24              10.150.124.0    eth.cluster
ipminet         Internal        24              10.150.123.0
storagenet      Internal        16              100.127.0.0     ib.storage

Enable Bonding on the Headnode#

Note

Always use a remote or physical KVM for this action, not SSH. Before you begin, confirm that the headnode’s out-of-band management, BMC interface, or remote/physical KVM is reachable and working.

In this step, we’ll clear the interface IP which was assigned to the primary interface during the installation and assign it to the newly created bond interface with the network interfaces. Refer to site survey for the network interface names/MAC addresses.

Login to headnode and run Cluster Manager Shell (cmsh).

root@clio-headnode-01:~# cmsh
[clio-headnode-01]% device
[clio-headnode-01->device]% use clio-headnode-01
[clio-headnode-01->device[clio-headnode-01]]% interfaces
[clio-headnode-01->device[clio-headnode-01]->interfaces]% ls
Type            Network device name  IP                 Network         Start if
------------ -------------------- ---------------- ---------------- --------
physical        enp138s0np0 [prov]   10.150.124.12      internalnet     always
[clio-headnode-01->device[clio-headnode-01]->interfaces]% add bmc ipmi0 10.150.123.11 ipminet
[clio-headnode-01->device*[clio-headnode-01*]->interfaces*[enpenp138s0np0*]]% add physical enp47s0np0
[clio-headnode-01->device*[clio-headnode-01*]->interfaces*[enp47s0np0*]]% add bond bond0 10.150.124.12 internalnet
[clio-headnode-01->device*[clio-headnode-01*]->interfaces*[bond0*]]% set mode 4
[clio-headnode-01->device*[clio-headnode-01*]->interfaces*[bond0*]]% set interfaces enp138s0np0 enp47s0np0
[clio-headnode-01->device*[clio-headnode-01*]->interfaces*[bond0*]]% use enp138s0np0
[clio-headnode-01->device*[clio-headnode-01*]->interfaces*[enp138s0np0]]% clear network
[clio-headnode-01->device*[clio-headnode-01*]->interfaces*[enp138s0np0*]]% clear ip
[clio-headnode-01->device*[clio-headnode-01*]->interfaces*[enp138s0np0*]]% ..
[clio-headnode-01->device*[clio-headnode-01*]->interfaces*]% ..
[clio-headnode-01->device*[clio-headnode-01*]]% set provisioninginterface bond0
[clio-headnode-01->device*[clio-headnode-01*]]% commit
[clio-headnode-01->device[clio-headnode-01]->interfaces]% ls
Type         Network device name  IP               Network          Start if
------------ -------------------- ---------------- ---------------- --------
bmc          ipmi0                10.150.123.12    ipminet          always
bond         bond0 [prov]         10.150.124.12    internalnet      always
physical     enp138s0np0 (bond0)  0.0.0.0                           always
physical     enp47s0np0 (bond0)   0.0.0.0                           always
[clio-headnode-01->device[clio-headnode-01]->interfaces]% use bond0
[clio-headnode-01->device[clio-headnode-01]->interfaces[bond0]]% show
Parameter                        Value
-------------------------------- ------------------------------------------------
Revision
Type                             bond
Network device name              bond0 [prov]
Network                          internalnet
IP                               10.150.124.12
DHCP                             no
Alternative Hostname
Additional Hostnames
Switch ports
Start if                         always
BringUpDuringInstall             no
On network priority              70
Mode                             4 (802.3ad)
Options
Interfaces                       enp138s0np0,enp47s0np0

Verify the IP connectivity to the BCM headnode using ping/ssh before proceeding.

Should the head node’s network connection fail to update after committing changes, access the appliance console via BMC or physical KVM and restart the networking service to apply the changes.

[clio-headnode-01->device[clio-headnode-01]->interfaces]% !systemctl restart networking.service

or
root@clio-headnode-01:~# systemctl restart networking.service

Add fsexports for Additional networks (e.g dgxnet) in BCM#

This step applies only if DGX nodes use a separate network; otherwise, it is not applicable. If multiple DGXnets are in use, repeat these steps for each network where nodes are provisioned.

Enable shared filesystem access from additional networks to facilitate appliance provisioning.

root@clio-headnode-01:~# cmsh
[clio-headnode-01]% device
[clio-headnode-01->device]% use clio-headnode-01
[clio-headnode-01->device[clio-headnode-01]]% fsexports
[clio-headnode-01->device[clio-headnode-01]->fsexports]% ls
Name (key)                                   Path                             Network                  Hosts        Write  Disabled
-------------------------------------------- -------------------------------- ------------------------ ------------ ------ --------
/cm/node-installer@internalnet               /cm/node-installer               internalnet                           no     no
/cm/node-installer/certificates@internalnet  /cm/node-installer/certificates  internalnet                           yes    no
/var/spool/burn@internalnet                  /var/spool/burn                  internalnet                           yes    no
/home@internalnet                            /home                            internalnet                           yes    no
/cm/shared@internalnet                       /cm/shared                       internalnet                           yes    no
[clio-headnode-01->device[clio-headnode-01]->fsexports]% add /cm/node-installer dgxnet
[clio-headnode-01->device*[clio-headnode-01*]->fsexports*[/cm/node-installer@dgxnet*]]% ..
[clio-headnode-01->device*[clio-headnode-01*]->fsexports*]% add /cm/node-installer/certificates dgxnet
[clio-headnode-01->device*[clio-headnode-01*]->fsexports*[/cm/node-installer/certificates@dgxnet*]]% set write yes
[clio-headnode-01->device*[clio-headnode-01*]->fsexports*[/cm/node-installer/certificates@dgxnet*]]% ..
[clio-headnode-01->device*[clio-headnode-01*]->fsexports*]% add /var/spool/burn dgxnet
[clio-headnode-01->device*[clio-headnode-01*]->fsexports*[/var/spool/burn@dgxnet*]]% set write yes
[clio-headnode-01->device*[clio-headnode-01*]->fsexports*[/var/spool/burn@dgxnet*]]% ..
[clio-headnode-01->device*[clio-headnode-01*]->fsexports*]% add /home dgxnet
[clio-headnode-01->device*[clio-headnode-01*]->fsexports*[/home@dgxnet*]]% set write yes
[clio-headnode-01->device*[clio-headnode-01*]->fsexports*[/home@dgxnet*]]% ..
[clio-headnode-01->device*[clio-headnode-01*]->fsexports*]% add /cm/shared dgxnet
[clio-headnode-01->device*[clio-headnode-01*]->fsexports*[/cm/shared@dgxnet*]]% set write yes
[clio-headnode-01->device*[clio-headnode-01*]->fsexports*[/cm/shared@dgxnet*]]% ..
[clio-headnode-01->device*[clio-headnode-01*]->fsexports*]% ls
Name (key)                                   Path                             Network                  Hosts        Write  Disabled
-------------------------------------------- -------------------------------- ------------------------ ------------ ------ --------
/cm/node-installer@internalnet               /cm/node-installer               internalnet                           no     no
/cm/node-installer/certificates@internalnet  /cm/node-installer/certificates  internalnet                           yes    no
/var/spool/burn@internalnet                  /var/spool/burn                  internalnet                           yes    no
/home@internalnet                            /home                            internalnet                           yes    no
/cm/shared@internalnet                       /cm/shared                       internalnet                           yes    no
/cm/node-installer@dgxnet                    /cm/node-installer               dgxnet                                no     no
/cm/node-installer/certificates@dgxnet       /cm/node-installer/certificates  dgxnet                                yes    no
/var/spool/burn@dgxnet                       /var/spool/burn                  dgxnet                                yes    no
/home@dgxnet                                 /home                            dgxnet                                yes    no
/cm/shared@dgxnet                            /cm/shared                       dgxnet                                yes    no
[clio-headnode-01->device*[clio-headnode-01*]->fsexports*]% commit
[clio-headnode-01->device[clio-headnode-01]->fsexports]%
Fri Jan 17 18:05:16 2025 [notice] clio-headnode-01: Service nfs was reloaded
[clio-headnode-01->device[clio-headnode-01]->fsexports]%

Node Provisioning#

The configuration of the control and DGX nodes can be performed either manually or automatically, depending on the deployment’s scale.

The below sections outline the steps for manual provisioning of the nodes.

Manual Provisioning - Create Control Node Definitions#

Note

In this example, we are defining Slurm Node, slogin-01. Repeat the same procedure for other control nodes like kubernetes/Run:AI

Clone the default image.

[clio-headnode-01->softwareimage]% clone default-image slogin-image
[clio-headnode-01->softwareimage*[slogin-image*]]% commit

Clone default category to create the slogin category and set the software image to slogin-image.

[clio-headnode-01->category]% clone default slogin
[clio-headnode-01->category*[slogin*]]% set softwareimage slogin-image
[clio-headnode-01->category*[slogin*]]% commit

Add a new physical node for the slogin-01 node and set its IP address.

[clio-headnode-01]% device
[clio-headnode-01->device]% add physicalnode slogin-01 10.150.124.17 bond0
[clio-headnode-01->device*[slogin*]]% set category slogin

Set the interfaces and MAC addresses for the specified SLOGIN node.

[clio-headnode-01->device*[SLOGIN-01*]% interfaces
[clio-headnode-01->device*[SLOGIN-01*]->interfaces]% remove bootif
[clio-headnode-01->device*[SLOGIN-01*]->interfaces*]% add bmc ipmi0 10.150.123.17 ipminet
[clio-headnode-01->device*[SLOGIN-01*]->interfaces*]% set bond0 ip 10.150.124.17
[clio-headnode-01->device*[SLOGIN-01*]->interfaces*]% add physical enp138s0np0
[clio-headnode-01->device*[SLOGIN-01*]->interfaces*]% set enp138s0np0 mac A0:88:C2:34:44:DC
[clio-headnode-01->device*[SLOGIN-01*]->interfaces*]% add physical enp47s0np0
[clio-headnode-01->device*[SLOGIN-01*]->interfaces*]% set enp47s0np0 mac A0:88:C2:34:44:D8
[clio-headnode-01->device*[SLOGIN-01*]->interfaces*]% set bond0 interfaces enp138s0np0 enp47s0np0
[clio-headnode-01->device*[SLOGIN-01*]->interfaces*]% ..
[clio-headnode-01->device*[SLOGIN-01*]]% set provisioninginterface bond0
[clio-headnode-01->device*[SLOGIN-01*]]% set mac A0:88:C2:34:44:DC
[clio-headnode-01->device*[SLOGIN-01*]]% interfaces
[clio-headnode-01->device*[SLOGIN-01*]->interfaces*]% commit
[clio-headnode-01->device*[SLOGIN-01*]->interfaces*]% ..
[clio-headnode-01->device*[SLOGIN-01*]]% commit

Repeat the same steps for other control nodes.

Set BMC settings in cmsh

[clio-headnode-01->device]% category
[clio-headnode-01->category]% use slogin
[clio-headnode-01->category[slogin]]% bmcsettings
[clio-headnode-01->category[slogin]->bmcsettings]% set username <BMC User>
[clio-headnode-01->category*[slogin*]->bmcsettings*]% set password  <BMC Pass>
[clio-headnode-01->category*[slogin*]->bmcsettings*]% commit

Provision Nodes into the Cluster#

Power on all the nodes. They should boot into their assigned roles automatically.

[clio-headnode-01->device]% power on -c slogin
ipmi0 .................... [  ON        ] slogin-01

Manual Provisioning - Create DGX Node Definitions#

Create boot image and category#

Using cmsh, verify that the built-in DGX OS image is present.

root@clio-headnode-01:~# cmsh
[clio-headnode-01]% softwareimage
[clio-headnode-01->softwareimage]% ls
Name (key)           Path (key)                               Kernel version    Nodes
-------------------- ---------------------------------------- ----------------- --------
default-image        /cm/images/default-image                 6.8.0-51-generic  1
dgx-image            /cm/images/dgx-image                     6.8.0-60-generic  0

Clone the built-in image with a suitable name, e.g. for slum nodes, use dgx-b200-slurm-image.

[clio-headnode-01->softwareimage]% clone dgx-image dgx-b200-slurm-image
[clio-headnode-01->softwareimage*[dgx-b200-slurm-image*]]% commit

Thu Aug  7 09:14:33 2025 [notice] clio-headnode-01: Started to copy:

[clio-headnode-01->softwareimage[dgx-b200-slurm-image]]% ls
Name (key)           Path (key)                               Kernel version    Nodes
-------------------- ---------------------------------------- ----------------- --------
default-image        /cm/images/default-image                 6.8.0-51-generic  1
dgx-b200-slrum-img   /cm/images/dgx-b200-slurm-image          6.8.0-60-generic  0
dgx-image            /cm/images/dgx-image                     6.8.0-60-generic  0

Create a category for dgx nodes, and assign the newly created image to the category. In this example, we are creating a category for B200 slurm nodes.

[clio-headnode-01->category]%
[clio-headnode-01->category]% ls
Name (key)               Software image           Nodes
------------------------ ------------------------ --------
default                  default-image            1
dgx                      dgx-image                0
[clio-headnode-01->category]% clone dgx dgx-b200-slurm
[clio-headnode-01->category*[dgx-b200-slurm*]]% commit
[clio-headnode-01->category]% use dgx-b200-slurm
[clio-headnode-01->category[dgx-b200-slurm]]% set softwareimage dgx-b200-slurm-image
[clio-headnode-01->category*[dgx-b200-slurm*]]% commit

If deploying Run:AI, you might need to create additional node categories or images. Consult the Run:AI Deployment guide for suggested category names.

Configure DGX B200 node#

Quit cmsh and from run bcm-pod-setup from the bcm shell.

Command syntax:
module load bcm-post-install
bcm-pod-setup --dgx-type b200 --dgx-category <category name> --dgx-image <image name>

Repeat the same for all categories of DGXB200 Nodes.

root@clio-headnode-01:~# module load bcm-post-install
root@clio-headnode-01:~# bcm-pod-setup --dgx-type b200 --dgx-category dgx-b200-slurm --dgx-image dgx-b200-slurm-image
===================
Starting POD setup.
===================
Running BasePOD configuration
Cleaning up unused A100 software image and category
- Category dgx-a100 has already been removed
- dgx-os-7.1-a100-image image has already been removed
Updating sysctl config in dgx-b200-slurm-image
Updating distribution kernel parameters for dgx-b200-slurm-image
Updating additional kernel parameters for dgx-b200-slurm-image
Updating kernel modules in default-image
Updating image sshd configuration and removing machine-id file
- /cm/images/default-image
- /cm/images/dgx-b200-slurm-image
Adding DGX node disk layout
- Updating category configuration
Adding UDEV persistent rules
Cleaning up environment.
===============
Deploy Complete
===============

Define the DGX B200 node identity and assign it to the category created above. Refer to site survey for the node identity details.

Note

In this example, the DGX nodes are created in the dgxnet network. They can be added to internalnet for small scale deployments.

[clio-headnode-01->device]% add physicalnode dgx-01  10.150.125.25 bond0
[clio-headnode-01->device*[dgx-01*]]% set category dgx-b200-slurm
[clio-headnode-01->device*[dgx-01*]]% set mac 58:A2:E1:E7:1D:7F

Set the interfaces and MAC addresses of the inband management interfaces for the specified DGX.

[clio-headnode-01->device*[dgx-01*]]% interfaces
[clio-headnode-01->device*[dgx-01*]->interfaces]% remove bootif
[clio-headnode-01->device*[dgx-01*]->interfaces*]% add bmc ipmi0 10.150.123.25
Switched power control for this node to: ipmi0
[clio-headnode-01->device*[dgx-01*]->interfaces*[ipmi0*]]% add physical enp170s0f1np1
[clio-headnode-01->device*[dgx-01*]->interfaces*[enp170s0f1np1*]]% set mac 58:A2:E1:E7:1D:7F
[clio-headnode-01->device*[dgx-01*]->interfaces*[enp170s0f1np1*]]% ..
[clio-headnode-01->device*[dgx-01*]->interfaces*]% add physical enp41s0f1np1
[clio-headnode-01->device*[dgx-01*]->interfaces*[enp41s0f1np1*]]% set mac C4:70:BD:D2:05:D5
[clio-headnode-01->device*[dgx-01*]->interfaces*[enp41s0f1np1*]]% ..
[clio-headnode-01->device*[dgx-01*]->interfaces*]% use bond0
[clio-headnode-01->device*[dgx-01*]->interfaces*[bond0]]% set mode 4
[clio-headnode-01->device*[dgx-01*]->interfaces*[bond0*]]% set interfaces enp170s0f1np1 enp41s0f1np1
[clio-headnode-01->device*[dgx-01*]->interfaces*[bond0*]]% commit
[clio-headnode-01->device[dgx-01]->interfaces*[bond0]]% ..
[clio-headnode-01->device[dgx-01]->interfaces*]% commit

Define the IB interfaces for the DGX B200.

[clio-headnode-01->device[dgx-01]->interfaces]% add physical ibp154s0 100.126.1.1 computenet
[clio-headnode-01->device*[dgx-01*]->interfaces*[ibp154s0*]]%
[clio-headnode-01->device*[dgx-01*]->interfaces*[ibp154s0*]]%
[clio-headnode-01->device*[dgx-01*]->interfaces*[ibp154s0*]]% add physical ibp192s0 100.126.2.1 computenet
[clio-headnode-01->device*[dgx-01*]->interfaces*[ibp192s0*]]% add physical ibp206s0 100.126.3.1 computenet
[clio-headnode-01->device*[dgx-01*]->interfaces*[ibp206s0*]]% add physical ibp220s0 100.126.4.1 computenet
[clio-headnode-01->device*[dgx-01*]->interfaces*[ibp220s0*]]% add physical ibp24s0 100.126.5.1 computenet
[clio-headnode-01->device*[dgx-01*]->interfaces*[ibp24s0*]]% add physical ibp64s0 100.126.6.1 computenet
[clio-headnode-01->device*[dgx-01*]->interfaces*[ibp64s0*]]% add physical ibp79s0 100.126.7.1 computenet
[clio-headnode-01->device*[dgx-01*]->interfaces*[ibp79s0*]]% add physical ibp94s0 100.126.8.1 computenet
[clio-headnode-01->device*[dgx-01*]->interfaces*[ibp94s0*]]% commit
[clio-headnode-01->device*[DGX-01*]->interfaces*[ibp41s0f0*]]% commit

Verify the interfaces mac/IPs and networks.

[clio-headnode-01->device[dgx-01]->interfaces]% ls
Type         Network device name    IP               Network          Start if
------------ ---------------------- ---------------- ---------------- --------
bmc          ipmi0                  10.150.123.25    ipminet          always
bond         bond0 [prov]           10.150.125.25    dgxnet           always
physical     enp170s0f1np1 (bond0)  0.0.0.0                           always
physical     enp41s0f1np1 (bond0)   0.0.0.0                           always
physical     ibp154s0               100.126.1.1      computenet       always
physical     ibp192s0               100.126.2.1      computenet       always
physical     ibp206s0               100.126.3.1      computenet       always
physical     ibp220s0               100.126.4.1      computenet       always
physical     ibp24s0                100.126.5.1      computenet       always
physical     ibp64s0                100.126.6.1      computenet       always
physical     ibp79s0                100.126.7.1      computenet       always
physical     ibp94s0                100.126.8.1      computenet       always

To simplify computenet IP routing, all computenet/RDMA interfaces are configured under the same IP subnet (100.126.0.0/16). This setup does not affect the performance of training/interface workloads because NCCL/GPUDirect RDMA operates over InfiniBand, not IP transport.

Clone DGX-01 to create the rest of the DGX nodes.

[clio-headnode-01->device]%foreach -o dgx-01 -n  dgx-02..dgx-04() --next-ip
[clio-headnode-01->device*]% commit
Successfully committed 3 Devices

Adjust the management interface mac addresses for each of the new nodes. Refer to site survey for the details.

home;device
use dgx-02
set mac C4:70:BD:D2:0B:79
interfaces
use enp170s0f1np1
set mac C4:70:BD:D2:0B:79
..
use  enp41s0f1np1
set mac C4:70:BD:D2:11:B5
commit

Repeat the same step for all the remaining nodes.

Set BMC settings in cmsh for the dgx-b200-slurm category.

[clio-headnode-01->device]% category
[clio-headnode-01->category]% use dgx-b200-slurm
[clio-headnode-01->category[slogin]]% bmcsettings
[clio-headnode-01->category[slogin]->bmcsettings]% set username <BMC User>
[clio-headnode-01->category*[slogin*]->bmcsettings*]% set password  <BMC Pass>
[clio-headnode-01->category*[slogin*]->bmcsettings*]% commit

Provision Nodes into the Cluster#

Power on all the nodes. They should boot into their assigned roles automatically.

[clio-headnode-01->device]% power on -c dgx-b200-slurm
ipmi0 .................... [  ON        ] DGX-01
ipmi0 .................... [  ON        ] DGX-02
ipmi0 .................... [  ON        ] DGX-03
ipmi0 .................... [  ON        ] DGX-04

After the nodes are booted, verify the node status

[clio-headnode-01->device]% ls
Type             Hostname (key)    MAC                Category         IP               Network          Status
---------------- ----------------- ------------------ ---------------- ---------------- ---------------- --------------------------------
HeadNode         clio-headnode-01  00:00:00:73:FE:80+                  10.150.124.11    internalnet      [   UP   ]
HeadNode         clio-headnode-02  C2:50:D9:5D:72:DB                   10.150.124.12    internalnet      [   UP   ]
PhysicalNode     dgx-01            3E:F7:55:83:3C:66  dgx-b200-slurm   10.150.125.25    dgxnet           [   UP   ], health check failed
PhysicalNode     dgx-02            DA:1D:45:16:E2:0B  dgx-b200-slurm   10.150.125.26    dgxnet           [   UP   ], health check failed+
PhysicalNode     dgx-03            FA:AD:15:EF:CB:64  dgx-b200-slurm   10.150.125.27    dgxnet           [   UP   ], health check failed+
PhysicalNode     dgx-04            2E:72:CE:3B:C6:1A  dgx-b200-slurm
PhysicalNode     slogin-01         C2:1E:70:7B:39:98  slogin           10.150.124.17    internalnet      [   UP   ]
PhysicalNode     slogin-02         B2:B0:5A:5C:3F:41  slogin           10.150.124.18    internalnet      [   UP

Example node/category configuration:

[clio-headnode-01->category]% ls
Name (key)               Software image           Nodes
------------------------ ------------------------ --------
default                  default-image            1
dgx                      dgx-image                0
dgx-b200-k8s             dgx-b200-k8s-image       0
dgx-b200-slurm           dgx-b200-slurm-image     4
k8s-system               dgx-b200-k8s-image       0
slogin                   slogin-image             2
[clio-headnode-01->category]% device;ls
Type             Hostname (key)    MAC                Category         IP               Network          Status
---------------- ----------------- ------------------ ---------------- ---------------- ---------------- --------------------------------
HeadNode         clio-headnode-01  00:00:00:73:FE:80+                  10.150.124.11    internalnet      [   UP   ]
HeadNode         clio-headnode-02  C2:50:D9:5D:72:DB                   10.150.124.12    internalnet      [   UP   ]
PhysicalNode     dgx-01            3E:F7:55:83:3C:66  dgx-b200-slurm   10.150.125.25    dgxnet           [   UP   ]
PhysicalNode     dgx-02            DA:1D:45:16:E2:0B  dgx-b200-slurm   10.150.125.26    dgxnet           [   UP   ]
PhysicalNode     dgx-03            FA:AD:15:EF:CB:64  dgx-b200-slurm   10.150.125.27    dgxnet           [   UP   ]
PhysicalNode     dgx-04            2E:72:CE:3B:C6:1A  dgx-b200-slurm   10.150.125.28    dgxnet           [   UP   ]
PhysicalNode     slogin-01         C2:1E:70:7B:39:98  slogin           10.150.124.17    internalnet      [   UP   ]
PhysicalNode     slogin-02         B2:B0:5A:5C:3F:41  slogin           10.150.124.18    internalnet      [   UP   ]

Note

Due to a known software defect, some versions of built-in DGX-OS image may have the nvidia-fabricmanager service failing to start. To check, SSH into one of the DGX nodes and verify the status of the following services.

root@dgx-b200-01~]#systemctl status openibd.service
root@dgx-b200-01~]#systemctl status nvidia-fabricmanager.service

If they are not enabled or running, manually start them and confirm they are starting up.

systemctl start openibd.service
systemctl start nvidia-fabricmanager.service

Modify the DGX-OS image in BCM with the following steps to enable the required services for all nodes.

cm-chroot-sw-img  /cm/images/<DGX IMAGE>
systemctl enable openibd.service
systemctl enable nvidia-fabricmanager.service

Example below for modifying image all nodes using dgx-b300-slurm-image

root@clio-headnode-01:~#cm-chroot-sw-img  /cm/images/dgx-b300-slurm-image/
root@dgx-b200-slurm-image:/#systemctl enable openibd.service
root@dgx-b200-slurm-image:/#systemctl status nvidia-fabricmanager.service

Reboot the nodes and verify the service status.

BCM High Availability Configuration#

BCM HA Configuration ensures BCM head node redundancy. This two-step procedure involves configuring BCM HA first, then adding NFS for BCM shared folders to achieve full redundancy.

BCM HA Configuration#

Power off all the provisioned compute and workload manager control nodes (kubernetes/slurm) before configuring HA.

clio-headnode-01:~# cmsh
[clio-headnode-01]% device
[clio-headnode-01->device]% shutdown -c dgx-b200,slogin

Run the cmha-setup script.

clio-headnode-01:~# cmha-setup

Choose Setup.

_images/image46.png

Choose Configure.

_images/image47.png

Verify the primary and stand-by BCM head-node MAC addresses and select CONTINUE.

Should the MAC addresses be incorrect, exit the script. Then, rerun the request-license command and select the option to keep the current license in the submenu. This action will allow you to modify only the MAC addresses for the headnodes.

_images/image48.png

Note

To prevent license issues due to a NIC failure/replacement, it is advisable to use the MAC address of the MLOM interface, if available.

Populate the ha interface IP for bond0:ha and select NEXT.

This is the virtual IP address (VIP) for the BCM head-node HA Pair.

_images/image49.png

Set the name of the second headnode to as per site survey.

_images/image50.png

Choose SKIP to skip the creation of the failovernet. We are going to use existing management for HA keepalives.

_images/image51.png

Refer to BCM11 Administrator Manual for more details on HA configuration options.

Set the IP inband management and BMC addresses for the secondary headnode.

_images/image52.png

Check that all settings in the summary are correct and select EXIT.

_images/image53.png

Select Yes to proceed with the failover config.

_images/image54.png

Enter the MYSQL password - which is the same as the BCM head-node root password - to continue.

_images/image55.png

Once the below steps finish, press ENTER to continue.

Initializing failover setup on master.............. [  OK  ]
Updating shared internal interface................. [  OK  ]
Updating extra shared internal interfaces.......... [  OK  ]
Cloning head node.................................. [  OK  ]
Updating secondary master interfaces............... [  OK  ]
Updating Failover Object........................... [  OK  ]
Restarting cmdaemon................................ [  OK  ]
Press ENTER to continue

Boot the secondary BCM head-node and use its BMC KVM to select a one-time PXE boot from one of the inband interfaces. The primary BCM head-node will then start PXE booting the secondary node. During the boot process, interrupt at the screen below and choose the RESCUE environment.

_images/image56.png

Once in the rescue environment, enter the /cm/cm-clone-install —failover command.

Then, provide the interface name of the primary in-band network, which will be used for the rest of the headnode installation prompts.

Note

If you are unsure of the interface name, check the primary headnode’s interface, as it will be the same for the secondary headnode. If the headnode interfaces are bonded, use the name of one of the child interfaces

root@ClusterManager:~# /cm/cm-clone-install --failover
Network interface to use [default: enp0]: enp138s0np0
.
.
.
root@master's password: <Root Cluster Password>
.
.
.
[v - view, e - edit, c - continue]: c
.
.
.
Do you want to continue [yes/no]? yes
.
.
.
Do you want to reboot [y/n]: y
_images/image57.png

Wait until the secondary node completes the reboot. Then on the primary headnode, go back to the previous screen and select Finalize.

_images/image58.png

Select CONTINUE.

_images/image59.png

Enter the MYSQL password - which is the same as the BCM head-node root password - to continue.

_images/image60.png

Once the below steps are complete, press ENTER.

Updating secondary master mac address.............. [  OK  ]
Initializing failover setup on HEAD-02............. [  OK  ]
Stopping cmdaemon.................................. [  OK  ]
Cloning cmdaemon database.......................... [  OK  ]
Checking database consistency...................... [  OK  ]
Starting cmdaemon, chkconfig services.............. [  OK  ]
Cloning workload manager databases................. [  OK  ]
Cloning additional databases....................... [  OK  ]
Update DB permissions.............................. [  OK  ]
Checking for dedicated failover network............ [  OK  ]
Press ENTER to continue

Choose to reboot the secondary headnode. Wait for the secondary headnode to come back up before continuing.

_images/image61.png

After reboot, you can verify the secondary node’s status from the primary head node using cmsh

[clio-headnode-01]% device list -f hostname:20,category:12,ip:20,status:15
hostname (key)       category     ip                   status
-------------------- ------------ -------------------- ---------------
clio-headnode-01                  10.150.124.11        [   UP   ]
clio-headnode-02                  10.150.124.12        [   UP   ]
[clio-headnode-01]%

Configuring NFS Shared storage#

On the primary headnode, continue in the HA setup screen (or start HA setup by running cmha-setup) , select Shared Storage option from the cmha-setup menu and then select SELECT. In this final HA configuration step, cmha-setup will copy the /cm/shared and /home directories to the shared storage and configure both head nodes and all cluster nodes to mount it.

_images/image62.png

Select Shared Storage.

_images/image63.png

Select NAS

_images/image64.png

Select both /cm/shared and /home to be shared via NAS.

_images/image65.png

Set the NAS parameters.

Note

Ensure the full mount path (e.g. /data/nas/cmshared) exists in the NFS server

_images/image66.png

Proceed with the setup.

_images/image67.png

Once the below steps are complete, press ENTER.

Copying NAS data................................... [  OK  ]
Mount NAS storage.................................. [  OK  ]
Remove old fsmounts................................ [  OK  ]
Add new fsmounts................................... [  OK  ]
Disable old automatic fsexports.................... [  OK  ]
Press ENTER to continue

Exit the cmha-setup script.

Run the cmha status to ensure HA is working.

clio-headnode-01:~# cmha status
Node Status: running in active mode

clio-headnode-01* -> Hclio-headnode-02
mysql           [  OK  ]
ping            [  OK  ]
status          [  OK  ]

clio-headnode-02 -> clio-headnode-01*
mysql           [  OK  ]
ping            [  OK  ]
status          [  OK  ]

Check the NFS mounts for the /cm/shared and /cm/home on the head-node and confirm it’s pointing to the NFS shared volume

Head-node Primary

root@clio-headnode-01:~# mount | grep nfs
nfsd on /proc/fs/nfsd type nfsd (rw,relatime)
10.150.124.9:/data/nas/cmshared on /cm/shared type nfs (rw,relatime,vers=3,rsize=32768,wsize=32768,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.150.124.9,mountvers=3,mountport=44516,mountproto=udp,local_lock=none,addr=10.150.124.9)
10.150.124.9:/data/nas/home on /home type nfs (rw,relatime,vers=3,rsize=32768,wsize=32768,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.150.124.9,mountvers=3,mountport=44516,mountproto=udp,local_lock=none,addr=10.150.124.9)
root@clio-headnode-01:~#

Head-node Secondary

root@clio-headnode-02:~# mount | grep nfs
nfsd on /proc/fs/nfsd type nfsd (rw,relatime)
10.150.124.9:/data/nas/cmshared on /cm/shared type nfs (rw,relatime,vers=3,rsize=32768,wsize=32768,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.150.124.9,mountvers=3,mountport=44516,mountproto=udp,local_lock=none,addr=10.150.124.9)
10.150.124.9:/data/nas/home on /home type nfs (rw,relatime,vers=3,rsize=32768,wsize=32768,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.150.124.9,mountvers=3,mountport=44516,mountproto=udp,local_lock=none,addr=10.150.124.9)
root@clio-headnode-02:~#

Deploying Slurm#

Run the wlm install command from the root shell on the headnode

root@clio-headnode-01:~# cm-wlm-setup

Select Setup (Step By Step)

_images/image68.png

Select Slurm for the workload manager.

_images/image69.png

Leave the cluster name as the default.

_images/image70.png

Select the 2 BCM head-nodes nodes for the server role.

_images/image71.png

Leave the name and priority for the server overlay as defaults.

_images/image73.png

Do not select any categories for the client role.

_images/image74.png

Do not select any nodes for the client role.

_images/image75.png

Leave the name and priority for the client overlay as defaults.

_images/image76.png

Select the following healthcheck producers to be run as prejob checks.

  • cm-chroot-sw-img

  • cuda-dcgm

  • diskspace

  • dmesg

  • failedprejob

  • gpuhealth_quick

  • mysql

  • oomkiller

  • rogueprocess

  • Schedulers

_images/image77.png

Select yes to configure GPUs.

_images/image78.png

Leave the name of the GPU overlay as the default.

_images/image79.png

Select the dgx-b200-slurm category for Workload Manager client role with gpu.

_images/image80.png

Do not select any nodes here.

_images/image81.png

Leave the priority of the gpu client role as the default.

_images/image82.png

Leave the number of slots empty.

_images/image83.png

Select the slogin category for the submit role.

_images/image84.png

Do not select any additional nodes for the submit role.

_images/image85.png

Leave the name and priority for the submit overlay as defaults.

_images/image86.png

Leave the name and priority for the accounting overlay as defaults.

_images/image87.png

Select the 2 slogin nodes for the accounting role.

_images/image88.png

Select SLOGIN-01 as the primary accounting node.

_images/image89.png

Select Use accounting node for the storage server type.

_images/image90.png

Select No on scontrol takeover on BCM failover

_images/image91.png

Select no for the Slurm power saving features.

_images/image92.png

Select Slurm auto-detects NVIDIA GPU details.

_images/image93.png

Set the number of GPUs to 8 and type to nvidia.

_images/image94.png

Select yes to configure the Pyxis plugin.

_images/image94a.png

For Enroot settings, maintain the defaults, do not share data among the nodes or change the paths.

_images/image95.png

Select topology source as None

_images/image96.png

Select Constrain Devices for Cgroups resource constraints.

_images/image97.png

Select “no” on Install NVIDIA GPU packages.

_images/image98.png

Leave the queue names as the default.

_images/image99.png

Choose Save config & deploy.

_images/image100.png

On the save screen you can proceed with the default values which places a cm-wlm-setup.conf file in the /root directory of the primary headnode, or you can specify a custom file name/directory. This file can be used in the future for redeploying Slurm using the same configuration values.

_images/image101.png

Wait for the installation to complete.

----- output omitted for brevity --------
## Progress: 73
#### stage: wlm_slurm: Create Job Queues
## Progress: 76
#### stage: wlm_slurm: Assign Server Role
Assigning SlurmServerRole role
## Progress: 77
#### stage: wlm: Set Primary Server
## Progress: 89
#### stage: wlm_slurm: Assign Client Role
Assigning SlurmClientRole role
## Progress: 90
#### stage: wlm_slurm: Assign SlurmClientRole to configuration overlays with gpu
GPU devices will automatically be added to gres.conf.
Please configure consumable resource allocation and GPU accounting tracking plugins in slurm.conf manually if needed.
More details can be found in the Bright Computing administrator manual, section "GPU Configuration For HPC Workload Managers".
## Progress: 91
#### stage: wlm_slurm: Assign Submit Role
Assigning SlurmSubmitRole role
#### stage: wlm_slurm: Assign Headnode Submit Role
Assigning SlurmSubmitRole role
## Progress: 92
#### stage: wlm_slurm: Render SlurmSubmit role intersection with other installations
## Progress: 93
#### stage: wlm_slurm: Assign Accounting Role
Assigning SlurmAccountingRole role
## Progress: 94
#### stage: wlm: Update Prejob Healthchecks
## Progress: 100

Took:     01:13 min.
Progress: 100/100
################### Finished execution for 'Workload manager setup', status: completed

Workload manager setup finished!

Remove the slurm-client overlay since there will not be any slurm cpu workers.

root@clio-headnode-01:~# cmsh
[root@clio-headnode-01]% configurationoverlay
[root@clio-headnode-01->configurationoverlay]% remove slurm-client
[root@clio-headnode-01->configurationoverlay*]% commit
Successfully removed 1 ConfigurationOverlays
Successfully committed 0 ConfigurationOverlays

Rename the slurm-gpu-client overlay

[clio-headnode-01->configurationoverlay]% set slurm-client-gpu name slurm-client
[clio-headnode-01->configurationoverlay*]% commit
Successfully committed 1 ConfigurationOverlays

Validate Slurm#

Login to SLOGIN-01 and run sinfo to verify that all the nodes are up and ready.

root@slogin-01:~# module load slurm
root@slogin-01:~# sinfo
PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
defq*        up   infinite      4   idle dgx-[01-04]

Basic slum job/GPU test

root@slogin-01:~# srun --gres=gpu:1 -n1 nvidia-smi
Wed Aug 13 13:55:33 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.172.08             Driver Version: 570.172.08     CUDA Version: 12.8     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA B200                    On  |   00000000:1B:00.0 Off |                    0 |
| N/A   33C    P0            168W / 1000W |       0MiB / 183359MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+
root@slogin-01:~# srun --gres=gpu:8 -n1 nvidia-smi
Wed Aug 13 13:56:14 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.172.08             Driver Version: 570.172.08     CUDA Version: 12.8     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA B200                    On  |   00000000:1B:00.0 Off |                    0 |
| N/A   32C    P0            168W / 1000W |       0MiB / 183359MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA B200                    On  |   00000000:43:00.0 Off |                    0 |
| N/A   34C    P0            164W / 1000W |       0MiB / 183359MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   2  NVIDIA B200                    On  |   00000000:52:00.0 Off |                    0 |
| N/A   37C    P0            166W / 1000W |       0MiB / 183359MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   3  NVIDIA B200                    On  |   00000000:61:00.0 Off |                    0 |
| N/A   36C    P0            166W / 1000W |       0MiB / 183359MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   4  NVIDIA B200                    On  |   00000000:9D:00.0 Off |                    0 |
| N/A   31C    P0            164W / 1000W |       0MiB / 183359MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   5  NVIDIA B200                    On  |   00000000:C3:00.0 Off |                    0 |
| N/A   31C    P0            165W / 1000W |       0MiB / 183359MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   6  NVIDIA B200                    On  |   00000000:D1:00.0 Off |                    0 |
| N/A   34C    P0            163W / 1000W |       0MiB / 183359MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   7  NVIDIA B200                    On  |   00000000:DF:00.0 Off |                    0 |
| N/A   35C    P0            168W / 1000W |       0MiB / 183359MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

Container validation test#

root@slogin-01:~# srun --container-image=ubuntu  grep PRETTY /etc/os-release
pyxis: importing docker image: ubuntu
pyxis: imported docker image: ubuntu
PRETTY_NAME="Ubuntu 24.04.2 LTS"

Single-node NCCL test#

root@slogin-01:~# srun --export="NCCL_SOCKET_IFNAME=bond0" -N1 --exclusive --gpus-per-node=8 --mpi=pmix --container-name=nccl-test --container-image="docker://brightcomputing/nccl-test:25.02-py3-x86" all_reduce_perf_mpi -b 1G -e 16G -f 2 -g 8
# nThread 1 nGpus 8 minBytes 1073741824 maxBytes 17179869184 step: 2(factor) warmup iters: 5 iters: 20 agg iters: 1 validation: 1 graph: 0
#
# Using devices
#  Rank  0 Group  0 Pid  67104 on     dgx-02 device  0 [0x1b] NVIDIA B200
#  Rank  1 Group  0 Pid  67104 on     dgx-02 device  1 [0x43] NVIDIA B200
#  Rank  2 Group  0 Pid  67104 on     dgx-02 device  2 [0x52] NVIDIA B200
#  Rank  3 Group  0 Pid  67104 on     dgx-02 device  3 [0x61] NVIDIA B200
#  Rank  4 Group  0 Pid  67104 on     dgx-02 device  4 [0x9d] NVIDIA B200
#  Rank  5 Group  0 Pid  67104 on     dgx-02 device  5 [0xc3] NVIDIA B200
#  Rank  6 Group  0 Pid  67104 on     dgx-02 device  6 [0xd1] NVIDIA B200
#  Rank  7 Group  0 Pid  67104 on     dgx-02 device  7 [0xdf] NVIDIA B200
#
#                                                              out-of-place                       in-place
#       size         count      type   redop    root     time   algbw   busbw #wrong     time   algbw   busbw #wrong
#        (B)    (elements)                               (us)  (GB/s)  (GB/s)            (us)  (GB/s)  (GB/s)
1073741824     268435456     float     sum      -1   2587.7  414.94  726.15      0   2582.9  415.71  727.50      0
2147483648     536870912     float     sum      -1   4628.4  463.98  811.96      0   4634.3  463.39  810.93      0
4294967296    1073741824     float     sum      -1   9114.2  471.24  824.67      0   9114.2  471.24  824.67      0
8589934592    2147483648     float     sum      -1    18081  475.07  831.38      0    18092  474.80  830.90      0
17179869184    4294967296     float     sum      -1    36048  476.58  834.02      0    36024  476.91  834.58      0
# Out of bounds values : 0 OK
# Avg bus bandwidth    : 805.674
#

Multinode NCCL Test 2 and 3 nodes#

root@slogin-01:~#  srun --export="UCX_NET_DEVICES=bond0,OMPI_MCA_pml=ucx,HCOLL_ENABLE_MCAST_ALL=0,OMPI_MCA_coll=^hcoll,NCCL_SOCKET_IFNAME=bond0,UCX_TLS=tcp,UCX_NET_DEVICES=bond0,MELLANOX_VISIBLE_DEVICES=all,OMPI_MCA_coll_hcoll_enable=0,OMPI_MCA_coll_ucc_enable=0" -N2 --exclusive --gpus-per-node=8 --mpi=pmix --container-name=nccl-test --container-image="docker://brightcomputing/nccl-test:25.02-py3-x86" all_reduce_perf_mpi -b 1G -e 16G -f 2 -g 8
[dgx-04:74971] PMIX ERROR: ERROR in file gds_ds12_lock_pthread.c at line 168
[dgx-02:75241] PMIX ERROR: ERROR in file gds_ds12_lock_pthread.c at line 168
# nThread 1 nGpus 8 minBytes 1073741824 maxBytes 17179869184 step: 2(factor) warmup iters: 5 iters: 20 agg iters: 1 validation: 1 graph: 0
#
# Using devices
#  Rank  0 Group  0 Pid  75241 on     dgx-02 device  0 [0x1b] NVIDIA B200
#  Rank  1 Group  0 Pid  75241 on     dgx-02 device  1 [0x43] NVIDIA B200
#  Rank  2 Group  0 Pid  75241 on     dgx-02 device  2 [0x52] NVIDIA B200
#  Rank  3 Group  0 Pid  75241 on     dgx-02 device  3 [0x61] NVIDIA B200
#  Rank  4 Group  0 Pid  75241 on     dgx-02 device  4 [0x9d] NVIDIA B200
#  Rank  5 Group  0 Pid  75241 on     dgx-02 device  5 [0xc3] NVIDIA B200
#  Rank  6 Group  0 Pid  75241 on     dgx-02 device  6 [0xd1] NVIDIA B200
#  Rank  7 Group  0 Pid  75241 on     dgx-02 device  7 [0xdf] NVIDIA B200
#  Rank  8 Group  0 Pid  74971 on     dgx-04 device  0 [0x1b] NVIDIA B200
#  Rank  9 Group  0 Pid  74971 on     dgx-04 device  1 [0x43] NVIDIA B200
#  Rank 10 Group  0 Pid  74971 on     dgx-04 device  2 [0x52] NVIDIA B200
#  Rank 11 Group  0 Pid  74971 on     dgx-04 device  3 [0x61] NVIDIA B200
#  Rank 12 Group  0 Pid  74971 on     dgx-04 device  4 [0x9d] NVIDIA B200
#  Rank 13 Group  0 Pid  74971 on     dgx-04 device  5 [0xc3] NVIDIA B200
#  Rank 14 Group  0 Pid  74971 on     dgx-04 device  6 [0xd1] NVIDIA B200
#  Rank 15 Group  0 Pid  74971 on     dgx-04 device  7 [0xdf] NVIDIA B200
#
#                                                              out-of-place                       in-place
#       size         count      type   redop    root     time   algbw   busbw #wrong     time   algbw   busbw #wrong
#        (B)    (elements)                               (us)  (GB/s)  (GB/s)            (us)  (GB/s)  (GB/s)
1073741824     268435456     float     sum      -1   2972.6  361.22  677.28      0   2975.6  360.84  676.58      0
2147483648     536870912     float     sum      -1   5710.7  376.04  705.08      0   5712.2  375.95  704.90      0
4294967296    1073741824     float     sum      -1    11185  384.00  720.00      0    11186  383.96  719.93      0
8589934592    2147483648     float     sum      -1    22124  388.26  727.98      0    22123  388.28  728.02      0
17179869184    4294967296     float     sum      -1    44009  390.37  731.95      0    44000  390.45  732.09      0
# Out of bounds values : 0 OK
# Avg bus bandwidth    : 712.381
#





## ---- 3 node NCCL test ----##
root@slogin-01:~# srun --export="UCX_NET_DEVICES=bond0,OMPI_MCA_pml=ucx,HCOLL_ENABLE_MCAST_ALL=0,OMPI_MCA_coll=^hcoll,NCCL_SOCKET_IFNAME=bond0,UCX_TLS=tcp,UCX_NET_DEVICES=bond0,MELLANOX_VISIBLE_DEVICES=all,OMPI_MCA_coll_hcoll_enable=0,OMPI_MCA_coll_ucc_enable=0" -N3 --exclusive --gpus-per-node=8 --mpi=pmix --container-name=nccl-test --container-image="docker://brightcomputing/nccl-test:25.02-py3-x86" all_reduce_perf_mpi -b 1G -e 16G -f 2 -g 8
[dgx-04:90178] PMIX ERROR: ERROR in file gds_ds12_lock_pthread.c at line 168
[dgx-02:90762] PMIX ERROR: ERROR in file gds_ds12_lock_pthread.c at line 168
pyxis: imported docker image: docker://brightcomputing/nccl-test:25.02-py3-x86
[dgx-01:80425] PMIX ERROR: ERROR in file gds_ds12_lock_pthread.c at line 168
# nThread 1 nGpus 8 minBytes 1073741824 maxBytes 17179869184 step: 2(factor) warmup iters: 5 iters: 20 agg iters: 1 validation: 1 graph: 0
#
# Using devices
#  Rank  0 Group  0 Pid  80425 on     dgx-01 device  0 [0x1b] NVIDIA B200
#  Rank  1 Group  0 Pid  80425 on     dgx-01 device  1 [0x43] NVIDIA B200
#  Rank  2 Group  0 Pid  80425 on     dgx-01 device  2 [0x52] NVIDIA B200
#  Rank  3 Group  0 Pid  80425 on     dgx-01 device  3 [0x61] NVIDIA B200
#  Rank  4 Group  0 Pid  80425 on     dgx-01 device  4 [0x9d] NVIDIA B200
#  Rank  5 Group  0 Pid  80425 on     dgx-01 device  5 [0xc3] NVIDIA B200
#  Rank  6 Group  0 Pid  80425 on     dgx-01 device  6 [0xd1] NVIDIA B200
#  Rank  7 Group  0 Pid  80425 on     dgx-01 device  7 [0xdf] NVIDIA B200
#  Rank  8 Group  0 Pid  90762 on     dgx-02 device  0 [0x1b] NVIDIA B200
#  Rank  9 Group  0 Pid  90762 on     dgx-02 device  1 [0x43] NVIDIA B200
#  Rank 10 Group  0 Pid  90762 on     dgx-02 device  2 [0x52] NVIDIA B200
#  Rank 11 Group  0 Pid  90762 on     dgx-02 device  3 [0x61] NVIDIA B200
#  Rank 12 Group  0 Pid  90762 on     dgx-02 device  4 [0x9d] NVIDIA B200
#  Rank 13 Group  0 Pid  90762 on     dgx-02 device  5 [0xc3] NVIDIA B200
#  Rank 14 Group  0 Pid  90762 on     dgx-02 device  6 [0xd1] NVIDIA B200
#  Rank 15 Group  0 Pid  90762 on     dgx-02 device  7 [0xdf] NVIDIA B200
#  Rank 16 Group  0 Pid  90178 on     dgx-04 device  0 [0x1b] NVIDIA B200
#  Rank 17 Group  0 Pid  90178 on     dgx-04 device  1 [0x43] NVIDIA B200
#  Rank 18 Group  0 Pid  90178 on     dgx-04 device  2 [0x52] NVIDIA B200
#  Rank 19 Group  0 Pid  90178 on     dgx-04 device  3 [0x61] NVIDIA B200
#  Rank 20 Group  0 Pid  90178 on     dgx-04 device  4 [0x9d] NVIDIA B200
#  Rank 21 Group  0 Pid  90178 on     dgx-04 device  5 [0xc3] NVIDIA B200
#  Rank 22 Group  0 Pid  90178 on     dgx-04 device  6 [0xd1] NVIDIA B200
#  Rank 23 Group  0 Pid  90178 on     dgx-04 device  7 [0xdf] NVIDIA B200
#
#                                                              out-of-place                       in-place
#       size         count      type   redop    root     time   algbw   busbw #wrong     time   algbw   busbw #wrong
#        (B)    (elements)                               (us)  (GB/s)  (GB/s)            (us)  (GB/s)  (GB/s)
1073741824     268435456     float     sum      -1   5322.5  201.74  386.66      0   5326.0  201.60  386.41      0
2147483648     536870912     float     sum      -1    10562  203.31  389.68      0    10560  203.36  389.77      0
4294967296    1073741824     float     sum      -1    20999  204.53  392.02      0    21009  204.44  391.84      0
8589934592    2147483648     float     sum      -1    42028  204.38  391.74      0    42038  204.34  391.65      0
17179869184    4294967296     float     sum      -1    83647  205.39  393.66      0    83650  205.38  393.64      0
# Out of bounds values : 0 OK
# Avg bus bandwidth    : 390.707

Deploy Run:ai#

Refer to the official Run:ai on BCM installation documentation for step-by-step deployment guides based on the BCM K8s deployment wizard.

Cluster Name#

To help distinguish the Run:ai cluster from other K8s clusters that may be deployed onto the SuperPOD, set the cluster name to k8s-user in the Kubernetes Wizard.

Node Categories#

In the NVIDIA Base Command Manager (BCM), a node category is a way to group nodes that share the same hardware profile and intended role. Defining node categories allows the system to assign the appropriate software image and configurations to each group during provisioning

Before installing NVIDIA Run:ai, make sure BCM node categories are created for:

  • Kubernetes system nodes ( k8s-system-user)

  • NVIDIA Run:ai GPU worker nodes (dgx-b200-k8s)

These will be employed when setting up Run:ai for the first time via the BCM setup assistant. More details and full instructions are available in the Run:ai BCM Install Getting Started Guide .

Validate Run:AI#

To validate Run:AI please refer to the Run:ai usage guides for deploying single-GPU training jobs, multi-node training jobs, single-GPU inference jobs, and multi-GPU inference jobs.

Appendix A - Site survey#

Sample Site Survey#

_images/image102.png _images/image103.png _images/image104.png

Appendix B - Switch Configuration#

SN4600C-1/TOR1 Configuration#

nv set bridge domain br_default vlan 100-103
nv set interface bond1 bond member swp1
nv set interface bond1 bond mlag id 1
nv set interface bond1-11,13-48,51 bond lacp-bypass on
nv set interface bond1-32 bridge domain br_default access 102
nv set interface bond33-48 bridge domain br_default access 103
nv set interface bond1-48,51 bond mlag enable on
nv set interface bond1-48,51 type bond
nv set interface bond2 bond member swp2
nv set interface bond2 bond mlag id 2
nv set interface bond3 bond member swp3
nv set interface bond3 bond mlag id 3
nv set interface bond4 bond member swp4
nv set interface bond4 bond mlag id 4
nv set interface bond5 bond member swp5
nv set interface bond5 bond mlag id 5
nv set interface bond6 bond member swp6
nv set interface bond6 bond mlag id 6
nv set interface bond7 bond member swp7
nv set interface bond7 bond mlag id 7
nv set interface bond8 bond member swp8
nv set interface bond8 bond mlag id 8
nv set interface bond9 bond member swp9
nv set interface bond9 bond mlag id 9
nv set interface bond10 bond member swp10
nv set interface bond10 bond mlag id 10
nv set interface bond11 bond member swp11
nv set interface bond11 bond mlag id 11
nv set interface bond12 bond member swp12
nv set interface bond12 bond mlag id 12
nv set interface bond13 bond member swp13
nv set interface bond13 bond mlag id 13
nv set interface bond14 bond member swp14
nv set interface bond14 bond mlag id 14
nv set interface bond15 bond member swp15
nv set interface bond15 bond mlag id 15
nv set interface bond16 bond member swp16
nv set interface bond16 bond mlag id 16
nv set interface bond17 bond member swp17
nv set interface bond17 bond mlag id 17
nv set interface bond18 bond member swp18
nv set interface bond18 bond mlag id 18
nv set interface bond19 bond member swp19
nv set interface bond19 bond mlag id 19
nv set interface bond20 bond member swp20
nv set interface bond20 bond mlag id 20
nv set interface bond21 bond member swp21
nv set interface bond21 bond mlag id 21
nv set interface bond22 bond member swp22
nv set interface bond22 bond mlag id 22
nv set interface bond23 bond member swp23
nv set interface bond23 bond mlag id 23
nv set interface bond24 bond member swp24
nv set interface bond24 bond mlag id 24
nv set interface bond25 bond member swp25
nv set interface bond25 bond mlag id 25
nv set interface bond26 bond member swp26
nv set interface bond26 bond mlag id 26
nv set interface bond27 bond member swp27
nv set interface bond27 bond mlag id 27
nv set interface bond28 bond member swp28
nv set interface bond28 bond mlag id 28
nv set interface bond29 bond member swp29
nv set interface bond29 bond mlag id 29
nv set interface bond30 bond member swp30
nv set interface bond30 bond mlag id 30
nv set interface bond31 bond member swp31
nv set interface bond31 bond mlag id 31
nv set interface bond32 bond member swp32
nv set interface bond32 bond mlag id 32
nv set interface bond33 bond member swp33
nv set interface bond33 bond mlag id 33
nv set interface bond34 bond member swp34
nv set interface bond34 bond mlag id 34
nv set interface bond35 bond member swp35
nv set interface bond35 bond mlag id 35
nv set interface bond36 bond member swp36
nv set interface bond36 bond mlag id 36
nv set interface bond37 bond member swp37
nv set interface bond37 bond mlag id 37
nv set interface bond38 bond member swp38
nv set interface bond38 bond mlag id 38
nv set interface bond39 bond member swp39
nv set interface bond39 bond mlag id 39
nv set interface bond40 bond member swp40
nv set interface bond40 bond mlag id 40
nv set interface bond41 bond member swp41
nv set interface bond41 bond mlag id 41
nv set interface bond42 bond member swp42
nv set interface bond42 bond mlag id 42
nv set interface bond43 bond member swp43
nv set interface bond43 bond mlag id 43
nv set interface bond44 bond member swp44
nv set interface bond44 bond mlag id 44
nv set interface bond45 bond member swp45
nv set interface bond45 bond mlag id 45
nv set interface bond46 bond member swp46
nv set interface bond46 bond mlag id 46
nv set interface bond47 bond member swp47
nv set interface bond47 bond mlag id 47
nv set interface bond48 bond member swp48
nv set interface bond48 bond mlag id 48
nv set interface bond51 bond member swp51
nv set interface bond51 bond mlag id 51
nv set interface bond51 bridge domain br_default untagged 1
nv set interface bond51 bridge domain br_default vlan all
nv set interface eth0 ip address dhcp
nv set interface eth0 ip vrf mgmt
nv set interface eth0 type eth
nv set interface lo ip address 10.160.254.22/32
nv set interface lo type loopback
nv set interface peerlink bond member swp63
nv set interface peerlink bond member swp64
nv set interface peerlink type peerlink
nv set interface peerlink.4094 base-interface peerlink
nv set interface peerlink.4094 type sub
nv set interface peerlink.4094 vlan 4094
nv set interface swp49-50 type swp
nv set interface vlan101-103 ip vrr enable on
nv set interface vlan101-103 ip vrr mac-address 00:1c:73:aa:bb:04
nv set interface vlan101-103 ip vrr state up
nv set interface vlan101-103 type svi
nv set interface vlan101 ip address 10.150.123.2/24
nv set interface vlan101 ip vrr address 10.150.123.1/24
nv set interface vlan101 vlan 101
nv set interface vlan102 ip address 10.150.124.2/24
nv set interface vlan102 ip vrr address 10.150.124.1/24
nv set interface vlan102 vlan 102
nv set interface vlan103 ip address 10.150.125.2/24
nv set interface vlan103 ip vrr address 10.150.125.1/24
nv set interface vlan103 vlan 103
nv set service dhcp-relay default interface vlan102
nv set service dhcp-relay default interface vlan103
nv set service dhcp-relay default server 10.150.124.11
nv set service dhcp-relay default server 10.150.124.12
nv set mlag backup 10.160.254.23
nv set mlag enable on
nv set mlag mac-address 44:38:39:FF:0A:00
nv set mlag peer-ip linklocal
nv set mlag priority 2048
nv set router bgp autonomous-system 4200120327
nv set router bgp enable on
nv set router bgp router-id 10.160.254.22
nv set router vrr enable on
nv set service ntp mgmt server 0.cumulusnetworks.pool.ntp.org
nv set service ntp mgmt server 1.cumulusnetworks.pool.ntp.org
nv set service ntp mgmt server 2.cumulusnetworks.pool.ntp.org
nv set service ntp mgmt server 3.cumulusnetworks.pool.ntp.org
nv set system aaa class nvapply action allow
nv set system aaa class nvapply command-path / permission all
nv set system aaa class nvshow action allow
nv set system aaa class nvshow command-path / permission ro
nv set system aaa class sudo action allow
nv set system aaa class sudo command-path / permission all
nv set system aaa role nvue-admin class nvapply
nv set system aaa role nvue-monitor class nvshow
nv set system aaa role system-admin class nvapply
nv set system aaa role system-admin class sudo
nv set system aaa user cumulus full-name cumulus,,,
nv set system aaa user cumulus hashed-password '*'
nv set system aaa user cumulus role system-admin
nv set system api state enabled
nv set system config auto-save state enabled
nv set system control-plane acl acl-default-dos inbound
nv set system control-plane acl acl-default-whitelist inbound
nv set system hostname SN4600C-1
nv set system reboot mode cold
nv set system ssh-server permit-root-login enabled
nv set system ssh-server state enabled
nv set system ssh-server vrf mgmt
nv set system timezone America/Los_Angeles
nv set system wjh channel forwarding trigger l2
nv set system wjh channel forwarding trigger l3
nv set system wjh channel forwarding trigger tunnel
nv set system wjh enable on
nv set vrf default router bgp address-family ipv4-unicast enable on
nv set vrf default router bgp address-family ipv4-unicast redistribute connected enable on
nv set vrf default router bgp enable on
nv set vrf default router bgp neighbor peerlink.4094 remote-as internal
nv set vrf default router bgp neighbor peerlink.4094 timers connection-retry 10
nv set vrf default router bgp neighbor peerlink.4094 timers hold 10
nv set vrf default router bgp neighbor peerlink.4094 timers keepalive 3
nv set vrf default router bgp neighbor peerlink.4094 timers route-advertisement auto
nv set vrf default router bgp neighbor peerlink.4094 type unnumbered
nv set vrf default router bgp neighbor swp49 remote-as external
nv set vrf default router bgp neighbor swp49 timers connection-retry 10
nv set vrf default router bgp neighbor swp49 timers hold 10
nv set vrf default router bgp neighbor swp49 timers keepalive 3
nv set vrf default router bgp neighbor swp49 timers route-advertisement auto
nv set vrf default router bgp neighbor swp49 type unnumbered
nv set vrf default router bgp neighbor swp50 remote-as external
nv set vrf default router bgp neighbor swp50 timers connection-retry 10
nv set vrf default router bgp neighbor swp50 timers hold 10
nv set vrf default router bgp neighbor swp50 timers keepalive 3
nv set vrf default router bgp neighbor swp50 timers route-advertisement auto
nv set vrf default router bgp neighbor swp50 type unnumbered

SN4600C-2/TOR2 Configuration#

nv set bridge domain br_default vlan 100-103
nv set interface bond1 bond member swp1
nv set interface bond1 bond mlag id 1
nv set interface bond1-11,13-48,51 bond lacp-bypass on
nv set interface bond1-32 bridge domain br_default access 102
nv set interface bond33-48 bridge domain br_default access 103
nv set interface bond1-48,51 bond mlag enable on
nv set interface bond1-48,51 type bond
nv set interface bond2 bond member swp2
nv set interface bond2 bond mlag id 2
nv set interface bond3 bond member swp3
nv set interface bond3 bond mlag id 3
nv set interface bond4 bond member swp4
nv set interface bond4 bond mlag id 4
nv set interface bond5 bond member swp5
nv set interface bond5 bond mlag id 5
nv set interface bond6 bond member swp6
nv set interface bond6 bond mlag id 6
nv set interface bond7 bond member swp7
nv set interface bond7 bond mlag id 7
nv set interface bond8 bond member swp8
nv set interface bond8 bond mlag id 8
nv set interface bond9 bond member swp9
nv set interface bond9 bond mlag id 9
nv set interface bond10 bond member swp10
nv set interface bond10 bond mlag id 10
nv set interface bond11 bond member swp11
nv set interface bond11 bond mlag id 11
nv set interface bond12 bond member swp12
nv set interface bond12 bond mlag id 12
nv set interface bond13 bond member swp13
nv set interface bond13 bond mlag id 13
nv set interface bond14 bond member swp14
nv set interface bond14 bond mlag id 14
nv set interface bond15 bond member swp15
nv set interface bond15 bond mlag id 15
nv set interface bond16 bond member swp16
nv set interface bond16 bond mlag id 16
nv set interface bond17 bond member swp17
nv set interface bond17 bond mlag id 17
nv set interface bond18 bond member swp18
nv set interface bond18 bond mlag id 18
nv set interface bond19 bond member swp19
nv set interface bond19 bond mlag id 19
nv set interface bond20 bond member swp20
nv set interface bond20 bond mlag id 20
nv set interface bond21 bond member swp21
nv set interface bond21 bond mlag id 21
nv set interface bond22 bond member swp22
nv set interface bond22 bond mlag id 22
nv set interface bond23 bond member swp23
nv set interface bond23 bond mlag id 23
nv set interface bond24 bond member swp24
nv set interface bond24 bond mlag id 24
nv set interface bond25 bond member swp25
nv set interface bond25 bond mlag id 25
nv set interface bond26 bond member swp26
nv set interface bond26 bond mlag id 26
nv set interface bond27 bond member swp27
nv set interface bond27 bond mlag id 27
nv set interface bond28 bond member swp28
nv set interface bond28 bond mlag id 28
nv set interface bond29 bond member swp29
nv set interface bond29 bond mlag id 29
nv set interface bond30 bond member swp30
nv set interface bond30 bond mlag id 30
nv set interface bond31 bond member swp31
nv set interface bond31 bond mlag id 31
nv set interface bond32 bond member swp32
nv set interface bond32 bond mlag id 32
nv set interface bond33 bond member swp33
nv set interface bond33 bond mlag id 33
nv set interface bond34 bond member swp34
nv set interface bond34 bond mlag id 34
nv set interface bond35 bond member swp35
nv set interface bond35 bond mlag id 35
nv set interface bond36 bond member swp36
nv set interface bond36 bond mlag id 36
nv set interface bond37 bond member swp37
nv set interface bond37 bond mlag id 37
nv set interface bond38 bond member swp38
nv set interface bond38 bond mlag id 38
nv set interface bond39 bond member swp39
nv set interface bond39 bond mlag id 39
nv set interface bond40 bond member swp40
nv set interface bond40 bond mlag id 40
nv set interface bond41 bond member swp41
nv set interface bond41 bond mlag id 41
nv set interface bond42 bond member swp42
nv set interface bond42 bond mlag id 42
nv set interface bond43 bond member swp43
nv set interface bond43 bond mlag id 43
nv set interface bond44 bond member swp44
nv set interface bond44 bond mlag id 44
nv set interface bond45 bond member swp45
nv set interface bond45 bond mlag id 45
nv set interface bond46 bond member swp46
nv set interface bond46 bond mlag id 46
nv set interface bond47 bond member swp47
nv set interface bond47 bond mlag id 47
nv set interface bond48 bond member swp48
nv set interface bond48 bond mlag id 48
nv set interface bond51 bond member swp51
nv set interface bond51 bond mlag id 51
nv set interface bond51 bridge domain br_default untagged 1
nv set interface bond51 bridge domain br_default vlan all
nv set interface eth0 ip address dhcp
nv set interface eth0 ip vrf mgmt
nv set interface eth0 type eth
nv set interface lo ip address 10.160.254.22/32
nv set interface lo type loopback
nv set interface peerlink bond member swp63
nv set interface peerlink bond member swp64
nv set interface peerlink type peerlink
nv set interface peerlink.4094 base-interface peerlink
nv set interface peerlink.4094 type sub
nv set interface peerlink.4094 vlan 4094
nv set interface swp49-50 type swp
nv set interface vlan101-103 ip vrr enable on
nv set interface vlan101-103 ip vrr mac-address 00:1c:73:aa:bb:04
nv set interface vlan101-103 ip vrr state up
nv set interface vlan101-103 type svi
nv set interface vlan101 ip address 10.150.123.3/24
nv set interface vlan101 ip vrr address 10.150.123.1/24
nv set interface vlan101 vlan 101
nv set interface vlan102 ip address 10.150.124.3/24
nv set interface vlan102 ip vrr address 10.150.124.1/24
nv set interface vlan102 vlan 102
nv set interface vlan103 ip address 10.150.125.3/24
nv set interface vlan103 ip vrr address 10.150.125.1/24
nv set interface vlan103 vlan 103
nv set service dhcp-relay default interface vlan102
nv set service dhcp-relay default interface vlan103
nv set service dhcp-relay default server 10.150.124.11
nv set service dhcp-relay default server 10.150.124.12

nv set mlag backup 10.160.254.22
nv set mlag enable on
nv set mlag mac-address 44:38:39:FF:0A:00
nv set mlag peer-ip linklocal
nv set mlag priority 2048
nv set router bgp autonomous-system 4200120327
nv set router bgp enable on
nv set router bgp router-id 10.160.254.22
nv set router vrr enable on
nv set service ntp mgmt server 0.cumulusnetworks.pool.ntp.org
nv set service ntp mgmt server 1.cumulusnetworks.pool.ntp.org
nv set service ntp mgmt server 2.cumulusnetworks.pool.ntp.org
nv set service ntp mgmt server 3.cumulusnetworks.pool.ntp.org
nv set system aaa class nvapply action allow
nv set system aaa class nvapply command-path / permission all
nv set system aaa class nvshow action allow
nv set system aaa class nvshow command-path / permission ro
nv set system aaa class sudo action allow
nv set system aaa class sudo command-path / permission all
nv set system aaa role nvue-admin class nvapply
nv set system aaa role nvue-monitor class nvshow
nv set system aaa role system-admin class nvapply
nv set system aaa role system-admin class sudo
nv set system aaa user cumulus full-name cumulus,,,
nv set system aaa user cumulus hashed-password '*'
nv set system aaa user cumulus role system-admin
nv set system api state enabled
nv set system config auto-save state enabled
nv set system control-plane acl acl-default-dos inbound
nv set system control-plane acl acl-default-whitelist inbound
nv set system hostname SN4600C-2
nv set system reboot mode cold
nv set system ssh-server permit-root-login enabled
nv set system ssh-server state enabled
nv set system ssh-server vrf mgmt
nv set system timezone America/Los_Angeles
nv set system wjh channel forwarding trigger l2
nv set system wjh channel forwarding trigger l3
nv set system wjh channel forwarding trigger tunnel
nv set system wjh enable on
nv set vrf default router bgp address-family ipv4-unicast enable on
nv set vrf default router bgp address-family ipv4-unicast redistribute connected enable on
nv set vrf default router bgp enable on
nv set vrf default router bgp neighbor peerlink.4094 remote-as internal
nv set vrf default router bgp neighbor peerlink.4094 timers connection-retry 10
nv set vrf default router bgp neighbor peerlink.4094 timers hold 10
nv set vrf default router bgp neighbor peerlink.4094 timers keepalive 3
nv set vrf default router bgp neighbor peerlink.4094 timers route-advertisement auto
nv set vrf default router bgp neighbor peerlink.4094 type unnumbered
nv set vrf default router bgp neighbor swp49 remote-as external
nv set vrf default router bgp neighbor swp49 timers connection-retry 10
nv set vrf default router bgp neighbor swp49 timers hold 10
nv set vrf default router bgp neighbor swp49 timers keepalive 3
nv set vrf default router bgp neighbor swp49 timers route-advertisement auto
nv set vrf default router bgp neighbor swp49 type unnumbered
nv set vrf default router bgp neighbor swp50 remote-as external
nv set vrf default router bgp neighbor swp50 timers connection-retry 10
nv set vrf default router bgp neighbor swp50 timers hold 10
nv set vrf default router bgp neighbor swp50 timers keepalive 3
nv set vrf default router bgp neighbor swp50 timers route-advertisement auto
nv set vrf default router bgp neighbor swp50 type unnumbered

SN2201 Switch Configuration-IPMI#

nv set bridge domain br_default vlan 101
nv set interface bond1 bond member swp49
nv set interface bond1 bond member swp50
nv set interface bond1 bridge domain br_default untagged 1
nv set interface bond1 bridge domain br_default vlan all
nv set interface bond1 type bond
nv set interface eth0 ip address dhcp
nv set interface eth0 ip vrf mgmt
nv set interface eth0 type eth
nv set interface swp1-48 bridge domain br_default access 101
nv set interface swp1-48 description 'BMC Ports'
nv set interface swp1-50 link state up
nv set interface swp1-50 type swp
nv set service ntp mgmt server 0.cumulusnetworks.pool.ntp.org
nv set service ntp mgmt server 1.cumulusnetworks.pool.ntp.org
nv set service ntp mgmt server 2.cumulusnetworks.pool.ntp.org
nv set service ntp mgmt server 3.cumulusnetworks.pool.ntp.org
nv set system aaa class nvapply action allow
nv set system aaa class nvapply command-path / permission all
nv set system aaa class nvshow action allow
nv set system aaa class nvshow command-path / permission ro
nv set system aaa class sudo action allow
nv set system aaa class sudo command-path / permission all
nv set system aaa role nvue-admin class nvapply
nv set system aaa role nvue-monitor class nvshow
nv set system aaa role system-admin class nvapply
nv set system aaa role system-admin class sudo
nv set system aaa user cumulus full-name cumulus,,,
nv set system aaa user cumulus hashed-password '*'
nv set system aaa user cumulus role system-admin
nv set system api state enabled
nv set system config auto-save state enabled
nv set system control-plane acl acl-default-dos inbound
nv set system control-plane acl acl-default-whitelist inbound
nv set system hostname IPMI-Basepod-01
nv set system reboot mode cold
nv set system ssh-server permit-root-login enabled
nv set system ssh-server state enabled
nv set system ssh-server vrf mgmt
nv set system timezone America/Los_Angeles
nv set system wjh channel forwarding trigger l2
nv set system wjh channel forwarding trigger l3
nv set system wjh channel forwarding trigger tunnel
nv set system wjh enable on

QM9700-1 Full Configuration#

#
# Running database "initial"
# Generated at 2025/02/15 06:42:32 +0000
# Hostname: QM9700-1
# Product release: 3.12.1002
#

#
# Running-config temporary prefix mode setting
#


#
# IB Partition configuration
#


#
# Subnet Manager configuration
#


#
# IB ports configuration
#





#
# Network interface configuration
#




#
# Other IP configuration
#





#
# Other IPv6 configuration
#


#
# Local user account configuration
#



#
# AAA remote server configuration
#
ldap bind-password ********
radius-server key ********
tacacs-server key ********

#
# Password restriction configuration
#


#
# Network management configuration
#
web proxy auth basic password ********

#
# X.509 certificates configuration
#

Certificate name system-self-signed, ID 9f639fcad62931e3996712b59066cdda047fb176
(public-cert config omitted since private-key config is hidden)


#
# IB nodename to GUID mapping
#




#
# Persistent prefix mode setting
#

QM9700-2 Full Configuration#

#
# Running database "initial"
# Generated at 2025/02/15 06:41:54 +0000
# Hostname: QM9700-2
# Product release: 3.12.1002
#

#
# Running-config temporary prefix mode setting
#


#
# IB Partition configuration
#


#
# Subnet Manager configuration
#


#
# Other IP configuration
#


#
# Local user account configuration
#



#
# AAA remote server configuration
#
ldap bind-password ********
radius-server key ********
tacacs-server key ********

#
# Password restriction configuration
#


#
# Network management configuration
#
web proxy auth basic password ********

#
# X.509 certificates configuration
#

Certificate name system-self-signed, ID 146da5394146409cf2e60c4b7debbedd1e2e6ac4
(public-cert config omitted since private-key config is hidden)


#
# IB nodename to GUID mapping
#




#
# Persistent prefix mode setting
#