Manual Addition of GB200/GB300 Rack Entries#
If the GB200/GB300 rack components need to be added manually in BCM (through cmsh), the explicit steps and requirements are documented here. Between GB200 and GB300, there are several differences that must be taken into account. Critically, there is only a single two port Bluefield3 network card per compute tray. As a result, no bond connections are configured as they are in GB200. Further, some PCIe Bus, Device, and Function numbers (BDF) have changed.
GB200/GB300 Compute Tray Golden Node#
Add the rack entry into cmsh.
Add rack entry commands
cmsh -c "rack; add <rack number>; set x-coordinate 1; set y-coordinate 1; commit"Add node entry. Please note, follow the nomenclature as described in the rack inventory section.
Tip
To skip steps 2-9, jump directly to the one-shot command summary instead of following the interactive steps.
Add node entry commands
cmsh device add physicalnode <COMPUTE_NODE-01>
Set category (many attributes will be inherited based on what was set up for the GB200/GB300 category).
Set category commands
cmsh device use <compute-node-01> set category <GB200/GB300 category>
Note
For this golden node entry that will be cloned in to the other node entries for a rack, the category can be changed later if needed when considering if the node entry will be managed under SLURM or Kubernetes. A separate category and software image is created in a different step for each workload manager.
Add BMC connection (
rf0for redfish based control,ipmi0for ipmitool. defaultrf0). This will set the power control of the node to this interface.Add BMC connection commands
cmsh device use <compute-node-01> interfaces add bmc rf0 set network <name of ipminet where ipmi is configured> set ip <bmc IP> set mac <bmc mac>
Note
If the BMC MAC address is not available, omit the
set maccommand.
Add Bluefield-3 and ConnectX-7/ConnectX-8 interfaces.
GB200 Compute Tray Bluefield-3 and CX-7 interfaces:
The individual netnames for GB200 compute tray management ports 1 and 2 and storage ports 1 and 2 are:
M1—enP6p3s0f0np0
M2—enP22p3s0f0np0
S1—enP6p3s0f1np1
S2—enP22p3s0f1np1
Add GB200 Bluefield-3 and CX-7 interface commands
cmsh device use <compute-node-01> interfaces add physical enP6p3s0f0np0 set mac <M1 MAC> add physical enP22p3s0f0np0 set mac <M2 MAC> add physical enP6p3s0f1np1 set network <storage network name> set mac <S1 MAC> add physical enP22p3s0f1np1 set network <storage network name> set mac <S2 MAC>
Note
If the MACs are not available, they can be set after cloning the golden node entry. Omit the
set maccommands in this case.M1 and M2 will be configured as bond0, so the IP address and network will be set up during the bond configuration. No IP is assigned to these interfaces directly.
S1 and S2 are configured as individual interfaces with their own IP addresses on the storage network.
If the 1G LOM port is needed to be used for provisioning, it can be added to the interface configuration as interface enP5p9s0.
GB300 Compute Tray Bluefield-3 and CX-8 interfaces:
The individual netnames for GB300 compute tray management port 1 and storage port 1 are:
M1—enP22p3s0f0np0
S1—enP22p3s0f1np1
Add GB300 Bluefield-3 and CX-8 interface commands
cmsh device use <compute-node-01> interfaces add physical enP22p3s0f0np0 set mac <M1 MAC> add physical enP22p3s0f1np1 set network <storage network name> set mac <S1 MAC>
Note
If the MACs are not available, they can be set after cloning the golden node entry. Omit the
set maccommands in this case.M1 (enP22p3s0f0np0) is configured with network and IP address for provisioning.
S1 is configured as an individual interface with its own IP address on the storage network.
There is no bond connection configuration for GB300 compute trays.
If the 1G LOM port is needed to be used for provisioning, it can be added to the interface configuration as interface enP5p9s0.
Configure bonds. Add and configure Bond0 (assuming LACP bonding enabled) - GB200 Compute Tray Only
GB200 Compute Tray:
Configure bond0 commands
cmsh device use <COMPUTE_NODE-01> interfaces add bond bond0 set interfaces enP6p3s0f0np0 enP22p3s0f0np0 set mode 4 set options miimon=100 set network <internalnet or whatever network it is being provisioned on> set ip <BOND_IP>
Set provisioning interface.
Set provisioning interface command for GB200 Compute Tray
cmsh device use <gb300-compute-node-01> set provisioninginterface enP22p3s0f0np0 commit
Note
If the 1G LOM port is used for provisioning, set that interface (e.g.,
enP5p9s0) as the provisioning interface here instead ofbond0(for GB200) orenP22p3s0f0np0(for GB300).Add InfiniBand interfaces.
Add InfiniBand interface commands
cmsh -c "device; use <compute-node-01>; interfaces; add physical ibp3s0; set network computenet; set ip <IP> (if available); commit" cmsh -c "device; use <compute-node-01>; interfaces; add physical ibP2p3s0; set network computenet; set ip <IP> (if available); commit" cmsh -c "device; use <compute-node-01>; interfaces; add physical ibP16p3s0; set network computenet; set ip <IP> (if available); commit" cmsh -c "device; use <compute-node-01>; interfaces; add physical ibP18p3s0; set network computenet; set ip <IP> (if available); commit"
Note
The InfiniBand interfaces listed above are the same for both GB200 and GB300 compute trays.
Set system MAC for initial boot.
For GB200, choose M1 or M2 MAC and set it for the device unless a console or remote BMC/KVM access is available to the node to select it.
For GB300, use only the M1 MAC or the MAC of interface
enP22p3s0f0np0.
Set system MAC commands
# For GB200: cmsh -c "device; use <gb200-compute-node-01>; set mac <M1 or M2 MAC>; commit"
# For GB300: cmsh -c "device; use <gb300-compute-node-01>; set mac <M1 MAC or enP22p3s0f0np0 MAC>; commit"
Clone 18 entries.
Clone 18 entries for GB200 compute trays
foreach -o <goldennode-gb200> -n <rack number>-<pod number>-gb200-c01..<rack number>-<pod number>-gb200-c18 --next-ip () commit
Clone 18 entries for GB300 compute trays
cmsh device use <GB200-NODE-01> set mac <M1_OR_M2_MAC> commit
Set system MAC commands for GB300
cmsh device use <GB300-NODE-01> set mac <M1_MAC_OR_ENP22P3S0F0NP0_MAC> commit
One-shot non-interactive command (summary)
The following one-liners consolidate Steps 2-9 into a single
cmsh -cinvocation. Edit the placeholders to match the environment and run the appropriate variant for the tray type.GB200: One-shot cmsh -c command for Steps 2-9
cmsh -c "device; add physicalnode <gb200-compute-node-01>; use <gb200-compute-node-01>; set category <GB200 category>; interfaces; add bmc rf0; set network <name of ipminet where ipmi is configured>; set ip <bmc IP>; set mac <bmc mac>; add physical enP6p3s0f0np0; set mac <M1 MAC>; add physical enP22p3s0f0np0; set mac <M2 MAC>; add physical enP6p3s0f1np1; set network <storage network name>; set mac <S1 MAC>; set ip <S1 IP>; add physical enP22p3s0f1np1; set network <storage network name>; set mac <S2 MAC>; set ip <S2 IP>; add bond bond0; set interfaces enP6p3s0f0np0 enP22p3s0f0np0; set mode 4; set options miimon=100; set network <internalnet or whatever network it is being provisioned on>; set ip <bond ip>; commit; device; use <gb200-compute-node-01>; set provisioninginterface bond0; commit; interfaces; add physical ibp3s0; set network computenet; set ip <ibp3s0 IP>; commit; add physical ibP2p3s0; set network computenet; set ip <ibP2p3s0 IP>; commit; add physical ibP16p3s0; set network computenet; set ip <ibP16p3s0 IP>; commit; add physical ibP18p3s0; set network computenet; set ip <ibP18p3s0 IP>; commit; device; use <gb200-compute-node-01>; set mac <M1 or M2 MAC>; commit"GB300: One-shot cmsh -c command for Steps 2-9
cmsh -c "device; add physicalnode <gb300-compute-node-01>; use <gb300-compute-node-01>; set category <GB300 category>; interfaces; add bmc rf0; set network <name of ipminet where ipmi is configured>; set ip <bmc IP>; set mac <bmc mac>; add physical enP22p3s0f0np0; set network <internalnet or whatever network it is being provisioned on>; set mac <M1 MAC>; set ip <M1 IP>; add physical enP22p3s0f1np1; set network <storage network name>; set mac <S1 MAC>; set ip <S1 IP>; commit; device; use <gb300-compute-node-01>; set provisioninginterface enP22p3s0f0np0; commit; interfaces; add physical ibp3s0; set network computenet; set ip <ibp3s0 IP>; commit; add physical ibP2p3s0; set network computenet; set ip <ibP2p3s0 IP>; commit; add physical ibP16p3s0; set network computenet; set ip <ibP16p3s0 IP>; commit; add physical ibP18p3s0; set network computenet; set ip <ibP18p3s0 IP>; commit; device; use <gb300-compute-node-01>; set mac <M1 MAC>; commit"
IPs and MACs will have to be updated manually for each entry.
Set the rack name and position for each compute tray, NVLink Switch, and Power Shelf.
The following script will prompt the user to enter the hostname nomenclature for the rack to be added. This is needed for each rack to properly display in the rack submenu, which is a new feature of BCM 11. The script handles all rack components with proper physical positioning and consistent RU-based hostname mapping throughout the entire 42U rack space.
Manual rack position update script
#!/bin/bash # --- Get rack type (GB200 or GB300) from the user --- while true; do read -p "Enter the rack type (GB200 or GB300): " RACK_TYPE RACK_TYPE_UPPER=$(echo "$RACK_TYPE" | tr '[:lower:]' '[:upper:]') if [[ "$RACK_TYPE_UPPER" == "GB200" || "$RACK_TYPE_UPPER" == "GB300" ]]; then break else echo "Invalid input. Please enter 'GB200' or 'GB300'." fi done # Set the compute tray name based on rack type if [[ "$RACK_TYPE_UPPER" == "GB200" ]]; then COMPUTE_TRAY_NAME="gb200" else COMPUTE_TRAY_NAME="gb300" fi # --- Get hostname components from the user --- echo "Please define the hostname components." read -p "Enter the 'rack number' for the hostname (this will also be used for 'set rack <rack_number>'): " HOSTNAME_RACK_ID read -p "Enter the 'pod number' for the hostname (e.g., p1): " HOSTNAME_POD_ID # Validate inputs (basic check if they are not empty) if [ -z "$HOSTNAME_RACK_ID" ] || [ -z "$HOSTNAME_POD_ID" ]; then echo "Error: Hostname rack number and pod number cannot be empty." exit 1 fi echo # Adding a blank line for better readability echo "Using Hostname Rack ID (and for 'set rack' command): $HOSTNAME_RACK_ID" echo "Using Hostname Pod ID: $HOSTNAME_POD_ID" echo "Rack type: $RACK_TYPE_UPPER" echo "Hostname formats will be:" echo " Compute trays: ${HOSTNAME_RACK_ID}-<RU>-${HOSTNAME_POD_ID}-${COMPUTE_TRAY_NAME}-n<node_number>" echo " NVLink switches: ${HOSTNAME_RACK_ID}-<RU>-${HOSTNAME_POD_ID}-nvsw-n<switch_number>" echo " Power shelves: ${HOSTNAME_RACK_ID}-<RU>-${HOSTNAME_POD_ID}-pwr-n<shelf_number>" echo # Adding a blank line # --- First range of nodes (01 to 08) --- k_rack_position=11 # Initialize k_rack_position for the first set of nodes echo "Processing nodes 01 to 08..." for i in {01..08}; do # The host_RU matches the rack position host_RU=$k_rack_position # Construct the device hostname device_hostname="${HOSTNAME_RACK_ID}-${host_RU}-${HOSTNAME_POD_ID}-${COMPUTE_TRAY_NAME}-n${i}" echo "Target Device: $device_hostname, Setting Rack Position $k_rack_position in Rack ${HOSTNAME_RACK_ID}" cmsh -c "device; use ${device_hostname}; set rack ${HOSTNAME_RACK_ID} $k_rack_position; commit" k_rack_position=$((k_rack_position + 1)) done echo "Finished processing nodes 01 to 08." echo # Adding a blank line # --- Second range of nodes (09 to 18) --- k_rack_position=28 # Re-initialize k_rack_position for the second set of nodes echo "Processing nodes 09 to 18..." for i in {09..18}; do # The host_RU matches the rack position host_RU=$k_rack_position # Construct the device hostname device_hostname="${HOSTNAME_RACK_ID}-${host_RU}-${HOSTNAME_POD_ID}-${COMPUTE_TRAY_NAME}-n${i}" echo "Target Device: $device_hostname, Setting Rack Position $k_rack_position in Rack ${HOSTNAME_RACK_ID}" cmsh -c "device; use ${device_hostname}; set rack ${HOSTNAME_RACK_ID} $k_rack_position; commit" k_rack_position=$((k_rack_position + 1)) done echo "Finished processing nodes 09 to 18." echo # Adding a blank line # --- Third range - NVLink Switches (01 to 09) --- k_rack_position=19 # Initialize k_rack_position for NVLink switches echo "Processing NVLink switches 01 to 09..." for i in {01..09}; do # The host_RU matches the rack position host_RU=$k_rack_position # Construct the NVLink switch hostname switch_hostname="${HOSTNAME_RACK_ID}-${host_RU}-${HOSTNAME_POD_ID}-nvsw-n${i}" echo "Target NVLink Switch: $switch_hostname, Setting Rack Position $k_rack_position in Rack ${HOSTNAME_RACK_ID}" cmsh -c "device; use ${switch_hostname}; set rack ${HOSTNAME_RACK_ID} $k_rack_position; commit" k_rack_position=$((k_rack_position + 1)) done echo "Finished processing NVLink switches 01 to 09." echo # Adding a blank line # --- Fourth range - Power Shelves (01 to 04) - Below first compute tray group --- k_rack_position=6 # Initialize k_rack_position for power shelves (below first compute group) echo "Processing Power shelves 01 to 04..." for i in {01..04}; do # The host_RU matches the rack position host_RU=$k_rack_position # Construct the power shelf hostname pwr_hostname="${HOSTNAME_RACK_ID}-${host_RU}-${HOSTNAME_POD_ID}-pwr-n${i}" echo "Target Power Shelf: $pwr_hostname, Setting Rack Position $k_rack_position in Rack ${HOSTNAME_RACK_ID}" cmsh -c "device; use ${pwr_hostname}; set rack ${HOSTNAME_RACK_ID} $k_rack_position; commit" k_rack_position=$((k_rack_position + 1)) done echo "Finished processing Power shelves 01 to 04." echo # Adding a blank line # --- Fifth range - Power Shelves (05 to 08) - Above top compute tray group --- k_rack_position=39 # Initialize k_rack_position for power shelves (above second compute group) echo "Processing Power shelves 05 to 08..." for i in {05..08}; do # The host_RU matches the rack position host_RU=$k_rack_position # Construct the power shelf hostname pwr_hostname="${HOSTNAME_RACK_ID}-${host_RU}-${HOSTNAME_POD_ID}-pwr-n${i}" echo "Target Power Shelf: $pwr_hostname, Setting Rack Position $k_rack_position in Rack ${HOSTNAME_RACK_ID}" cmsh -c "device; use ${pwr_hostname}; set rack ${HOSTNAME_RACK_ID} $k_rack_position; commit" k_rack_position=$((k_rack_position + 1)) done echo "Finished processing Power shelves 05 to 08." echo "All nodes, switches, and power shelves processed. Confirm in cmsh by doing cmsh;rack; display <rack id>"
After the nodes have been added, check if the rack positions look correct.
cmsh;rack; display rack <RACK_NUMBER>
or
cmsh;rack;list #see that the nodes are in the expected position
Rack position reference for a GB200/GB300 rack
Table 3 GB200/GB300 Rack Position Layout (48U Rack, Top-Down View)# Rack Position
Device Type
Device #
Example Hostname
48-46
Infrastructure
N/A
(Empty/Other infrastructure)
45-44
SN2201 TOR/OOB
N/A
(TOR/OOB switch)
43
Infrastructure
N/A
(Empty/Other infrastructure)
42
Power Shelf
08
A05-42-P1-pwr-n08
41
Power Shelf
07
A05-41-P1-pwr-n07
40
Power Shelf
06
A05-40-P1-pwr-n06
39
Power Shelf
05
A05-39-P1-pwr-n05
38
Infrastructure
N/A
(Empty)
37
Compute Tray
18
A05-37-P1-gb200-n18
36
Compute Tray
17
A05-36-P1-gb200-n17
35
Compute Tray
16
A05-35-P1-gb200-n16
34
Compute Tray
15
A05-34-P1-gb200-n15
33
Compute Tray
14
A05-33-P1-gb200-n14
32
Compute Tray
13
A05-32-P1-gb200-n13
31
Compute Tray
12
A05-31-P1-gb200-n12
30
Compute Tray
11
A05-30-P1-gb200-n11
29
Compute Tray
10
A05-29-P1-gb200-n10
28
Compute Tray
09
A05-28-P1-gb200-n09
27
NVLink Switch
09
A05-27-P1-nvsw-n09
26
NVLink Switch
08
A05-26-P1-nvsw-n08
25
NVLink Switch
07
A05-25-P1-nvsw-n07
24
NVLink Switch
06
A05-24-P1-nvsw-n06
23
NVLink Switch
05
A05-23-P1-nvsw-n05
22
NVLink Switch
04
A05-22-P1-nvsw-n04
21
NVLink Switch
03
A05-21-P1-nvsw-n03
20
NVLink Switch
02
A05-20-P1-nvsw-n02
19
NVLink Switch
01
A05-19-P1-nvsw-n01
18
Compute Tray
08
A05-18-P1-gb200-n08
17
Compute Tray
07
A05-17-P1-gb200-n07
16
Compute Tray
06
A05-16-P1-gb200-n06
15
Compute Tray
05
A05-15-P1-gb200-n05
14
Compute Tray
04
A05-14-P1-gb200-n04
13
Compute Tray
03
A05-13-P1-gb200-n03
12
Compute Tray
02
A05-12-P1-gb200-n02
11
Compute Tray
01
A05-11-P1-gb200-n01
10
Infrastructure
N/A
(Empty)
9
Power Shelf
04
A05-9-P1-pwr-n04
8
Power Shelf
03
A05-8-P1-pwr-n03
7
Power Shelf
02
A05-7-P1-pwr-n02
6
Power Shelf
01
A05-6-P1-pwr-n01
5-1
Infrastructure
N/A
(Empty/Other infrastructure)
The power shelves are strategically positioned to provide power distribution to the compute node groups above and below them.