Control Plane Configuration Verification Checklist#
This checklist helps verify that all control plane node configurations have been completed correctly after following the steps in Control Plane Node Entries. Use this systematic verification process before proceeding to provisioning or high availability setup.
General Prerequisites#
[ ] Categories, software images, and networks have been defined
[ ] P2P connection documentation is available for MAC addresses
[ ] IP allocation plan is documented and followed
[ ] BMC credentials are documented (asset tags, default usernames)
Head Nodes Verification#
Primary Head Node Configuration#
BMC Interface (rf0/ipmi0)
[ ] BMC interface added (rf0 for DGX SuperPOD, ipmi0 for OEM if rf0 fails)
[ ] IP address assigned on correct IPMI network
[ ] BMC MAC address configured
[ ] Power control set to rf0/ipmi0
Provisioning Bond (bond0)
[ ] Physical interfaces added:
enP4s4np0
(M1) andenP6s6np0
(M2)[ ] Correct MAC addresses assigned to each interface
[ ] Bond configured with mode 4 (LACP)
[ ] IP assigned on
internalnet
[ ] Bond set as provisioning interface
[ ]
miimon=100
option configured
IPMI Network Bond (bond1)
[ ] Physical interfaces added:
enP18s18np0
(M3) andenP22s22np0
(M4)[ ] Correct MAC addresses assigned
[ ] Bond configured with mode 4
[ ] IP assigned on
ipminet0
BMC Credentials
[ ] User ID set (typically 2)
[ ] Username configured (OEM default)
[ ] Password set (from asset tag)
Optional Heartbeat Interface
[ ] Extra 1G RJ45 LOM port configured if available for HA
SLURM Login Nodes (slogin) Verification#
Golden Node Configuration#
Node Creation
[ ] Physical node created with category
slogin
[ ] Device-level MAC set (M1/M2)
[ ] ARM/C2 architecture confirmed
BMC Configuration
[ ] BMC interface
rf0
added[ ] IP assigned on IPMI network
[ ] BMC MAC address configured
Provisioning Bond (bond0)
[ ] Physical interfaces:
enP4s4np0
(M1) andenP6s6np0
(M2)[ ] Bond mode 4 configured
[ ] IP assigned on
internalnet
[ ] Set as provisioning interface
[ ]
miimon=100
option configured
Fast Storage Network
[ ] Two physical ports configured:
enP18s18np0
(M3) andenP22s22np0
(M4)[ ] Each port has /31 IP on
storagenet
[ ] Correct MAC addresses assigned
BMC Credentials (if unique per node)
[ ] User ID, username, and password configured
Node Cloning#
Second slogin Node
[ ] Cloned using
foreach
command with--next-ip
[ ] Hostname follows naming convention:
<rack>-<ru>-p<podnumber>-slogin-02
[ ] IPs incremented correctly
K8s-Admin Nodes (k8a) Verification#
Golden Node Configuration#
Node Creation
[ ] Physical node created with category
k8s-admin
[ ] x86 architecture confirmed (required for NMX-M software)
[ ] Device-level MAC set (M1/M2)
BMC Configuration
[ ] BMC interface
rf0
added[ ] IP assigned on IPMI network
[ ] BMC MAC address configured
Physical Interfaces
[ ] Four interfaces configured:
[ ]
ens1np0
(M1) with correct MAC[ ]
enp42s0np0
(M2) with correct MAC[ ]
enp171s0np0
(M3) with correct MAC[ ]
enp189s0np0
(M4) with correct MAC
Bond Configuration
[ ] Bond0 (Management/Provisioning)
[ ] Interfaces:
ens1np0
andenp42s0np0
[ ] Mode 4 (LACP)
[ ] IP on
internalnet
[ ] Set as provisioning interface
[ ] Bond1 (NVLink COMe Network)
[ ] Interfaces:
enp171s0np0
andenp189s0np0
[ ] Mode 4 (LACP)
[ ] Connected to OOB network for NMX-M communication
Node Cloning#
Additional k8s-admin Nodes
[ ] Total of 3 nodes (odd number for quorum)
[ ] Nodes 02 and 03 cloned with
foreach
command[ ] Hostnames follow convention:
<rack>-<RU>-p<podnumber>-k8a-<arch>-02/03
[ ] IPs incremented correctly
K8s-User Nodes (k8u) Verification#
Golden Node Configuration#
Node Creation
[ ] Physical node created with category
k8s-user
[ ] Architecture documented (ARM or x86 supported)
[ ] Device-level MAC set (M1/M2)
BMC Configuration
[ ] BMC interface
rf0
added[ ] IP assigned on IPMI network
[ ] BMC MAC address configured
Provisioning Bond (bond0)
[ ] Physical interfaces:
enP4s4np0
(M1) andenP6s6np0
(M2)[ ] Bond mode 4 configured
[ ] IP assigned on
internalnet
[ ] Set as provisioning interface
Fast Storage Network
[ ] Two physical ports:
enP18s18np0
(M3) andenP22s22np0
(M4)[ ] Each port has /31 IP on
storagenet
[ ] Each IP on different border TOR
[ ] Correct MAC addresses assigned
BMC Credentials (if unique per node)
[ ] User ID, username, and password configured
Node Cloning#
Additional k8s-user Nodes
[ ] Total of 3 nodes (odd number for quorum)
[ ] Nodes 02 and 03 cloned with
foreach
command[ ] Hostnames follow convention:
<rack>-<ru>-p<podnumber>-k8u-<arch>-02/03
[ ] IPs incremented correctly
Final Verification Steps#
Network Connectivity#
Interface Status Check
[ ] All interfaces show “always” for Start if
[ ] Bond interfaces show correct member interfaces
[ ] IP addresses match planning documentation
Network Assignment Verification
[ ]
internalnet
assignments for all provisioning bonds[ ]
ipminet0
assignments for IPMI/OOB interfaces[ ]
storagenet
assignments for storage interfaces
Documentation and Naming#
Hostname Conventions
[ ] All nodes follow
<RACK>-<RU>-P[1-16]-<TYPE>-0[1-X]
pattern[ ] HEAD nodes:
P[1-16]-HEAD-0[1-2]
[ ] SLOGIN nodes:
P[1-16]-SLOGIN-0[1-2]
[ ] K8U nodes:
P[1-16]-K8U-0[1-3]
[ ] K8A nodes:
P[1-16]-K8A-0[1-3]
Ready for Next Steps#
High Availability Setup
[ ] Primary head node configuration complete
[ ] Ready for HA setup process (secondary head node will be auto-configured)
Provisioning Readiness
[ ] All control nodes defined and configured
[ ] Ready to begin provisioning process
[ ] Network infrastructure validated
Commands for Quick Verification#
Use these commands to quickly verify configurations:
Check Interface Configuration
cmsh -c "device use <node-name>;interfaces;list"
Verify BMC Settings
cmsh -c "device use <node-name>;bmcsettings;show"
Check Bond Configuration
cmsh -c "device use <node-name>;interfaces;use bond0;show"
List All Nodes by Category
cmsh -c "device;list -c slogin"
cmsh -c "device;list -c k8s-admin"
cmsh -c "device;list -c k8s-user"
Verify Network Assignments
cmsh -c "networks;list"
cmsh -c "device use <node-name>;interfaces;list"
Next Steps#
Once all items in this checklist are verified:
Proceed to Finalize Headnode Setup
Configure High Availability if required
Note
This checklist should be completed before proceeding to the provisioning phase. Any missing configurations should be addressed by returning to the appropriate sections in Control Plane Node Entries.