Category Creation#
Individual category (typically by node type) settings are configured to address that particular type of node. This usually assumes that the hardware node configuration is the same for each node (in other words all the nodes of a particular type should have the same make, model, and configuration). While mixing various types of hardware into a single category is possible, it is much simpler not to do so.
Each major device type in the control plane is given a category. The settings for the category level apply to all nodes within that category. Each category is also assigned a software image in which to provision and boot all the nodes of that category with.
The categories that need to be defined are:
slogin
k8s-admin
k8s-user
dgx-gb200
Note
the dgx-gb200 category is created by the bcm-post-install module, however if that is not being used, it will need to be defined manually (OEMs)
For each category the following tasks need to be completed:
Add <category name>.
cmsh -c "category; add <category name>; commit"
Set the software image.
cmsh -c "category; use <category name>; set softwareimage <category name>-image; commit"
Set the management network.
This is typically the network that the nodes in this category are provisioned from.
cmsh -c "category; use <category name>; set managementnetwork internalnet; commit"
Note
For both control planes and the dgx-gb200 categories, the management network is set to internalnet by default.
If bcm-netautogen is used, or if a separate dgxnet is created, the management network (dgxnet) should set to match if that is network that is provisioning that category.
Ensure this is cleared from the node level in order to inherit this property from the category.
Add BMC login credentials to the category. This should behave correctly if all nodes in that category have had their username/password set to the same value. If not, specify this at the node level for the control plane nodes.
cmsh -c "category use <category name>; bmcsettings; set username <bmc username>; set userid <bmc user id>; set password <bmc password>; commit"
Create and assign a disksetup.xml.
cmsh; category use <category>; set disksetup <double tab to see options>; commit
Note
hit enter to input in the xml manually/copy-paste or set disksetup <disksetup file name> if the file is already created
This is unique per control plane node type, and they have different requirements. This is covered in the next section.
For any categories that will provision aarch64/ARM architecture nodes, the boot loader must be set to GRUB from syslinux.
cmsh -c "category use <category name>; set bootloader grub; commit"
or
cmsh; category; use <aarch64/ARM category>; set bootloader grub; commit
For the GB200 category, ensure that the BMC settings are defined so that OOB power control can be established via BCM 11 itself. The firmware management mode also needs to be set for the firmware update process to work properly via BCM.
cmsh -c "category use <gb200 category>; bmcsettings; set firmwaremanagemode GB200; set password 0penBmc; set privilege ADMINISTRATOR; set userid 0; set username root; commit"
or
cmsh; category use <gb200 category>; bmcsettings;
set firmwaremanagemode GB200
set password 0penBmc
set privilege ADMINISTRATOR
set userid 0
set username root
commit
Control Plane Disk Setup#
Each control plane category can have a specific disk setup depending on the server’s hardware model. It is assumed that all the servers in a category are of the same make and model. Since there are control nodes of varying hardware topologies, some information gathering with regards to PCIe addressing/topology needs to be done. This information gathering is covered in the Hardware Information Gathering section of the Appendix. Provided are the disksetup configurations for each category assuming the reference architecture models are used.
Note
If a non-reference server is being used, edit the example(s) below to reflect the drive count and PCI Express addresses of the drives. However, the correct partitioning is crucial to the installation of NVIDIA Mission Control Software.
slogin disksetup file#
Create and add a slogin disk setup file in /cm/local/apps/cmd/etc/htdocs/disk-setup/slogin-node-disksetup.xml
Reference: Disk Setup for slogin nodes (based on Supermicro ARS-221GL-FNB-NC24B-DC Model).
<?xml version="1.0" encoding="UTF-8"?>
<diskSetup>
<device>
<blockdev>/dev/disk/by-path/pci-0014:01:00.0-nvme-1</blockdev>
<partition id="boot1" partitiontype="esp">
<size>512M</size>
<type>linux</type>
<filesystem>fat</filesystem>
<mountPoint>/boot/efi</mountPoint>
<mountOptions>defaults,noatime,nodiratime</mountOptions>
</partition>
<partition id="slash1">
<size>max</size>
<type>linux raid</type>
</partition>
</device>
<device>
<blockdev>/dev/disk/by-path/pci-0015:01:00.0-nvme-1</blockdev>
<partition id="boot2" partitiontype="esp">
<size>512M</size>
<type>linux</type>
<filesystem>fat</filesystem>
<mountOptions>defaults,noatime,nodiratime</mountOptions>
</partition>
<partition id="slash2">
<size>max</size>
<type>linux raid</type>
</partition>
</device>
<device>
<blockdev>/dev/disk/by-path/pci-0000:01:00.0-nvme-1</blockdev>
<partition id="var1">
<size>1500G</size>
<type>linux raid</type>
</partition>
<partition id="tmp1">
<size>max</size>
<type>linux raid</type>
</partition>
</device>
<device>
<blockdev>/dev/disk/by-path/pci-0001:01:00.0-nvme-1</blockdev>
<partition id="var2">
<size>1500G</size>
<type>linux raid</type>
</partition>
<partition id="tmp2">
<size>max</size>
<type>linux raid</type>
</partition>
</device>
<raid id="slashraid">
<member>slash1</member>
<member>slash2</member>
<level>1</level>
<filesystem>ext4</filesystem>
<mountPoint>/</mountPoint>
<mountOptions>defaults,noatime,nodiratime</mountOptions>
</raid>
<raid id="varraid">
<member>var1</member>
<member>var2</member>
<level>1</level>
<filesystem>ext4</filesystem>
<mountPoint>/var</mountPoint>
<mountOptions>defaults,noatime,nodiratime</mountOptions>
</raid>
<raid id="tmpraid">
<member>tmp1</member>
<member>tmp2</member>
<level>0</level>
<filesystem>ext4</filesystem>
<mountPoint>/tmp</mountPoint>
<mountOptions>defaults,noatime,nodiratime</mountOptions>
</raid>
</diskSetup>
Reference: slogin disk layout after provisioning
lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
nvme0n1 259:0 0 7T 0 disk
├─nvme0n1p1 259:1 0 1.5T 0 part
│ └─md1 9:1 0 1.5T 0 raid1 /var
└─nvme0n1p2 259:2 0 5.5T 0 part
└─md2 9:2 0 11T 0 raid0 /tmp
nvme1n1 259:3 0 7T 0 disk
├─nvme1n1p1 259:4 0 1.5T 0 part
│ └─md1 9:1 0 1.5T 0 raid1 /var
└─nvme1n1p2 259:5 0 5.5T 0 part
└─md2 9:2 0 11T 0 raid0 /tmp
nvme3n1 259:6 0 894.3G 0 disk
├─nvme3n1p1 259:14 0 512M 0 part
└─nvme3n1p2 259:15 0 893.7G 0 part
└─md0 9:0 0 893.7G 0 raid1 /
nvme2n1 259:7 0 894.3G 0 disk
├─nvme2n1p1 259:12 0 512M 0 part /boot/efi
└─nvme2n1p2 259:13 0 893.7G 0 part
└─md0 9:0 0 893.7G 0 raid1 /
root@a03-p1-aps-arm-01:~# df -h
Filesystem Size Used Avail Use% Mounted on
tmpfs 240G 62M 240G 1% /run
/dev/md0 879G 7.1G 827G 1% /
none 240G 0 240G 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
efivarfs 384K 21K 364K 6% /sys/firmware/efi/efivars
/dev/nvme2n1p1 511M 4.0K 511M 1% /boot/efi
/dev/md1 1.5T 4.2G 1.4T 1% /var
/dev/md2 11T 1.9M 11T 1% /tmp
Set the disksetup file in the category.
cmsh; category use slogin; set disksetup slogin-disksetup.xml; commit
k8s-admin disksetup file#
Create and add an k8s-admin disk setup file in /cm/local/apps/cmd/etc/htdocs/disk-setup/k8s-admin-disksetup.xml.
Reference: Disk setup for k8s-admin nodes (based on Supermicro SYS-221GE-FNB-NC24B-DC model)
<?xml version="1.0" encoding="UTF-8"?>
<diskSetup>
<device>
<blockdev>/dev/disk/by-path/pci-0000:03:00.0-nvme-1</blockdev>
<partition id="boot1" partitiontype="esp">
<size>512M</size>
<type>linux</type>
<filesystem>fat</filesystem>
<mountPoint>/boot/efi</mountPoint>
<mountOptions>defaults,noatime,nodiratime</mountOptions>
</partition>
<partition id="slash1">
<size>max</size>
<type>linux raid</type>
</partition>
</device>
<device>
<blockdev>/dev/disk/by-path/pci-0000:04:00.0-nvme-1</blockdev>
<partition id="boot2" partitiontype="esp">
<size>512M</size>
<type>linux</type>
<filesystem>fat</filesystem>
<mountPoint>/boot/efi</mountPoint>
<mountOptions>defaults,noatime,nodiratime</mountOptions>
</partition>
<partition id="slash2">
<size>max</size>
<type>linux raid</type>
</partition>
</device>
<device>
<blockdev>/dev/disk/by-path/pci-0000:3d:00.0-nvme-1</blockdev>
<partition id="shoreline1">
<size>1500G</size>
<type>linux raid</type>
</partition>
<partition id="raid1">
<size>max</size>
<type>linux raid</type>
</partition>
</device>
<device>
<blockdev>/dev/disk/by-path/pci-0000:3e:00.0-nvme-1</blockdev>
<partition id="shoreline2">
<size>1500G</size>
<type>linux raid</type>
</partition>
<partition id="raid2">
<size>max</size>
<type>linux raid</type>
</partition>
</device>
<raid id="slashraid">
<member>slash1</member>
<member>slash2</member>
<level>1</level>
<filesystem>ext4</filesystem>
<mountPoint>/</mountPoint>
<mountOptions>defaults,noatime,nodiratime</mountOptions>
</raid>
<raid id="shorelineraid">
<member>shoreline1</member>
<member>shoreline2</member>
<level>1</level>
</raid>
<raid id="localraid">
<member>raid1</member>
<member>raid2</member>
<level>0</level>
<filesystem>ext4</filesystem>
<mountPoint>/local</mountPoint>
<mountOptions>defaults,noatime,nodiratime</mountOptions>
</raid>
</diskSetup>
Reference: k8s-admin disk layout after provisioning
lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
loop0 7:0 0 1.5T 0 loop
nvme0n1 259:0 0 7T 0 disk
├─nvme0n1p1 259:1 0 1.5T 0 part
│ └─md1 9:1 0 1.5T 0 raid1
└─nvme0n1p2 259:2 0 5.5T 0 part
└─md2 9:2 0 11T 0 raid0 /local
nvme1n1 259:3 0 7T 0 disk
├─nvme1n1p1 259:4 0 1.5T 0 part
│ └─md1 9:1 0 1.5T 0 raid1
└─nvme1n1p2 259:5 0 5.5T 0 part
└─md2 9:2 0 11T 0 raid0 /local
nvme3n1 259:6 0 894.3G 0 disk
├─nvme3n1p1 259:8 0 512M 0 part
└─nvme3n1p2 259:9 0 893.7G 0 part
└─md0 9:0 0 893.7G 0 raid1 /
nvme2n1 259:7 0 894.3G 0 disk
├─nvme2n1p1 259:10 0 512M 0 part /boot/efi
└─nvme2n1p2 259:11 0 893.7G 0 part
└─md0 9:0 0 893.7G 0 raid1 /
a03-p1-nmxm-x86-01# df -h
Filesystem Size Used Avail Use% Mounted on
tmpfs 240G 114M 240G 1% /run
/dev/md0 879G 26G 809G 4% /
tmpfs 240G 0 240G 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
efivarfs 384K 21K 364K 6% /sys/firmware/efi/efivars
/dev/nvme2n1p1 511M 4.0K 511M 1% /boot/efi
/dev/md2 11T 28K 11T 1% /local
Note
/dev/md1 is an unformatted partition used by NMC Autonomous Hardware Recovery (AHR)
Set the disksetup file in the category.
cmsh; category use k8s-admin; set disksetup k8s-admin-disksetup.xml; commit
k8s-user disksetup file#
Create and add a k8s-user disk setup file in /cm/local/apps/cmd/etc/htdocs/disk-setup/k8s-user-disksetup.xml.
Reference: Disk setup for k8s-user nodes (based on Supermicro ARS-221GL-FNB-NC24B-DC Model)
<?xml version="1.0" encoding="UTF-8"?>
<diskSetup>
<device>
<blockdev>/dev/disk/by-path/pci-0014:01:00.0-nvme-1</blockdev>
<partition id="boot1" partitiontype="esp">
<size>512M</size>
<type>linux</type>
<filesystem>fat</filesystem>
<mountPoint>/boot/efi</mountPoint>
<mountOptions>defaults,noatime,nodiratime</mountOptions>
</partition>
<partition id="slash1">
<size>max</size>
<type>linux raid</type>
</partition>
</device>
<device>
<blockdev>/dev/disk/by-path/pci-0015:01:00.0-nvme-1</blockdev>
<partition id="boot2" partitiontype="esp">
<size>512M</size>
<type>linux</type>
<filesystem>fat</filesystem>
<mountOptions>defaults,noatime,nodiratime</mountOptions>
</partition>
<partition id="slash2">
<size>max</size>
<type>linux raid</type>
</partition>
</device>
<device>
<blockdev>/dev/disk/by-path/pci-0000:01:00.0-nvme-1</blockdev>
<partition id="var1">
<size>1500G</size>
<type>linux raid</type>
</partition>
<partition id="tmp1">
<size>max</size>
<type>linux raid</type>
</partition>
</device>
<device>
<blockdev>/dev/disk/by-path/pci-0001:01:00.0-nvme-1</blockdev>
<partition id="var2">
<size>1500G</size>
<type>linux raid</type>
</partition>
<partition id="tmp2">
<size>max</size>
<type>linux raid</type>
</partition>
</device>
<raid id="slashraid">
<member>slash1</member>
<member>slash2</member>
<level>1</level>
<filesystem>ext4</filesystem>
<mountPoint>/</mountPoint>
<mountOptions>defaults,noatime,nodiratime</mountOptions>
</raid>
<raid id="varraid">
<member>var1</member>
<member>var2</member>
<level>1</level>
<filesystem>ext4</filesystem>
<mountPoint>/var</mountPoint>
<mountOptions>defaults,noatime,nodiratime</mountOptions>
</raid>
<raid id="tmpraid">
<member>tmp1</member>
<member>tmp2</member>
<level>0</level>
<filesystem>ext4</filesystem>
<mountPoint>/tmp</mountPoint>
<mountOptions>defaults,noatime,nodiratime</mountOptions>
</raid>
</diskSetup>
Reference: k8s-user disk layout after provisioning
lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
nvme0n1 259:0 0 7T 0 disk
├─nvme0n1p1 259:1 0 1.5T 0 part
│ └─md1 9:1 0 1.5T 0 raid1 /var
└─nvme0n1p2 259:2 0 5.5T 0 part
└─md2 9:2 0 11T 0 raid0 /tmp
nvme1n1 259:3 0 7T 0 disk
├─nvme1n1p1 259:4 0 1.5T 0 part
│ └─md1 9:1 0 1.5T 0 raid1 /var
└─nvme1n1p2 259:5 0 5.5T 0 part
└─md2 9:2 0 11T 0 raid0 /tmp
nvme3n1 259:6 0 894.3G 0 disk
├─nvme3n1p1 259:14 0 512M 0 part
└─nvme3n1p2 259:15 0 893.7G 0 part
└─md0 9:0 0 893.7G 0 raid1 /
nvme2n1 259:7 0 894.3G 0 disk
├─nvme2n1p1 259:12 0 512M 0 part /boot/efi
└─nvme2n1p2 259:13 0 893.7G 0 part
└─md0 9:0 0 893.7G 0 raid1 /
root@a03-p1-scheduler-01:~# df -h
Filesystem Size Used Avail Use% Mounted on
tmpfs 240G 62M 240G 1% /run
/dev/md0 879G 7.1G 827G 1% /
none 240G 0 240G 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
efivarfs 384K 21K 364K 6% /sys/firmware/efi/efivars
/dev/nvme2n1p1 511M 4.0K 511M 1% /boot/efi
/dev/md1 1.5T 4.2G 1.4T 1% /var
/dev/md2 11T 1.9M 11T 1% /tmp
Set the disksetup file in the category.
cmsh; category use k8s-user; set disksetup k8s-user-disksetup.xml; commit
DGX GB200 Disk Setup#
The following post install process is done to ensure consistent naming of nvme/disk drive devices.
Note
For systems that have two M.2 nvmes, nvme mulitpath is disabled in these instructions. Specifically, for DGX GB200 systems, this does not have to be done because the compute tray only has a single M.2 OS Drive. UDEV Rules KB article.
Create the rules file
60-persistent-storage-gb200.rules
.Reference: GB200 rules - 60-persistent-storage-gb200.rules:
########## persistent nvme rules by HW address ##########
KERNEL=="nvme[0-9]n[0-9]", ATTRS{address}=="0015:01:00.0",
SYMLINK+="disk/by-id/osdisk-1"
KERNEL=="nvme[0-9]n[0-9]", ATTRS{address}=="0006:07:00.0",
SYMLINK+="disk/by-id/raiddisk-1"
KERNEL=="nvme[0-9]n[0-9]", ATTRS{address}=="0006:09:00.0",
SYMLINK+="disk/by-id/raiddisk-2"
KERNEL=="nvme[0-9]n[0-9]", ATTRS{address}=="0016:07:00.0",
SYMLINK+="disk/by-id/raiddisk-3"
KERNEL=="nvme[0-9]n[0-9]", ATTRS{address}=="0016:09:00.0",
SYMLINK+="disk/by-id/raiddisk-4"
########## persistent nvme rules by HW address ##########
Add the rules file to
/usr/lib/udev/rules.d/60-persistent-storage-gb200.rules
on the node-installer images/cm/node-installer
and/cm/node-installer-ubuntu2404-aarch64
cp 60-persistent-storage-gb200.rules /cm/node-installer/usr/lib/udev/rules.d/60-persistent-storage-gb200.rules
cp 60-persistent-storage-gb200.rules /cm/node-installer-ubuntu2404-aarch64/usr/lib/udev/rules.d/60-persistent-storage-gb200.rules
Note
If the head node is C2/ARM, then copying to the /cm/node-installer
is sufficient.
Copy the rules file to
/cm/images/<dgxos image>/usr/lib/udev/rules.d/60-persistent-storage-gb200.rules
.
cp 60-persistent-storage-gb200.rules /cm/images/baseos7.1-image-aarch64/usr/lib/udev/rules.d/60-persistent-storage-gb200.rules
Create and add disk setup file (gb200-disksetup.xml) to the directory
/cm/local/apps/cmd/etc/htdocs/disk-setup
.
Reference: DGX GB200 compute tray disk setup:
<?xml version="1.0" encoding="UTF-8"?>
<diskSetup>
<device>
<blockdev>/dev/disk/by-id/osdisk-1</blockdev>
<partition id="efi" partitiontype="esp">
<size>100M</size>
<type>linux</type>
<filesystem>fat</filesystem>
<mountPoint>/boot/efi</mountPoint>
<mountOptions>defaults,noatime,nodiratime</mountOptions>
</partition>
<partition id="boot1">
<size>4G</size>
<type>linux</type>
<filesystem>ext2</filesystem>
<mountPoint>/boot</mountPoint>
<mountOptions>defaults,noatime,nodiratime</mountOptions>
</partition>
<partition id="slash1">
<size>max</size>
<type>linux</type>
<filesystem>ext4</filesystem>
<mountPoint>/</mountPoint>
<mountOptions>defaults,noatime,nodiratime</mountOptions>
</partition>
</device>
<device>
<blockdev>/dev/disk/by-id/raiddisk-1</blockdev>
<partition id="raid1">
<size>max</size>
<type>linux raid</type>
</partition>
</device>
<device>
<blockdev>/dev/disk/by-id/raiddisk-2</blockdev>
<partition id="raid2">
<size>max</size>
<type>linux raid</type>
</partition>
</device>
<device>
<blockdev>/dev/disk/by-id/raiddisk-3</blockdev>
<partition id="raid3">
<size>max</size>
<type>linux raid</type>
</partition>
</device>
<device>
<blockdev>/dev/disk/by-id/raiddisk-4</blockdev>
<partition id="raid4">
<size>max</size>
<type>linux raid</type>
</partition>
</device>
<raid id="scratch_local">
<member>raid1</member>
<member>raid2</member>
<member>raid3</member>
<member>raid4</member>
<level>0</level>
<filesystem>ext4</filesystem>
<mountPoint>/scratch_local</mountPoint>
<mountOptions>defaults,noatime,nodiratime</mountOptions>
</raid>
</diskSetup>
Note
DGX GB200 systems are using the disk-by-id method for disk setup that is consistent with the KB article.
Set the disksetup file in the dgx-gb200 category.
cmsh; category use dgx-gb200; set disksetup gb200-disksetup.xml; commit