Category Creation#

Individual category (typically by node type) settings are configured to address that particular type of node. This usually assumes that the hardware node configuration is the same for each node (in other words all the nodes of a particular type should have the same make, model, and configuration). While mixing various types of hardware into a single category is possible, it is much simpler not to do so.

Each major device type in the control plane is given a category. The settings for the category level apply to all nodes within that category. Each category is also assigned a software image in which to provision and boot all the nodes of that category with.

The categories that need to be defined are:

  • slogin

  • k8s-admin

  • k8s-user

  • dgx-gb200

Note

the dgx-gb200 category is created by the bcm-post-install module, however if that is not being used, it will need to be defined manually (OEMs)

For each category the following tasks need to be completed:

  1. Add <category name>.

cmsh -c "category; add <category name>; commit"
  1. Set the software image.

cmsh -c "category; use <category name>; set softwareimage <category name>-image; commit"
  1. Set the management network.

This is typically the network that the nodes in this category are provisioned from.

cmsh -c "category; use <category name>; set managementnetwork internalnet; commit"

Note

  • For both control planes and the dgx-gb200 categories, the management network is set to internalnet by default.

  • If bcm-netautogen is used, or if a separate dgxnet is created, the management network (dgxnet) should set to match if that is network that is provisioning that category.

  • Ensure this is cleared from the node level in order to inherit this property from the category.

  1. Add BMC login credentials to the category. This should behave correctly if all nodes in that category have had their username/password set to the same value. If not, specify this at the node level for the control plane nodes.

cmsh -c "category use <category name>; bmcsettings; set username <bmc username>; set userid <bmc user id>; set password <bmc password>; commit"
  1. Create and assign a disksetup.xml.

cmsh; category use <category>; set disksetup <double tab to see options>; commit

Note

  • hit enter to input in the xml manually/copy-paste or set disksetup <disksetup file name> if the file is already created

This is unique per control plane node type, and they have different requirements. This is covered in the next section.

  1. For any categories that will provision aarch64/ARM architecture nodes, the boot loader must be set to GRUB from syslinux.

cmsh -c "category use <category name>; set bootloader grub; commit"

or

cmsh; category; use <aarch64/ARM category>; set bootloader grub; commit
  1. For the GB200 category, ensure that the BMC settings are defined so that OOB power control can be established via BCM 11 itself. The firmware management mode also needs to be set for the firmware update process to work properly via BCM.

cmsh -c "category use <gb200 category>; bmcsettings; set firmwaremanagemode GB200; set password 0penBmc; set privilege ADMINISTRATOR; set userid 0; set username root; commit"

or

cmsh; category use <gb200 category>; bmcsettings;
set firmwaremanagemode GB200
set password 0penBmc
set privilege ADMINISTRATOR
set userid 0
set username root
commit

Control Plane Disk Setup#

Each control plane category can have a specific disk setup depending on the server’s hardware model. It is assumed that all the servers in a category are of the same make and model. Since there are control nodes of varying hardware topologies, some information gathering with regards to PCIe addressing/topology needs to be done. This information gathering is covered in the Hardware Information Gathering section of the Appendix. Provided are the disksetup configurations for each category assuming the reference architecture models are used.

Note

If a non-reference server is being used, edit the example(s) below to reflect the drive count and PCI Express addresses of the drives. However, the correct partitioning is crucial to the installation of NVIDIA Mission Control Software.

slogin disksetup file#

  1. Create and add a slogin disk setup file in /cm/local/apps/cmd/etc/htdocs/disk-setup/slogin-node-disksetup.xml

Reference: Disk Setup for slogin nodes (based on Supermicro ARS-221GL-FNB-NC24B-DC Model).

<?xml version="1.0" encoding="UTF-8"?>

<diskSetup>

<device>
  <blockdev>/dev/disk/by-path/pci-0014:01:00.0-nvme-1</blockdev>
  <partition id="boot1" partitiontype="esp">
    <size>512M</size>
    <type>linux</type>
    <filesystem>fat</filesystem>
    <mountPoint>/boot/efi</mountPoint>
    <mountOptions>defaults,noatime,nodiratime</mountOptions>
  </partition>
  <partition id="slash1">
    <size>max</size>
    <type>linux raid</type>
  </partition>
</device>

<device>
  <blockdev>/dev/disk/by-path/pci-0015:01:00.0-nvme-1</blockdev>
  <partition id="boot2" partitiontype="esp">
    <size>512M</size>
    <type>linux</type>
    <filesystem>fat</filesystem>
    <mountOptions>defaults,noatime,nodiratime</mountOptions>
  </partition>
  <partition id="slash2">
    <size>max</size>
    <type>linux raid</type>
  </partition>
</device>

<device>
  <blockdev>/dev/disk/by-path/pci-0000:01:00.0-nvme-1</blockdev>
  <partition id="var1">
    <size>1500G</size>
    <type>linux raid</type>
  </partition>
  <partition id="tmp1">
    <size>max</size>
    <type>linux raid</type>
  </partition>
</device>

<device>
  <blockdev>/dev/disk/by-path/pci-0001:01:00.0-nvme-1</blockdev>
  <partition id="var2">
    <size>1500G</size>
    <type>linux raid</type>
  </partition>
  <partition id="tmp2">
    <size>max</size>
    <type>linux raid</type>
  </partition>
</device>

<raid id="slashraid">
  <member>slash1</member>
  <member>slash2</member>
  <level>1</level>
  <filesystem>ext4</filesystem>
  <mountPoint>/</mountPoint>
  <mountOptions>defaults,noatime,nodiratime</mountOptions>
</raid>

<raid id="varraid">
  <member>var1</member>
  <member>var2</member>
  <level>1</level>
  <filesystem>ext4</filesystem>
  <mountPoint>/var</mountPoint>
  <mountOptions>defaults,noatime,nodiratime</mountOptions>
</raid>

<raid id="tmpraid">
  <member>tmp1</member>
  <member>tmp2</member>
  <level>0</level>
  <filesystem>ext4</filesystem>
  <mountPoint>/tmp</mountPoint>
  <mountOptions>defaults,noatime,nodiratime</mountOptions>
</raid>

</diskSetup>

Reference: slogin disk layout after provisioning

lsblk

NAME         MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINTS
nvme0n1      259:0    0     7T  0 disk
├─nvme0n1p1  259:1    0   1.5T  0 part
│ └─md1         9:1    0   1.5T  0 raid1 /var
└─nvme0n1p2  259:2    0   5.5T  0 part
  └─md2         9:2    0    11T  0 raid0 /tmp
nvme1n1      259:3    0     7T  0 disk
├─nvme1n1p1  259:4    0   1.5T  0 part
│ └─md1         9:1    0   1.5T  0 raid1 /var
└─nvme1n1p2  259:5    0   5.5T  0 part
  └─md2         9:2    0    11T  0 raid0 /tmp
nvme3n1      259:6    0 894.3G  0 disk
├─nvme3n1p1 259:14    0   512M  0 part
└─nvme3n1p2 259:15    0 893.7G  0 part
  └─md0         9:0    0 893.7G  0 raid1 /
nvme2n1      259:7    0 894.3G  0 disk
├─nvme2n1p1 259:12    0   512M  0 part /boot/efi
└─nvme2n1p2 259:13    0 893.7G  0 part
  └─md0         9:0    0 893.7G  0 raid1 /

root@a03-p1-aps-arm-01:~# df -h

Filesystem      Size  Used Avail Use% Mounted on
tmpfs           240G   62M  240G   1% /run
/dev/md0        879G  7.1G  827G   1% /
none            240G     0  240G   0% /dev/shm
tmpfs           5.0M     0  5.0M   0% /run/lock
efivarfs        384K   21K  364K   6% /sys/firmware/efi/efivars
/dev/nvme2n1p1  511M  4.0K  511M   1% /boot/efi
/dev/md1        1.5T  4.2G  1.4T   1% /var
/dev/md2         11T  1.9M   11T   1% /tmp
  1. Set the disksetup file in the category.

cmsh; category use slogin; set disksetup slogin-disksetup.xml; commit

k8s-admin disksetup file#

  1. Create and add an k8s-admin disk setup file in /cm/local/apps/cmd/etc/htdocs/disk-setup/k8s-admin-disksetup.xml.

Reference: Disk setup for k8s-admin nodes (based on Supermicro SYS-221GE-FNB-NC24B-DC model)

 <?xml version="1.0" encoding="UTF-8"?>

 <diskSetup>

  <device>
  <blockdev>/dev/disk/by-path/pci-0000:03:00.0-nvme-1</blockdev>
  <partition id="boot1" partitiontype="esp">
   <size>512M</size>
   <type>linux</type>
   <filesystem>fat</filesystem>
   <mountPoint>/boot/efi</mountPoint>
   <mountOptions>defaults,noatime,nodiratime</mountOptions>
  </partition>
  <partition id="slash1">
   <size>max</size>
   <type>linux raid</type>
  </partition>
  </device>

  <device>
  <blockdev>/dev/disk/by-path/pci-0000:04:00.0-nvme-1</blockdev>
  <partition id="boot2" partitiontype="esp">
   <size>512M</size>
   <type>linux</type>
   <filesystem>fat</filesystem>
   <mountPoint>/boot/efi</mountPoint>
   <mountOptions>defaults,noatime,nodiratime</mountOptions>
  </partition>
  <partition id="slash2">
   <size>max</size>
   <type>linux raid</type>
  </partition>
  </device>

  <device>
  <blockdev>/dev/disk/by-path/pci-0000:3d:00.0-nvme-1</blockdev>
  <partition id="shoreline1">
   <size>1500G</size>
   <type>linux raid</type>
  </partition>
  <partition id="raid1">
   <size>max</size>
   <type>linux raid</type>
  </partition>
  </device>

  <device>
  <blockdev>/dev/disk/by-path/pci-0000:3e:00.0-nvme-1</blockdev>
  <partition id="shoreline2">
   <size>1500G</size>
   <type>linux raid</type>
  </partition>
  <partition id="raid2">
   <size>max</size>
   <type>linux raid</type>
  </partition>
  </device>

  <raid id="slashraid">
  <member>slash1</member>
  <member>slash2</member>
  <level>1</level>
  <filesystem>ext4</filesystem>
  <mountPoint>/</mountPoint>
  <mountOptions>defaults,noatime,nodiratime</mountOptions>
  </raid>

  <raid id="shorelineraid">
  <member>shoreline1</member>
  <member>shoreline2</member>
  <level>1</level>
  </raid>

  <raid id="localraid">
  <member>raid1</member>
  <member>raid2</member>
  <level>0</level>
  <filesystem>ext4</filesystem>
  <mountPoint>/local</mountPoint>
  <mountOptions>defaults,noatime,nodiratime</mountOptions>
  </raid>

</diskSetup>

Reference: k8s-admin disk layout after provisioning

lsblk

NAME         MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINTS
loop0          7:0    0   1.5T  0 loop
nvme0n1      259:0    0     7T  0 disk
├─nvme0n1p1  259:1    0   1.5T  0 part
│ └─md1        9:1    0   1.5T  0 raid1
└─nvme0n1p2  259:2    0   5.5T  0 part
  └─md2        9:2    0    11T  0 raid0 /local
nvme1n1      259:3    0     7T  0 disk
├─nvme1n1p1  259:4    0   1.5T  0 part
│ └─md1        9:1    0   1.5T  0 raid1
└─nvme1n1p2  259:5    0   5.5T  0 part
  └─md2        9:2    0    11T  0 raid0 /local
nvme3n1      259:6    0 894.3G  0 disk
├─nvme3n1p1  259:8    0   512M  0 part
└─nvme3n1p2  259:9    0 893.7G  0 part
  └─md0        9:0    0 893.7G  0 raid1 /
nvme2n1      259:7    0 894.3G  0 disk
├─nvme2n1p1  259:10   0   512M  0 part /boot/efi
└─nvme2n1p2  259:11   0 893.7G  0 part
  └─md0        9:0    0 893.7G  0 raid1 /

a03-p1-nmxm-x86-01# df -h

Filesystem      Size  Used Avail Use% Mounted on
tmpfs           240G  114M  240G   1% /run
/dev/md0        879G   26G  809G   4% /
tmpfs           240G     0  240G   0% /dev/shm
tmpfs           5.0M     0  5.0M   0% /run/lock
efivarfs        384K   21K  364K   6% /sys/firmware/efi/efivars
/dev/nvme2n1p1  511M  4.0K  511M   1% /boot/efi
/dev/md2         11T   28K   11T   1% /local

Note

/dev/md1 is an unformatted partition used by NMC Autonomous Hardware Recovery (AHR)

  1. Set the disksetup file in the category.

cmsh; category use k8s-admin; set disksetup k8s-admin-disksetup.xml; commit

k8s-user disksetup file#

  1. Create and add a k8s-user disk setup file in /cm/local/apps/cmd/etc/htdocs/disk-setup/k8s-user-disksetup.xml.

Reference: Disk setup for k8s-user nodes (based on Supermicro ARS-221GL-FNB-NC24B-DC Model)

<?xml version="1.0" encoding="UTF-8"?>

<diskSetup>

<device>
  <blockdev>/dev/disk/by-path/pci-0014:01:00.0-nvme-1</blockdev>
  <partition id="boot1" partitiontype="esp">
    <size>512M</size>
    <type>linux</type>
    <filesystem>fat</filesystem>
    <mountPoint>/boot/efi</mountPoint>
    <mountOptions>defaults,noatime,nodiratime</mountOptions>
  </partition>
  <partition id="slash1">
    <size>max</size>
    <type>linux raid</type>
  </partition>
</device>

<device>
  <blockdev>/dev/disk/by-path/pci-0015:01:00.0-nvme-1</blockdev>
  <partition id="boot2" partitiontype="esp">
    <size>512M</size>
    <type>linux</type>
    <filesystem>fat</filesystem>
    <mountOptions>defaults,noatime,nodiratime</mountOptions>
  </partition>
  <partition id="slash2">
    <size>max</size>
    <type>linux raid</type>
  </partition>
</device>

<device>
  <blockdev>/dev/disk/by-path/pci-0000:01:00.0-nvme-1</blockdev>
  <partition id="var1">
    <size>1500G</size>
    <type>linux raid</type>
  </partition>
  <partition id="tmp1">
    <size>max</size>
    <type>linux raid</type>
  </partition>
</device>

<device>
  <blockdev>/dev/disk/by-path/pci-0001:01:00.0-nvme-1</blockdev>
  <partition id="var2">
    <size>1500G</size>
    <type>linux raid</type>
  </partition>
  <partition id="tmp2">
    <size>max</size>
    <type>linux raid</type>
  </partition>
</device>

<raid id="slashraid">
  <member>slash1</member>
  <member>slash2</member>
  <level>1</level>
  <filesystem>ext4</filesystem>
  <mountPoint>/</mountPoint>
  <mountOptions>defaults,noatime,nodiratime</mountOptions>
</raid>

<raid id="varraid">
  <member>var1</member>
  <member>var2</member>
  <level>1</level>
  <filesystem>ext4</filesystem>
  <mountPoint>/var</mountPoint>
  <mountOptions>defaults,noatime,nodiratime</mountOptions>
</raid>

<raid id="tmpraid">
  <member>tmp1</member>
  <member>tmp2</member>
  <level>0</level>
  <filesystem>ext4</filesystem>
  <mountPoint>/tmp</mountPoint>
  <mountOptions>defaults,noatime,nodiratime</mountOptions>
</raid>

</diskSetup>

Reference: k8s-user disk layout after provisioning

lsblk

NAME         MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINTS
nvme0n1      259:0    0     7T  0 disk
├─nvme0n1p1  259:1    0   1.5T  0 part
│ └─md1         9:1    0   1.5T  0 raid1 /var
└─nvme0n1p2  259:2    0   5.5T  0 part
  └─md2         9:2    0    11T  0 raid0 /tmp
nvme1n1      259:3    0     7T  0 disk
├─nvme1n1p1  259:4    0   1.5T  0 part
│ └─md1         9:1    0   1.5T  0 raid1 /var
└─nvme1n1p2  259:5    0   5.5T  0 part
  └─md2         9:2    0    11T  0 raid0 /tmp
nvme3n1      259:6    0 894.3G  0 disk
├─nvme3n1p1 259:14    0   512M  0 part
└─nvme3n1p2 259:15    0 893.7G  0 part
  └─md0         9:0    0 893.7G  0 raid1 /
nvme2n1      259:7    0 894.3G  0 disk
├─nvme2n1p1 259:12    0   512M  0 part /boot/efi
└─nvme2n1p2 259:13    0 893.7G  0 part
  └─md0         9:0    0 893.7G  0 raid1 /

root@a03-p1-scheduler-01:~# df -h

Filesystem      Size  Used Avail Use% Mounted on
tmpfs           240G   62M  240G   1% /run
/dev/md0        879G  7.1G  827G   1% /
none            240G     0  240G   0% /dev/shm
tmpfs           5.0M     0  5.0M   0% /run/lock
efivarfs        384K   21K  364K   6% /sys/firmware/efi/efivars
/dev/nvme2n1p1  511M  4.0K  511M   1% /boot/efi
/dev/md1        1.5T  4.2G  1.4T   1% /var
/dev/md2         11T  1.9M   11T   1% /tmp
  1. Set the disksetup file in the category.

cmsh; category use k8s-user; set disksetup k8s-user-disksetup.xml; commit

DGX GB200 Disk Setup#

The following post install process is done to ensure consistent naming of nvme/disk drive devices.

Note

For systems that have two M.2 nvmes, nvme mulitpath is disabled in these instructions. Specifically, for DGX GB200 systems, this does not have to be done because the compute tray only has a single M.2 OS Drive. UDEV Rules KB article.

  1. Create the rules file 60-persistent-storage-gb200.rules.

    Reference: GB200 rules - 60-persistent-storage-gb200.rules:

########## persistent nvme rules by HW address ##########

KERNEL=="nvme[0-9]n[0-9]", ATTRS{address}=="0015:01:00.0",
SYMLINK+="disk/by-id/osdisk-1"

KERNEL=="nvme[0-9]n[0-9]", ATTRS{address}=="0006:07:00.0",
SYMLINK+="disk/by-id/raiddisk-1"

KERNEL=="nvme[0-9]n[0-9]", ATTRS{address}=="0006:09:00.0",
SYMLINK+="disk/by-id/raiddisk-2"

KERNEL=="nvme[0-9]n[0-9]", ATTRS{address}=="0016:07:00.0",
SYMLINK+="disk/by-id/raiddisk-3"

KERNEL=="nvme[0-9]n[0-9]", ATTRS{address}=="0016:09:00.0",
SYMLINK+="disk/by-id/raiddisk-4"

########## persistent nvme rules by HW address ##########
  1. Add the rules file to /usr/lib/udev/rules.d/60-persistent-storage-gb200.rules on the node-installer images /cm/node-installer and /cm/node-installer-ubuntu2404-aarch64

cp 60-persistent-storage-gb200.rules /cm/node-installer/usr/lib/udev/rules.d/60-persistent-storage-gb200.rules
cp 60-persistent-storage-gb200.rules /cm/node-installer-ubuntu2404-aarch64/usr/lib/udev/rules.d/60-persistent-storage-gb200.rules

Note

If the head node is C2/ARM, then copying to the /cm/node-installer is sufficient.

  1. Copy the rules file to /cm/images/<dgxos image>/usr/lib/udev/rules.d/60-persistent-storage-gb200.rules.

cp 60-persistent-storage-gb200.rules /cm/images/baseos7.1-image-aarch64/usr/lib/udev/rules.d/60-persistent-storage-gb200.rules
  1. Create and add disk setup file (gb200-disksetup.xml) to the directory /cm/local/apps/cmd/etc/htdocs/disk-setup.

Reference: DGX GB200 compute tray disk setup:

<?xml version="1.0" encoding="UTF-8"?>

<diskSetup>

<device>
 <blockdev>/dev/disk/by-id/osdisk-1</blockdev>
 <partition id="efi" partitiontype="esp">
  <size>100M</size>
  <type>linux</type>
  <filesystem>fat</filesystem>
  <mountPoint>/boot/efi</mountPoint>
  <mountOptions>defaults,noatime,nodiratime</mountOptions>
 </partition>
 <partition id="boot1">
  <size>4G</size>
  <type>linux</type>
  <filesystem>ext2</filesystem>
  <mountPoint>/boot</mountPoint>
  <mountOptions>defaults,noatime,nodiratime</mountOptions>
 </partition>
 <partition id="slash1">
  <size>max</size>
  <type>linux</type>
  <filesystem>ext4</filesystem>
  <mountPoint>/</mountPoint>
  <mountOptions>defaults,noatime,nodiratime</mountOptions>
 </partition>
</device>

<device>
 <blockdev>/dev/disk/by-id/raiddisk-1</blockdev>
 <partition id="raid1">
  <size>max</size>
  <type>linux raid</type>
 </partition>
</device>

<device>
 <blockdev>/dev/disk/by-id/raiddisk-2</blockdev>
 <partition id="raid2">
  <size>max</size>
  <type>linux raid</type>
 </partition>
</device>

<device>
 <blockdev>/dev/disk/by-id/raiddisk-3</blockdev>
 <partition id="raid3">
  <size>max</size>
  <type>linux raid</type>
 </partition>
</device>

<device>
 <blockdev>/dev/disk/by-id/raiddisk-4</blockdev>
 <partition id="raid4">
  <size>max</size>
  <type>linux raid</type>
 </partition>
</device>

<raid id="scratch_local">
 <member>raid1</member>
 <member>raid2</member>
 <member>raid3</member>
 <member>raid4</member>
 <level>0</level>
 <filesystem>ext4</filesystem>
 <mountPoint>/scratch_local</mountPoint>
 <mountOptions>defaults,noatime,nodiratime</mountOptions>
</raid>

</diskSetup>

Note

DGX GB200 systems are using the disk-by-id method for disk setup that is consistent with the KB article.

  1. Set the disksetup file in the dgx-gb200 category.

cmsh; category use dgx-gb200; set disksetup gb200-disksetup.xml; commit