Hardware Prerequisites

View as Markdown

This page covers the hardware requirements for both the NICo site controller and the compute systems it manages.

Site Controller

The site controller runs the NICo control plane on a Kubernetes cluster. A minimum of 3 nodes is required for high availability; 5 nodes are recommended for large GB200-class sites.

ComponentRequirement
Server classAny major OEM Gen5 server (e.g. Dell R760-class)
CPU2x modern x86_64 sockets (Intel Xeon / AMD EPYC), 24+ cores per socket
Memory256 GiB minimum, 512 GiB recommended
OS storage200-500 GiB NVMe SSD space (UEFI + Secure Boot)
K8s data storage1+ TiB NVMe SSD space dedicated to containerd, Kubelet, and logs
Networking1-2x 25/100 GbE ports (single-homed or dual-homed)
Out-of-bandBMC/iDRAC/iLO/XClarity (DHCP or statically addressed)
TPMTPM 2.0 module present and enabled in BIOS/UEFI
Secure EraseAll local storage drives must support Secure Erase

Operating system: Ubuntu 24.04 LTS, kernel 6.8+. Swap disabled (or minimal), NUMA enabled, virtualization/IOMMU enabled. Time sync: chrony or equivalent, synced to enterprise NTP.

Storage layout: Total local NVMe capacity should be 4 TiB or greater. Mount 1.7 TiB on / (root) on the NVMe OS disk (ext4 or xfs) — typical usage is 200–500 GiB. Mount /var/lib/containerd and /var/lib/kubelet on a separate NVMe data disk (1+ TiB, ext4/xfs, noatime). Consider a dedicated /var/log if there is heavy logging. Persistent app storage (SAN/NAS, Rook-Ceph) is not required for NICo itself.

DPUs on Site Controller (Optional)

DPUs on site controller nodes are optional and site-owned. If DPUs are installed, ensure the following requirements are met:

  • Verify the correct DPU power cable has been ordered from the server vendor.
  • For BF3 DPUs, verify link speed and optics: BF3 runs at 200 Gb, so match ports to 200 Gb-capable optics, fiber, or DACs.
  • A basic onboard NIC for management is sufficient—no extra ConnectX NICs are needed.

Compute Systems (Managed Hosts)

Each managed host is a server paired with one or more NVIDIA BlueField DPUs. The DPU provides the primary data-plane connectivity and acts as the enforcement boundary for NICo.

ComponentRequirement
Server classAn NVIDIA-certified system with a data center classification
GPURefer to the Hardware Compatibility List for supported GPUs and systems.
DPUOne or more BlueField DPUs with 2x 200 Gb network interfaces and a 1 Gb BMC interface
Local storageNVMe drives must support Secure Erase; firmware must be updated only via signed images; rollback must not be possible.
TPMTPM 2.0 with Secure Boot support
UEFIMust support preventing in-band host control.
Chassis BMCMust support the following Redfish operations: power control, boot order, UEFI secure boot toggle, IPv6, firmware update, Serial-over-LAN

The BlueField-3 B3220 P-Series DPU is suitable: 200GbE/NDR200 dual-port QSFP112 Network Adaptor (900-9D3B6-00CV-AA0). Other NICs on the host are automatically disabled during NICo installation.

NICo does not require any cabling or communication between the DPU and the host.

Supported Hardware

For a list of tested host machines, DPUs, and validated firmware versions, refer to the Hardware Compatibility List page.

BIOS/UEFI Settings

The following settings should be enabled on site controller and compute system BIOS/UEFI:

  • UEFI + Secure Boot (with signed kernel/modules)
  • VT-x / AMD-V + IOMMU
  • SR-IOV (if using NIC VFs; otherwise leave off)
  • NTP (locked to enterprise sources; clock drift alarms enabled)