Overview
These pages are intended for network administrators who are responsible for configuring and managing BMC-based platforms. The instructions in this guide presuppose a moderate understanding of Linux, encompassing skills such as text editing, comprehending Unix file permissions, and process monitoring.
In the GB200 NVLink switch tray systems, the BMC (Baseboard Management Controller) ensures that the management plane is distinct from the control plane, a requirement of some Cloud Service Providers (CSPs). It's an essential server element offering hardware monitoring and security capabilities. Within the GB200 switch tray, the BMC operates alongside NVOS, which is the operating system on the host CPU managing the system, connecting via Ethernet over USB. The BMC manages most system peripherals, and the customer engages with it through a dedicated 1GbE RJ-45 interface utilizing the Redfish protocol. For those foregoing BMC network connections, its functionalities are also accessible via NVOS UI.

High-Level Feature List
Platform attestation (for all ERoTs)
Update firmware using PLDM T5 via ERoT, including the following firmware components:
BMC software/firmware
FPGA
CPU BIOS (UEFI and NVOS is updated from CPU SSD)
NVSwitch firmware (launched ONLY from NVOS as part as NVOS upgrade)
For full list of supported commands and events, please see the System Features section, below.
BMC-to-CPU Communication
The CPU communicates with the BMC using Redfish over Ethernet over USB. BMC reads information from the CPU using MCTP over I2C.
In every NVIDIA platform that facilitates local communication between the BMC and CPU, static private IPv4 addresses are configured.
BMC side 10.0.1.1 (netmask 255.255.252.0)
CPU side 10.0.1.2 (netmask 255.255.252.0)
The Redfish client on the CPU will be bound to the USB I/F, thus ensuring that internal traffic cannot "leak" to the management network. A dedicated nvos_user SHALL be added to the BMC image. When NVOS connects to BMC it SHALL change the password to a unique password based on SEED from the CPU TPM. If CPU fails to connect, it reverts to the default password. Each entity MUST be capable of operating without the other one, meaning the following:
BMC SHALL operate normally while CPU is undergoing reboot; BMC must detect the connectivity state towards the CPU.
CPU SHALL operate normally while BMC is undergoing reboot; CPU must detect the connectivity state towards the BMC.
Firmware Upgrade
BMC upgrades via the relevant ERoT: BMC firmware, FPGA, CPU BIOS, EROTs firmware and NVSwitch firmware, and upgrades CPLD directly (From CPU not from BMC).
The flow is PLDM Type 5, writing to staging FLASH. Then ERoT validates the integrity and continues the flow.
The trigger for firmware update can be done either with external Redfish or NVOS.
NVOS SHALL utilize MFT for the actual process through BMC.
NVSwitch firmware upgrade MUST be done via NVOS to ensure compatibility with the driver and SDK version.
Term | Description |
ASIC | An application-specific integrated circuit (ASIC) is an integrated circuit (IC) chip customized for a particular use, rather than intended for general-purpose use. |
Attestation | Authenticated, machine-readable metadata about one or more software artifacts. An attestation MUST contain at least: Envelope. |
BMC | Baseboard Management Controller |
DMTF | Founded in 1992, DMTF (formerly known as the Distributed Management Task Force) is an industry standards organization working to simplify the manageability of network- accessible technologies through open and collaborative efforts by leading technology companies. |
EEPROM | A type of non-volatile ROM that enables individual bytes of data to be erased and reprogrammed. |
EROT | External root of trust |
FPGA | Field Programmable Gate Arrays (FPGAs) are integrated circuits often sold off-the-shelf. They're referred to as 'field programmable' because they provide customers the ability to reconfigure the hardware to meet specific use case requirements after the manufacturing process. |
Host | A computer platform executing an Operating System which may control one or more network adapters. A host in networking is a device that is connected to a network and is able to communicate with other hosts on the network. |
Network Interfaces | A network interface is the point of interconnection between a computer and a private or public network. A network interface is generally a network interface card (NIC), but does not have to have a physical form. Instead, the network interface can be implemented in software. |
RF | DMTF's Redfish® is a standard designed to deliver simple and secure management for converged, hybrid IT and the Software Defined Data Center (SDDC). |
SPDM | DMTF's Security Protocol and Data Model (SPDM) Specification defines messages, data objects, and sequences for performing message exchanges between devices over a variety of transport and physical media to authentication of components, firmware measurement and protection of data in flight. |
VPD | Vital product data |
Feature | Detail |
Firmware Management |
|
User Management |
|
Attestation |
|
Power Management |
|
Network Interfaces |
|
EEPROM |
|
Temperature Sensor |
|
Debug Information |
|
Debug Token |
|
Leakage Sensor |
|
Rsyslog |
|