What can I help you with?
NVIDIA Switch BMC User Manual v88.0002.0931

Overview

These pages are intended for network administrators who are responsible for configuring and managing BMC-based platforms. The instructions in this guide presuppose a moderate understanding of Linux, encompassing skills such as text editing, comprehending Unix file permissions, and process monitoring.

In the GB200 NVLink switch tray systems, the BMC (Baseboard Management Controller) ensures that the management plane is distinct from the control plane, a requirement of some Cloud Service Providers (CSPs). It's an essential server element offering hardware monitoring and security capabilities. Within the GB200 switch tray, the BMC operates alongside NVOS, which is the operating system on the host CPU managing the system, connecting via Ethernet over USB. The BMC manages most system peripherals, and the customer engages with it through a dedicated 1GbE RJ-45 interface utilizing the Redfish protocol. For those foregoing BMC network connections, its functionalities are also accessible via NVOS UI.

image-2024-8-15_22-52-45-version-1-modificationdate-1746375900767-api-v2.png

High-Level Feature List

  • Platform attestation (for all ERoTs)

  • Update firmware using PLDM T5 via ERoT, including the following firmware components:

    • BMC software/firmware

    • FPGA

    • CPU BIOS (UEFI and NVOS is updated from CPU SSD)

    • NVSwitch firmware (launched ONLY from NVOS as part as NVOS upgrade)

For full list of supported commands and events, please see the System Features section, below.

BMC-to-CPU Communication

The CPU communicates with the BMC using Redfish over Ethernet over USB. BMC reads information from the CPU using MCTP over I2C.

In every NVIDIA platform that facilitates local communication between the BMC and CPU, static private IPv4 addresses are configured.

  • BMC side 10.0.1.1 (netmask 255.255.252.0)

  • CPU side 10.0.1.2 (netmask 255.255.252.0)

The Redfish client on the CPU will be bound to the USB I/F, thus ensuring that internal traffic cannot "leak" to the management network. A dedicated nvos_user SHALL be added to the BMC image. When NVOS connects to BMC it SHALL change the password to a unique password based on SEED from the CPU TPM. If CPU fails to connect, it reverts to the default password. Each entity MUST be capable of operating without the other one, meaning the following:

  • BMC SHALL operate normally while CPU is undergoing reboot; BMC must detect the connectivity state towards the CPU.

  • CPU SHALL operate normally while BMC is undergoing reboot; CPU must detect the connectivity state towards the BMC.

Firmware Upgrade

  • BMC upgrades via the relevant ERoT: BMC firmware, FPGA, CPU BIOS, EROTs firmware and NVSwitch firmware, and upgrades CPLD directly (From CPU not from BMC).

  • The flow is PLDM Type 5, writing to staging FLASH. Then ERoT validates the integrity and continues the flow.

  • The trigger for firmware update can be done either with external Redfish or NVOS.

  • NVOS SHALL utilize MFT for the actual process through BMC.

  • NVSwitch firmware upgrade MUST be done via NVOS to ensure compatibility with the driver and SDK version.

Term

Description

ASIC

An application-specific integrated circuit (ASIC) is an integrated circuit (IC) chip customized for a particular use, rather than intended for general-purpose use.

Attestation

Authenticated, machine-readable metadata about one or more software artifacts. An attestation MUST contain at least: Envelope.

BMC

Baseboard Management Controller

DMTF

Founded in 1992, DMTF (formerly known as the Distributed Management Task Force) is an industry standards organization working to simplify the manageability of network- accessible technologies through open and collaborative efforts by leading technology companies.

EEPROM

A type of non-volatile ROM that enables individual bytes of data to be erased and reprogrammed.

EROT

External root of trust

FPGA

Field Programmable Gate Arrays (FPGAs) are integrated circuits often sold off-the-shelf. They're referred to as 'field programmable' because they provide customers the ability to reconfigure the hardware to meet specific use case requirements after the manufacturing process.

Host

A computer platform executing an Operating System which may control one or more network adapters. A host in networking is a device that is connected to a network and is able to communicate with other hosts on the network.

Network Interfaces

A network interface is the point of interconnection between a computer and a private or public network. A network interface is generally a network interface card (NIC), but does not have to have a physical form. Instead, the network interface can be implemented in software.

RF

DMTF's Redfish® is a standard designed to deliver simple and secure management for converged, hybrid IT and the Software Defined Data Center (SDDC).

SPDM

DMTF's Security Protocol and Data Model (SPDM) Specification defines messages, data objects, and sequences for performing message exchanges between devices over a variety of transport and physical media to authentication of components, firmware measurement and protection of data in flight.

VPD

Vital product data

Feature

Detail

Firmware Management

  • Show Firmware Inventory

  • ​Show Firmware Version & Health

  • Update Firmware

  • Show Firmware Update Status

  • Filter Next Update Firmware End Components

User Management

  • Show BMC Users

  • Change BMC User Password

Attestation

  • Show System EROT List

  • Show EROT Security Information

  • Generate EROT SPDM Information

  • Show EROT SPDM Generation Status

  • Show EROT SPDM Information

  • Set EROT Automatic Background Copy State

  • Show EROT Automatic Background Copy Status

Power Management

  • Apply Reset

Network Interfaces​

  • Show Network Interface List

  • Show Network Interface Details

EEPROM

  • Show EEPROM Information

Temperature Sensor

  • Show Temperature Sensor Information

Debug Information

  • Generate Debug Information

  • Show Debug Information Generation Status

  • Output Debug Information To File

Debug Token

  • Generate Debug Tokens

  • Show Debug Token Generation Status

  • Download Debug Token

  • Install Debug Token Signed Firmware

  • Show Debug Token Installation Status

  • Generate Installed Debug Token Attachments

  • Show Installed Debug Token Attachments Generation Status

  • Show Installed Debug Token Attachments

Leakage Sensor

  • Show the status of the Leak Detect sensors

  • Leakage notification VIA RF events

Rsyslog

  • The BMC can stream out local logs (that go to the systemd journal) by using rs yslog .

© Copyright 2025, NVIDIA. Last updated on May 4, 2025.