NVIDIA BlueField Management and Initial Provisioning
NVIDIA BlueField Management and Initial Provisioning

Appendix - BlueField Management in DPU Mode

The BlueField networking platform (DPU or SuperNIC) incorporates an integrated BMC, ASPEED AST2600. The on-board BMC provides security in untrusted platforms and is therefore needed in most BlueField use cases.

Like the host BMC, managed host platform, the BlueField BMC is a trusted entity (with its own ERoT to ensure that its firmware is secured) that enables provisioning and managing the BlueField over a separated management network, using standard interfaces, protocols, and security to manage the full lifecycle of the BlueField. In addition, the BlueField BMC enables managing the BlueField even if the BlueField's OS is down, and it has a separate power input so it can also hard reset the BlueField if required.

The main interface for the BlueField BMC is a 1GbE RJ45 OOB management port that is connected to the internal management Ethernet network of the cloud service provider or the Enterprise IT management network.

Managing the BlueField using its BMC is detailed hereafter.

Remote Management Using Redfish Protocol

Supported by BlueField, the Redfish standard is a suite of specifications that delivers an industry standard protocol providing a secured RESTful interface for the management of servers, storage, networking, and converged infrastructure.

Redfish is supported from the host BMC and BlueField BMC.

Redfish replaces IPMI, providing the following advantages:

  • Human readable schemas

  • Interoperable, equally usable by apps, GUIs, and scripts

  • Extensible to add capabilities

  • Secured using HTTPs

Management Architecture

The following diagram illustrates the architecture and connectivity for managing the BlueField.

mgmt-arch-version-1-modificationdate-1725896357040-api-v2.png


Management Interfaces

Info

See this page for a detailed description of the interfaces of BlueField-3.

Info

See here page for a detailed description of the interfaces of BlueField-2.

The following table describes the interfaces available to manage the BlueField.

Management Interface

Description

Comment

OOB Management Port (1GbE RJ45)

A dedicated, separate Ethernet interface to manage the BlueField from the remote management controller (RMC)

Info

NVIDIA recommends using this interface as the main management interface.

Enables managing the BlueField life cycle using the BlueField's BMC. Supports Redfish commands to the BlueField BMC (eth0). Recovery flows, monitoring, and configuration operations are all available through this interface.

In addition, this physical interface allows users to SSH directly to the BlueField (oob_net).

Note

IPMI is supported for backward compatibility, but it is recommended to start new deployments with Redfish only.

SMBus (PCIe Golden Fingers)

Enables PLDM/NC-SI over MCTP between the BlueField and the host BMC

Enables the host BMC to monitor the BlueField

PCIe

PCIe interface between the BlueField and host server

Enables the host to recover the BlueField using RShim PCIe physical function (PF) when the host is trusted

Info

Unavailable while in zero-trust mode. Use the 1GbE OOB interface instead.


Recommended Management Approach

The BlueField BMC allows managing the BlueField over the 1GbE OOB interface using Redfish protocol. The following functions are available:

  • BlueField and BlueField BMC upgrade and recovery

  • Monitoring of the BlueField device

  • BlueField BlueField reset control (even when BlueField OS is halted)

  • Setting BlueField UEFI configuration

  • Console interface to BlueField

The following subsections describe the recommended management methods for specific tasks on the BlueField .

Note

The following method are not supported by BlueField SuperNIC.

BlueField Update and Recovery

The NVIDIA BlueField offers two file format for performing software upgrade:

  • The standard ISO format

  • BlueField bootstream (BFB) file format.

Each file format has a different update and recovery method:

  • For ISO, a PXE server may be used to load an ISO image which contains the necessary updates. This can be accomplished using BlueField's UEFI, interface PXE server over the 1GbE OOB or through the high-speed data ports. BlueField SoC runs UEFI/PXE which sends a DHCP DISCOVER over the 1GbE OOB interface, including vendor class ( "NVIDIA/BF/PXE" ) for BlueField SoC (to allow customer's server to differentiate between BlueField SoC and BlueField BMC), and MAC for identification and discovery.

  • BFB serves both as a comprehensive upgrade tool and a recovery solution for the BlueField. There are two ways to facilitate these upgrade and recovery methods:

    • The BlueField BMC is under the control of a remote management controller (RMC) that utilizes the Redfish protocol over the 1GbE OOB connection. A pre-installed golden image can be used which allows flash and recovery of the BlueField devices. This can be triggered by either the RMC or a trusted platform's BMC. For more information, refer to this page

    • The host BMC is under the control of a RMC that utilized Redfish protocol over PCIe connection.

Note

After an upgrade/recovery, a system power cycle may be required to apply changes.


BlueField BMC Update

The BlueField BMC can be updated by the RMC using Redfish over the 1GbE OOB port to the BlueField BMC.

BlueField BMC update is A/B redundant, using a dual firmware flash. If both flashes fail to boot, the BlueField BMC may be recovered from the platform's BMC using the SMBus or UART interfaces.

Please refer to this page for more information.

Note

After an upgrade, a system power cycle may be required to apply changes.


BlueField Monitoring and Telemetry

The RMC may monitor and read telemetry of the BlueField using Redfish over the 1GbE OOB port to the BlueField BMC.

  • BlueField temperatures (board, DDR, and ports), voltages and link states

  • BlueField FRU information about NIC FW, CPU, DDR, eMMC, network interface, etc.

  • Device sensor data record (SDR), sensor threshold and events, system event logs (SEL), etc.

Please refer to this page for more information.

BlueField and BlueField BMC Reset Control

The RMC may issue a reset to the BlueField (soft or hard) or to the BlueField BMC, both using Redfish over the 1GbE OOB port to the BlueField BMC.

Please refer to these pages (1,2) for more information.

BlueField UEFI Configuration

BlueField UEFI settings may be modified using Redfish over the 1GbE OOB port to the BlueField BMC. This includes changing UEFI default password (which is mandatory), setting BlueField to zero-trust, setting date and time, etc.

Please refer to this page for more information.

Console Interface

The BlueField console interface is accessible via the BlueField BMC using Serial-over-LAN (SoL) over the 1GbE OOB port. The RMC may access the console interface of the BlueField device to track its boot progress.

Please refer to this page for more information.

© Copyright 2024, NVIDIA. Last updated on Aug 29, 2024.