HowTo Configure NVIDIA BlueField-3 to NIC Mode on VMware vSphere 8.0

Created on Mar 12, 2024

This guide outlines configuring the NVIDIA® BlueField®-3 DPU to NIC Mode on VMware ESXi version 8.0.x.

  • A server platform with an adapter card utilizing an NVIDIA BlueField-3 device

  • Administrator privileges are necessary for installation on the target machine

The setup involves an ESXi 8.0 server with one or more NVIDIA BlueField-3 adapter cards.

The setup includes:

Note

VMware ESXi server, vSphere Cluster and vCenter install and configuration is out of the scope of this post.

On a Dell server:

    1. Enter the server's BIOS menu.

    2. Click Device Settings:

      dell_750_01-version-1-modificationdate-1718524382483-api-v2.png

    3. Select the desired BlueField device(s).

      BF3_Dell_BF_mode_02-version-1-modificationdate-1734341044553-api-v2.png

    4. Verify that the Chip Type is a BlueField device (BlueField-3) and the VMware Distributed Services Engine (DPU) is Disabled.

      Click BlueField Internal Cpu Configuration.

      dell_750_02-version-2-modificationdate-1718524743103-api-v2.png

    5. Set Internal Cpu Offload Engine setting to Disable (disable is NIC mode).

      dell_750_03-version-1-modificationdate-1718524788000-api-v2.png

    6. Repeat steps b-d for all necessary BlueField devices in the server.

    7. Save BIOS settings.

    8. Power cycle the server.

On a Lenovo server:

    1. Enter the server's BIOS menu.

    2. Select the desired BlueField device(s).

      00-version-1-modificationdate-1712648362510-api-v2.png

      01-version-2-modificationdate-1712648477713-api-v2.png

    3. Verify that the Chip Type is a BlueField device (e.g., BlueField-3) and the VMware Distributed Services Engine (DPU) is Disabled.

      02-version-2-modificationdate-1712648492927-api-v2.png

    4. Click BlueField Internal Cpu Configuration.

      03-version-1-modificationdate-1712648517580-api-v2.png

    5. Set Internal Cpu Offload Engine setting to Disable (i.e., NIC mode).

      04-version-1-modificationdate-1712648533837-api-v2.png

    6. Repeat steps b-d for all necessary BlueField devices in the server.

    7. Save BIOS settings.

    8. Power cycle the server.

  1. Enable SSH from the vSphere Client to the desired ESXi server.

  2. Access the remote ESXi Shell.

  3. Enter Maintenance Mode with ESXCLI.

  4. Install NVIDIA Firmware Tools (MFT).

  5. Locate the physical NVIDIA network device.

    ESXi console

    Copy
    Copied!
                

    [root@qa-esxio-lnv-06:~] lspci |grep -i mel

    Example output:

    ESXi console

    Copy
    Copied!
                

    0000:38:00.0 Ethernet controller: Mellanox Technologies MT43244 Family [BlueField-3 integrated ConnectX-7] 0000:38:00.1 Ethernet controller: Mellanox Technologies MT43244 Family [BlueField-3 integrated ConnectX-7] 0000:38:00.2 Ethernet controller: Mellanox Technologies MT42822 Family [BlueField Auxiliary Comm Channel] 0000:38:00.3 DMA controller: Mellanox Technologies MT43244 BlueField-3 SoC Management Interface 0000:a8:00.0 Ethernet controller: Mellanox Technologies MT43244 Family [BlueField-3 integrated ConnectX-7] 0000:a8:00.1 Ethernet controller: Mellanox Technologies MT43244 Family [BlueField-3 integrated ConnectX-7] 0000:a8:00.2 Ethernet controller: Mellanox Technologies MT42822 Family [BlueField Auxiliary Comm Channel] 0000:a8:00.3 DMA controller: Mellanox Technologies MT43244 BlueField-3 SoC Management Interface

  6. Check the NIC link status from the ESXi service console.

    ESXi console

    Copy
    Copied!
                

    [root@qa-esxio-lnv-06:~] esxcli network nic list

    Example output :

    step01-version-1-modificationdate-1710429771253-api-v2.png

  7. Query NVIDIA device firmware.

    ESXi console

    Copy
    Copied!
                

    [root@qa-esxio-lnv-06:~] esxcli network nic list

    Example output :

    step03-version-1-modificationdate-1710430820027-api-v2.png

    Warning

    If the following is printed:

    ESXi console

    Copy
    Copied!
                

    /opt/mellanox/bin/mlxfwmanager -E- No devices found or specified, mst might be stopped, run 'mst start' to load MST modules

    Then the device is in DPU mode, so MFT does not recognize it.

  8. Verify whether each device is in DPU mode by ensuring that INTERNAL_CPU_OFFLOAD_ENGINE is enabled.

    ESXi console

    Copy
    Copied!
                

    [root@qa-esxio-lnv-06:~] /opt/mellanox/bin/mlxconfig -d mt41692_pciconf0 q | grep -i offload INTERNAL_CPU_OFFLOAD_ENGINE ENABLED(0) [root@qa-esxio-lnv-06:~] /opt/mellanox/bin/mlxconfig -d mt41692_pciconf1 q | grep -i offload INTERNAL_CPU_OFFLOAD_ENGINE ENABLED(0)

  9. Enter device recovery mode and reboot the server.

    ESXi console

    Copy
    Copied!
                

    [root@qa-esxio-lnv-06:~] esxcli system module parameters set -m nmlx5_core -p mst_recovery=1 [root@qa-esxio-lnv-06:~] reboot -f

  10. Configure the desired device to NIC mode.

    ESXi console

    Copy
    Copied!
                

    [root@qa-esxio-lnv-06:~] /opt/mellanox/bin/mlxconfig -d mt41692_pciconf0 set INTERNAL_CPU_OFFLOAD_ENGINE=1

    Example output

    step05-version-1-modificationdate-1710431684227-api-v2.png

  11. Power cycle the system after switching to NIC mode.

  12. Verify that INTERNAL_CPU_OFFLOAD_ENGINE is DISABLED after power cycling the system.

    ESXi console

    Copy
    Copied!
                

    [root@qa-esxio-lnv-06:~] /opt/mellanox/bin/mlxconfig -d mt41692_pciconf0 q | grep -i offload

    Example output:

    step06-version-1-modificationdate-1710431708907-api-v2.png

We will add a DPU to a VM as a passthrough PCI device and will connect to DPU UEFI to configure DPU in NIC Mode.

  1. Change the management interfaces to passthrough.

    NIC_mode_from_UEFI_01-version-1-modificationdate-1710359837593-api-v2.png

  2. Create a virtual machine (VM) and assign the PCIe passthrough devices.

    NIC_mode_from_UEFI_02-version-1-modificationdate-1710359857713-api-v2.png

  3. Power on the VM.

  4. Deploy Linux OS from PXE.

  5. Install the driver to get RShim and MFT:

    Note

    Installing minicom is necessary.

    • For RedHat, run:

      ESXi console

      Copy
      Copied!
                  

      yum install minicom yum install pv yum install gcc-gfortran

    • For Ubuntu, run:

      ESXi console

      Copy
      Copied!
                  

      apt install minicom apt install pv

    • For SLES, use yast or zypper. For example:

      ESXi console

      Copy
      Copied!
                  

      zypper -i minicom zypper -i pv

  6. Connect to DPU via minicom:

    Info

    If you have multiple DPUs on your setup, this means that you have one RShim per DPU (e.g., rshim0 for DPU#1, rshim1 for DPU#2, etc.)

    Open as many threads on VNC to get access to each DPU using the following formula:

    ESXi console

    Copy
    Copied!
                

    minicom -D /dev/rshim<N>/console

    Where <N> represents the index associated with each DPU.

    NIC_mode_from_UEFI_03-version-1-modificationdate-1710359921660-api-v2.png

  7. Open a new SSH connection to the same VM and reset each DPU from the host:

    ESXi console

    Copy
    Copied!
                

    echo "SW_RESET 1" > /dev/rshim<N>/misc

  8. Navigate to the console for your DPU and press the ESC button key on the keyboard.

  9. Define a new password for the DPU.

    Info

    The default password for the UEFI BIOS is bluefield.

    NIC_mode_from_UEFI_04-version-1-modificationdate-1710359990387-api-v2.png

  10. Select Device Manager.

    NIC_mode_from_UEFI_05-version-1-modificationdate-1710360007770-api-v2.png

  11. Select Network Device List.

    NIC_mode_from_UEFI_06-version-1-modificationdate-1710360041300-api-v2.png

  12. Select the network device that presents the uplink (i.e., select the device with the uplink MAC address).

    NIC_mode_from_UEFI_07-version-1-modificationdate-1710360059787-api-v2.png

  13. Select NVIDIA Network adapter - $<uplink-mac>.

    NIC_mode_from_UEFI_08-version-1-modificationdate-1710360077763-api-v2.png

  14. Select BlueField Internal Cpu Configuration.

    NIC_mode_from_UEFI_09-version-1-modificationdate-1710360102123-api-v2.png

  15. Set NIC mode by setting Internal CPU Offload Engine to Disabled.

    NIC_mode_from_UEFI_10-version-1-modificationdate-1710360183003-api-v2.png

    Info

    To set DPU mode, set Internal CPU Offload Engine to Enabled.

  16. Power cycle the system.

BK-version-2-modificationdate-1697457536297-api-v2.jpg

Boris Kovalev

Boris Kovalev has worked for the past several years as a Solutions Architect, focusing on NVIDIA Networking/Mellanox technology, and is responsible for complex machine learning, Big Data and advanced VMware-based cloud research and design. Boris previously spent more than 20 years as a senior consultant and solutions architect at multiple companies, most recently at VMware. He has written multiple reference designs covering VMware, machine learning, Kubernetes, and container solutions which are available at the Mellanox Documents website.

This document is provided for information purposes only and shall not be regarded as a warranty of a certain functionality, condition, or quality of a product. NVIDIA Corporation (“NVIDIA”) makes no representations or warranties, expressed or implied, as to the accuracy or completeness of the information contained in this document and assumes no responsibility for any errors contained herein. NVIDIA shall have no liability for the consequences or use of such information or for any infringement of patents or other rights of third parties that may result from its use. This document is not a commitment to develop, release, or deliver any Material (defined below), code, or functionality. NVIDIA reserves the right to make corrections, modifications, enhancements, improvements, and any other changes to this document, at any time without notice. Customer should obtain the latest relevant information before placing orders and should verify that such information is current and complete. NVIDIA products are sold subject to the NVIDIA standard terms and conditions of sale supplied at the time of order acknowledgement, unless otherwise agreed in an individual sales agreement signed by authorized representatives of NVIDIA and customer (“Terms of Sale”). NVIDIA hereby expressly objects to applying any customer general terms and conditions with regards to the purchase of the NVIDIA product referenced in this document. No contractual obligations are formed either directly or indirectly by this document.

© Copyright 2024, NVIDIA. Last updated on Jan 27, 2025.