HowTo Install NVIDIA Firmware Tools (MFT) on VMware ESXi 8.0

Created on Mar 21, 2023

Introduction

This post describes th e procedure of how to install and run NVIDIA® Firmware tools ( MFT) on VMware ESXi 8.0 version.

References

Overview

The NVIDIA Firmware Tools (MFT) package is a set of firmware management tools used to:

  • Generate a standard or customized NVIDIA firmware image

  • Querying for firmware information

  • Burn a firmware image

Hardware and Software Requirements

  • A server platform with an ConnectX®-6 Dx adapter card.

  • Installer Privileges: The installation requires administrator privileges on the target machine.

Setup

The setup includes:

Note

VMware ESXi server, vSphere Cluster and vCenter install and configuration is out of the scope of this post.

Installation

In first step we need to add additional Mellanox Firmware Tools depot to vSphere Cluster image in VMware Lifecycle Manager (vLCM) .

To add the Mellanox Firmware Tools (MFT) depot to the image.

  1. Download Mellanox Firmware Tools 4.22.1 version from the MFT web page: Mellanox Firmware Tools (MFT) (nvidia.com).

    Cluster_Configuration_04a.png

  2. Open a browser, connect to vSphere web interface at https://<vcenter_fqdn>, and login with the administrator@vsphere.local account.

    Cluster_Configuration_00a.png

  3. At the Inventory tab, select the cluster, then select the Updates tab. Select Image and check LCM compliance.

    Cluster_Configuration_00.png

  4. On the top left menu, click on the tree lines then select Lifecycle Manager .

    Cluster_Configuration_04.png

  5. Click on Action, then select Import Updates.

    Cluster_Configuration_05.png

  6. At the Import Updates popup, click on Browse.

    Cluster_Configuration_06.png

  7. At the Open popup, select the Mellanox depot, then click Open.

    Cluster_Configuration_07.png

  8. Repeat steps 5 to 7 for the second depot bundle.

    Cluster_Configuration_07b.png

    Cluster_Configuration_08.png

  9. At the Inventory tab, select the cluster, then select the Updates tab. Select Image, then click on Edit.

    Cluster_Configuration_09.png

  10. Click on Show details.

    Cluster_Configuration_10.png

  11. Click on ADD COMPONENTS.

    Cluster_Configuration_11.png

  12. Select the Mellanox depots and click SELECT.

    Cluster_Configuration_12.png

  13. Click SAVE.

    Cluster_Configuration_13.png

  14. A compliancy check will starting automaticaly.

    Cluster_Configuration_14.png

  15. Click on REMEDIATE ALL to start MFT install on hosts.

    Cluster_Configuration_15.png

  16. Click START REMEDIATION.

    Cluster_Configuration_16.png

    Cluster_Configuration_17.png

  17. To Enter a host to Maintenance mode maybe you need to power off vCLS VMs on host manualy.

    Cluster_Configuration_18.png

    Cluster_Configuration_19.png

    Cluster_Configuration_20.png

  18. All host have now MFT tools installed.

    Cluster_Configuration_22.png

Verification

  1. Enable SSH Access to ESXi server.

  2. Log into ESXi console with root permissions.

  3. Start the mst driver.

    ESXi console

    Copy
    Copied!
                

    [root@clx-host-153:~] /opt/mellanox/bin/mst start Module mst is already loaded [root@clx-host-153:~]

  4. Check the current status of NVIDIA devices.

    ESXi console

    Copy
    Copied!
                

    [root@clx-host-153:~] /opt/mellanox/bin/mst status PCI devices: ------------ DEVICE_TYPE MST PCI RDMA NET NUMA ConnectX6DX(rev:0) mt4125_pciconf0 39:00.0   ConnectX6DX(rev:0) mt4125_pciconf0.1 39:00.1   [root@clx-host-153:~]

  5. Query the device information.

    ESXi console

    Copy
    Copied!
                

    [root@clx-host-153:~] /opt/mellanox/bin/mlxfwmanager --query Querying Mellanox devices firmware ...   Device #1: ----------   Device Type: ConnectX6DX Part Number: MCX623106AC-CDA_Ax Description: ConnectX-6 Dx EN adapter card; 100GbE; Dual-port QSFP56; PCIe 4.0 x16; Crypto and Secure Boot PSID: MT_0000000436 PCI Device Name: mt4125_pciconf6 Base GUID: 0c42a103002404ea Base MAC: 0c42a12404ea Versions: Current Available FW 22.30.1004 N/A PXE 3.6.0301 N/A UEFI 14.23.0017 N/A   Status: No matching image found   [root@clx-host-153:~]

Appendix A

mst Synopsis

mst [switches]

Commands and Switches Description:

ESXi cli

Copy
Copied!
            

mst start # Create special files that represent Mellanox devices in directory/dev. Load appropriate modules. After successfully completing this command, the mst driver will be ready to work. mst stop # Stop Mellanox mst driver service and unload the kernel modules. mst restart # "mst stop" followed by "mst start" mst server start [-p|--port port] # Start mst server to allow incoming connection. Default port is 23108. mst server stop # Stop the mst server. mst status # Print current status of Mellanox devices. Options: -v run with a high verbosity level (print more info on each device) mst version # Print the version info

Done !

Authors

image2020-11-17_6-50-16.png

Boris Kovalev

Boris Kovalev has worked for the past several years as a Solutions Architect, focusing on NVIDIA Networking/Mellanox technology, and is responsible for complex machine learning, Big Data and advanced VMware-based cloud research and design. Boris previously spent more than 20 years as a senior consultant and solutions architect at multiple companies, most recently at VMware. He has written multiple reference designs covering VMware, machine learning, Kubernetes, and container solutions which are available at the Mellanox Documents website.

Last updated on Sep 12, 2023.