Software Management

The systems include an embedded management CPU card that runs MLNX-OS® management software.

The MLNX-OS systems management package and related documentation can be downloaded at https://docs.nvidia.com/networking/category/mlnxos.

The InfiniBand Subnet Manager (SM) is a centralized entity running in the system. The SM applies network traffic related configurations such as QoS, routing, partitioning to the fabric devices. You can view and configure the Subnet Manager parameters via the CLI/WebUI. Each subnet needs one subnet manager to discover, activate and manage the subnet.

Each network requires a Subnet Manager to be running in either the system itself (system based) or on one of the nodes which is connected to the fabric (host based).

Warning

No more than two subnet managers are recommended for any single fabric.

The InfiniBand Subnet Manager running on the system supports up to 2048 nodes. If the fabric includes more than 2048 nodes, you may need to purchase NVIDIA's Unified Fabric Manager (UFM®) software package.

Each subnet needs one subnet manager to discover, activate and manage the subnet.

Each network requires a Subnet Manager to be running in either the system itself (system based) or on one of the nodes which is connected to the fabric (host based).

The subnet manager (OpenSM) assigns Local IDentifiers (LIDs) to each port connected to the fabric, and develops a routing table based on the assigned LIDs.

A typical installation using the OFED package will run the OpenSM subnet manager at system start up after the drivers are loaded. This automatic OpenSM is resident in memory, and sweeps the fabric approximately every 5 seconds for new adapters to add to the subnet routing tables.

Software and firmware updates are available from the NVIDIA Support website. Check that your current revision is the same one that is on the NVIDIA website. If not upgrade your software. Copy the update to a known location on a remote server within the user’s LAN.

Use the CLI or the GUI in order to perform software upgrades. For further information please refer to the Upgrading MLNX-OS® Software section in the MLNX-OS Software User Manual.

Be sure to read and follow all of the instructions regarding the updating of the software on your system.

Managed systems do not require Firmware updating. Firmware updating is done through the MLNX-OS management software. The system comes standard with a management software module for system management called NVIDIA Operating System (MLNX-OS). MLNX-OS® is installed on all Quantum based managed systems. MLNX-OS® includes a CLI, WebUI, SNMP, system management software and IB management software (OpenSM).

There are two methods to update system firmware:

  • (Typical) In-band via a switch network port across a cable connecting the server to the switch port.

  • (Non-typical) Via the I²C port of the switch using a NVIDIA MTUSB-1 device connecting to a server's USB port on the one end and to the I²C port of the switch on the other.

Firmware updates should normally be conducted in-band. The use of the MTUSB-1 device is intended for cases of debug or firmware corruption and should be conducted by NVIDIA Fields or Support engineers, or by trained users at the customer's site.

Both types of updated require the installation of NVIDIA Firmware Tools (MFT) package. The MFT package and user manual are available for download under https://network.nvidia.com/products/adapter-software/firmware-tools/. Please select the package that suits your operating system.

In order to obtain information regarding the externally managed system, you must download the NVIDIA MFT tools from https://network.nvidia.com/products/adapter-software/firmware-tools/.

Select and download the release that matches your system. Follow the instructions in the User Manual https://docs.nvidia.com/networking/category/mft to get the tools.

Updating Firmware In-band (Typical)

Check the currently programmed firmware on the system and compare it to the latest firmware available under https://network.nvidia.com/support/firmware/firmware-downloads/ (check under Quantum Switch Systems).

In order to obtain the firmware version of the externally managed system:

  1. Obtain the LID of the target system. The following instructions use one of the utilities provided by the installed MFT package. (Other methods are described in the MFT User Manual) by performing the following:

    1. Mark the GUID printed on the inventory pull-out tab of the system.

    2. Run the command ibnetdiscover and search for the row starting with the word "Switch" and indicates the GUID of the system.

    3. Mark the displayed LID on that row (a decimal number).

  2. Run the following command from a host:

    Copy
    Copied!
                

    # flint -d <device> q#

  3. Compare the results of this command with the latest version for your system posted on https://network.nvidia.com/support/firmware/firmware-downloads/ (select the Quantum System page).

  4. If the current version is not the latest version, follow the directions in the MFT User manual to burn the new firmware inband.

For further information, please refer to MFT User Manual at https://docs.nvidia.com/networking/category/mft.

© Copyright 2023, NVIDIA. Last updated on Jan 7, 2024.