NVIDIA Firmware Tools (MFT) Documentation v4.25.1
NVIDIA ConnectX-5 Adapter Cards Firmware Release Notes v16.35.3502 LTS

mlxfwreset – Loading Firmware on 5th Generation Devices Tool

The mlxfwreset tool enables the user to load updated firmware on a NIC/switch without having to reboot the machine. The mlxfwreset tool supports 5th Generation (Group II) HCAs and allows a smooth firmware upgrade.

  • Access to device through PCI configuration cycles

  • Supported OSs: FreeBSD, Linux, Windows

Copy
Copied!
            

mlxfwreset -d <device> query

Copy
Copied!
            

mlxfwreset -d <device> reset-[y] [--level <0,3,4>] [--type <0..2>] [--sync <0,1>] [-s] [-m]

Where:

q|query

Query supported reset level/type/sync.

N/A for switch devices.

r|reset

Execute reset.

-

reset_fsm_register

Reset the multi-host synchronization register.

-

-d|--device <device>

Device to work with.

-

-l|--level <0,3,4>

Run reset with the specified reset-level.

N/A for switch devices.

-t|--type <0..2>

Run reset with the specified reset-type.

N/A for switch devices.

--sync <0,1>

Run reset with the specified reset-sync.

N/A for switch devices.

-y|--yes

Answer “yes” on prompt.

-

-m|--mst_flags MST_FLAGS

Provide mst flags to be used when invoking mst restart step. For example: --mst_flags="–with_fpga".

  • This option is supported in Linux OSes only.

  • N/A for switch devices.

-s|--skip_driver

Skip driver start/stop stage (driver must be stopped manually).

N/A for switch devices.

-v|--version

Print tool version.

-

-h|--help

Show help message and exit.

-

Reset levels and types depend on the extent of the changes introduced when updating the device's firmware. The tool will display the supported reset levels and types that will ensure the loading of the new firmware. Those reset levels and types are:

  • Reset-levels:

    • 0: Driver, PCI link, network link will remain up ("live-Patch")

    • 3: Driver restart and PCI reset

    • 4: Warm Reboot

  • Reset-types (relevant only for reset-levels 3, 4):

    • 0: Full chip reset

    • 1: Phy-less reset ("port-alive" - network link will remain up)

    • 2: NIC only reset (for SoC devices)

Warning

The exact reset-level and reset-types needed to load new the firmware may differ as they depend on the difference between the running firmware and the firmware we are upgrading to.

  • Reset-sync indicates who is responsible for the synchronization mechanism between the hosts on the Multi-Host setup (relevant only for reset-level 3):

    • 0: Tool is the owner

    • 1: Driver is the owner

Running mlxfwreset on a switch device is done in the same form as running mlxfwreset on a NIC. The only difference is that there are no level, types or sync parameters.

Running mlxfwreset on a Multi-Host setup enables you to choose one of the supported reset-sync. To check which reset-sync are supported on your device, run the query command prior to the reset command.

  • When running reset with reset-sync "0" ("tool is the owner"), the tool must be ran simultaneously on all the hosts. Note that a time-out of 3 minutes is expected for all the hosts until they join the reset process.

  • When running reset with reset-sync "1" ("driver is the owner"), the tool must be ran on a single host.

Warning

reset-sync "1" ("driver is the owner") is supported only when the firmware and all the drivers on all the hosts support it.

Running mlxfwreset on a SmartNIC device is identical to running mlxfwreset on a Multi-Host setup while the integrated Arm is considered as one of the hosts.

The procedure on a SmartNIC device is to run mlxfwreset first from the integrated Arm and then from all other hosts.

Warning

Depending on the reset-type, the integrated Arm might get reset. In case Arm is reset, the mlxfwreset on the host will wait for the Arm to complete the reboot process.

Some configuration changes require PCI rescan by the user, in this case, mlxfwreset will print the following warning message:

"-W- PCI rescan is required after device reset."

To query the default and supported options to reset a device, run:

Copy
Copied!
            

# mlxfwreset -d /dev/mst/mt4113_pciconf0 query

Example:

Copy
Copied!
            

Reset-levels: 0: Driver, PCI link, network link will remain up ("live-Patch") -Not Supported 3: Driver restart and PCI reset -Supported (default) 4: Warm Reboot -Supported   Reset-types (relevant only for reset-levels 3,4): 0: Full chip reset -Supported (default) 1: Phy-less reset ("port-alive" - network link will remain up) -Not Supported 2: NIC only reset (for SoC devices)           -Not Supported   Reset-sync (relevant only for reset-level 3): 0: Tool is the owner -Supported (default) 1: Driver is the owner -Supported In the new mlxfwreset for BF2 and BF3, sync1 is the default. Sync0 is not supported.

To reset the device in order to load the new firmware, run:

Copy
Copied!
            

# mlxfwreset -d /dev/mst/mt4113_pciconf0 reset

Or:

Copy
Copied!
            

mlxfwreset -d /dev/mst/mt53100_pciconf0 reset -y

Example

Copy
Copied!
            

3: Driver restart and PCI reset Continue with reset?[y/N] y -I- Stopping Driver -Done -I- Sending Reset Command To Fw -Done -I- Resetting PCI -Done -I- Starting Driver -Done -I- Restarting MST -Done -I- FW was loaded successfully.

To reset a device with a specific reset level to load new firmware, run:

Copy
Copied!
            

# mlxfwreset -d /dev/mst/mt4113_pciconf0 -l 4 reset

Example

Copy
Copied!
            

Requested reset level for device, /dev/mst/mt4113_pciconf0: 4: Warm Reboot Continue with reset?[y/N] y -I- Sending reboot command to machine

The following are the limitations of mlxfwreset:

  • Executing a reset-level or reset-type or reset-sync that is not supported (as shown in the query command) will yield an error

  • When burning firmware with flint/mlxburn at the end of the burn the following message is displayed:
    -I- To load new FW run mlxfwreset or reboot machine.
    If this message is not displayed, a reboot is required to load a new firmware.

  • On an old firmware, after a successful reset execution, attempting to query or reset again will yield an error as the load new firmware command was already sent to the firmware.

  • In case mlxfwreset exits with error after the “Stopping driver” step and before the “Starting driver” step, the driver will remain down. The user should start the driver manually in this case.

  • mlxfwreset for switch devices does not work over InfiniBand.

© Copyright 2023, NVIDIA. Last updated on Oct 23, 2023.