mlxfwreset – Loading Firmware on 5th Generation Devices Tool

NVIDIA ConnectX-5 Adapter Cards Firmware Release Notes v16.35.3502 LTS

mlxfwreset tool enables the user to load updated firmware on a NIC without having to reboot the machine. mlxfwreset supports 5th Generation (Group II) HCAs and allows a smooth firmware upgrade.

  • Access to device through PCI configuration cycles

  • Supported OSs: FreeBSD, Linux, Windows

Copy
Copied!
            

mlxfwreset -d <device> query

Copy
Copied!
            

mlxfwreset -d <device> reset-[y] [--level <0,3..5>] [--type <0..1>] [--sync <0..1>] [-s] [-m]

where:

q|query

Query supported reset level/type/sync

r|reset

Execute reset

reset_fsm_register

Reset the multi-host synchronization register

-d|--device <device>

Device to work with

-l|--level <0..5>

Run reset with the specified reset-level

-t|--type <0,1>

Run reset with the specified reset-type

--sync <0,1>

Run reset with the specified reset-sync

-y|--yes

Answer “yes” on prompt

-m|--mst_flags MST_FLAGS

Provide mst flags to be used when invoking mst restart step. For example: --mst_flags="–with_fpga"

Note: This option is supported in Linux OSes only.

-s|--skip_driver

Skip driver start/stop stage (driver must be stopped manually)

-v|--version

Print tool version

-h|--help

Show help message and exit

Reset levels and types depend on the extent of the changes introduced when updating the device's firmware. The tool will display the supported reset levels and types that will ensure the loading of the new firmware. Those reset levels and types are:

  • Reset-levels:

    • 0: Driver, PCI link, network link will remain up ("live-Patch")

    • 3: Driver restart and PCI reset

    • 4: Warm Reboot

    • 5: Cold Reboot

  • Reset-types (relevant only for reset-levels 3, 4):

    • 0: Full chip reset

    • 1: Phy-less reset ("port-alive" - network link will remain up)

Warning

The exact reset-level and reset-types needed to load new the firmware may differ as they depend on the difference between the running firmware and the firmware we are upgrading to.

  • Reset-sync indicates who is responsible for the synchronization mechanism between the hosts on the Multi-Host setup(relevant only for reset-level 3):

    • 0: Tool is the owner

    • 1: Driver is the owner

Running mlxfwreset on a Multi-Host setup enables you to choose one of the supported reset-sync. To check which reset-sync are supported on your device, run the query command prior to the reset command.

  • When running reset with reset-sync "0" ("tool is the owner"), the tool must be ran simultaneously on all the hosts. Note that a time-out of 3 minutes is expected for all the hosts until they join the reset process.

  • When running reset with reset-sync "1" ("driver is the owner"), the tool must be ran on a single host.

Warning

reset-sync "1" ("driver is the owner") is supported only when the firmware and all the drivers on all the hosts support it.

Running mlxfwreset on a SmartNIC device is identical to running mlxfwreset on a Multi-Host setup while the integrated Arm is considered as one of the hosts.

The procedure on a SmartNIC device is to run mlxfwreset first from the integrated Arm and then from all other hosts.

Warning

Depending on the reset-type, the integrated Arm might get reset. In case Arm is reset, the mlxfwreset on the host will wait for the Arm to complete the reboot process.

To query the default and supported options to reset a device, run:

Copy
Copied!
            

# mlxfwreset -d /dev/mst/mt4113_pciconf0 query

Example:

Copy
Copied!
            

Reset-levels: 0: Driver, PCI link, network link will remain up ("live-Patch") -Not Supported 3: Driver restart and PCI reset -Supported (default) 4: Warm Reboot -Supported 5: Cold Reboot -Supported   Reset-types (relevant only for reset-levels 3,4): 0: Full chip reset -Supported (default) 1: Phy-less reset ("port-alive" - network link will remain up) -Not Supported   Reset-sync (relevant only for reset-level 3): 0: Tool is the owner -Supported (default) 1: Driver is the owner -Supported

To reset the device in order to load the new firmware, run:

Copy
Copied!
            

# mlxfwreset -d /dev/mst/mt4113_pciconf0 reset

Example

Copy
Copied!
            

3: Driver restart and PCI reset Continue with reset?[y/N] y -I- Stopping Driver -Done -I- Sending Reset Command To Fw -Done -I- Resetting PCI -Done -I- Starting Driver -Done -I- Restarting MST -Done -I- FW was loaded successfully.

To reset a device with a specific reset level to load new firmware, run:

Copy
Copied!
            

# mlxfwreset -d /dev/mst/mt4113_pciconf0 -l 4 reset

Example

Copy
Copied!
            

Requested reset level for device, /dev/mst/mt4113_pciconf0: 4: Warm Reboot Continue with reset?[y/N] y -I- Sending reboot command to machine

The following are the limitations of mlxfwreset:

  • Executing a reset-level or reset-type or reset-sync that is not supported (as shown in the query command) will yield an error

  • When burning firmware with flint/mlxburn at the end of the burn the following message is displayed:
    -I- To load new FW run mlxfwreset or reboot machine.
    If this message is not displayed, a reboot is required to load a new firmware.

  • On an old firmware, after a successful reset execution, attempting to query or reset again will yield an error as the load new firmware command was already sent to the firmware.

  • In case mlxfwreset exits with error after the “Stopping driver” step and before the “Starting driver” step, the driver will remain down. The user should start the driver manually in this case.

© Copyright 2023, NVIDIA. Last updated on May 23, 2023.