mlxfwreset – Loading Firmware on 5th Generation Devices Tool
The mlxfwreset tool enables the user to load updated firmware on a NIC/switch without having to reboot the machine. The mlxfwreset tool supports 5th Generation (Group II) HCAs and allows a smooth firmware upgrade.
NoteThis feature is supported if either reset type 3 (ARM RESET), which is the default starting from this version, or 4 (ARM SHUT DOWN), and reset level 1 (IMMEDIATE RESET) are supported.
Tool Requirements
Access to device through PCI configuration cycles
Supported OSs: FreeBSD, Linux, Windows
Query Command
mlxfwreset -d <device> query
Reset Command
mlxfwreset -d <device> reset-[y] [--level <
0
,3
,4
>] [--type <0
..2
>] [--sync <0
,1
,2
>] [-s] [-m] [--method <0
,1
>mlxfwreset Synopsis
Where:
q|query
Query supported reset level/type/sync.
N/A for switch devices.
r|reset
Execute reset.
-
reset_fsm_register
Reset the multi-host synchronization register.
-
-d|--device <device>
Device to work with.
-
-l|--level <0,1,3,4>
Run reset with the specified reset-level.
N/A for switch devices.
-t|--type <0..4>
Run reset with the specified reset-type.
N/A for switch devices.
--sync <0,1,2>
Run reset with the specified reset-sync.
N/A for switch devices.
--method <0,1>
Run reset with the specified request method (relevant only for reset-level 3)
-
-y|--yes
Answer “yes” on prompt.
-
-m|--mst_flags MST_FLAGS
Provide mst flags to be used when invoking mst restart step. For example: --mst_flags="–with_fpga".
This option is supported in Linux OSes only.
N/A for switch devices.
-s|--skip_driver
Skip driver start/stop stage (driver must be stopped manually).
N/A for switch devices.
-v|--version
Print tool version.
-
-h|--help
Show help message and exit.
-
Reset Levels and Types
Reset levels and types depend on the extent of the changes introduced when updating the device's firmware. The tool will display the supported reset levels and types that will ensure the loading of the new firmware. Those reset levels and types are:
Reset-levels:
0: Driver, PCI link, network link will remain up ("live-Patch")
1: Only ARM side will not remain up ("Immediate reset")
3: Driver restart and PCI reset
4: Warm Reboot
Reset-types (relevant only for reset-levels 1, 3, 4):
0: Full chip reset
1: Phy-less reset ("port-alive" - network link will remain up) – Not Suppported
2: NIC only reset (for SoC devices)
3: ARM only reset
4: ARM OS shut down
NoteThe exact reset-level and reset-types needed to load new the firmware may differ as they depend on the difference between the running firmware and the firmware we are upgrading to.
Reset Sync
Reset-sync indicates who is responsible for the synchronization mechanism between the hosts on the Multi-Host setup (relevant only for reset-level 3):
0: Tool is the owner
1: Driver is the owner
2: FW is the owner
Reset Method
Reset request method (relevant only for reset-level 3):
0: Link Disable method
1: Hot reset (SBR)
PCIe Switch Reset via Hot Reset
We provide a mechanism to reset a PCIe switch using a hot reset, eliminating the need to reboot the entire setup.
This is achieved using sync 2 and method 1 arguments, which together define the hot reset flow.
This feature currently supports ConnectX-7 and BlueField devices.
To trigger the reset, users can either run the tool with: --sync 2 --method 1 arguments or simply run it without any parameters and the tool will apply the default values automatically.
Important:
Before executing the reset, the user should stop any driver bound or associated with devices downstream to the PCIe Switch. The tool may also prompt the user to unload specific drivers or terminate some processes that it is aware of, but it is not necessarily aware of all the bounded drivers.
It is the user responsibility to ensure these steps are completed to guarantee a successful reset.
mlxfwreset for Switch Devices
Running mlxfwreset on a switch device is done in the same form as running mlxfwreset on a NIC. The only difference is that there are no level, types or sync parameters.
mlxfwreset for Multi-Host NICs
Running mlxfwreset on a Multi-Host setup enables you to choose one of the supported reset-sync. To check which reset-sync are supported on your device, run the query command prior to the reset command.
When running reset with reset-sync "0" ("tool is the owner"), the tool must be ran simultaneously on all the hosts. Note that a time-out of 3 minutes is expected for all the hosts until they join the reset process.
When running reset with reset-sync "1" ("driver is the owner"), the tool must be ran on a single host.
Notereset-sync "1" ("driver is the owner") is supported only when the firmware and all the drivers on all the hosts support it.
mlxfwreset for SmartNICs (Bluefield)
After upgrading the FW , you can run mlxfwreset with level 3 and sync 1.
If you use Level 4 standalone (without shutting down the ARM), you need to execute mlxfwreset on both the integrated ARM and the host.
To update the FW using Level 4, the ARM must be shut down first.
NoteDepending on the reset-type, the integrated Arm might get reset. In case Arm is reset, the mlxfwreset on the host will wait for the Arm to complete the reboot process.
mlxfwreset after Changing Configurations using mlxconfig
Some configuration changes require PCI rescan by the user, in this case, mlxfwreset will print the following warning message:
"-W- PCI rescan is required after device reset."
Examples of mlxfwreset Usage
To query the default and supported options to reset a device, run:
# mlxfwreset -d /dev/mst/mt4113_pciconf0 query
Example:
Reset-levels:
0
: Driver, PCI link, network link will remain up ("live-Patch"
) -Not Supported1
: Only ARM side will not remain up ("Immediate reset"
). -Not Supported3
: Driver restart and PCI reset -Supported (default
)4
: Warm Reboot -Supported Reset-types (relevant onlyfor
reset-levels1
,3
,4
):0
: Full chip reset -Supported (default
)1
: Phy-less reset ("port-alive"
- network link will remain up) -Not Supported2
: NIC only reset (for
SoC devices) -Not Supported3
: ARM only reset -Not Supported4
: ARM OS shut down -Not Supported Reset-sync (relevant onlyfor
reset-level3
):0
: Tool is the owner -Supported (default
)1
: Driver is the owner -Supported In thenew
mlxfwresetfor
BF2 and BF3, sync1 is thedefault
. Sync0 is not supported.To reset the device in order to load the new firmware, run:
# mlxfwreset -d /dev/mst/mt4113_pciconf0 reset
Or:
mlxfwreset -d /dev/mst/mt53100_pciconf0 reset -y
Example
3
: Driver restart and PCI reset Continue with reset?[y/N] y -I- Stopping Driver -Done -I- Sending Reset Command To Fw -Done -I- Resetting PCI -Done -I- Starting Driver -Done -I- Restarting MST -Done -I- FW was loaded successfully.To reset a device with a specific reset level to load new firmware, run:
# mlxfwreset -d /dev/mst/mt4113_pciconf0 -l
4
resetExample
Requested reset level
for
device, /dev/mst/mt4113_pciconf0:4
: Warm Reboot Continue with reset?[y/N] y -I- Sending reboot command to machinemlxfwreset Limitations
The following are the limitations of mlxfwreset:
Executing a reset-level or reset-type or reset-sync that is not supported (as shown in the query command) will yield an error
When burning firmware with flint/mlxburn at the end of the burn the following message is displayed:
-I- To load new FW run mlxfwreset or reboot machine
If this message is not displayed, a reboot is required to load a new firmware
On an old firmware, after a successful reset execution, attempting to query or reset again will yield an error as the load new firmware command was already sent to the firmware
In case mlxfwreset exits with error after the “Stopping driver” step and before the “Starting driver” step, the driver will remain down. The user should start the driver manually in this case
mlxfwreset for switch devices does not work over InfiniBand or remote connection