NVIDIA BlueField-2 DPU Firmware Release Notes v24.35.4554 LTS
Bug Fixes History

Internal Ref.

Issue

4149511

Description: Fixed an issue that resulted in setup crash when create_sq used invalid mbox. Now the invalid mbox is replaced with a valid DB.

Keywords: mbox

Discovered in Version: 24.35.4030

Fixed in Release: 24.35.4506

4161453

Description: Fixed a rare case that cause traffic to stop as it is failed to be recovered when the emulation doorbell did not function properly.

Keywords: Emulation doorbell

Discovered in Version: 24.35.4030

Fixed in Release: 24.35.4506

4149392

Description: Added address validation in MLNX OEM CMD 0x0032 (get debug info) to be 4-bytes aligned.

Keywords: Address validation, 0x0032

Discovered in Version: 24.35.4030

Fixed in Release: 24.35.4506

3887759

Description: Fixed an issue that caused Completion Timeout to mistakenly be treated as Advisory Non-Fatal error. Now Completion Timeout is treated as uncorrectable error.

Keywords: Completion Timeout, Advisory Non-Fatal error

Discovered in Version: 24.35.3006

Fixed in Release: 24.35.4030

3817699

Description: Fixed an issue that caused the TX to hang and create a "TX timeout" error in dmesg after unplugging the device forcefully during server warm reboot.

Keywords: hotplug, virtio, nvme, warm reboot, TX timeout

Discovered in Version: 24.35.3006

Fixed in Release: 24.35.4030

3588590

Description: Fixed an issue on the customized server with an independent power supply that resulted in an assert with ext_synd as 0x4010 during a power cycle process.

Keywords: Virtio full emulation, independent power supply

Discovered in Version: 24.35.3006

Fixed in Release: 24.35.4030

3813815

Description: Fixed an issue that result in no traffic after live migration resume of vDPA when using DOCA version 2.6.0 onwards.

Keywords: vDPA, Live Migration, DOCA

Discovered in Version: 24.35.3006

Fixed in Release: 24.35.4030

3588742

Description: Fixed an issue on the server with an independent power supply where the virtio devices are hotplugged that led to a timeout when power cycling the server. The following errors were provided: "DESTROY_GENERAL_OBJECT(0xa03)" and "MODIFY_GENERAL_OBJECT(0xa01)".

Keywords: Virtio full emulation, independent power supply

Discovered in Version: 24.35.3006

Fixed in Release: 24.35.4030

3745192 / 3895938

Description: Fixed an issue on the customized server with an independent power supply, that led to a non-functional virtio when power cycling the server during stressful traffic. The following error was provided: "DESTROY_GENERAL_OBJECT(0xa03) No done completion".

Keywords: Virtio full emulation, independent power supply

Discovered in Version: 24.35.3006

Fixed in Release: 24.35.4030

3670349

Description: Fixed an issue that prevented the Power Controller Control bit in the Slot Control register from returning to default when forcing the Unplug sequence.

Keywords: Power Controller Control

Discovered in Version: 24.35.2000

Fixed in Release: 24.35.3502

3670324

Description: Fixed an assert with ext_synd as 0x8ce5 during a power cycle process for a virtio case on the customized server with an independent power supply.

Keywords: virtio, assert

Discovered in Version: 24.35.2000

Fixed in Release: 24.35.3502

3217896

Description: Fixed RDE PATCH operation status code reported in case the property is "read-only".

Keywords: RDE

Discovered in Version: 24.35.1012

Fixed in Release: 24.35.2000

3241357

Description: Fixed an issue in MCTP-over-PCIe, where the VDM message with the type Route-to-Root Complex, the target ID was not set as 0x0.

Keywords: MCTP-over-PCIe, VDM message

Discovered in Version: 24.35.1012

Fixed in Release: 24.35.2000

3250924

Description: Fixed an issue that resulted in NVMe driver not loading on a hotplug-able device when VIRTIO_EMULATION_HOTPLUG_TRANS was enabled by mlxconfig.

Keywords: VIRTIO_EMULATION_HOTPLUG_TRANS, NVMe

Discovered in Version: 24.35.1012

Fixed in Release: 24.35.2000

3215393

Description: Fixed an issue that caused the virtual QoS mechanism to stop traffic from reaching the full line rate of 200GbE on each direction when LAG was enabled.

Keywords: Virtual QoS mechanism, 200GbE, LAG

Discovered in Version: 24.35.1012

Fixed in Release: 24.35.2000

3023205

Description: Fixed an issue that resulted in:

  • data corruption

  • NVMF queue getting stuck, following backend controller timeout and RNR NAK exceeded syndrome on initiator

due to mistakes in access to the internal database NVMF Backend Controller as a result of wrong data in the database.

Keywords: data_corruption

Discovered in Version: 24.33.1048

Fixed in Release: 24.34.1002

3101645

Description: Fixed an issue that caused some requests to get stuck in the pacer, thus not allowing detach NVMF namespace command to progress when the pacer was configured to use byte in flight limitation mechanism and NVMf backend controller timeout happened under traffic.

Keywords: CMD queue

Discovered in Version: 24.33.1048

Fixed in Release: 24.34.1002

3021669

Description: Added a new NVconfig parameter “MULTI_PCI_RESOURCE_SHARE” to support modes that allow choosing the utilization of the card's resources on each host in Socket-Direct / Multi host setup.

Keywords: Performance

Discovered in Version: 24.33.1048

Fixed in Release: 24.34.1002

2665773

Description: Added 50 Usec delay during PML1 exit to avoid any PCIe replay timer timeout.

Keywords: PCIe. PML1

Discovered in Version: 24.33.1048

Fixed in Release: 24.34.1002

3134894

Description: Fixed an issue where set_flow_table_entry failed when aso_flow_meter action was used.

Keywords: ASO Flow Meter, FW Steering

Discovered in Version: 24.33.1048

Fixed in Release: 24.34.1002

3039007

Description: Enabled Multi-Host RX Rate-limiter configuration via the QEEC mlxreg and the max_shaper_rate field.

Keywords: RX Rate-Limiter, Multi-host

Discovered in Version: 24.33.1048

Fixed in Release: 24.34.1002

3059379

Description: Added "Command Unsupported" response code in cases when running the MCTP control command "Get Vendor Defined Messages Supported", and there were no supported VDMs.

Keywords: MCTP control command

Discovered in Version: 24.33.1048

Fixed in Release: 24.34.1002

2994292

Description: Fixed a race condition occured between the duplicate read and QP commands (2RST, 2ERR and Destroy) in the signature that caused the command to hang.

Keywords: Race condition

Discovered in Version: 24.33.1048

Fixed in Release: 24.34.1002

3068927

Description: Added support for dynamic MSI-X when in SmartNIC mode.

Keywords: Dynamic MSI-X

Discovered in Version: 24.33.1048

Fixed in Version: 24.34.1002

3110286

Description: Fixed an issue where vPort counters had wrong values.

Keywords: vPort counters

Discovered in Version: 24.33.1048

Fixed in Version: 24.34.1002

2887966

Description: A hardware issue in the illegal_flowq that raises even when there is drop, results in the adapter card getting stuck during

high scale traffic.

Keywords: Performance

Discovered in Version: 24.33.1048

Fixed in Version: 24.34.1002

3056461

Description: Creating a Channel Service Object with bad parameters that lead to command rollback, results in the command getting stuck.

Keywords: Open SNAPI

Discovered in Version: 24.33.1048

Fixed in Version: 24.34.1002

2702118

Description: Changed the policy of VDPA queue number capability.

  • When the devices count <= 8, the VDPA queue number in cap is 256

  • When the device count >=32, the VDPA queue number in cap is 64

  • When the devices count is in 9~31, the VDPA queue number in cap is 128. Here the devices counts all port functions configured in mlxconfig, including pf, vf and sf.

Keywords: VDPA

Discovered in Version: 24.32.1010

Fixed in Version: 24.33.1048

2991255

Description: Fixed an issue that caused the host driver to hang when the received packet was bigger than the received buffer (according to the device's MTU) as the device reported a packet length bigger than the received buffer length.

Keywords: virtio net RX packet, RX buffer

Discovered in Version: 24.32.1010

Fixed in Version: 24.33.1048

3021734

Description: Fixed an issue that caused BMC to fail to detect the PCIe device when using the MCTP-over-PCIe protocol.

Keywords: BMC, MCTP-over-PCIe protocol, PCIe

Discovered in Version: 24.32.1010

Fixed in Version: 24.33.1048

2802943

Description: Implemented SLD detection code. Surprise Down Error Reporting Capable value was changed from 1 to 0 in boards where the downstream perst was not controlled thus causing SLD detection not to function properly.

Keywords: SLD detection, Surprise Down Error Reporting Capable

Discovered in Version: 24.32.1010

Fixed in Version: 24.33.1048

2513453

Description: Fixed rare lanes skew issue that caused CPU to timeout in Rec.idle.

Keywords: PCIe

Discovered in Version: 24.32.1010

Fixed in Version: 24.33.1048

2932436

Description: Optimized the virtio data path to reach line speed for Tx bandwidth.

Keywords: VDPA, virtio full emulation

Discovered in Version: 24.32.1010

Fixed in Version: 24.33.1048

2907707

Description: Fixed a configuration issue which flipped the MSB of Partition Key field in CNP packets and led to P_KEY mismatch between CNP packets and regular packets.

Keywords: Partition Key, PKEY, CNP, ECN

Discovered in Version: 24.32.1010

Fixed in Version: 24.33.1048
