NVIDIA BlueField-2 DPU Firmware Release Notes v24.42.1000
NVIDIA BlueField-2 DPU Firmware Release Notes v24.42.1000

Bug Fixes History

Internal Ref.

Issue

3665350

Description: Fixed an issue on the customized server with an independent power supply, that led to an assert with ext_synd as 0x8ce5 during a power cycle process for virtio.

Keywords: virtio emulation, independent power supply

Discovered in Version: 24.39.2048

Fixed in Release: 24.41.1000

3798733

Description: Fixed an issue that caused traffic not to function properly after performing Live Migration with ingress traffic for vDPA over VFE scenario.

Keywords: virtio, vDPA over VFE, Live Migration

Discovered in Version: 24.39.2048

Fixed in Release: 24.41.1000

3555832

Description: Fixed an issue that caused traffic failure when modifying the VIRTIO_NET_F_MRG_RXBUF bit for the VDPA device during traffic.

Keywords: VDPA, MRG_RXBUF

Discovered in Version: 24.39.2048

Fixed in Release: 24.41.1000

3771100

Description: Fixed an issue that resulted in the second mkey index returning even if it was not set in the creation of the virtio q when querying virtio q object.

Keywords: VDPA, virtio, query object

Discovered in Version: 24.39.2048

Fixed in Release: 24.41.1000

3783686

Description: Fixed an issue on the customized server with an independent power supply, that led to a non-functional virtio when power cycled the server during stressful traffic. The following error was provided: "DESTROY_GENERAL_OBJECT(0xa03) No done completion".

Keywords: virtio full emulation, independent power supply

Discovered in Version: 24.39.2048

Fixed in Release: 24.41.1000

3691774

Description: Fixed an issue that resulted in traffic loss after performing Live Migration with virtio vq "frozen-ready" feature.

Note: When the traffic load is high, and the vq frozen-ready cap is on, traffic loss might still be experienced after modifying the vq from suspend to ready mode.

Keywords: VDPA, live migration, virtio, resume

Discovered in Version: 24.39.2048

Fixed in Release: 24.41.1000

Internal Ref.

Issue

3634184

Description: Changed HW ETS (QETCR RL) default to be per host-port instead of per physical port to avoid bandwidth degradation.

Keywords: HW ETS

Discovered in Version: 24.38.1002

Fixed in Release: 24.40.1000

3728130

Description: Fixed an issue that resulted in DESTROY_GENERAL_OBJECT(0xa03) and MODIFY_GENERAL_OBJECT(0xa01) getting timeout when performing a host power cycle with an independent-power-supplied BlueField-2 on which the virtio devices are hotplugged.

Keywords: Virtio full emulation, Bluefield-2

Discovered in Version: 24.38.1002

Fixed in Release: 24.40.1000

3708035

Description: Fixed an issue with Selective-Repeat configuration which occasionally caused retransmission to wait for timeout instead of out-of-sequence NACK.

Keywords: RoCE, SR

Discovered in Version: 24.38.1002

Fixed in Release: 24.40.1000

3695219

Description: Enabled the lowest minimum rate for SW DCQCN to enable congestion control to hold a larger amount of QPs without pauses or drops.

Keywords: Congestion control, PCC, DCQCN

Discovered in Version: 24.38.1002

Fixed in Release: 24.40.1000

3609404

Description: Redirected multicast traffic to loopback only on MNG PF port using PT Tx loopback CAM HW mechanism.

Keywords: Multicast traffic, loopback, MNG PF

Discovered in Version: 24.38.1002

Fixed in Release: 24.40.1000

3629353

Description: Fixed the cr_space in port configuration to prevent wrong timestamp of cqes.

Keywords: Hardware timestamp

Discovered in Version: 24.38.1002

Fixed in Release: 24.40.1000

3547022

Description: Fixed an issue that resulted in reset failure when unloading network drivers on an external host and the sync1 reset is still reported as 'supported' although it is not.

Keywords: sync1 reset

Discovered in Version: 24.38.1002

Fixed in Release: 24.40.1000

3534774

Description: Fixed an issue that prevented the Power Controller Control bit in the Slot Control register from returning to default when forcing the Unplug sequence.

Keywords: Power Controller Control

Discovered in Version: 24.38.1002

Fixed in Release: 24.40.1000

3602169

Description: Added a locking mechanism to protect the firmware from a race condition between insertion and deletion of the same rule in parallel which occasionally resulted in firmware accessing a memory that has already been released, thus causing IOMMU / translation error.

Note: This fix will not impact insertion rate for tables owned by SW steering.

Keywords: Firmware steering

Discovered in Version: 24.38.1002

Fixed in Release: 24.40.1000

3612682

Description: Enabled live migration for virtio with mergeable buffer.

Keywords: Virtio, Mergeable buffer, Live migration

Discovered in Version: 24.38.1002

Fixed in Release: 24.40.1000

3571251

Description: Fixed an issue that resulted in migration data corruption when running parallel save_vhca_state/load_vhca_state commands on the same PF.

Keywords: VF live migration

Discovered in Version: 24.38.1002

Fixed in Release: 24.40.1000

Internal Ref.

Issue

3609404

Description: Redirected multicast traffic to loopback only on MNG PF port using PT Tx loopback CAM HW mechanism.

Keywords: Multicast traffic, loopback, MNG PF

Discovered in Version: 24.38.1002

Fixed in Release: 24.39.2048

3629353

Description: Fixed the cr_space in port configuration to prevent wrong timestamp of cqes.

Keywords: Hardware timestamp

Discovered in Version: 24.38.1002

Fixed in Release: 24.39.2048

3547022

Description: Fixed an issue that resulted in reset failure when unloading network drivers on an external host and the sync1 reset is still reported as 'supported' although it is not.

Keywords: sync1 reset

Discovered in Version: 24.38.1002

Fixed in Release: 24.39.2048

3534774

Description: Fixed an issue that prevented the Power Controller Control bit in the Slot Control register from returning to default when forcing the Unplug sequence.

Keywords: Power Controller Control

Discovered in Version: 24.38.1002

Fixed in Release: 24.39.2048

3602169

Description: Added a locking mechanism to protect the firmware from a race condition between insertion and deletion of the same rule in parallel which occasionally resulted in firmware accessing a memory that has already been released, thus causing IOMMU / translation error.

Note: This fix will not impact insertion rate for tables owned by SW steering.

Keywords: Firmware steering

Discovered in Version: 24.38.1002

Fixed in Release: 24.39.2048

3612682

Description: Enabled live migration for virtio with mergeable buffer.

Keywords: Virtio, Mergeable buffer, Live migration

Discovered in Version: 24.38.1002

Fixed in Release: 24.39.2048

3571251

Description: Fixed an issue that resulted in migration data corruption when running parallel save_vhca_state/load_vhca_state commands on the same PF.

Keywords: VF live migration

Discovered in Version: 24.38.1002

Fixed in Release: 24.39.2048

Internal Ref.

Issue

3365411

Description: Fixed a link failure that occurred due to a wrong 'is_inphi_cable' indication.

Keywords: Link failure

Discovered in Version: 24.37.1300

Fixed in Release: 24.38.1002

3435583

Description: Under certain configurations, during the loading of the PXE driver in Smart-NIC mode, the firmware attempts to lock the CMAS resources in ICMC sets that are full. This results in the failure of the locking and raises a health buffer indication.

To prevent the above scenario, in this firmware version we improved the distribution of resource locking in ICMC.

Keywords: ICMC locking

Discovered in Version: 24.37.1300

Fixed in Release: 24.38.1002

3331179

Description: Improved token calculation.

Keywords: Token calculation

Discovered in Version: 24.37.1300

Fixed in Release: 24.38.1002

3491841

Description: Fixed a firmware assert that occurred when tried to verify if the module supported "swap".

Keywords: Firmware assert

Discovered in Version: 24.37.1300

Fixed in Release: 24.38.1002

Internal Ref.

Issue

3432548

Description: Closed the attached QP doorbell to avoid any impact from the software side or the db_recovery mechanism

Keywords: QP doorbell

Discovered in Version: 24.35.2000

Fixed in Release: 24.37.1300

3385129

Description: Fixed an issue that resulted in high PTP offset by changing the RST value to 200, and adjusting the PTP Tx offset in PTP4L configuration.

Keywords: PTP glitch, PTP constant offset

Discovered in Version: 24.35.2000

Fixed in Release: 24.37.1300

3233113

Description: Disabled some HW optimization to prevent a HW race that caused an SQ to get stuck.

Keywords: HW race, SQ

Discovered in Version: 24.33.1048

Fixed in Release: 24.37.1300

3306318

Description: Fixed the issue that caused the virtio PXE boot to fail due to virtio BLK controller being stuck in continuous host warm reboot.

Keywords: virtio full emulation, PXE boot, warm reboot

Discovered in Version: 24.35.2000

Fixed in Release: 24.37.1300

3327847

Description: CNP received, handled, and ignored counters in the hardware counters cannot work after moving to Programmable Congestion Control mode.

Keywords: CNP, Programmable Congestion Control

Discovered in Version: 24.35.2000

Fixed in Release: 24.37.1300

© Copyright 2024, NVIDIA. Last updated on Aug 14, 2024.