Bug Fixes History

Internal Ref.

Issues

3326692

Description: Wrap-around of the time_since_last_clear counter caused incorrect reporting of counters on the port.

Keywords: Counters

Discovered in Version: 27.2010.3118

Fixed in Version: 27.2010.6102

3436317

Description: On rare occasions, when a SHARP QP exceeds the allowed amount of retries, the switch may hang due to an incorrect flow execution.

Keywords: SHARP

Discovered in Version: 27.2010.2300

Fixed in Version: 27.2010.6102

3283303/3298590

Description: In the rare event of an error burst, the link maintenance stopped working.

Keywords: Link Maintenance

Discovered in Version: 27.2010.3118

Fixed in Version: 27.2010.6064

3339363

Description: pFRN notification state machine got halted in busy-wait on all riscs due to inability to free TX credits.

Keywords: pFRN

Discovered in Version: 27.2010.3118

Fixed in Version: 27.2010.6064

3301825

Description: The firmware does not return values for the counters "PortSwLifetimeLimitDiscards" and "PortSwHOQLifetimeLimitDiscards". Support has now been added for the counters.

Keywords: Counters

Discovered in Version: 27.2010.3118

Fixed in Version: 27.2010.5042

3335002

Description: pFRN mirror v1 header pad count showed an invalid padding size.

Keywords: PFRN

Discovered in Version: 27.2010.4010

Fixed in Version: 27.2010.5042

3261861

Description: Connecting an HDR device to an NDR device with Optical cables longer than 30m causes degradation in the bandwidth.

Keywords: HDR-to-NDR

Discovered in Version: 27.2010.4102

Fixed in Version: 27.2010.5002

3269531

Description: After multiple MSPS (Management System Power Supply register) calls, the switch gets stuck.

Keywords: MSPS

Discovered in Version: 27.2010.3118

Fixed in Version: 27.2010.5002

3199650

Description: A physical link failure between switches while a SHARP job is running and utilizing the link can cause one of the switches to become invalid for further SHARP jobs. This can result in either a "No resource" response for new SHARP job requests or in jobs getting stuck.
The bug fix requires SHARP version 3.2.

Keywords: SHARP

Discovered in Version: 27.2010.4010

Fixed in Version: 27.2010.4102

3245821

Description: In case of an AR group table set request, the ARN mask is flushed for group that has an active pFRN timer.

Keywords: PFRN

Discovered in Version: 27.2010.4010

Fixed in Version: 27.2010.4102

3253717

Description: mask_force_clear_timeout timer in pFRN feature was not functional (the mask was not cleared when the timer expired).

Keywords: PFRN

Discovered in Version: 27.2010.4010

Fixed in Version: 27.2010.4102

3242209

Description: Set PFRN mad did not return error on wrong inputs in mask_clear_timer and mask_force_clear_timer fields.

Keywords: PFRN

Discovered in Version: 27.2010.4010

Fixed in Version: 27.2010.4102

3174239

Description: On rare occasions, traps were not properly repressed, which caused redundant traps to be sent multiple times.

Keywords: Traps

Discovered in Version: 27.2010.3118

Fixed in Version: 27.2010.4010

2998597

Description: Bandwidth degradation may be visible in large scale random traffic patterns (e.g., all2all and Adaptive Routing) due to wrong fast path configurations.

Keywords: Performance

Discovered in Version: 31.2008.2500

Fixed in Version: 31.2010.3004

3040232

Description: PLFT mapping for SMA port (port 0) was configured in a way that caused PLFT of FDB 0 to be used instead of PLFT of FDB 1.

Keywords: PLFT, SMA

Discovered in Version: 27.2010.2110

Fixed in Version: 27.2010.2246

2646440

Description: I2C bus got stuck in start state.

Keywords: I2C

Discovered in Release: 27.2008.2102

Fixed in Release: 27.2010.2036

2709851

Description: In some cases, traps that were sent when there is a change in link state may not be sent to SM due to wrong logic of the link state machine in the firmware.

Keywords: SM Traps

Discovered in Version: 27.2008.2500

Fixed in Release: 27.2010.1128

2635607

Description: SM timeouts on PortInfo MAD SET may occur when Operational VLs are decreased (for example, when running different SM with different op_vl configuration) due to wrong logic in firmware of buffers allocation per VL. The fix is to first handle the VLs needed to be decreased in size and then enlarge the ones needed to increase in size.

Keywords: SM, Operational VL, Timeout

Fixed in Release: 27.2008.3328

2578261

Description: In rare cases, on FR4 CMIS MMS1W50-HM, unplugging and plugging the module during link up flow may cause the link to get stuck on "Polling ."

Keywords: Cables, FR4

Discovered in Version: 27.2008.2402

Fixed in Release: 27.2008.3328

2700834

Description: A division by zero issue in uC code caused infinite loop to uC database alignment which prevents memory corruption that was a result of illegal access of neighboring lanes.

Keywords: Memory

Fixed in Release: 27.2008.3328

2627108

Description: Setting SHARP QuotaConfig with tree_if higher than 95 result with buffer overrun, and may lead to zombie jobs on the switch.

Keywords: SHARP

Discovered in Version: 27.2008.2500

Fixed in Release: 27.2008.3328

2483974

Description: Configuring split port using mlxconfig using MFT 4.15 resulted in configuring the incorrect ports on the unmanaged switch. On version 27.2008.3100, the issue was fixed. Make sure to use MFT 4.15 and above.

Keywords: MFT, Port Split

Fixed in Release: 27.2008.3328

2646158

Description: In some cases, traps that are sent when there is a change in link state may not be sent to SM due to a race between trap generation and trap repress. The solution ensures that the latest information will always be sent to SM.

Keywords: SM Traps

Discovered in Version: 27.2008.2500

Fixed in Release: 27.2008.3328

2668318

Description: In SHARP, in case of reusing a QP for son after Set Parent flow uses it as father, the father bit indication might remain set in QP and Resource Cleanup flow may fail. The solution resets the QPC entry in QPAlloc flow.

Keywords: SHARP

Discovered in Version: 27.2008.2500

Fixed in Release: 27.2008.3328

2697623

Description: In SHARP, in case of Set Parent flow, misconfiguration in the TX domain causes credits to return to the wrong hardware unit.

Keywords: SHARP

Discovered in Version: 27.2008.2500

Fixed in Release: 27.2008.3328

2712117

Description: In SHARP, switch may hang on locked semaphore due to misconfiguration in streaming aggregation TreeConfig MAD while ports are toggling.

Keywords: SHARP

Discovered in Version: 27.2008.2500

Fixed in Release: 27.2008.3328

2571800

Description: New SHARP jobs may hang after abrupt termination of SHARP jobs.

Keywords: SHARP

Discovered in Version: 27.2008.2402

Fixed in Release: 27.2008.2500

2579752

Description: Modules failed over 400KHz. The default I2C frequency has now been set to 100KHz for all modules.

Keywords: Modules, I2C

Discovered in Version: 27.2008.2102

Fixed in Release: 27.2008.2402

2439961

Description: The IsPLRMaxRetransmissionRateSupported and IsEffectiveCounterSupported counters were incorrectly added to the Virtual Port in the IB switch.

Keywords: Counters

Discovered in Version: 27.2008.2300

Fixed in Release: 27.2008.2402

2445274

Description: Packet bandwidth does not spread according to the VL Arbitration configuration in split ports.

Keywords: VL Arbitration, Split Ports

Discovered in Version: 27.2008.2102

Fixed in Version: 27.2008.2402

2441016

Description: On rare cases, SHARP jobs may fail, followed by multiple "SHARP error" traps. In cases this occurs, following jobs on the same tree may fail as well.

Keywords: SHARP

Fixed in Version: 27.2008.2402

2323467

Description: 32-bits counters per SL or VL were wrongly overflowed at 16-bits instead of 32-bits.

Keywords: Counters

Discovered in Version: 27.2008.1904

Fixed in Release: 27.2008.2300

2373063

Description: Packet bandwidth does not spread according to the VL Arbitration configuration on 4x port.

Keywords: VL Arbitration

Fixed in Release: 27.2008.2202

2384211

Description: PKEY may return with a value of zero when sending aggregation class MADs to an aggregation node.

Keywords: PKEY

Discovered in Version: 27.2008.2102

Fixed in Release: 27.2008.2202

2395304

Description: When running non-SHARP traffic, packet drop may occur when SHARP is enabled.

Keywords: SHARP

Fixed in Release: 27.2008.2202

2196422

Description: On rare occasions, due to a suboptimal configuration of the NVIDIA Rx clock tracking, a link with challenging signal integrity resulted in link failures.

Keywords: Rx clock tracking

Discovered in Version: 27.2008.0232

Fixed in Release: 27.2008.1904

1848091

Description: Although the effective BER (after FEC) is expected to meet our design targets (e.g. 10e-14 or lower), occasionally it may be higher.

Keywords: Cables

Discovered in Version: 27.2000.2708

Fixed in Release: 27.2008.0232

2073222

Description: In rare cases, HDR active copper cable link up time might be higher than expected (up to 2 minutes).

Keywords: Cables

Discovered in Version: 27.2000.3276

Fixed in Release: 27.2008.0232

2169355

Description: TCA port (ports 41/81) counter returns non-zero value since the TCA counters were not supported.

Keywords: SHARP, TCA, Port Counters

Discovered in Version: 27.2007.0618

Fixed in Release: 27.2008.0232

2136877

Description: Port Counters with "all_ports" attribute returns wrong values since the TCA counters were not supported.

Keywords: TCA, Port Counters

Discovered in Version: 27.2007.0618

Fixed in Release: 27.2008.0232

2133393

Description: On rare occasions when link is flapping or toggle by the user, the switch may hang.

Keywords: Link Flapping

Discovered in Version: 27.2007.0618

Fixed in Release: 27.2008.0232

2122186

Description: Traffic loss may be experienced during a spine failover, when two SHARP (SAT) flows are enabled.

Keywords: InfiniBand; SHARP (SAT)

Discovered in Version: 27.2007.0618

Fixed in Release: 27.2008.0232

2063786

Description: Running 2 flows in parallel is currently not functional in SHARP (SAT).

Keywords: SHARP (SAT), 2 flows

Discovered in Version: 27.2000.3276

Fixed in Release: 27.2007.0618

1972573

Description: Reading the Serial Number by the MSPS register is not functional on the new Delta PSU model.

Keywords: Delta PSU model, MSPS register

Discovered in Version: 27.2000.2708

Fixed in Release: 27.2007.0618

1970878

Description: When using NVIDIA AOC cables longer than 50m use one VL to achieve full wire speed.

Keywords: Cables

Fixed in Release: 27.2007.0618

2022524

Description: As the switch does not send auto-negotiation indication, after resetting/power cycling a ConnectX-6 HCA, some HCAs get stuck in "polling" state.

Keywords: Auto-negotiation, HCA, switch

Discovered in Version: 27.2000.2708

Fixed in Release: 27.2007.0300

1996051

Description: After performing a software reset on the switch while using an Active Copper Cable or Optics Cable, the link gets high BER and is not available for traffic forwarding.

Keywords: Cables, BER

Discovered in Version: 27.2000.2708

Fixed in Release: 27.2007.0300

2036930

Description: Degradation in throughput might be experienced when using HDR100 cables with a length of 30m and above.

Keywords: Cables, Bandwidth

Discovered in Version: 27.2000.2708

Fixed in Release: 27.2000.3276

1946287

Description: Fixed an issue that resulted in SHARP jobs getting stuck after stopping a job during SAT operation.

Keywords: SHARP

Discovered in Version: 27.2000.2306

Fixed in Release: 27.2000.2626

1778566

Description: Fixed an issue that caused the Rx buffers allocation after running OpenSM to be based on the default VLCap configuration instead of the Operational Vl configuration.

Keywords: Rx buffers allocation, OpenSM

Discovered in Version: 27.2000.2306

Fixed in Release: 27.2000.2626

1930686

Description: Fixed an issue that caused a multicast packet to be forwarded to a wrong port when the switch was configured to use the Split mode.

Keywords: Switch multicast forwarding

Discovered in Version: 27.2000.2182

Fixed in Release: 27.2000.2626

1761271

Description: CWDM4 AOM cable is currently not supported on Quantum switch systems.

Keywords: Modules/Cables

Discovered in Version: 27.2000.1400

Fixed in Release: 27.2000.2626

1713747

Description: When using splitter HDR optical cables, toggling the upper port causes the lower port to be toggled as well.

Keywords: Cables, port toggling

Discovered in Version: 27.2000.2046

Fixed in Release: 27.2000.2626

1834740

Description: Fixed an issue that resulted in high BER when using optical module with module firmware older than 37.50.316.

Keywords: Optical cables, BER, cables firmware

Discovered in Version: 27.2000.2182

Fixed in Release: 27.2000.2306

1899441

Description: Fixed an issue that caused the packets to be transmitted from a wrong output port due to a wrong configuration of the packet classification decision in the switch forwarding database cache key, that caused both AR eligible packets and AR ineligible packets to hit the same cache entry.

Keywords: Switch forwarding, Adaptive Routing

Discovered in Version: 27.2000.2046

Fixed in Release: 27.2000.2182

1885460

Description: On rare occasions, and under high SHARP load, switch SHARP operation might get stuck.

Keywords: SHARP

Discovered in Version: 27.2000.2046

Fixed in Release: 27.2000.2182

1859715

Description: The bandwidth on MFS1S00-H050E cables is 99G/s and on MFS1S00-H100E cables is 67Gb/s when connecting at HDR speed to an HDR switch.

Keywords: Cables

Discovered in Version: 27.2000.1886

Fixed in Release: 27.2000.2046

1797452

Description: A port may hang while Link-Maintenance runs on it and the second port’s link is toggled.

Keywords: Link-Maintenance, port toggling

Discovered in Version: 27.2000.1600

Fixed in Release: 27.2000.1886

1698990

Description: HDR link up time when using optical cables may take 6 minutes or more (up to 20 minutes).

Keywords: HDR, optical cables, link up times

Discovered in Version: 27.2000.1100

Fixed in Release: 27.2000.1886

1718734/

1723236/

1718645/

1710631

Description: On rare occasions, HDR link may not raise properly when using optical cables.

Keywords: HDR link

Discovered in Version: 27.2000.1012

Fixed in Release: 27.2000.1600

1774870

Description: Link flapping and packet loss during High/Low temperature changes.

Keywords: Link

Discovered in Version: 27.2000.1400

Fixed in Release: 27.2000.1600

1778837

Description: When using a copper splitter cable up to 2m length in HDR100 mode, traffic may drop.

Keywords: Cable, HDR100

Discovered in Version: 27.2000.1400

Fixed in Release: 27.2000.1600

1534459

Description: When working with 8 VLs, TP does not function due to buffers’ configuration.

Keywords: VLs, latency, performance

Discovered in Version: 27.2000.1100

Fixed in Release: 27.2000.1400

1605587

Description: Fixed an issue that cause the green port LED to blink in the same frequency regardless of the link speed rate set.

Keywords: Port LED

Discovered in Version: 27.1910.0618

Fixed in Release: 27.2000.1142

1598550

Description: Fixed an issue that prevented the port from being split when the request (command) was sent from the NV config tool.

Keywords: Split Port

Discovered in Version: 27.1910.0618

Fixed in Release: 27.1910.0620

© Copyright 2023, NVIDIA. Last updated on Sep 9, 2023.