NVIDIA Quantum-2 Firmware Release Notes v31.2012.1068

Bug Fixes History

The following table provides a list of bugs fixed in version 31.2012.1024.

Internal Ref.

Issues

3554182

Description: Link does not raise with 2nd source MMS4X00-NS transceivers.

Keywords: Cables, link up

Discovered in Version: 31.2010.6064

Fixed in Version: 31.2012.1024

3538638

Description: The message of code 57 in the PDDR Troubleshooting information page was incorrect.

Keywords: Link Diagnostics

Discovered in Version: 31.2010.6064

Fixed in Version: 31.2012.1024

3407038

Description: An unresponsive PSU client can cause the SDA I2C line to hang.

Keywords: I2C

Discovered in Version: 31.2010.6064

Fixed in Version: 31.2012.1024

3477039

Description: Wrong RTT value is exposed under PRTL PRM.

Keywords: Registers, RTT Value

Discovered in Version: 31.2010.6064

Fixed in Version: 31.2012.1024

3481394

Description: When trying to choose the threshold for the Fast Recovery feature (BER Config), it is possible that threshold 0 will be loaded.

Keywords: Fast Recovery, BER Configuration

Discovered in Version: 31.2010.6064

Fixed in Version: 31.2012.1024

3499997

Description: In some cases, the combination of SHARP SAT traffic and SHARP MADs can cause the switch to get stuck.

Keywords: SHARP

Discovered in Version: 31.2010.4210

Fixed in Version: 31.2012.1024

3451519

Description: When using ibdiagnet, an incorrect module alarm type was reported.

Keywords: ibdiagnet, Module Temperature Alarm Type

Discovered in Version: 31.2010.5108

Fixed in Version: 31.2012.1024

The following table provides a list of bugs fixed in previous versions. For a list of bug fixed from the current version, see Bug Fixes.

Internal Ref.

Issues

3326692

Description: Wrap-around of the time_since_last_clear counter caused incorrect reporting of counters on the port.

Keywords: Counters

Discovered in Version: 31.2010.3118

Fixed in Version: 31.2010.6102

3389432

Description: The flint burning firmware process might take longer than expected, possibly leading to timeouts in SM and logical links drops by the SM, which, in turn, may lead to failure of the flint burn command.

Keywords: SM, Timeout, Flint, Failure

Discovered in Version: 31.2010.6064

Fixed in Version: 31.2010.6102

3339363

Description: pFRN notification state machine got halted in busy-wait on all riscs due to inability to free TX credits.

Keywords: pFRN

Discovered in Version: 31.2010.3118

Fixed in Version: 31.2010.6064

3393378

Description: In some cases, pFRN configuration over multi-SWID caused out-of-bound access to an array and overran FLID configuration.

Keywords: pFRN

Fixed in Version: 31.2010.6064

3342918

Description: On rare occasions, the port might get stuck (in all speeds) during the link up flow when using optical modules.

Keywords: Port Link Up, Port Toggling, Optical Modules

Fixed in Version: 31.2010.6064

3395821

Description: Bandwidth is lower than expected on MMS4X00-NL-QP1 cable.

Keywords: MMS4X00-NL-QP1, Bandwidth

Fixed in Version: 31.2010.6064

2824249

Description: After a firmware update failure, the bad image was not erased.

Keywords: Installation, Firmware

Discovered in Version: 31.2010.2036

Fixed in Version: 31.2010.6064

3362685

Description: In QM9700 systems, when a transceiver module is plugged in when only one of the optic cables is connected (while the second cable is disconnected), the port LED may be incorrectly displayed on the disconnected side.

Keywords: Port LED, Optic Cables

Discovered in Version: 31.2010.4102

Fixed in Version: 31.2010.5108

3377608

Description: When operating in dynamic trees allocation mode, MAD error responses might be received in libsharp.

Keywords: sharp_am, libsharp

Fixed in Version: 31.2010.5108

3362200

Description: In rare cases that involve stress of traffic, unexpected hardware fast path behavior may occur, possibly leading to the switch firmware hanging when toggling the ports.

Keywords: Turbo Path

Discovered in Version: 31.2010.5002

Fixed in Version: 31.2010.5108

3301825

Description: The firmware does not return values for the counters "PortSwLifetimeLimitDiscards" and "PortSwHOQLifetimeLimitDiscards". Support has now been added for the counters.

Keywords: Counters

Discovered in Version: 31.2010.3118

Fixed in Version: 31.2010.5042

3335002

Description: pFRN mirror v1 header pad count showed an invalid padding size.

Keywords: PFRN

Discovered in Version: 31.2010.4010

Fixed in Version: 31.2010.5042

3269531

Description: After multiple MSPS (Management System Power Supply register) calls, the switch gets stuck.

Keywords: MSPS

Discovered in Version: 27.2010.3118

Fixed in Version: 27.2010.5002

3267152

Description: On NDR devices, when collecting BER data, the peer falls, causing the switch to hang.

Keywords: BER COLLECT

Discovered in Version: 31.2010.4102

Fixed in Version: 31.2010.5002

3261861

Description: Connecting an HDR device to an NDR device with Optical cables longer than 30m causes degradation in the bandwidth.

Keywords: HDR-to-NDR

Discovered in Version: 31.2010.4102

Fixed in Version: 31.2010.5002

2974424

Description: Currently, on cables that perform polarity inversion there is no link up.

Keywords: Cables, Polarity Inversion

Discovered in Version: 31.2010.3118

Fixed in Version: 31.2010.5002

3199650

Description: A physical link failure between switches while a SHARP job is running and utilizing the link can cause one of the switches to become invalid for further SHARP jobs. This can result in either a "No resource" response for new SHARP job requests or in jobs getting stuck.
The bug fix requires SHARP version 3.2.

Keywords: SHARP

Discovered in Version: 31.2010.4010

Fixed in Version: 31.2010.4102

3245821

Description: In case of an AR group table set request, the ARN mask is flushed for group that has an active pFRN timer.

Keywords: PFRN

Discovered in Version: 31.2010.4010

Fixed in Version: 31.2010.4102

3253717

Description: mask_force_clear_timeout timer in pFRN feature was not functional (the mask was not cleared when the timer expired).

Keywords: PFRN

Discovered in Version: 31.2010.4010

Fixed in Version: 31.2010.4102

3242209

Description: Set PFRN mad did not return error on wrong inputs in mask_clear_timer and mask_force_clear_timer fields.

Keywords: PFRN

Discovered in Version: 31.2010.4010

Fixed in Version: 31.2010.4102

3143685

Description: The switch does not return SN or PN when trying to call via mlxlink or ibdiagnet.

Keywords: SN, PN, mlxlink, ibdiagnet

Discovered in Version: 31.2010.2300

Fixed in Version: 31.2010.4010

3174239

Description: On rare occasions, traps were not properly repressed, which caused redundant traps to be sent multiple times.

Keywords: Traps

Discovered in Version: 31.2010.3118

Fixed in Version: 31.2010.4010

3002314

Description: On rare occasion, when port is configured to mloop toggle may cause link to not rise.

Keywords: Optic in Mloop

Discovered in Version: 31.2010.2110

Fixed in Version: 31.2010.3118

3127727

Description: On rare occasion, when egress port is split to two, the egress port may get stuck due to wrong Fast Path configuration.

Keywords: Switch Hang, Fast Path, Split

Discovered in Version: 31.2010.3004

Fixed in Version: 31.2010.3118

3082569

Description: In some traffic patterns involving small packets, the PortRcvErrors counter may mistakenly count events of local physical errors due to an internal flow in the hardware that involves link packets.

Keywords: Counters

Discovered in Version: 31.2010.2246

Fixed in Version: 31.2010.3004

3085427

Description: On rare occasions, SHARP semaphore may remain locked on a port following an event of a port link down or an application crash.

Keywords: SHARPv3

Discovered in Version: 31.2010.2036

Fixed in Version: 31.2010.3004

3011581

Description: On rare occasions, job failures with SharpError trap may be experienced as a result of previous jobs that have failed.

Keywords: SHARPv3

Discovered in Version: 31.2010.2036

Fixed in Version: 31.2010.3004

3000602

Description: After disconnecting MMS4X00-NL* cable and connecting Ultron cable to the same port, ports fails to link up.

Keywords: Cables

Discovered in Version: 31.2010.2110

Fixed in Version: 31.2010.2300

3060122

Description: In the event of link fault of a link between root switch and non-root switch during the run of a job, the next job run on the non-root switch may fail.

Keywords: SHARPv3

Discovered in Version: 31.2010.2036

Fixed in Version: 31.2010.2300

2923464

Description: When using MMS4X00-NL Optical module, on rare occasions port that is in NDR speed may get stuck and stay in Polling state.

Keywords: NDR, Optical Module

Discovered in Version: 31.2010.1404

Fixed in Version: 31.2010.2246

2859363

Description: When using NVIDIA Quantum-2 systems in Auto-Neg mode, NDR speed in one lane (1x) is not supported.

Keywords: Auto-Negotiation

Discovered in Version: 31.2010.1310

Fixed in Version: 31.2010.2246

3033131

Description: The number of flows changed from 2 to 1, as intended.

Keywords: SHARPv3

Discovered in Version: 31.2010.2110

Fixed in Version: 31.2010.2246

2972388

Description: Running of concurrent jobs may lead to states where jobs unexpectedly terminate or get stuck.

Keywords: SHARPv3

Discovered in Version: 31.2010.2036

Fixed in Version: 31.2010.2110

2982113

Description: On rare occasions, job resource cleanup may fail.

Keywords: SHARPv3

Discovered in Version: 31.2010.2036

Fixed in Version: 31.2010.2110

2971339

Description: During high load scenarios, performance degradation may be experienced.

Keywords: SHARPv3

Discovered in Version: 31.2010.2036

Fixed in Version: 31.2010.2110

2849215

Description: On NVIDIA Quantum-2 switches, when working with MFA7U10-H0xx cables, if one of the ports in a cage is disabled at the time of initialization by user configuration, reenabling the port will require toggling the link (i.e. enable → disable → enable).

Keywords: NVIDIA Quantum-2, Cables

Discovered in Version: 31.2010.1310

Fixed in Version: 31.2010.2036

2890632

Description: On NVIDIA Quantum-2 systems, changing the Optical module rate was not allowed.

Keywords: Optical Modules

Discovered in Version: 31.2010.1310

Fixed in Version: 31.2010.2036

2885798

Description: In NVIDIA Quantum-2 systems, effective errors may occur with short Copper cable MCP4Y10-N00B.

Workaround: N/A

Discovered in Version: 31.2010.1310

Fixed in Version: 31.2010.2036

2910161

Description: In auto-negotiation flow, using copper cables when toggling both port's sides may cause the port to get stuck on rare occasions.

Keywords: Auto-Negotiation, Copper Cables

Discovered in Version: 31.2010.1310

Fixed in Version: 31.2010.2036

© Copyright 2023, NVIDIA. Last updated on Jun 16, 2024.