Appendix—NVLink Switch SSD Firmware Update Customer Guide
This document describes a rare issue where the system SSD (NVMe device) becomes unresponsive during normal operation while traffic continues to run. It outlines the symptoms, affected components, and provides a detailed procedure for installing a firmware fix through the regular NVOS update mechanism.
During normal operation, the system SSD (NVMe device) may become unresponsive even while traffic continues to flow. When this happens, the operating system (NVOS) error logs will resemble the following output:
WARNING kernel: [
2729682.680668] nvme nvme0: I/O
132 QID
3 timeout, aborting
WARNING kernel: [
2729682.682007] nvme nvme0: Abort status:
0x0
WARNING kernel: [
2729713.399278] nvme nvme0: I/O
132 QID
3 timeout, reset controller
WARNING kernel: [
2729774.836562] nvme nvme0: I/O
21 QID
0 timeout, reset controller
ERR kernel: [
2729795.339632] nvme nvme0: Device not ready; aborting reset, CSTS=
0x1
WARNING kernel: [
2729815.909129] nvme nvme0: Removing after probe failure status: -
19
...
EXT4-fs error (device nvme0n1p3): ext4_get_inode_loc:
4513: inode #
531990: block
2097665: comm healthd: unable to read itable block
EXT4-fs error (device nvme0n1p3): ext4_get_inode_loc:
4513: inode #
534476: block
2097820: comm python3: unable to read itable block
Affected Part Number: VTPM24CEXI080-BM110006 (MEM000490)
Affected Firmware: CE00A400
Solution: Issue is fixed in the new SSD firmware version CE00A450
NVOS supports two options for updating SSD firmware:
Automatic installation of the SSD firmware as part of the NVOS upgrade—available from version 1.3.3 onward.
Manual installation using the installation script—available on any version of NVOS from 1.x.x.
In NVOS systems, SSD (NVMe) firmware is automatically installed during the standard NVOS software upgrade process. No special user action is required.
The updated SSD firmware becomes active after the system completes its routine power cycles, which occur naturally as part of the NVOS upgrade flow.
When upgrading NVOS, if the OS determines that an SSD firmware update is needed, the system will power cycle itself regardless of whether the command the user entered is for a power cycle or a cold reboot.
Prerequisites
NVOS image as part of 1.3.3 bundle or above
Verification
After the NVOS update and power cycle of the system, do the following:
Log in to the system.
Run the following command to check the SSD firmware version:
nv show platform firmware SSD
Expected output:
operational applied part-number Virtium VTPM24CEXI080-BM110006 actual-firmware CE00A450 fw-source N/A
actual-firmwareshould show CE00A450 (indicating the new firmware).
If the version matches CE00A450, the update was successful.
Prerequisites
SSH access to the system
Steps for Upgrade
(Optional) Check the current firmware version to see if the firmware version is different than CE00A450:
nv show platform firmware SSD
Example output:
operational applied part-number Virtium VTPM24CEXI080-BM110006 actual-firmware CE00A400 fw-source N/A
Upgrade to A450:
cd <directory> sudo ./deploy_fw_A450.sh
The script will automatically power cycle the system after the firmware update.
Steps for Downgrade
(Optional) Check the current firmware version to see if fw version is different then CE00A450:
nv show platform firmware SSD
Example output:
operational applied part-number Virtium VTPM24CEXI080-BM110006 actual-firmware CE00A450 fw-source N/A
Downgrade to A400:
cd <directory> sudo ./deploy_fw_A400.sh
The script will automatically power cycle the system after the firmware update.
Expected Behavior After Installation
The system will automatically power cycle after the firmware update.
Once the system is back online, the SSD should operate normally.
No additional steps are required from the customer after the update completes.
