NVIDIA NVOS User Manual for NVLink Switches v25.02.1884

Chassis Management

The chassis manager provides the user access to the following information:

Accessible Parameters

Description

switch temperatures

Displays system’s temperature

switch leakage

Displays leakage sensors' status

fan unit

Displays system fans’ status

power unit

Displays system power consumers

Flash memory

Displays information about system memory utilization.

Additionally, it monitors:

  • AC power to the PSUs

  • DC power out from the PSUs

  • Chassis failures

  • Leakage detection from the switch tray

The system health monitor scans the system to decide whether or not the system is healthy. When the monitor discovers that one of the system's modules (leaf, spine, fan, or power supply) is in an unhealthy state or has returned from an unhealthy state, it notifies the users through the following methods:

  • System logs—accessible to the user at any time as they are saved permanently on the system

  • Status LEDs—changed by the system health monitor when an error is found in the system and is resolved

The system will have 6 leakage sensors located in different locations. Once sensors detect leakage, NVOS will publish an event immediately and update system health accordingly. User will be able to see on which sensor it was detected using the 'nv show platform environment leakage' show command. Once leakage is redeemed, the sensors can re-arm automatically without the need to clear the sensors' leak state and it takes up to 1 hour.

© Copyright 2025, NVIDIA. Last updated on Mar 3, 2025.