What can I help you with?
NVIDIA UFM Enterprise User Manual v6.21.2

Fabric Validation Tab

The Fabric Validation Tab presents the available fabric validation tests and allows users to execute them and view the results as a job summary. The summary includes all errors and warnings identified during the test run.

image-2025-4-13_13-33-13-version-1-modificationdate-1748450571133-api-v2.png

Test Name

Description / Checks

Check LIDs

Detects invalid LIDs such as:

  • Zero LID

  • Duplicate LIDs

Check Links

Identifies connectivity issues where connected ports are not all in the same state (e.g., not all active).

Check Subnet Manager

Checks for SM-related errors:

  • Failed to get SMInfo MAD

  • SM Not Found

  • SM Not Correct (Incorrect master SM priority)

  • Multiple master SMs

Check Duplicate Nodes

Detects nodes with duplicated descriptions.

Check Duplicate GUIDs

Identifies duplicate GUIDs in the fabric.

Check Routing

Detects failures in retrieving routing MADs.

Check Link Speed

Validates link speed consistency. Issues include:

  • Speed mismatches between ports

  • Wrong/unsupported configuration – 'enable' not part of the 'supported'

  • Unexpected speeds

Check Link Width

Validates link width consistency. Issues include:

  • Width mismatches between ports

  • Wrong/Unsupported configurations

  • Unexpected widths

Check Partition Key

Checks for PKey errors, such as:

  • Failed retrieval of PKey tables

  • Mismatching PKeys between ports

Check Temperature

Verifies ability to retrieve temperature sensor data.

Check Cables

Detects cable-related issues:

  • Unsupported cable info

  • Failure in retrieving data (provides a reason)

Check Effective BER

Ensures the Effective Bit Error Rate (BER) does not exceed the configured threshold.

Dragonfly Topology Validation

Validates whether the fabric topology is of the Dragonfly type.

SHARP Fabric Validation

Checks for SHARP configuration correctness in the fabric.

Tree Topology Validation

Validates if the fabric follows a tree topology.

Socket Direct Mode Reporting

Displays inventory of HCAs using socket direct mode.

Validate SM Configuration

Confirms all HCAs share consistent Subnet Manager configuration.

Validate Nodes Firmware Version

Ensures all HCAs are running the latest firmware version.

Validate Switches CPLD Version

Verifies that all switches have the latest CPLD version installed.

To run a specific test, click the play button. The job will be displayed once completed.

image2022-4-28_22-48-31-version-1-modificationdate-1748450572447-api-v2.png

Note

The job will also be displayed in the Jobs window.

Device and Port GUID

Some validation tests contain data related to devices or ports like device GUID and port GUID.

Depending on that information a context menu for each related device/port can be shown.

Note

I f the data i s related to a port the context menu will contain both port and device options .

image2022-4-28_22-48-47-version-1-modificationdate-1748450572907-api-v2.png

image2022-4-28_22-49-10-version-1-modificationdate-1748450573347-api-v2.png

© Copyright 2025, NVIDIA. Last updated on Jun 3, 2025.