Fabric Validation Tab

NVIDIA UFM Enterprise User Manual v6.15.1

The Fabric Validation tab displays the fabric validation tests and gives the ability to run the test and receive/view the summary as a job output. Summary of the job contains all errors and warnings that were found during the test execution.

image2022-4-28_22-47-56-version-1-modificationdate-1701963275217-api-v2.png

Test

Description

Check Lids

Checks for bad lids. Possible lid errors are:

  • zero lid

  • lid duplication

Check Links

Checks for connectivity issues where all ports connected are not in the same state (active)

Check Subnet Manager

Checks for errors related to subnet manager. Possible SM errors are:

  • Failed to get SMInfo Mad

  • SM Not Found

  • SM Not Correct (master SM with wrong priority)

  • Many master SMs exists

Check Duplicate Nodes

Checks for duplications in nodes description

Check Duplicate Guids

Checks for GUIDs duplications

Check Routing

Checks for failures in getting routing MADs

Check Link Speed

Checks for errors related to link speed. Possible link speed errors are:

  • Different speed between ports

  • Wrong configuration – 'enable' not part of the 'supported'

  • Unexpected speed

Check Link Width

Checks for errors related to link width. Possible link width errors are:

  • Different width between ports

  • Wrong configuration – 'enable' not part of the 'supported'

  • Unexpected width

Check Partition Key

Checks for errors related to PKey. Possible PKey errors are:

  • Failed to get Pkey Tables

  • Mismatching pkeys between ports

Check Temperature

Checks for failure in getting temperature sensing.

Check Cables

Checks for errors related to cables. Possible cable errors are:

  • This device does not support cable info capability

  • Failed to get cable information (provides a reason)

Check Effective BER

Checks that the Effective BER does not exceed the threshold

Dragonfly Topology Validation

Validate if the topology is Dragonfly

SHARP Fabric Validation

Checks for SHARP Configurations in the fabric

Tree Topology Validation

Checks if the fabric is a tree topology

Socket Direct Mode Reporting

Presents the inventory of fabric HCAs that are using socket direct

To run a specific test, click the play button. The job will be displayed once completed.

image2022-4-28_22-48-31-version-1-modificationdate-1701963275527-api-v2.png

Warning

The job will also be displayed in the Jobs window.

Some validation tests contain data related to devices or ports like device GUID and port GUID.

Depending on that information a context menu for each related device/port can be shown.

Warning

I f the data i s related to a port the context menu will contain both port and device options .

image2022-4-28_22-48-47-version-1-modificationdate-1701963275973-api-v2.png

image2022-4-28_22-49-10-version-1-modificationdate-1701963276247-api-v2.png

© Copyright 2023, NVIDIA. Last updated on Dec 19, 2023.