InfiniBand Related Troubleshooting

Issue

Cause

Solution

The InfiniBand interfaces are not up after the first reboot after the installation process is completed.

Port status might be PORT_DOWN: Switch port state might be “disabled” or cable is disconnected.

Enable switch admin or connect cable.

Port status might be PORT_INITIALIZED: SM might not be running on the fabric.

Run the SM on the fabric.

Port status might be PORT_ARMED: Firmware issue.

Please contact Mellanox Support.

Ethernet interface is started instead of InfiniBand.

BMC is enabled.

Disable BMC.

The firmware version is not up-to-date.

Burn the updated version.

Note: This issue can occur when using firmware version 2.40.5000. To avoid it, upgrade to version 2.40.5030 and above.

© Copyright 2023, NVIDIA. Last updated on Oct 26, 2023.