InfiniBand Related Troubleshooting

Linux Kernel Upstream Release Notes v6.5

Issue

Cause

Solution

The InfiniBand interfaces are not up after the first reboot after the installation process is completed.

Port status might be PORT_DOWN: Switch port state might be “disabled” or cable is disconnected.

Enable switch admin or connect cable.

Port status might be PORT_INITIALIZED: SM might not be running on the fabric.

Run the SM on the fabric.

Port status might be PORT_ARMED: Firmware issue.

Please contact Mellanox Support.

Ethernet interface is started instead of InfiniBand.

BMC is enabled.

Disable BMC.

The firmware version is not up-to-date.

Burn the updated version.

Note: This issue can occur when using firmware version 2.40.5000. To avoid it, upgrade to version 2.40.5030 and above.

© Copyright 2023, NVIDIA. Last updated on May 23, 2023.