General Issues
Issue |
Cause |
Solution |
The system panics when it is booted with a failed adapter installed. |
Malfunction hardware component |
|
NVIDIA adapter is not identified as a PCI device. |
PCI slot or adapter PCI connector dysfunctionality |
|
NVIDIA adapters are not installed in the system. |
Misidentification of the NVIDIA adapter installed |
Run the command below and check NVIDIA’s MAC to identify the NVIDIA adapter installed. lspci | grep Mellanox' or 'lspci -d 15b3: Note: NVIDIA MACs start with: 00:02:C9:xx:xx:xx, 00:25:8B:xx:xx:xx or F4:52:14:xx:xx:xx" |
Insufficient memory to be used by udev upon OS boot. |
udev is designed to fork() new process for each event it receives so it could han- dle many events in parallel, and each udev instance consumes some RAM memory. |
Limit the udev instances running simultaneously per boot by adding udev.children-max=<number> to the kernel command line in grub. |
Operating system running from root file system located on a remote storage (over NVIDIA devices), hang during reboot/shutdown (errors such as “No such file or directory” will appear). |
The mlnx-en.d service script is called using the ‘stop’ option by the operating system. This option unloads the driver stack. Therefore, the OS root file system dis- appears before the reboot/ shutdown procedure is completed, leaving the OS in a hang state. |
Disable the openibd ‘stop’ option by setting 'ALLOW_STOP=no' in /etc/mlnx-en.conf configuration file. |