Troubleshooting#
Hints#
If any issues need to be debugged, the following environment variables are very useful:
CUSOLVERMP_LOG_LEVEL environment variable to enable detailed logging of cuSOLVERMp operations
NCCL_DEBUG. NCCL provides extensive debugging capabilities through environment variables such as NCCL_DEBUG. For a comprehensive list of NCCL debugging options, refer to the NCCL Environment Variables documentation.
Known Issues#
cuSOLVERMp versions older than 0.7.0: Some users may face hangs due to lazy initialization of NCCL in UCC. To disable the lazy NCCL initialization, please set
UCC_TL_NCCL_LAZY_INIT
environment variable tono
.cuSOLVERMp versions older than 0.7.0: Some users may see errors with HPC-X v2.18 caused by a clash of UCC being initialized in OMPI and cuSOLVERMp. To disable UCC initialization in OMPI, please set
OMPI_MCA_coll_ucc_enable
environment variable to0
.