Known Issues
See the sections for specific versions to see which issues are open in those versions.
Issue
When enabling MIG and creating a MIG partition for the GPU, there is no output returned for non-device specific fields: dcgmi dmon -e 1,2,3,4,5
Explanation
This issue affects EL 8 with:
Driver Version: 470.141.03
CUDA Version: 11.4.152
DCGM: 2.4.5
Issue
DGX A100/A800 Firmware Update Container log may show error messages such as "Unable to send RAW command (channel=0x0 netfn=0x3c lun=0x0 cmd=0xf rsp=0xd3): Destination unavailable"
This error will be displayed when running supported commands and may be safely ignored.
Issue
After rebooting, `nvsm show controllers` may display a blank serial number.
Explanation
This issue is specific to the DGX-1 platform with the MegaRAID controller and can be remedied by restarting the nvsm service after 30 minutes. To restart the service, run `systemctl restart nvsm`