NVML API Reference Guide :: GPU Deployment and Management Documentation

NVML API Reference Guide (PDF) - vR550 (older) - Last updated March 12, 2024 - Send Feedback

2. Known Issues

This is a list of known NVML issues in the current driver:

On systems where GPUs are NUMA nodes, the accuracy of FB memory utilization provided by NVML depends on the memory accounting of the operating system. This is because FB memory is managed by the operating system instead of the NVIDIA GPU driver. Typically, pages allocated from FB memory are not released even after the process terminates to enhance performance. In scenarios where the operating system is under memory pressure, it may resort to utilizing FB memory. Such actions can result in discrepancies in the accuracy of memory reporting.
On Linux, GPU Reset can't be triggered when there is pending GPU Operation Mode (GOM) change.
On Linux, GPU Reset may not successfully change pending ECC mode. A full reboot may be required to enable the mode change.
nvmlAccountingStats supports only one process per GPU at a time (CUDA proxy server counts as one process).
nvmlAccountingStats_t.time reports time and utilization values starting from cuInit till process termination. Future driver versions might change this behavior slightly and account process only from cuCtxCreate till cuCtxDestroy.
On GPUs from Fermi family, current P0 clocks (reported by nvmlDeviceGetClockInfo) can differ from max clocks by a few MHz.

NVML API Reference Guide (PDF) - vR550 (older) - Last updated March 12, 2024 - Send Feedback