NVML API Reference Guide :: GPU Deployment and Management Documentation

NVML API Reference Guide (PDF) - vR515 (older) - Last updated May 4, 2022 - Send Feedback

2.16. Device Commands

This chapter describes NVML operations that change the state of the device. Each of these requires root/admin access. Non-admin users will see an NVML_ERROR_NO_PERMISSION error code when invoking any of these methods.

Functions

nvmlReturn_t nvmlDeviceClearEccErrorCounts ( nvmlDevice_t device, nvmlEccCounterType_t counterType )
nvmlReturn_t nvmlDeviceGetClkMonStatus ( nvmlDevice_t device, nvmlClkMonStatus_t* status )
nvmlReturn_t nvmlDeviceResetGpuLockedClocks ( nvmlDevice_t device )
nvmlReturn_t nvmlDeviceResetMemoryLockedClocks ( nvmlDevice_t device )
nvmlReturn_t nvmlDeviceSetAPIRestriction ( nvmlDevice_t device, nvmlRestrictedAPI_t apiType, nvmlEnableState_t isRestricted )
nvmlReturn_t nvmlDeviceSetApplicationsClocks ( nvmlDevice_t device, unsigned int memClockMHz, unsigned int graphicsClockMHz )
nvmlReturn_t nvmlDeviceSetComputeMode ( nvmlDevice_t device, nvmlComputeMode_t mode )
nvmlReturn_t nvmlDeviceSetDriverModel ( nvmlDevice_t device, nvmlDriverModel_t driverModel, unsigned int flags )
nvmlReturn_t nvmlDeviceSetEccMode ( nvmlDevice_t device, nvmlEnableState_t ecc )
nvmlReturn_t nvmlDeviceSetGpuLockedClocks ( nvmlDevice_t device, unsigned int minGpuClockMHz, unsigned int maxGpuClockMHz )
nvmlReturn_t nvmlDeviceSetGpuOperationMode ( nvmlDevice_t device, nvmlGpuOperationMode_t mode )
nvmlReturn_t nvmlDeviceSetMemoryLockedClocks ( nvmlDevice_t device, unsigned int minMemClockMHz, unsigned int maxMemClockMHz )
nvmlReturn_t nvmlDeviceSetPersistenceMode ( nvmlDevice_t device, nvmlEnableState_t mode )
nvmlReturn_t nvmlDeviceSetPowerManagementLimit ( nvmlDevice_t device, unsigned int limit )

Functions

nvmlReturn_t nvmlDeviceClearEccErrorCounts ( nvmlDevice_t device, nvmlEccCounterType_t counterType )

Parameters

device: The identifier of the target device
counterType: Flag that indicates which type of errors should be cleared.

Returns

NVML_SUCCESS if the error counts were cleared
NVML_ERROR_UNINITIALIZED if the library has not been successfully initialized
NVML_ERROR_INVALID_ARGUMENT if device is invalid or counterType is invalid
NVML_ERROR_NOT_SUPPORTED if the device does not support this feature
NVML_ERROR_NO_PERMISSION if the user doesn't have permission to perform this operation
NVML_ERROR_GPU_IS_LOST if the target GPU has fallen off the bus or is otherwise inaccessible
NVML_ERROR_UNKNOWN on any unexpected error

Description

Clear the ECC error and other memory error counts for the device.

For Kepler or newer fully supported devices. Only applicable to devices with ECC. Requires NVML_INFOROM_ECC version 2.0 or higher to clear aggregate location-based ECC counts. Requires NVML_INFOROM_ECC version 1.0 or higher to clear all other ECC counts. Requires root/admin permissions. Requires ECC Mode to be enabled.

Sets all of the specified ECC counters to 0, including both detailed and total counts.

This operation takes effect immediately.

See nvmlMemoryErrorType_t for details on available counter types.

See also:

nvmlReturn_t nvmlDeviceGetClkMonStatus ( nvmlDevice_t device, nvmlClkMonStatus_t* status )

Parameters

device: The identifier of the target device
status: Reference in which to return the clkmon fault status

Returns

NVML_SUCCESS if status has been set
NVML_ERROR_UNINITIALIZED if the library has not been successfully initialized
NVML_ERROR_INVALID_ARGUMENT if device is invalid or status is NULL
NVML_ERROR_NOT_SUPPORTED if the device does not support this feature
NVML_ERROR_GPU_IS_LOST if the target GPU has fallen off the bus or is otherwise inaccessible
NVML_ERROR_UNKNOWN on any unexpected error

Description

Retrieves the frequency monitor fault status for the device.

For Ampere or newer fully supported devices. Requires root user.

See nvmlClkMonStatus_t for details on decoding the status output.

See also:

nvmlDeviceGetClkMonStatus()

nvmlReturn_t nvmlDeviceResetGpuLockedClocks ( nvmlDevice_t device )

Parameters

device: The identifier of the target device

Returns

NVML_SUCCESS if new settings were successfully set
NVML_ERROR_UNINITIALIZED if the library has not been successfully initialized
NVML_ERROR_INVALID_ARGUMENT if device is invalid
NVML_ERROR_NOT_SUPPORTED if the device does not support this feature
NVML_ERROR_GPU_IS_LOST if the target GPU has fallen off the bus or is otherwise inaccessible
NVML_ERROR_UNKNOWN on any unexpected error

Description

Resets the gpu clock to the default value

This is the gpu clock that will be used after system reboot or driver reload. Default values are idle clocks, but the current values can be changed using nvmlDeviceSetApplicationsClocks.

See also:

nvmlDeviceSetGpuLockedClocks

For Volta or newer fully supported devices.

nvmlReturn_t nvmlDeviceResetMemoryLockedClocks ( nvmlDevice_t device )

Parameters

device: The identifier of the target device

Returns

NVML_SUCCESS if new settings were successfully set
NVML_ERROR_UNINITIALIZED if the library has not been successfully initialized
NVML_ERROR_INVALID_ARGUMENT if device is invalid
NVML_ERROR_NOT_SUPPORTED if the device does not support this feature
NVML_ERROR_GPU_IS_LOST if the target GPU has fallen off the bus or is otherwise inaccessible
NVML_ERROR_UNKNOWN on any unexpected error

Description

Resets the memory clock to the default value

This is the memory clock that will be used after system reboot or driver reload. Default values are idle clocks, but the current values can be changed using nvmlDeviceSetApplicationsClocks.

See also:

nvmlDeviceSetMemoryLockedClocks

For Ampere or newer fully supported devices.

nvmlReturn_t nvmlDeviceSetAPIRestriction ( nvmlDevice_t device, nvmlRestrictedAPI_t apiType, nvmlEnableState_t isRestricted )

Parameters

device: The identifier of the target device
apiType: Target API type for this operation
isRestricted: The target restriction

Returns

NVML_SUCCESS if isRestricted has been set
NVML_ERROR_UNINITIALIZED if the library has not been successfully initialized
NVML_ERROR_INVALID_ARGUMENT if device is invalid or apiType incorrect
NVML_ERROR_NOT_SUPPORTED if the device does not support changing API restrictions or the device does not support the feature that api restrictions are being set for (E.G. Enabling/disabling auto boosted clocks is not supported by the device)
NVML_ERROR_NO_PERMISSION if the user doesn't have permission to perform this operation
NVML_ERROR_GPU_IS_LOST if the target GPU has fallen off the bus or is otherwise inaccessible
NVML_ERROR_UNKNOWN on any unexpected error

Description

Changes the root/admin restructions on certain APIs. See nvmlRestrictedAPI_t for the list of supported APIs. This method can be used by a root/admin user to give non-root/admin access to certain otherwise-restricted APIs. The new setting lasts for the lifetime of the NVIDIA driver; it is not persistent. See nvmlDeviceGetAPIRestriction to query the current restriction settings.

For Kepler or newer fully supported devices. Requires root/admin permissions.

See also:

nvmlRestrictedAPI_t

nvmlReturn_t nvmlDeviceSetApplicationsClocks ( nvmlDevice_t device, unsigned int memClockMHz, unsigned int graphicsClockMHz )

Parameters

device: The identifier of the target device
memClockMHz: Requested memory clock in MHz
graphicsClockMHz: Requested graphics clock in MHz

Returns

NVML_SUCCESS if new settings were successfully set
NVML_ERROR_UNINITIALIZED if the library has not been successfully initialized
NVML_ERROR_INVALID_ARGUMENT if device is invalid or memClockMHz and graphicsClockMHz is not a valid clock combination
NVML_ERROR_NO_PERMISSION if the user doesn't have permission to perform this operation
NVML_ERROR_NOT_SUPPORTED if the device doesn't support this feature
NVML_ERROR_GPU_IS_LOST if the target GPU has fallen off the bus or is otherwise inaccessible
NVML_ERROR_UNKNOWN on any unexpected error

Description

Set clocks that applications will lock to.

Sets the clocks that compute and graphics applications will be running at. e.g. CUDA driver requests these clocks during context creation which means this property defines clocks at which CUDA applications will be running unless some overspec event occurs (e.g. over power, over thermal or external HW brake).

Can be used as a setting to request constant performance.

On Pascal and newer hardware, this will automatically disable automatic boosting of clocks.

On K80 and newer Kepler and Maxwell GPUs, users desiring fixed performance should also call nvmlDeviceSetAutoBoostedClocksEnabled to prevent clocks from automatically boosting above the clock value being set.

For Kepler or newer non-GeForce fully supported devices and Maxwell or newer GeForce devices. Requires root/admin permissions.

See nvmlDeviceGetSupportedMemoryClocks and nvmlDeviceGetSupportedGraphicsClocks for details on how to list available clocks combinations.

After system reboot or driver reload applications clocks go back to their default value. See nvmlDeviceResetApplicationsClocks.

nvmlReturn_t nvmlDeviceSetComputeMode ( nvmlDevice_t device, nvmlComputeMode_t mode )

Parameters

device: The identifier of the target device
mode: The target compute mode

Returns

NVML_SUCCESS if the compute mode was set
NVML_ERROR_UNINITIALIZED if the library has not been successfully initialized
NVML_ERROR_INVALID_ARGUMENT if device is invalid or mode is invalid
NVML_ERROR_NOT_SUPPORTED if the device does not support this feature
NVML_ERROR_NO_PERMISSION if the user doesn't have permission to perform this operation
NVML_ERROR_GPU_IS_LOST if the target GPU has fallen off the bus or is otherwise inaccessible
NVML_ERROR_UNKNOWN on any unexpected error

Description

Set the compute mode for the device.

For all products. Requires root/admin permissions.

The compute mode determines whether a GPU can be used for compute operations and whether it can be shared across contexts.

This operation takes effect immediately. Under Linux it is not persistent across reboots and always resets to "Default". Under windows it is persistent.

Under windows compute mode may only be set to DEFAULT when running in WDDM

Note:

On MIG-enabled GPUs, compute mode would be set to DEFAULT and changing it is not supported.

See nvmlComputeMode_t for details on available compute modes.

See also:

nvmlDeviceGetComputeMode()

nvmlReturn_t nvmlDeviceSetDriverModel ( nvmlDevice_t device, nvmlDriverModel_t driverModel, unsigned int flags )

Parameters

device: The identifier of the target device
driverModel: The target driver model
flags: Flags that change the default behavior

Returns

NVML_SUCCESS if the driver model has been set
NVML_ERROR_UNINITIALIZED if the library has not been successfully initialized
NVML_ERROR_INVALID_ARGUMENT if device is invalid or driverModel is invalid
NVML_ERROR_NOT_SUPPORTED if the platform is not windows or the device does not support this feature
NVML_ERROR_NO_PERMISSION if the user doesn't have permission to perform this operation
NVML_ERROR_GPU_IS_LOST if the target GPU has fallen off the bus or is otherwise inaccessible
NVML_ERROR_UNKNOWN on any unexpected error

Description

Set the driver model for the device.

For Fermi or newer fully supported devices. For windows only. Requires root/admin permissions.

On Windows platforms the device driver can run in either WDDM or WDM (TCC) mode. If a display is attached to the device it must run in WDDM mode.

It is possible to force the change to WDM (TCC) while the display is still attached with a force flag (nvmlFlagForce). This should only be done if the host is subsequently powered down and the display is detached from the device before the next reboot.

This operation takes effect after the next reboot.

Windows driver model may only be set to WDDM when running in DEFAULT compute mode.

Change driver model to WDDM is not supported when GPU doesn't support graphics acceleration or will not support it after reboot. See nvmlDeviceSetGpuOperationMode.

See nvmlDriverModel_t for details on available driver models. See nvmlFlagDefault and nvmlFlagForce

See also:

nvmlDeviceGetDriverModel()

nvmlReturn_t nvmlDeviceSetEccMode ( nvmlDevice_t device, nvmlEnableState_t ecc )

Parameters

device: The identifier of the target device
ecc: The target ECC mode

Returns

NVML_SUCCESS if the ECC mode was set
NVML_ERROR_UNINITIALIZED if the library has not been successfully initialized
NVML_ERROR_INVALID_ARGUMENT if device is invalid or ecc is invalid
NVML_ERROR_NOT_SUPPORTED if the device does not support this feature
NVML_ERROR_NO_PERMISSION if the user doesn't have permission to perform this operation
NVML_ERROR_GPU_IS_LOST if the target GPU has fallen off the bus or is otherwise inaccessible
NVML_ERROR_UNKNOWN on any unexpected error

Description

Set the ECC mode for the device.

For Kepler or newer fully supported devices. Only applicable to devices with ECC. Requires NVML_INFOROM_ECC version 1.0 or higher. Requires root/admin permissions.

The ECC mode determines whether the GPU enables its ECC support.

This operation takes effect after the next reboot.

See nvmlEnableState_t for details on available modes.

See also:

nvmlDeviceGetEccMode()

nvmlReturn_t nvmlDeviceSetGpuLockedClocks ( nvmlDevice_t device, unsigned int minGpuClockMHz, unsigned int maxGpuClockMHz )

Parameters

device: The identifier of the target device
minGpuClockMHz: Requested minimum gpu clock in MHz
maxGpuClockMHz: Requested maximum gpu clock in MHz

Returns

NVML_SUCCESS if new settings were successfully set
NVML_ERROR_UNINITIALIZED if the library has not been successfully initialized
NVML_ERROR_INVALID_ARGUMENT if device is invalid or minGpuClockMHz and maxGpuClockMHz is not a valid clock combination
NVML_ERROR_NO_PERMISSION if the user doesn't have permission to perform this operation
NVML_ERROR_NOT_SUPPORTED if the device doesn't support this feature
NVML_ERROR_GPU_IS_LOST if the target GPU has fallen off the bus or is otherwise inaccessible
NVML_ERROR_UNKNOWN on any unexpected error

Description

Set clocks that device will lock to.

Sets the clocks that the device will be running at to the value in the range of minGpuClockMHz to maxGpuClockMHz. Setting this will supercede application clock values and take effect regardless if a cuda app is running. See /ref nvmlDeviceSetApplicationsClocks

Can be used as a setting to request constant performance.

This can be called with a pair of integer clock frequencies in MHz, or a pair of /ref nvmlClockLimitId_t values. See the table below for valid combinations of these values.

If one arg takes one of these values, the other must be one of these values as well. Mixed numeric and symbolic calls return NVML_ERROR_INVALID_ARGUMENT.

Requires root/admin permissions.

After system reboot or driver reload applications clocks go back to their default value. See nvmlDeviceResetGpuLockedClocks.

For Volta or newer fully supported devices.

nvmlReturn_t nvmlDeviceSetGpuOperationMode ( nvmlDevice_t device, nvmlGpuOperationMode_t mode )

Parameters

device: The identifier of the target device
mode: Target GOM

Returns

NVML_SUCCESS if mode has been set
NVML_ERROR_UNINITIALIZED if the library has not been successfully initialized
NVML_ERROR_INVALID_ARGUMENT if device is invalid or mode incorrect
NVML_ERROR_NOT_SUPPORTED if the device does not support GOM or specific mode
NVML_ERROR_NO_PERMISSION if the user doesn't have permission to perform this operation
NVML_ERROR_GPU_IS_LOST if the target GPU has fallen off the bus or is otherwise inaccessible
NVML_ERROR_UNKNOWN on any unexpected error

Description

Sets new GOM. See nvmlGpuOperationMode_t for details.

For GK110 M-class and X-class Tesla products from the Kepler family. Modes NVML_GOM_LOW_DP and NVML_GOM_ALL_ON are supported on fully supported GeForce products. Not supported on Quadro and Tesla C-class products. Requires root/admin permissions.

Changing GOMs requires a reboot. The reboot requirement might be removed in the future.

Compute only GOMs don't support graphics acceleration. Under windows switching to these GOMs when pending driver model is WDDM is not supported. See nvmlDeviceSetDriverModel.

See also:

nvmlGpuOperationMode_t

nvmlDeviceGetGpuOperationMode

nvmlReturn_t nvmlDeviceSetMemoryLockedClocks ( nvmlDevice_t device, unsigned int minMemClockMHz, unsigned int maxMemClockMHz )

Parameters

device: The identifier of the target device
minMemClockMHz: Requested minimum memory clock in MHz
maxMemClockMHz: Requested maximum memory clock in MHz

Returns

NVML_SUCCESS if new settings were successfully set
NVML_ERROR_UNINITIALIZED if the library has not been successfully initialized
NVML_ERROR_INVALID_ARGUMENT if device is invalid or minGpuClockMHz and maxGpuClockMHz is not a valid clock combination
NVML_ERROR_NO_PERMISSION if the user doesn't have permission to perform this operation
NVML_ERROR_NOT_SUPPORTED if the device doesn't support this feature
NVML_ERROR_GPU_IS_LOST if the target GPU has fallen off the bus or is otherwise inaccessible
NVML_ERROR_UNKNOWN on any unexpected error

Description

Set memory clocks that device will lock to.

Sets the device's memory clocks to the value in the range of minMemClockMHz to maxMemClockMHz. Setting this will supersede application clock values and take effect regardless of whether a cuda app is running. See /ref nvmlDeviceSetApplicationsClocks

Can be used as a setting to request constant performance.

Requires root/admin permissions.

After system reboot or driver reload applications clocks go back to their default value. See nvmlDeviceResetMemoryLockedClocks.

For Ampere or newer fully supported devices.

nvmlReturn_t nvmlDeviceSetPersistenceMode ( nvmlDevice_t device, nvmlEnableState_t mode )

Parameters

device: The identifier of the target device
mode: The target persistence mode

Returns

NVML_SUCCESS if the persistence mode was set
NVML_ERROR_UNINITIALIZED if the library has not been successfully initialized
NVML_ERROR_INVALID_ARGUMENT if device is invalid or mode is invalid
NVML_ERROR_NOT_SUPPORTED if the device does not support this feature
NVML_ERROR_NO_PERMISSION if the user doesn't have permission to perform this operation
NVML_ERROR_GPU_IS_LOST if the target GPU has fallen off the bus or is otherwise inaccessible
NVML_ERROR_UNKNOWN on any unexpected error

Description

Set the persistence mode for the device.

For all products. For Linux only. Requires root/admin permissions.

The persistence mode determines whether the GPU driver software is torn down after the last client exits.

This operation takes effect immediately. It is not persistent across reboots. After each reboot the persistence mode is reset to "Disabled".

See nvmlEnableState_t for available modes.

After calling this API with mode set to NVML_FEATURE_DISABLED on a device that has its own NUMA memory, the given device handle will no longer be valid, and to continue to interact with this device, a new handle should be obtained from one of the nvmlDeviceGetHandleBy*() APIs. This limitation is currently only applicable to devices that have a coherent NVLink connection to system memory.

See also:

nvmlDeviceGetPersistenceMode()

nvmlReturn_t nvmlDeviceSetPowerManagementLimit ( nvmlDevice_t device, unsigned int limit )

Parameters

device: The identifier of the target device
limit: Power management limit in milliwatts to set

Returns

NVML_SUCCESS if limit has been set
NVML_ERROR_UNINITIALIZED if the library has not been successfully initialized
NVML_ERROR_INVALID_ARGUMENT if device is invalid or defaultLimit is out of range
NVML_ERROR_NOT_SUPPORTED if the device does not support this feature
NVML_ERROR_GPU_IS_LOST if the target GPU has fallen off the bus or is otherwise inaccessible
NVML_ERROR_UNKNOWN on any unexpected error

Description

Set new power limit of this device.

For Kepler or newer fully supported devices. Requires root/admin permissions.

See nvmlDeviceGetPowerManagementLimitConstraints to check the allowed ranges of values.

Note:

Limit is not persistent across reboots or driver unloads. Enable persistent mode to prevent driver from unloading when no application is using the device.

nvmlDeviceGetPowerManagementDefaultLimit

< Previous | Next >

NVML API Reference Guide (PDF) - vR515 (older) - Last updated May 4, 2022 - Send Feedback