4.25. vGPU Migration
This chapter describes operations that are associated with vGPU Migration.
Classes
Enumerations
Functions
- nvmlReturn_t nvmlDeviceGetPgpuMetadataString ( nvmlDevice_t device, char* pgpuMetadata, unsigned int* bufferSize )
- nvmlReturn_t nvmlDeviceGetVgpuMetadata ( nvmlDevice_t device, nvmlVgpuPgpuMetadata_t* pgpuMetadata, unsigned int* bufferSize )
- nvmlReturn_t nvmlDeviceGetVgpuSchedulerCapabilities ( nvmlDevice_t device, nvmlVgpuSchedulerCapabilities_t* pCapabilities )
- nvmlReturn_t nvmlDeviceGetVgpuSchedulerLog ( nvmlDevice_t device, nvmlVgpuSchedulerLog_t* pSchedulerLog )
- nvmlReturn_t nvmlDeviceGetVgpuSchedulerState ( nvmlDevice_t device, nvmlVgpuSchedulerGetState_t* pSchedulerState )
- nvmlReturn_t nvmlDeviceSetVgpuSchedulerState ( nvmlDevice_t device, nvmlVgpuSchedulerSetState_t* pSchedulerState )
- nvmlReturn_t nvmlGetVgpuCompatibility ( nvmlVgpuMetadata_t* vgpuMetadata, nvmlVgpuPgpuMetadata_t* pgpuMetadata, nvmlVgpuPgpuCompatibility_t* compatibilityInfo )
- nvmlReturn_t nvmlGetVgpuVersion ( nvmlVgpuVersion_t* supported, nvmlVgpuVersion_t* current )
- nvmlReturn_t nvmlSetVgpuVersion ( nvmlVgpuVersion_t* vgpuVersion )
- nvmlReturn_t nvmlVgpuInstanceGetMetadata ( nvmlVgpuInstance_t vgpuInstance, nvmlVgpuMetadata_t* vgpuMetadata, unsigned int* bufferSize )
Enumerations
- enum nvmlVgpuPgpuCompatibilityLimitCode_t
-
vGPU-pGPU compatibility limit codes
Values
- NVML_VGPU_COMPATIBILITY_LIMIT_NONE = 0x0
- Compatibility is not limited.
- NVML_VGPU_COMPATIBILITY_LIMIT_HOST_DRIVER = 0x1
- ompatibility is limited by host driver version.
- NVML_VGPU_COMPATIBILITY_LIMIT_GUEST_DRIVER = 0x2
- Compatibility is limited by guest driver version.
- NVML_VGPU_COMPATIBILITY_LIMIT_GPU = 0x4
- Compatibility is limited by GPU hardware.
- NVML_VGPU_COMPATIBILITY_LIMIT_OTHER = 0x80000000
- Compatibility is limited by an undefined factor.
- enum nvmlVgpuVmCompatibility_t
-
vGPU VM compatibility codes
Values
- NVML_VGPU_VM_COMPATIBILITY_NONE = 0x0
- vGPU is not runnable
- NVML_VGPU_VM_COMPATIBILITY_COLD = 0x1
- vGPU is runnable from a cold / powered-off state (ACPI S5)
- NVML_VGPU_VM_COMPATIBILITY_HIBERNATE = 0x2
- vGPU is runnable from a hibernated state (ACPI S4)
- NVML_VGPU_VM_COMPATIBILITY_SLEEP = 0x4
- vGPU is runnable from a sleeped state (ACPI S3)
- NVML_VGPU_VM_COMPATIBILITY_LIVE = 0x8
- vGPU is runnable from a live/paused (ACPI S0)
Functions
- nvmlReturn_t nvmlDeviceGetPgpuMetadataString ( nvmlDevice_t device, char* pgpuMetadata, unsigned int* bufferSize )
-
Parameters
- device
- The identifier of the target device
- pgpuMetadata
- Pointer to caller-supplied buffer into which pgpuMetadata is written
- bufferSize
- Pointer to size of pgpuMetadata buffer
Returns
- NVML_SUCCESS GPU metadata structure was successfully returned
- NVML_ERROR_INSUFFICIENT_SIZE pgpuMetadata buffer is too small, required size is returned in bufferSize
- NVML_ERROR_INVALID_ARGUMENT if bufferSize is NULL or device is invalid; if pgpuMetadata is NULL and the value of bufferSize is not 0.
- NVML_ERROR_NOT_SUPPORTED if vGPU is not supported by the system
- NVML_ERROR_UNKNOWN on any unexpected error
Description
Returns the properties of the physical GPU indicated by the device in an ascii-encoded string format.
The caller passes in a buffer via pgpuMetadata, with the size of the buffer in bufferSize. If the string is too large to fit in the supplied buffer, the function returns NVML_ERROR_INSUFFICIENT_SIZE with the size needed in bufferSize.
- nvmlReturn_t nvmlDeviceGetVgpuMetadata ( nvmlDevice_t device, nvmlVgpuPgpuMetadata_t* pgpuMetadata, unsigned int* bufferSize )
-
Parameters
- device
- The identifier of the target device
- pgpuMetadata
- Pointer to caller-supplied buffer into which pgpuMetadata is written
- bufferSize
- Pointer to size of pgpuMetadata buffer
Returns
- NVML_SUCCESS GPU metadata structure was successfully returned
- NVML_ERROR_INSUFFICIENT_SIZE pgpuMetadata buffer is too small, required size is returned in bufferSize
- NVML_ERROR_INVALID_ARGUMENT if bufferSize is NULL or device is invalid; if pgpuMetadata is NULL and the value of bufferSize is not 0.
- NVML_ERROR_NOT_SUPPORTED vGPU is not supported by the system
- NVML_ERROR_UNKNOWN on any unexpected error
Description
Returns a vGPU metadata structure for the physical GPU indicated by device. The structure contains information about the GPU and the currently installed NVIDIA host driver version that's controlling it, together with an opaque data section containing internal state.
The caller passes in a buffer via pgpuMetadata, with the size of the buffer in bufferSize. If the pgpuMetadata structure is too large to fit in the supplied buffer, the function returns NVML_ERROR_INSUFFICIENT_SIZE with the size needed in bufferSize.
- nvmlReturn_t nvmlDeviceGetVgpuSchedulerCapabilities ( nvmlDevice_t device, nvmlVgpuSchedulerCapabilities_t* pCapabilities )
-
Parameters
- device
- The identifier of the target device
- pCapabilities
- Reference in which pCapabilities is written
Returns
- NVML_SUCCESS vGPU scheduler capabilities were successfully obtained
- NVML_ERROR_INVALID_ARGUMENT if pCapabilities is NULL or device is invalid
- NVML_ERROR_NOT_SUPPORTED The API is not supported in current state or device not in vGPU host mode
- NVML_ERROR_UNKNOWN on any unexpected error
Description
Returns the vGPU scheduler capabilities. The list of supported vGPU schedulers returned in nvmlVgpuSchedulerCapabilities_t is from the NVML_VGPU_SCHEDULER_POLICY_*. This list enumerates the supported scheduler policies if the engine is Graphics type. The other values in nvmlVgpuSchedulerCapabilities_t are also applicable if the engine is Graphics type. For other engine types, it is BEST EFFORT policy. If ARR is supported and enabled, scheduling frequency and averaging factor are applicable else timeSlice is applicable.
For Pascal or newer fully supported devices.
- nvmlReturn_t nvmlDeviceGetVgpuSchedulerLog ( nvmlDevice_t device, nvmlVgpuSchedulerLog_t* pSchedulerLog )
-
Parameters
- device
- The identifier of the target device
- pSchedulerLog
- Reference in which pSchedulerLog is written
Returns
- NVML_SUCCESS vGPU scheduler logs were successfully obtained
- NVML_ERROR_INVALID_ARGUMENT if pSchedulerLog is NULL or device is invalid
- NVML_ERROR_NOT_SUPPORTED The API is not supported in current state or device not in vGPU host mode
- NVML_ERROR_UNKNOWN on any unexpected error
Description
Returns the vGPU Software scheduler logs. pSchedulerLog points to a caller-allocated structure to contain the logs. The number of elements returned will never exceed NVML_SCHEDULER_SW_MAX_LOG_ENTRIES.
To get the entire logs, call the function atleast 5 times a second.
For Pascal or newer fully supported devices.
- nvmlReturn_t nvmlDeviceGetVgpuSchedulerState ( nvmlDevice_t device, nvmlVgpuSchedulerGetState_t* pSchedulerState )
-
Parameters
- device
- The identifier of the target device
- pSchedulerState
- Reference in which pSchedulerState is returned
Returns
- NVML_SUCCESS vGPU scheduler state is successfully obtained
- NVML_ERROR_INVALID_ARGUMENT if pSchedulerState is NULL or device is invalid
- NVML_ERROR_NOT_SUPPORTED The API is not supported in current state or device not in vGPU host mode
- NVML_ERROR_UNKNOWN on any unexpected error
Description
Returns the vGPU scheduler state. The information returned in nvmlVgpuSchedulerGetState_t is not relevant if the BEST EFFORT policy is set.
For Pascal or newer fully supported devices.
- nvmlReturn_t nvmlDeviceSetVgpuSchedulerState ( nvmlDevice_t device, nvmlVgpuSchedulerSetState_t* pSchedulerState )
-
Parameters
- device
- The identifier of the target device
- pSchedulerState
- vGPU pSchedulerState to set
Returns
- NVML_SUCCESS vGPU scheduler state has been successfully set
- NVML_ERROR_INVALID_ARGUMENT if pSchedulerState is NULL or device is invalid
- NVML_ERROR_RESET_REQUIRED if setting pSchedulerState failed with fatal error, reboot is required to overcome from this error.
- NVML_ERROR_NOT_SUPPORTED The API is not supported in current state or device not in vGPU host mode or if any vGPU instance currently exists on the device
- NVML_ERROR_UNKNOWN on any unexpected error
Description
Sets the vGPU scheduler state.
For Pascal or newer fully supported devices.
The scheduler state change won't persist across module load/unload. Scheduler state and params will be allowed to set only when no VM is running. In nvmlVgpuSchedulerSetState_t, IFF enableARRMode is enabled then provide avgFactorForARR and frequency as input. If enableARRMode is disabled then provide timeslice as input.
- nvmlReturn_t nvmlGetVgpuCompatibility ( nvmlVgpuMetadata_t* vgpuMetadata, nvmlVgpuPgpuMetadata_t* pgpuMetadata, nvmlVgpuPgpuCompatibility_t* compatibilityInfo )
-
Parameters
- vgpuMetadata
- Pointer to caller-supplied vGPU metadata structure
- pgpuMetadata
- Pointer to caller-supplied GPU metadata structure
- compatibilityInfo
- Pointer to caller-supplied buffer to hold compatibility info
Returns
- NVML_SUCCESS vGPU metadata structure was successfully returned
- NVML_ERROR_INVALID_ARGUMENT if vgpuMetadata or pgpuMetadata or bufferSize are NULL
- NVML_ERROR_UNKNOWN on any unexpected error
Description
Takes a vGPU instance metadata structure read from nvmlVgpuInstanceGetMetadata(), and a vGPU metadata structure for a physical GPU read from nvmlDeviceGetVgpuMetadata(), and returns compatibility information of the vGPU instance and the physical GPU.
The caller passes in a buffer via compatibilityInfo, into which a compatibility information structure is written. The structure defines the states in which the vGPU / VM may be booted on the physical GPU. If the vGPU / VM compatibility with the physical GPU is limited, a limit code indicates the factor limiting compatability. (see nvmlVgpuPgpuCompatibilityLimitCode_t for details).
Note: vGPU compatibility does not take into account dynamic capacity conditions that may limit a system's ability to boot a given vGPU or associated VM.
- nvmlReturn_t nvmlGetVgpuVersion ( nvmlVgpuVersion_t* supported, nvmlVgpuVersion_t* current )
-
Parameters
- supported
- Pointer to the structure in which the preset range of vGPU versions supported by the NVIDIA vGPU Manager is written
- current
- Pointer to the structure in which the range of supported vGPU versions set by an administrator is written
Returns
- NVML_SUCCESS The vGPU version range structures were successfully obtained.
- NVML_ERROR_NOT_SUPPORTED The API is not supported.
- NVML_ERROR_INVALID_ARGUMENT The supported parameter or the current parameter is NULL.
- NVML_ERROR_UNKNOWN An error occurred while the data was being fetched.
Description
Query the ranges of supported vGPU versions.
This function gets the linear range of supported vGPU versions that is preset for the NVIDIA vGPU Manager and the range set by an administrator. If the preset range has not been overridden by nvmlSetVgpuVersion, both ranges are the same.
The caller passes pointers to the following nvmlVgpuVersion_t structures, into which the NVIDIA vGPU Manager writes the ranges: 1. supported structure that represents the preset range of vGPU versions supported by the NVIDIA vGPU Manager. 2. current structure that represents the range of supported vGPU versions set by an administrator. By default, this range is the same as the preset range.
- nvmlReturn_t nvmlSetVgpuVersion ( nvmlVgpuVersion_t* vgpuVersion )
-
Parameters
- vgpuVersion
- Pointer to a caller-supplied range of supported vGPU versions.
Returns
- NVML_SUCCESS The preset range of supported vGPU versions was successfully overridden.
- NVML_ERROR_NOT_SUPPORTED The API is not supported.
- NVML_ERROR_IN_USE The range was not overridden because a VM is running on the host.
- NVML_ERROR_INVALID_ARGUMENT The vgpuVersion parameter specifies a range that is outside the range supported by the NVIDIA vGPU Manager or if vgpuVersion is NULL.
Description
Override the preset range of vGPU versions supported by the NVIDIA vGPU Manager with a range set by an administrator.
This function configures the NVIDIA vGPU Manager with a range of supported vGPU versions set by an administrator. This range must be a subset of the preset range that the NVIDIA vGPU Manager supports. The custom range set by an administrator takes precedence over the preset range and is advertised to the guest VM for negotiating the vGPU version. See nvmlGetVgpuVersion for details of how to query the preset range of versions supported.
This function takes a pointer to vGPU version range structure nvmlVgpuVersion_t as input to override the preset vGPU version range that the NVIDIA vGPU Manager supports.
After host system reboot or driver reload, the range of supported versions reverts to the range that is preset for the NVIDIA vGPU Manager.
Note:1. The range set by the administrator must be a subset of the preset range that the NVIDIA vGPU Manager supports. Otherwise, an error is returned. 2. If the range of supported guest driver versions does not overlap the range set by the administrator, the guest driver fails to load. 3. If the range of supported guest driver versions overlaps the range set by the administrator, the guest driver will load with a negotiated vGPU version that is the maximum value in the overlapping range. 4. No VMs must be running on the host when this function is called. If a VM is running on the host, the call to this function fails.
- nvmlReturn_t nvmlVgpuInstanceGetMetadata ( nvmlVgpuInstance_t vgpuInstance, nvmlVgpuMetadata_t* vgpuMetadata, unsigned int* bufferSize )
-
Parameters
- vgpuInstance
- vGPU instance handle
- vgpuMetadata
- Pointer to caller-supplied buffer into which vGPU metadata is written
- bufferSize
- Size of vgpuMetadata buffer
Returns
- NVML_SUCCESS vGPU metadata structure was successfully returned
- NVML_ERROR_INSUFFICIENT_SIZE vgpuMetadata buffer is too small, required size is returned in bufferSize
- NVML_ERROR_INVALID_ARGUMENT if bufferSize is NULL or vgpuInstance is 0; if vgpuMetadata is NULL and the value of bufferSize is not 0.
- NVML_ERROR_NOT_FOUND if vgpuInstance does not match a valid active vGPU instance on the system
- NVML_ERROR_UNKNOWN on any unexpected error
Description
Returns vGPU metadata structure for a running vGPU. The structure contains information about the vGPU and its associated VM such as the currently installed NVIDIA guest driver version, together with host driver version and an opaque data section containing internal state.
nvmlVgpuInstanceGetMetadata() may be called at any time for a vGPU instance. Some fields in the returned structure are dependent on information obtained from the guest VM, which may not yet have reached a state where that information is available. The current state of these dependent fields is reflected in the info structure's nvmlVgpuGuestInfoState_t field.
The VMM may choose to read and save the vGPU's VM info as persistent metadata associated with the VM, and provide it to Virtual GPU Manager when creating a vGPU for subsequent instances of the VM.
The caller passes in a buffer via vgpuMetadata, with the size of the buffer in bufferSize. If the vGPU Metadata structure is too large to fit in the supplied buffer, the function returns NVML_ERROR_INSUFFICIENT_SIZE with the size needed in bufferSize.