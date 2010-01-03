Description

After the Virtual GPU Manager for VMware vSphere Hypervisor (ESXi) has been updated by using vSphere Lifecycle Management (vLCM), vLCM fails to validate the host compliance. As a result, the vLCM patching and automation process fails.



Status

Resolved in NVIDIA vGPU software 16.10



Ref. #

5205011

Description

XID error 120 can put GPUs based on the Ada Lovelace architecture into a bad state. As a result, the VM to which the GPU is assigned might become unstable and the hypervisor host might crash. When this issue occurs, XID error 120 messages are written to the log files on the hypervisor host.



Status

Resolved in NVIDIA vGPU software 16.11



Ref. #

5137781

4973227

Description

On Windows 11 VMs with more than 1 TB of system memory, GPU device unavailable errors (Error 43) occur. This issue affects NVIDIA vGPU and GPU pass through deployments.



Version

This issue affects Windows 11 guest VMS.



Workaround

Limit the amount of system memory assigned to the VM to less than 1 TB.



Status

Open



Ref. #

5115698

Description

In air-gapped environments where root certificates are not available on the host machine, timestamps cannot be verified. As a result, the NVIDIA vGPU software graphics driver fails to create the default client configuration token folder on Windows (%SystemDrive%\Program Files\NVIDIA Corporation\vGPU Licensing\ClientConfigToken). If the folder is created manually and the client configuration token is copied there, the client fails to obtain a license. Typically, root certificates are imported by Windows updates from the Microsoft Trusted Root Program.



Workaround

Determine whether the NVIDIA Authenticode signature certificate and the timestamp signature certificate are installed and, if not, install them.

To determine whether the root NVIDIA Authenticode signature certificate is installed:

Context-click the file and click the Digital Signatures tab. In the Signature list, select the NVIDIA certificate and click Details. Click View Certificate, then click Certification Path. The root certificate that is needed appears at the top of the certification path. Run the certmgr.msc command and in the certmgr window that opens, expand Trusted Root Certification Authorities and click Certificates to see whether the certificate that you identified in the previous step is installed.

To determine whether the root timestamp signature certificate is installed:

Context-click the file and click the Digital Signatures tab. In the Signature list, select the NVIDIA certificate and click Details. In the Countersignatures section, click the timestamp authority, for example, Digicert or Entrust, then click Details below the countersignature section. Click View Certificate, then click Certification Path. The root certificate that is needed appears at the top of the certification path. Run the certmgr.msc command and in the certmgr window that opens, expand Trusted Root Certification Authorities and click Certificates to see whether the certificate that you identified in the previous step is installed.

Root certificates for both Digicert and Entrust are required for timestamping and can be downloaded from the following websites:

Status

Not an NVIDIA bug



Ref. #

4684895

Description

An IOMMU fault causes a purple screen crash with XID error 32 on hypervisor hosts with GPUs based on the NVIDIA Hopper and NVIDIA Ada Lovelace GPU architectures.



Status

Resolved in NVIDIA vGPU software 16.10



Ref. #

4998685

Description

On Windows VMs configured with NVIDIA vGPU, the performance of the CATIA application is degraded if the Optimize CGR for large assembly visualization setting is enabled.



Workaround

In NVIDIA Control Panel, on the Manage 3D settings page, set Threaded optimization to Off.



Status

Open



Ref. #

4480745

Description

In Windows 11 24H2 guest VMs, the display is driven in Omnissa Horizon sessions by the Omnissa Horizon Indirect Display Driver (IDD) instead of the NVIDIA vGPU software graphics driver. This issue does not cause any visual corruption. However, OpenGL applications run at 30 fps instead of 60 fps, and pages for controlling the settings of multiple displays are missing from NVIDIA Control Panel.



Version

This issue affects only Omnissa Horizon with Windows 11 24H2 guest VMs. To find out which Omnissa Horizon versions support Windows 11 24H2, refer to Omnissa Knowledge Base Article: Supported Windows 10 and Windows 11 Guest Operating Systems for Horizon Agent and Remote Experience, for Omnissa Horizon 8.x (2006 and Later) (78714).



Workaround

Status

Not an NVIDIA bug



Ref. #

4923798

Description

Disabling or disconnecting a display can cause a Timeout Detection and Recovery (TDR) error on a Windows VM that is configured with NVIDIA vGPU. The TDR error might cause a VM crash or intermittent black screens with remoting solutions such as Omnissa Horizon. When this error occurs, TDR error, XID error 44, and XID error 109 messages are written to the log file on the hypervisor host.



Status

Resolved in NVIDIA vGPU software 16.8

After upgrading to a release in which this issue is resolved, you must set a registry key value in the guest VM for the vGPU to avoid this issue.

Obtain the driver key of the vGPU. In Device Manager, expand Display Device, context-click the vGPU and from the menu that pops up, choose Properties. In the Properties window that opens, click the Details tab and in the Property drop-down list, select Driver key. Open the Registry Editor and navigate to the Windows registry key Computer\HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Class\driver-key . driver-key The driver key for the vGPU that you got in the previous step, for example, {4d36e968-e325-11ce-bfc1-08002be10318}\0001 . Add the following registry value to this registry key: Name: R550_4558671

Type: REG_DWORD

Data: 1 Restart the guest VM to reload the graphics driver.

Ref. #

4558671

Description

XID error 120 causes multiple issues with VMs configured with NVIDIA vGPU on a physical GPU that includes a GPU System Processor (GSP), such as GPUs based on the NVIDIA Ada Lovelace GPU architecture. Examples of these issues include:

VMs hang or crash

VMs fail to power on after hanging or a crashing.

The hypervisor host crashes.

Status

Resolved in NVIDIA vGPU software 16.6



Ref. #

4600308

Description

When multiple VMs configured with NVIDIA vGPU are shut down simultaneously, XID error 119 causes the hypervisor host to hang or crash. This issue affects VMs configured with NVIDIA vGPU on a physical GPU that includes a GPU System Processor (GSP), such as GPUs based on the NVIDIA Ada Lovelace GPU architecture.



Status

Resolved in NVIDIA vGPU software 16.6



Ref. #

4644559

Description

A VM that is configured with NVIDIA vGPU on any NVIDIA RTX Ada graphics card, such as the NVIDIA RTX 6000 Ada and NVIDIA RTX 5000 Ada, fails to boot. When this issue occurs, the error message vmiop-display unable to reserve vgpu is written to the log files for the VM on the hypervisor host. This issue occurs because an issue with VMware vSphere Hypervisor (ESXi) prevents the hypervisor software from parsing the names of the virtual GPU types for these cards.



Version

This issue affects only the VMware vSphere Hypervisor (ESXi) 8.0U2 General Availability (GA) release. Other VMware vSphere Hypervisor (ESXi) releases that NVIDIA vGPU software supports are not affected.



Status

Not an NVIDIA bug



Ref. #

4293546

Description

After a VM that is configured with NVIDIA vGPU is migrated, the vGPU is disabled and the remote desktop session is disconnected from the VM. When this issue occurs, the following error messages are written to the log files on the hypervisor host.

XID error 69

Timeout detection and recovery (TDR) failures

Status

Resolved in NVIDIA vGPU software 16.3



Ref. #

4302812

Description

When run in a guest VM to which an NVIDIA vGPU or physical GPU has been assigned, the nvidia-smi -q command shows the status of the vGPU or physical GPU as Display Active: Disabled even if the vGPU or physical GPU is functioning correctly. In this situation, the status shown by nvidia-smi -q does not indicate an issue and can be ignored.



Status

Resolved in NVIDIA vGPU software 16.3



Ref. #

4437545

Description

After the hypervisor host is rebooted, a black screen occurs if an attempt is made to connect to a VM configured with NVIDIA vGPU. This issue affects only GPUs based on the Ampere GPU architecture and later GPU architectures. When this issue occurs, the following error messages are written to the log files on the hypervisor host.

XID error 38

XID error 43

XID error 109

vGPU Message 22

Timeout detection and recovery (TDR) failures

Status

Resolved in NVIDIA vGPU software 16.3



Ref. #

4399699

Description

Users might experience poor graphics quality on a Windows VM that is configured with a vGPU on a Tesla T4 GPU. This issue can cause random pixelation on the entire screen, or only on some patches of the screen. No errors are reported or written to the log files when this issue occurs.



Workaround

Contact NVIDIA Enterprise Support for assistance with a workaround for this issue.



Status

Resolved in NVIDIA vGPU software 16.4



Ref. #

3973158

Description

After the NVIDIA vGPU software graphics driver for Windows is installed, the NVIDIA Control Panel app might be missing from the system. This issue typically occurs in the following situations:

Multiple users connect to virtual machines by using remote desktop applications such as Microsoft RDP, Omnissa Horizon, and Citrix Virtual Apps and Desktops.

VM instances are created by using Citrix Machine Creation Services (MCS) or VMware Instant Clone technology.

Roaming user desktop profiles are deployed.

This issue occurs because the NVIDIA Control Panel app is now distributed through the Microsoft Store. The NVIDIA Control Panel app might fail to be installed when the NVIDIA vGPU software graphics driver for Windows is installed if the Microsoft Store app is disabled, the system is not connected to the Internet, or installation of apps from the Microsoft Store is blocked by your system settings.

To determine whether the NVIDIA Control Panel app is installed on your system, use the Windows Settings app or the Get-AppxPackage Windows PowerShell command.

To use the Windows Settings app: From the Windows Start menu, choose Settings > Apps > Apps & feautures . In the Apps & features window, type nvidia control panel in the search box and confirm that the NVIDIA Control Panel app is found.

To use the Get-AppxPackageWindows PowerShell command: Run Windows PowerShell as Administrator. Determine whether the NVIDIA Control Panel app is installed for the current user. Copy Copied! PS C:\> Get-AppxPackage -Name NVIDIACorp.NVIDIAControlPanel Determine whether the NVIDIA Control Panel app is installed for all users. Copy Copied! PS C:\> Get-AppxPackage -AllUsers -Name NVIDIACorp.NVIDIAControlPanel This example shows that the NVIDIA Control Panel app is installed for the users Administrator , pliny , and trajan . Copy Copied! PS C:\> Get-AppxPackage -AllUsers -Name NVIDIACorp.NVIDIAControlPanel Name : NVIDIACorp.NVIDIAControlPanel Publisher : CN=D6816951-877F-493B-B4EE-41AB9419C326 Architecture : X64 ResourceId : Version : 8.1.964.0 PackageFullName : NVIDIACorp.NVIDIAControlPanel_8.1.964.0_x64__56jybvy8sckqj InstallLocation : C:\Program Files\WindowsApps\NVIDIACorp.NVIDIAControlPanel_8.1.964.0_x64__56jybvy8sckqj IsFramework : False PackageFamilyName : NVIDIACorp.NVIDIAControlPanel_56jybvy8sckqj PublisherId : 56jybvy8sckqj PackageUserInformation : {S-1-12-1-530092550-1307989247-1105462437-500 [Administrator]: Installed , S-1-12-1-530092550-1307989247-1105462437-1002 [pliny]: Installed , S-1-12-1-530092550-1307989247-1105462437-1003 [trajan]: Installed } IsResourcePackage : False IsBundle : False IsDevelopmentMode : False NonRemovable : False IsPartiallyStaged : False SignatureKind : Store Status : Ok



Preventing this Issue

Since 16.3: If your system does not allow the installation apps from the Microsoft Store, download and run the standalone NVIDIA Control Panel installer that is available from NVIDIA Licensing Portal. For instructions, refer to Virtual GPU Software User Guide .

If your system can allow the installation apps from the Microsoft Store, ensure that:

The Microsoft Store app is enabled.

Installation of Microsoft Store apps is not blocked by your system settings.

No local or group policies are set to block Microsoft Store apps.

Workaround

If the NVIDIA Control Panel app is missing, install it separately from the graphics driver.

Since 16.3: You can install the NVIDIA Control Panel app by downloading and running the standalone NVIDIA Control Panel installer that is available from NVIDIA Licensing Portal. For instructions, refer to Virtual GPU Software User Guide .

You can install the NVIDIA Control Panel app by downloading and running the standalone NVIDIA Control Panel installer that is available from NVIDIA Licensing Portal. For instructions, refer to . 16.0-16.2 only: For a system that is running Windows 11 or a modern version of Windows 10, you can install the NVIDIA Control Panel app by using the winget command-line tool of Windows Package Manager. Note: The winget command-line tool is not available on the Windows Server OS. Before using the winget command-line tool to install the NVIDIA Control Panel app, ensure that the following prerequisites are met: Your system is connected to the Internet. The Microsoft Store app is enabled. Packages on which winget depends, such as Microsoft.UI.Xaml and Microsoft.VCLibs.x64 , are installed. To use the winget command-line tool to install the NVIDIA Control Panel app, run the following command: Copy Copied! PS C:\> winget install "NVIDIA Control Panel" --id 9NF8H0H7WMLT -s msstore --accept-package-agreements --accept-source-agreements For information about how to download and use the latest winget version, refer to Use the winget tool to install and manage applications on the Microsoft documentation site.

For a system that is running Windows 11 or a modern version of Windows 10, you can install the NVIDIA Control Panel app by using the winget command-line tool of Windows Package Manager.

If the issue persists, contact NVIDIA Enterprise Support for further assistance.



Status

Open



Ref. #

3999308

Description

The NVIDIA Enterprise Management Toolkit (NVWMI) functions for faking Extended Display Identification Data (EDID), namely, fakeEDID , fakeEDIDAll , and fakeEDIDOnPort have no effect. This issue affects only Windows guest VMs and can prevent a VM from being enabled with multiple displays. When this issue occurs, unable to fake EDID events can be seen in Event Viewer.



Status

Resolved in NVIDIA vGPU software 16.2



Ref. #

4309888

Description

Windows Server 2022 guest VMs support only a maximum of nine Remote Desktop Protocol (RDP) sessions. An attempt to launch a 10th session on a Windows Server 2022 guest VM fails. When this issue occurs, the following error messages are logged.

Copy Copied! 2023-08-21T22:55:40.279Z Er(02) vthread-3390694 - vmiop_log: (0x0): Cannot use virtual context buffers in sysmem 2023-08-21T22:55:40.279Z Er(02) vthread-3390694 - vmiop_log: (0x0): Invalid promote context input 2023-08-21T22:55:40.279Z Er(02) vthread-3390694 - vmiop_log: (0x0): VGPU message 111 failed, result code: 0x1f

Version

This issue affects only Windows Server 2022 guest VMs that are configured with NVIDIA vGPU.



Status

Resolved in NVIDIA vGPU software16.2

Resolution of this issue increases the maximum number of RDP sessions to 16. Issues similar to this issue might still occur if the channels allocated to a vGPU are exhausted. For more information, refer to Issues occur when the channels allocated to a vGPU are exhausted.



Ref. #

4242693

Description

While the nvidia-bug-report.sh script on is running on the hypervisor host to capture configuration data for a bug report, the following error message is displayed:

Copy Copied! sysctl: cannot stat /proc/sys/vm/compaction_proactiveness: No such file or directory

Workaround

Ignore this message as it is benign. The bug report is generated correctly.



Status

Resolved in NVIDIA vGPU software 16.1



Ref. #

4052185

Description

If GPU System Processor (GSP) firmware is disabled, the NVIDIA Virtual GPU Manager incorrectly identifies the brand of the NVIDIA L40 GPU. This incorrect identification of the GPU brand might cause performance degradation with some applications that are optimised for features of the NVIDIA L40 that are not available in the incorrect brand. However, the output from the nvidia-smi command is not affected.

This issue occurs only if GPU System Processor (GSP) firmware is disabled. It does not occur if GSP firm is enabled.



Status

Resolved in NVIDIA vGPU software 16.1



Ref. #

4142288

Description

On all supported Windows Server guest OS releases, NVIDIA Control Panel crashes if a user session is disconnected and then reconnected while NVIDIA Control Panel is open.



Version

This issue affects all supported Windows Server guest OS releases.



Status

Open



Ref. #

4086605

Description

When multiple VMs that are configured with VMware vSGA are powered on simultaneously, an Input-Output Memory Management Unit (IOMMU) fault causes a purple screen crash. This issue does not affect VMs that are configured with NVIDIA vGPU.



Workaround

Power on each VMware vSGA VM separately. Do not power on multiple VMware vSGA VMs simultaneously.



Status

Open



Ref. #

3688024

Description

Graphics applications are corrupted on Windows VMs that are configured with one or more vGPUs that are based on the NVIDIA Ampere or NVIDIA Ada Lovelace GPU architecture.



Status

Resolved in NVIDIA vGPU software 16.1



Ref. #

3641947

Description

A VM that has been assigned multiple fractional vGPUs from the same physical GPU hangs or becomes inaccessible during installation of the NVIDIA vGPU software graphics driver in the VM. This issue affects only GPUs based on the NVIDIA Turing and NVIDIA Volta GPU architectures. This issue does not occur if the VM has been assigned multiple fractional vGPUs from different physical GPUs.



Version

This issue affects only GPUs based on the NVIDIA Turing and NVIDIA Volta GPU architectures.



Status

Open



Ref. #

4020171

Description

NVIDIA CUDA Toolkit profilers cannot gather hardware metrics on NVIDIA vGPU. This issue affects only traces that gather hardware metrics. Other traces are not affected by this issue and work normally.



Version

This issue affects NVIDIA vGPU software releases starting with 15.2.



Status

Open



Ref. #

4041169

Description

After the NVIDIA vGPU software graphics for windows has been installed in the guest VM, the driver sends a remote call to ngx.download.nvidia.com to download and install additional components. Such a remote call might be a security issue.



Workaround

Before running the NVIDIA vGPU software graphics driver installer, disable the remote call to ngx.download.nvidia.com by setting the following Windows registry key:

Copy Copied! [HKEY_LOCAL_MACHINE\SOFTWARE\NVIDIA Corporation\Global\NGXCore] "EnableOTA"=dword:00000000

Note: If this Windows registry key is set to 1 or deleted, the remote call to ngx.download.nvidia.com is enabled again.





Status

Open



Ref. #

4031840

Description

Multiple RDP session reconnections in a Windows Server 2022 guest VM can consume all the frame buffer of a vGPU or physical GPU. When this issue occurs, users' screens becomes black, their sessions are disconnected but left intact, and they cannot log on again. The following error message is written to the event log on the hypervisor host:

Copy Copied! The Desktop Window Manager process has exited. (Process exit code: 0xe0464645, Restart count: 1, Primary display device ID: )

Version

This issue affects only the Windows Server 2022 guest OS.



Workaround

Periodically restart the Windows Server 2022 guest VM to prevent all frame buffer from being consumed.



Status

Open



Ref. #

3583766

Description

A VM to which multiple legacy fractional vGPUs on the same physical GPU are assigned fails to boot. A fractional vGPU is assigned only a fraction of the physical GPU's frame buffer. A legacy NVIDIA vGPU does not support single root I/O virtualization (SR-IOV). When this issue occurs, error messages similar to the following examples are written to the vmware.log file on the hypervisor host:

Copy Copied! 2022-11-23T09:01:06.643Z In(05) vmx - VMIOP: Registered device 0000:da:00.0 ... 2022-11-23T09:01:06.715Z In(05) vmx - VMIOP: Failed to register device 0000:da:00.0 error = Failure

Status

Not an NVIDIA bug



Ref. #

3879209

Description

A licensed client of NVIDIA License System (NLS) fails to acquire a license with the error The allowed time to process response has expired . This error can affect clients of a Cloud License Service (CLS) instance or a Delegated License Service (DLS) instance.

This error occurs when the time difference between the system clocks on the client and the server that hosts the CLS or DLS instance is greater than 10 minutes. A common cause of this error is the failure of either the client or the server to adjust its system clock when daylight savings time begins or ends. The failure to acquire a license is expected to prevent clock windback from causing licensing errors.



Workaround

Ensure that system clock time of the client and any server that hosts a DLS instance match the current time in the time zone where they are located.

To prevent this error from occurring when daylight savings time begins or ends, enable the option to automatically adjust the system clock for daylight savings time:

Windows: Set the Adjust for daylight saving time automatically option.

Set the option. Linux: Use the hwclock command.

Status

Not a bug



Ref. #

3859889

Description

In an environment with multiple active desktop sessions, the Manage License page of NVIDIA Control Panel shows that a licensed system is unlicensed. However, the nvidia-smi command and the management interface of the NVIDIA vGPU software license server correctly show that the system is licensed. When an active session is disconnected and reconnected, the NVIDIA Display Container service crashes.

The Manage License page incorrectly shows that the system is unlicensed because of stale data in NVIDIA Control Panel in an environment with multiple sessions. The data is stale because NVIDIA Control Panel fails to get and update the settings for remote sessions when multiple sessions or no sessions are active in the VM. The NVIDIA Display Container service crashes when a session is reconnected because the session is not active at the moment of reconnection.



Status

Open



Ref. #

3761243

Description

VP9 and AV1 decoding with web browsers are not supported on Microsoft Windows Server 2019 and later supported releases. This issue occurs because starting with Windows Server 2019, the required codecs are not included with the OS and are not available through the Microsoft Store app. As a result, hardware decoding is not available for viewing YouTube videos or using collaboration tools such as Google Meet in a web browser.



Version

This issue affects Microsoft Windows Server releases starting with Windows Server 2019.



Status

Not an NVIDIA bug



Ref. #

200756564

Description

After a second NVIDIA vGPU device is added to a Microsoft Windows Server 2016 VM, the device does not appear in the output from the nvidia-smi command. This issue occurs only if the VM is already running NVIDIA vGPU software for the existing NVIDIA vGPU device when the second device is added to the VM.

The nvidia-smi command cannot retrieve the guest driver version, license status, and accounting mode of the second NVIDIA vGPU device.

Copy Copied! nvidia-smi vgpu --query GPU 00000000:37:00.0 Active vGPUs : 1 vGPU ID : 3251695793 VM ID : 3575923 VM Name : SVR-Reg-W(P)-KuIn vGPU Name : GRID V100D-32Q vGPU Type : 185 vGPU UUID : 29097249-2359-11b2-8a5b-8e896866496b Guest Driver Version : 537.13 License Status : Licensed Accounting Mode : Disabled ... GPU 00000000:86:00.0 Active vGPUs : 1 vGPU ID : 3251695797 VM ID : 3575923 VM Name : SVR-Reg-W(P)-KuIn vGPU Name : GRID V100D-32Q vGPU Type : 185 vGPU UUID : 2926dd83-2359-11b2-8b13-5f22f0f74801 Guest Driver Version : Not Available License Status : N/A Accounting Mode : N/A

Version

This issue affects only VMs that are running Microsoft Windows Server 2016 as a guest OS.



Workaround

To avoid this issue, configure the guest VM with both NVIDIA vGPU devices before installing the NVIDIA vGPU software graphics driver.

If you encounter this issue after the VM is configured, use one of the following workarounds:

Reinstall the NVIDIA vGPU software graphics driver.

Forcibly uninstall the Microsoft Basic Display Adapter and reboot the VM.

Upgrade the guest OS on the VM to Microsoft Windows Server 2019.

Status

Not an NVIDIA bug



Ref. #

3562801

Description

After the NVIDIA vGPU software graphics driver for Linux is upgraded from an RPM package in a licensed VM, licensing fails. The nvidia-smi vgpu -q command shows the driver version and license status as N/A. Restarting the nvidia-gridd service fails with a Unit not found error.



Workaround

Perform a clean installation of the NVIDIA vGPU software graphics driver for Linux from an RPM package.

Remove the currently installed driver. Install the new version of the driver. Copy Copied! $ rpm -iv nvidia-linux-grid-525_535.274.02_amd64.rpm

Status

Open



Ref. #

3512766

Description

After the NVIDIA vGPU software graphics driver for Linux is upgraded from a Debian package, the driver is not loaded into the VM.



Workaround

Use one of the following workarounds to load the driver into the VM:

Reboot the VM.

Status

Not a bug



Ref. #

200748806

Description

When a VM configured with a Tesla V100 or Tesla T4 vGPU is migrated between a host running an NVIDIA vGPU software 14 release and a host running a an NVIDIA vGPU software 13 release, the remote desktop session freezes. After the session freezes, the VM must be rebooted to recover the session. This issue occurs only when the NVIDIA hardware-based H.264/HEVC video encoder (NVENC) is enabled.



Version

The issue affects migrations between a host running an NVIDIA vGPU software 14 release and a host running an NVIDIA vGPU software 13 release.



Workaround

Disable NVENC.



Status

Open



Ref. #

3512790

Description

When multiple application instances are launched on a legacy vGPU that is allocated only a fraction of the physical GPU's frame buffer, the application or VM to which the vGPU is assigned crashes. A legacy NVIDIA vGPU does not support single root I/O virtualization (SR-IOV). This issue does not affect NVIDIA vGPUs that support SR-IOV.

The symptoms of this issue depend on the release of VMware vSphere Hypervisor (ESXi).

With VMware vSphere Hypervisor (ESXi) 7.0.3 and later releases, the application crashes but the guest VM remains accessible. When this issue occurs, the following error message is written to the vmware.log file: Copy Copied! vmiop_log: (0x0): VGPU message 7 failed

With VMware vSphere Hypervisor (ESXi) releases before 7.0.3, the guest VMX process crashes. When this issue occurs, the following error message is written to the vmware.log file in the host VMFS datastore folder for the VM: Copy Copied! E105: PANIC: PhysMem: creating too many Global lookups.

This issue occurs when the plugin for legacy NVIDIA vGPUs creates more BAR1 mappings than VMware vSphere Hypervisor (ESXi) allows a VM to create. These mappings depend on the number and type of applications running in the VM.



Workaround

A workaround is available for the following GPUs, all of which have a large physical BAR1 memory size:

Quadro RTX 6000 Passive

Quadro RTX 8000 Passive

Tesla P6

Tesla P40

Tesla P100 (all variants)

Tesla V100 (all variants)

Note: This workaround is not available for other GPUs that are affected by this issue.

To employ this workaround, set the vGPU plugin parameter pciPassthru0.cfg.plugin_managed_bar1_va_override to 1.



Status

Open



Ref. #

200680865

Description

Only one VM configured with NVIDIA vGPU can be powered with VMware vSphere Hypervisor (ESXi) 7.0.3. Any attempt to power on a second VM fails with the following error message:

Copy Copied! Insufficient resources. At least one device (pcipassthru0) required for VM vm-name is not available on host. host-name

This issue occurs because the release of VMware vCenter Server is incompatible with VMware vSphere Hypervisor (ESXi) 7.0.3. Only VMware vCenter Server 7.0.3 is compatible with VMware vSphere Hypervisor (ESXi) 7.0.3.



Version

VMware vSphere Hypervisor (ESXi) 7.0.3



Workaround

Upgrade VMware vCenter Server to release 7.0.3 to match the release of VMware vSphere Hypervisor (ESXi).



Status

Not an NVIDIA bug



Ref. #

3419013

Description

The frame rate in frames per second (FPS) for the NVIDIA hardware-based H.264/HEVC video encoder (NVENC) reported by the nvidia-smi encodersessions command and NVWMI is double the actual frame rate. Only the reported frame rate is incorrect. The actual encoding of frames is not affected.

This issue affects only Windows VMs that are configured with NVIDIA vGPU.



Status

Open



Ref. #

2997564

Description

After a second vGPU is added to a VM and the VM is restarted, the VM fails. NVIDIA vGPU software supports up to a maximum of 16 vGPUs per VM on VMware vSphere Hypervisor (ESXi).

When this issue occurs, the following messages are written to the log file on the hypervisor host:

Copy Copied! 2021-09-27T17:11:42.303Z| vthread-2105551| | I005: vmiop_log: (0x0): Start restoring vGPU state ... 2021-09-27T17:11:43.465Z| vcpu-0| | E002: vmiop_log: (0x0): Deferred restore for RPCs cannot continue, since restore data was not saved 2021-09-27T17:11:43.465Z| vcpu-0| | E002: vmiop_log: (0x0): Deferred call for vmiopd_restore_rpc_data failed at un-stun! 2021-09-27T17:11:43.465Z| vcpu-0| | E002: vmiop_log: (0x0): Failed to complete restore for deferred functions. 2021-09-27T18:44:27.034Z| vthread-2105550| | E002: vmiop_log: (0x0): VGPU message 1 failed, guest VGX version is already initialized... 2021-09-27T18:44:27.034Z| vthread-2105550| | E002: vmiop_log: (0x0): VGPU message 1 failed, result code: 0x40 ... 2021-09-27T18:44:35.359Z| vthread-2105550| | I005: vmiop_log: (0x0): Guest driver unloaded!





Workaround

To avoid this issue, create your VMs in EFI mode.

If you encounter this issue with a VM that was created in legacy BIOS mode, shut down and restart the VM or power off the VM and power it on again.



Status

Not an NVIDIA bug



Ref. #

3386681

Description

The NVIDIA hardware-based H.264/HEVC video encoder (NVENC) does not work with Teradici Cloud Access Software on Windows. This issue affects NVIDIA vGPU and GPU pass through deployments.

This issue occurs because the check that Teradici Cloud Access Software performs on the DLL signer name is case sensitive and NVIDIA recently changed the case of the company name in the signature certificate.



Status

Not an NVIDIA bug

This issue is resolved in the latest 21.07 and 21.03 Teradici Cloud Access Software releases.



Ref. #

200749065

Description

When a user logs out of a VM deployed by using Omnissa Horizon instant clone technology, the VM is deleted and OS is not shut down cleanly. The NVIDIA vGPU software license that was being used by the VM is not returned to the license server, which could cause the license server to run out of licenses.



Workaround

Deploy the instant-clone desktop pool with the following options:

Floating user assignment

user assignment All Machines Up-Front provisioning

This configuration will allow the MAC address to be reused on the newly cloned VMs.

For more information, refer to the documentation for the version of Omnissa Horizon or VMware Horizon that you are using:

Status

Not an NVIDIA bug



Ref. #

200744338

Description

If a proxy is set with a system environment variable such as HTTP_PROXY or HTTPS_PROXY , a licensed client might fail to acquire a license.



Workaround

Perform this workaround on each affected licensed client.

Add the address of the NVIDIA vGPU software license server to the system environment variable NO_PROXY . The address must be specified exactly as it is specified in the client's license server settings either as a fully-qualified domain name or an IP address. If the NO_PROXY environment variable contains multiple entries, separate the entries with a comma ( , ). If high availability is configured for the license server, add the addresses of the primary license server and the secondary license server to the system environment variable NO_PROXY . Restart the NVIDIA driver service that runs the core NVIDIA vGPU software logic. On Windows, restart the NVIDIA Display Container service.

On Linux, restart the nvidia-gridd service.

Status

Closed



Ref. #

200704733

Description

Desktop session connections fail for a 2Q, 3Q, or 4Q vGPU that is configured with four 4K displays and for which the NVIDIA hardware-based H.264/HEVC video encoder (NVENC) is enabled. This issue affects only Teradici Cloud Access Software sessions on Linux guest VMs.

This issue is accompanied by the following error message:

Copy Copied! This Desktop has no resources available or it has timed out

This issue is caused by insufficient frame buffer.



Workaround

Ensure that sufficient frame buffer is available for all the virtual displays that are connected to a vGPU by changing the configuration in one of the following ways:

Reducing the number of virtual displays. The number of 4K displays supported with NVENC enabled depends on the vGPU. vGPU 4K Displays Supported with NVENC Enabled 2Q 1 3Q 2 4Q 3

Disabling NVENC. The number of 4K displays supported with NVENC disabled depends on the vGPU. vGPU 4K Displays Supported with NVENC Disabled 2Q 2 3Q 2 4Q 4

Using a vGPU type with more frame buffer. Four 4K displays with NVENC enabled on any Q-series vGPU with at least 6144 MB of frame buffer are supported.

Status

Not an NVIDIA bug



Ref. #

200701959

Description

Disconnected sessions cannot be reconnected or might be reconnected very slowly when the NVIDIA Enterprise Management Toolkit (NVWMI) is installed. This issue affects Citrix Virtual Apps and Desktops and Omnissa Horizon sessions on Windows guest VMs.



Workaround

Ensure that the NVWMI service is disabled.

Note: By default, NVWMI is disabled in the NVIDIA vGPU software graphics driver.

Status

Not a bug



Ref. #

3262923

Description

When the NVIDIA vGPU software graphics driver in a Windows VM is upgraded with the Custom (Advanced) option selected, the VM crashes.

Status

Open



Ref. #

200700291

Description

An otherwise correctly configured VMware vSphere ESXi 7.0 Update 2 server fails to boot VMs with vGPUs on GPUs based on the NVIDIA Ampere if the server being managed by a version of VMware vCenter Server older than 7.0.2. This version of VMware vCenter is released with ESXi 7.0 VMware vSphere Update 2.

When this issue occurs, the following error message is seen:

Copy Copied! Insufficient resources. One or more devices (pciPassthru0) required by VM vm-name are not available on host host-name





Workaround

Use VMware vCenter Server 7.0.2 or a later compatible update



Status

Open

Description

When a Linux VM configured with a Tesla V100 or Tesla T4 vGPU is migrated from a host that is running a vGPU manager 11 release before 11.6 to a host that is running a vGPU manager 13 release, the VM hangs. After the migration, the destination host and VM become unstable. When this issue occurs, XID error 31 is written to the log files on the destination hypervisor host.



Version

This issue affects migration from a host that is running a vGPU manager 11 release before 11.6 to a host that is running a vGPU manager 13 release.



Workaround

If the VM is configured with a Tesla T4 vGPU, perform the following sequence of steps before attempting the migration:

Upgrade the host that is running a vGPU manager 11 release to release 11.6 or a later vGPU manager 11 release. Disconnect any remoting tool that is using NVENC.

Note: You cannot use this workaround for a VM that is configured with a Tesla V100 vGPU.





Status

Open



Ref. #

200691445

Description

After a Teradici Cloud Access Software session has been idle for a short period of time, the session disconnects from the VM. When this issue occurs, the error messages NVOS status 0x19 and vGPU Message 21 failed are written to the log files on the hypervisor host. This issue affects only Linux guest VMs.



Status

Open



Ref. #

200689126

Description

NVIDIA GPU Operator doesn't support vGPU deployments on GPUs based on architectures before the NVIDIA Turing™ architecture. This issue is caused by the omission of version information for the vGPU manager from the configuration information that GPU Operator requires. Without this information, GPU Operator does not deploy the NVIDIA driver container because the container cannot determine if the driver is compatible with the vGPU manager.



Status

Open



Ref. #

3227576

Description

The nvidia-smi command shows 100% GPU utilization for NVIDIA A100, NVIDIA A40, and NVIDIA A10 GPUs even if no vGPUs have been configured or no VMs are running.

Copy Copied! [root@host ~]# nvidia-smi Fri Oct 10 11:45:28 2025 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 535.274.03 Driver Version: 535.274.03 CUDA Version: 12.1 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 A100-PCIE-40GB On | 00000000:5E:00.0 Off | 0 | | N/A 50C P0 97W / 250W | 0MiB / 40537MiB | 100% Default | | | | Disabled | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+





Workaround

Boot any VMs that are configured with a vGPU that resides on the GPU.

After this workaround has been completed, the nvidia-smi command shows 0% GPU utilization for affected GPUs when they are idle.

Copy Copied! root@host ~]# nvidia-smi Fri Oct 10 11:47:38 2025 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 535.274.03 Driver Version: 535.274.03 CUDA Version: 12.1 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 A100-PCIE-40GB On | 00000000:5E:00.0 Off | 0 | | N/A 50C P0 97W / 250W | 0MiB / 40537MiB | 0% Default | | | | Disabled | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+





Status

Open



Ref. #

200605527

Description

Upgrading the NVIDIA vGPU software graphics driver in a Linux guest VM with multiple vGPUs might fail. This issue occurs if the driver is upgraded by overinstalling the new release of the driver on the current release of the driver while the nvidia-gridd service is running in the VM.



Workaround

Stop the nvidia-gridd service. Try again to upgrade the driver.

Status

Open



Ref. #

200633548

Description

If NVIDIA licensing information is not configured on the system, any attempt to start NVIDIA Control Panel by right-clicking on the desktop within 30 seconds of the VM being started fails.



Workaround

Restart the VM and wait at least 30 seconds before trying to launch NVIDIA Control Panel.



Status

Open



Ref. #

200623179

Description

When a window is dragged across the desktop in a Citrix Virtual Apps and Desktops session, corruption of the session in the form of residual window borders occurs.



Version

This issue affects only Citrix Virtual Apps and Desktops version 7 2003



Workaround

Use Citrix Virtual Apps and Desktops version 7 1912 or 2006.



Status

Not an NVIDIA bug



Ref. #

200608675

Description

Some Omnissa Horizon clients cannot connect to a Windows 10 2004 VM with multiple displays. When this issue occurs, the VM becomes unusable and clients cannot connect to the VM even if only a single display is connected to it.

This issue occurs because the desktop capture mechanism for the affected Omnissa Horizon clients is provided by NVIDIA® Frame Buffer Capture (NVFBC) and NVFBC is deprecated on Windows 10 starting with Windows 10 October 2019 Update. For more information, see NVFBC Windows 10 Support Deprecation Technical Bulletin (PDF).



Version

This issue affects only Windows 10 May 2020 Update (2004) guest VMs.



Workaround

Obtain a version of Omnissa Horizon for which the desktop capture mechanism is not provided by NVFBC.



Status

Not an NVIDIA bug



Ref. #

200607827

Description

Suspending a VM configured with vGPU on a host running one version of the vGPU manager and resuming the VM on a host running a version from an older main release branch fails. For example, suspending a VM on a host that is running the vGPU manager from release 16.12 and resuming the VM on a host running the vGPU manager from release 15.4 fails. When this issue occurs, the error One or more devices (pciPassthru0) required by VM vm-name are not available on host host-name is reported on VMware vCenter Server.

Status

Not an NVIDIA bug



Ref. #

200602087

Description

On a Linux VM configured with a -1Q vGPU, one 4K display, and VMware Horizon 7.12, the VMware Horizon session might become unresponsive after a switch from large screen (windowed) to full screen. When this issue occurs, the VMware vSphere VM’s log file contains the error message Unable to set requested topology .



Version

This issue affects deployments that use VMware Horizon 7.12.



Workaround

Use VMware Horizon 7.11.



Status

Open



Ref. #

200617112

Description

On a Linux VM configured with a -1Q vGPU, two 4K displays, and VMware Horizon 7.12, the VMware Horizon session might become unresponsive. When this issue occurs, the VMware vSphere VM’s log file contains the error message Failed to setup capture session (error 8). Unable to allocate video memory .



Version

This issue affects deployments that use VMware Horizon 7.12.



Workaround

Use VMware Horizon 7.11 or a vGPU with more frame buffer.



Status

Open



Ref. #

200617081

Description

On Linux, the frame rate might drop to 1 frame per second (FPS) after NVIDIA vGPU software has been running for several minutes. Only some applications are affected, for example, glxgears. Other applications, such as Unigine Heaven, are not affected. This behavior occurs because Display Power Management Signaling (DPMS) for the Xorg server is enabled by default and the display is detected to be inactive even when the application is running. When DPMS is enabled, it enables power saving behavior of the display after several minutes of inactivity by setting the frame rate to 1 FPS.



Workaround

If necessary, stop the Xorg server. Copy Copied! # /etc/init.d/xorg stop In a plain text editor, edit the /etc/X11/xorg.conf file to set the options to disable DPMS and disable the screen saver. In the Monitor section, set the DPMS option to false . Copy Copied! Option "DPMS" "false" At the end of the file, add a ServerFlags section that contains option to disable the screen saver. Copy Copied! Section "ServerFlags" Option "BlankTime" "0" EndSection Save your changes to /etc/X11/xorg.conf file and quit the editor. Start the Xorg server. Copy Copied! # etc/init.d/xorg start

Status

Open



Ref. #

200605900

Description

When Omnissa Horizon is used with the Blast Extreme display protocol, frame buffer consumption increases over time after multiple disconnections from and reconnections to a VM. This issue occurs even if the VM is in an idle state and no graphics applications are running.



Workaround

Reboot the VM.



Status

Not an NVIDIA bug



Ref. #

200602520

Description

Desktop Windows Manager (DWM) crashes randomly occur in Windows VMs, causing a blue-screen crash and the bug check CRITICAL_PROCESS_DIED . Computer Management shows problems with the primary display device.



Version

This issue affects Windows 10 1809, 1903 and 1909 VMs.



Status

Not an NVIDIA bug



Ref. #

2730037

Description

After multiple VMs configured with vGPU on a single hypervisor host are migrated simultaneously, the remote desktop session freezes with an assertion failure and XID error 43. This issue affects only GPUs that are based on the Volta GPU architecture. It does not occur if only a single VM is migrated.

When this error occurs, the following error messages are logged to the VMware vSphere Hypervisor (ESXi) log file:

Copy Copied! Jan 3 14:35:48 ch81-m1 vgpu-12[8050]: error: vmiop_log: NVOS status 0x1f Jan 3 14:35:48 ch81-m1 vgpu-12[8050]: error: vmiop_log: Assertion Failed at 0x4b8cacf6:286 ... Jan 3 14:35:59 ch81-m1 vgpu-12[8050]: error: vmiop_log: (0x0): XID 43 detected on physical_chid:0x174, guest_chid:0x14





Status

Open



Ref. #

200581703

Description

When a Citrix Virtual Apps and Desktops session that is locked is unlocked by pressing Ctrl+Alt+Del, the session freezes. This issue affects only VMs that are running Microsoft Windows 10 1809 as a guest OS.



Version

Microsoft Windows 10 1809 guest OS



Workaround

Restart the VM.



Status

Not an NVIDIA bug



Ref. #

2767012

Description

After the Linux kernel is upgraded (for example by running sudo apt full-upgrade) with Dynamic Kernel Module Support (DKMS) enabled, the nvidia-smi command fails to run. If DKMS is enabled, an upgrade to the Linux kernel triggers a rebuild of the NVIDIA vGPU software graphics driver. The rebuild of the driver fails because the compiler version is incorrect. Any attempt to reinstall the driver fails because the kernel fails to build.

When the failure occurs, the following messages are displayed:

Copy Copied! -> Installing DKMS kernel module: ERROR: Failed to run `/usr/sbin/dkms build -m nvidia -v 535.54.03 -k 5.3.0-28-generic`: Kernel preparation unnecessary for this kernel. Skipping... Building module: cleaning build area... 'make' -j8 NV_EXCLUDE_BUILD_MODULES='' KERNEL_UNAME=5.3.0-28-generic IGNORE_CC_MISMATCH='' modules...(bad exit status: 2) ERROR (dkms apport): binary package for nvidia: 535.54.03 not found Error! Bad return status for module build on kernel: 5.3.0-28-generic (x86_64) Consult /var/lib/dkms/nvidia/ 535.54.03/build/make.log for more information. -> error. ERROR: Failed to install the kernel module through DKMS. No kernel module was installed; please try installing again without DKMS, or check the DKMS logs for more information. ERROR: Installation has failed. Please see the file '/var/log/nvidia-installer.log' for details. You may find suggestions on fixing installation problems in the README available on the Linux driver download page at www.nvidia.com.





Workaround

When installing the NVIDIA vGPU software graphics driver with DKMS enabled, use one of the following workarounds:

Before running the driver installer, install the dkms package, then run the driver installer with the -dkms option.

package, then run the driver installer with the -dkms option. Run the driver installer with the --no-cc-version-check option.

Status

Not a bug.



Ref. #

2836271

Description

During installation of the NVIDIA vGPU software graphics driver in a Red Hat Enterprise Linux or CentOS 6 guest VM, a kernel panic occurs, and the VM hangs and cannot be rebooted. This issue is observed on older Linux kernels when the NVIDIA device is using message-signaled interrupts (MSIs).



Version

This issue affects the following guest OS releases:

Red Hat Enterprise Linux 6.6 and later compatible 6. x versions

versions CentOS 6.6 and later compatible 6.x versions

Workaround

Disable MSI in the guest VM to fall back to INTx interrupts by adding the following line to the file /etc/modprobe.d/nvidia.conf: Copy Copied! options nvidia NVreg_EnableMSI=0 If the file /etc/modprobe.d/nvidia.conf does not exist, create it. Install the NVIDIA vGPU Software graphics driver in the guest VM.

Status

Closed



Ref. #

200556896

Description

Some servers, for example, the Dell R740, do not configure SR-IOV capability if the SR-IOV SBIOS setting is disabled on the server. If the SR-IOV SBIOS setting is disabled on such a server that is being used with the Tesla T4 GPU, VMware vSphere ESXi enumerates the Tesla T4 as 32 separate GPUs. In this state, you cannot use the GPU to configure a VM with NVIDIA vGPU or for GPU pass through.



Workaround

Ensure that the SR-IOV SBIOS setting is enabled on the server.



Status

Not an NVIDIA bug

A fix is available from VMware in VMware vSphere ESXi 7.0 Update 2.



Ref. #

2697051

Description

When vMotion is used to migrate a VM configured with vGPU to another host, users' sessions may freeze for up to several seconds during the migration.

These factors may increase the length of time for which a session freezes:

Continuous use of the frame buffer by the workload, which typically occurs with workloads such as video streaming

A large amount of vGPU frame buffer

A large amount of system memory

Limited network bandwidth

Workaround

Administrators can mitigate the effects on end users by avoiding migration of VMs configured with vGPU during business hours or warning end users that migration is about to start and that they may experience session freezes.

End users experiencing this issue must wait for their sessions to resume when the migration is complete.



Status

Open



Ref. #

2569578

Description

When a VM configured with vGPU is migrated to another host, the migration stops before it is complete.

This issue occurs if the ECC memory configuration (enabled or disabled) on the source and destination hosts are different. The ECC memory configuration on both the source and destination hosts must be identical.



Workaround

Before attempting to migrate the VM again, ensure that the ECC memory configuration on both the source and destination hosts are identical.



Status

Not an NVIDIA bug



Ref. #

200520027

Description

The ECC memory settings for a vGPU cannot be changed from a Linux guest VM by using NVIDIA X Server Settings. After the ECC memory state has been changed on the ECC Settings page and the VM has been rebooted, the ECC memory state remains unchanged.



Workaround

Use the nvidia-smi command in the guest VM to enable or disable ECC memory for the vGPU as explained in Virtual GPU Software User Guide .

If the ECC memory state remains unchanged even after you use the nvidia-smi command to change it, use the workaround in Changes to ECC memory settings for a Linux vGPU VM by nvidia-smi might be ignored.



Status

Open



Ref. #

200523086

Description

After the ECC memory state for a Linux vGPU VM has been changed by using the nvidia-smi command and the VM has been rebooted, the ECC memory state might remain unchanged.

This issue occurs when multiple NVIDIA configuration files in the system cause the kernel module option for setting the ECC memory state RMGuestECCState in /etc/modprobe.d/nvidia.conf to be ignored.

When the nvidia-smi command is used to enable ECC memory, the file /etc/modprobe.d/nvidia.conf is created or updated to set the kernel module option RMGuestECCState . Another configuration file in /etc/modprobe.d/ that contains the keyword NVreg_RegistryDwordsPerDevice might cause the kernel module option RMGuestECCState to be ignored.



Workaround

This workaround requires administrator privileges.

Move the entry containing the keyword NVreg_RegistryDwordsPerDevice from the other configuration file to /etc/modprobe.d/nvidia.conf. Reboot the VM.

Status

Open



Ref. #

200505777

Description

When GPU performance is being monitored, host core CPU utilization is higher than expected for moderate workloads. For example, host CPU utilization when only a small number of VMs are running is as high as when several times as many VMs are running.



Workaround

Disable monitoring of the following GPU performance statistics:

vGPU engine usage by applications across multiple vGPUs

Encoder session statistics

Frame buffer capture (FBC) session statistics

Statistics gathered by performance counters in guest VMs

Status

Open



Ref. #

2414897

Description

On 1Q vGPUs with a 4K display, a shortage of frame buffer causes the H.264 encoder to fall back to software encoding.



Workaround

Use a 2Q or larger virtual GPU type to provide more frame buffer for each vGPU.



Status

Open



Ref. #

2422580

Description

On 2Q vGPUs with three or more 4K displays, a shortage of frame buffer causes the H.264 encoder to fall back to software encoding.

This issue affects only vGPUs assigned to VMs that are running a Linux guest OS.



Workaround

Use a 4Q or larger virtual GPU type to provide more frame buffer for each vGPU.



Status

Open



Ref. #

200457177

Description

Because of a known limitation with NvFBC, a frame capture while the interactive logon message is displayed returns a blank screen.

An NvFBC session can capture screen updates that occur after the session is created. Before the logon message appears, there is no screen update after the message is shown and, therefore, a black screen is returned instead. If the NvFBC session is created after this update has occurred, NvFBC cannot get a frame to capture.



Workaround

Press Enter or wait for the screen to update for NvFBC to capture the frame.



Status

Not a bug



Ref. #

2115733

Description

When Windows Server is used as a guest OS, Remote Desktop Services (RDS) sessions do not use the GPU. By default, the RDS sessions use the Microsoft Basic Render Driver instead of the GPU. This default setting enables 2D DirectX applications such as Microsoft Office to use software rendering, which can be more efficient than using the GPU for rendering. However, as a result, 3D applications that use DirectX are prevented from using the GPU.



Version

This issue affects all Windows Server releases that are supported as a guest OS.



Solution

Change the local computer policy to use the hardware graphics adapter for all RDS sessions.

Choose Local Computer Policy > Computer Configuration > Administrative Templates > Windows Components > Remote Desktop Services > Remote Desktop Session Host > Remote Session Environment. Set the Use the hardware default graphics adapter for all Remote Desktop Services sessions option.

Description

Migrating a VM configured with vGPU fails gracefully if the VM is running an intensive workload.

The error stack in the task details on the vSphere web client contains the following error message:

Copy Copied! The migration has exceeded the maximum switchover time of 100 second(s). ESX has preemptively failed the migration to allow the VM to continue running on the source. To avoid this failure, either increase the maximum allowable switchover time or wait until the VM is performing a less intensive workload.





Workaround

Increase the maximum switchover time by increasing the vmotion.maxSwitchoverSeconds option from the default value of 100 seconds.

For more information, see VMware Knowledge Base Article: vMotion or Storage vMotion of a VM fails with the error: The migration has exceeded the maximum switchover time of 100 second(s) (2141355).



Status

Not an NVIDIA bug



Ref. #

200416700

Description

In a Linux VM, the view session can sometimes freeze after the VM acquires a license.



Workaround

Resize the view session.



Status

Not an NVIDIA bug



Ref. #

200426961

Description

When the scheduling policy is fixed share, GPU engine utilization can be reported as higher than expected for a vGPU.

For example, GPU engine usage for six P40-4Q vGPUs on a Tesla P40 GPU might be reported as follows:

Copy Copied! [root@localhost:~] nvidia-smi vgpu Mon Aug 20 10:33:18 2018 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 390.42 Driver Version: 390.42 | |-------------------------------+--------------------------------+------------+ | GPU Name | Bus-Id | GPU-Util | | vGPU ID Name | VM ID VM Name | vGPU-Util | |===============================+================================+============| | 0 Tesla P40 | 00000000:81:00.0 | 99% | | 85109 GRID P40-4Q | 85110 win7-xmpl-146048-1 | 32% | | 87195 GRID P40-4Q | 87196 win7-xmpl-146048-2 | 39% | | 88095 GRID P40-4Q | 88096 win7-xmpl-146048-3 | 26% | | 89170 GRID P40-4Q | 89171 win7-xmpl-146048-4 | 0% | | 90475 GRID P40-4Q | 90476 win7-xmpl-146048-5 | 0% | | 93363 GRID P40-4Q | 93364 win7-xmpl-146048-6 | 0% | +-------------------------------+--------------------------------+------------+ | 1 Tesla P40 | 00000000:85:00.0 | 0% | +-------------------------------+--------------------------------+------------+

The vGPU utilization of vGPU 85109 is reported as 32%. For vGPU 87195, vGPU utilization is reported as 39%. And for 88095, it is reported as 26%. However, the expected vGPU utilization of any vGPU should not exceed approximately 16.7%.

This behavior is a result of the mechanism that is used to measure GPU engine utilization.



Status

Open



Ref. #

2227591

Description

The command nvidia-smi vgpu -m shows that vGPU migration is supported on all hypervisors, even hypervisors or hypervisor versions that do not support vGPU migration.



Status

Closed



Ref. #

200407230

Description

A GPU resources not available error might occur during VMware instant clone provisioning. On Windows VMs, a Video TDR failure - NVLDDMKM.sys error causes a blue screen crash.

This error occurs when options for VMware Virtual Shared Graphics Acceleration (vSGA) are set for a VM that is configured with NVIDIA vGPU. VMware vSGA is a feature of VMware vSphere that enables multiple virtual machines to share the physical GPUs on ESXi hosts and can be used as an alternative to NVIDIA vGPU.

Depending on the combination of options set, one of the following error messages is seen when the VM is powered on:

Module ‘MKS’ power on failed. This message is seen when the following options are set: Enable 3D support is selected. 3D Renderer is set to Hardware The graphics type of all GPUs on the ESXi host is Shared Direct.

Hardware GPU resources are not available. The virtual machine will use software rendering. This message is seen when the following options are set: Enable 3D support is selected. 3D Renderer is set to Automatic . The graphics type of all GPUs on the ESXi host is Shared Direct.



Resolution

If you want to use NVIDIA vGPU, unset any options for VMware vSGA that are set for the VM.

Ensure that the VM is powered off. Open the vCenter Web UI. In the vCenter Web UI, right-click the VM and choose Edit Settings. Click the Virtual Hardware tab. In the device list, expand the Video card node and de-select the Enable 3D support option. Start the VM.

Status

Not a bug



Ref. #

2369683

Description

Some registry keys are available only with the R390 Virtual GPU Manager, for example, NVreg_IgnoreMMIOCheck . If any keys that are available only with the R390 Virtual GPU Manager are set, the NVIDIA module fails to load after a downgrade from R390 to R384.

When nvidia-smi is run without any arguments to verify the installation, the following error message is displayed:

Copy Copied! NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.





Workaround

Before uninstalling the R390 VIB, clear all parameters of the nvidia module to remove any registry keys that are available only for the R390 Virtual GPU Manager.

Copy Copied! # esxcli system module parameters set -p "" -m nvidia





Status

Not an NVIDIA bug



Ref. #

200366884

Description

Pass-through mode on Tesla P40 GPUs and other GPUs based on the Pascal architecture does not work as expected. In some situations, after the VM is powered on, the guest OS crashes or fails to boot.



Workaround

Ensure that your GPUs are configured as described in Requirements for Using GPUs Requiring Large MMIO Space in Pass-Through Mode.



Status

Not a bug



Ref. #

1944539

Description

When windows for 3D applications on Linux are dragged, the frame rate drops substantially and the application runs slowly.

This issue does not affect 2D applications.



Status

Open



Ref. #

1949482

Description

On Red Hat Enterprise Linux 6.8 and 6.9, and CentOS 6.8 and 6.9, a segmentation fault in DBus code causes the nvidia-gridd service to exit.

The nvidia-gridd service uses DBus for communication with NVIDIA X Server Settings to display licensing information through the Manage License page. Disabling the GUI for licensing resolves this issue.

To prevent this issue, the GUI for licensing is disabled by default. You might encounter this issue if you have enabled the GUI for licensing and are using Red Hat Enterprise Linux 6.8 or 6.9, or CentOS 6.8 and 6.9.



Version

Red Hat Enterprise Linux 6.8 and 6.9

CentOS 6.8 and 6.9



Status

Open



Ref. #

200358191

200319854

1895945

Description

By default, the Manage License option is not available in NVIDIA X Server Settings. This option is missing because the GUI for licensing on Linux is disabled by default to work around the issue that is described in A segmentation fault in DBus code causes nvidia-gridd to exit on Red Hat Enterprise Linux and CentOS.



Workaround

This workaround requires sudo privileges.

Note: Do not use this workaround with Red Hat Enterprise Linux 6.8 and 6.9 or CentOS 6.8 and 6.9. To prevent a segmentation fault in DBus code from causing the nvidia-gridd service from exiting, the GUI for licensing must be disabled with these OS versions.

If you are licensing a physical GPU for NVIDIA vGPU for Compute, you must use the configuration file /etc/nvidia/gridd.conf.

If NVIDIA X Server Settings is running, shut it down. If the /etc/nvidia/gridd.conf file does not already exist, create it by copying the supplied template file /etc/nvidia/gridd.conf.template. As root, edit the /etc/nvidia/gridd.conf file to set the EnableUI option to TRUE . Start the nvidia-gridd service. Copy Copied! # sudo service nvidia-gridd start

When NVIDIA X Server Settings is restarted, the Manage License option is now available.



Status

Open

Description

NVIDIA vGPU software licenses remain checked out on the license server when non-persistent VMs are forcibly powered off.

The NVIDIA service running in a VM returns checked out licenses when the VM is shut down. In environments where non-persistent licensed VMs are not cleanly shut down, licenses on the license server can become exhausted. For example, this issue can occur in automated test environments where VMs are frequently changing and are not guaranteed to be cleanly shut down. The licenses from such VMs remain checked out against their MAC address for seven days before they time out and become available to other VMs.



Resolution

If VMs are routinely being powered off without clean shutdown in your environment, you can avoid this issue by shortening the license borrow period. To shorten the license borrow period, set the LicenseInterval configuration setting in your VM image. For details, refer to Virtual GPU Client Licensing User Guide .



Status

Closed



Ref. #

1694975

Description

Memory exhaustion can occur with vGPU profiles that have 512 Mbytes or less of frame buffer.

This issue typically occurs in the following situations:

Full screen 1080p video content is playing in a browser. In this situation, the session hangs and session reconnection fails.

Multiple display heads are used with Citrix Virtual Apps and Desktops or Omnissa Horizon on a Windows 10 guest VM.

Higher resolution monitors are used.

Applications that are frame-buffer intensive are used.

NVENC is in use.

To reduce the possibility of memory exhaustion, NVENC is disabled on profiles that have 512 Mbytes or less of frame buffer.

When memory exhaustion occurs, the NVIDIA host driver reports Xid error 31 and Xid error 43 in the VMware vSphere log file vmware.log in the guest VM’s storage directory.

The following vGPU profiles have 512 Mbytes or less of frame buffer:

Tesla M6-0B, M6-0Q

Tesla M10-0B, M10-0Q

Tesla M60-0B, M60-0Q

The root cause is a known issue associated with changes to the way that recent Microsoft operating systems handle and allow access to overprovisioning messages and errors. If your systems are provisioned with enough frame buffer to support your use cases, you should not encounter these issues.



Workaround

Use an appropriately sized vGPU to ensure that the frame buffer supplied to a VM through the vGPU is adequate for your workloads.

Monitor your frame buffer usage.

If you are using Windows 10, consider these workarounds and solutions: Use a profile that has 1 Gbyte of frame buffer. Optimize your Windows 10 resource usage. To obtain information about best practices for improved user experience using Windows 10 in virtual environments, complete the NVIDIA GRID vGPU Profile Sizing Guide for Windows 10 download request form. Additionally, you can use the VMware OS Optimization Tool to make and apply optimization recommendations for Windows 10 and other operating systems.



Status

Open



Ref. #

200130864

1803861

Description

Note: If vSGA is being used, this issue shouldn't be encountered and changing the default graphics type is not necessary.

On VMware vSphere Hypervisor (ESXi), after vGPU is configured, VMs to which a vGPU is assigned may fail to start and the following error message may be displayed:

Copy Copied! The amount of graphics resource available in the parent resource pool is insufficient for the operation.

The vGPU Manager VIB provides vSGA and vGPU functionality in a single VIB. After this VIB is installed, the default graphics type is Shared, which provides vSGA functionality. To enable vGPU support for VMs in VMware vSphere, you must change the default graphics type to Shared Direct. If you do not change the default graphics type you will encounter this issue.



Workaround

Change the default graphics type to Shared Direct as explained in Virtual GPU Software User Guide .



Status

Open



Ref. #

200256224

Description

GDM fails to start on Red Hat Enterprise Linux 7.2 and CentOS 7.0 with the following error:

Copy Copied! Oh no! Something has gone wrong!





Workaround

Permanently enable permissive mode for Security Enhanced Linux (SELinux).

As root, edit the /etc/selinux/config file to set SELINUX to permissive . Copy Copied! SELINUX=permissive Reboot the system. Copy Copied! ~]# reboot

For more information, see Permissive Mode in Red Hat Enterprise Linux 7 SELinux User's and Administrator's Guide .



Status

Not an NVIDIA bug



Ref. #

200167868

Description

When you launch NVIDIA Control Panel on a VM configured with vGPU, it fails to start and reports that you are not using a display attached to an NVIDIA GPU. This happens because Windows is using VMware’s SVGA device instead of NVIDIA vGPU.



Fix

Make NVIDIA vGPU the primary display adapter.

Use Windows screen resolution control panel to make the second display, identified as “2” and corresponding to NVIDIA vGPU, to be the active display and select the Show desktop only on 2 option. Click Apply to accept the configuration.

You may need to click on the Detect button for Windows to recognize the display connected to NVIDIA vGPU.

Note: If the Omnissa Horizon agent is installed in the VM, the NVIDIA GPU is automatically selected in preference to the SVGA device.





Status

Open



Ref. #

Description

Using the current VMware vCenter user interface, it is possible to configure a VM with more than one vGPU device. When booted, the VM boots in VMware SVGA mode and doesn’t load the NVIDIA driver. The additional vGPU devices are present in Windows Device Manager but display a warning sign, and the following device status:

Copy Copied! Windows has stopped this device because it has reported problems. (Code 43)





Workaround

NVIDIA vGPU currently supports a single virtual GPU device per VM. Remove any additional vGPUs from the VM configuration before booting the VM.



Status

Open



Ref. #

Description

Using the current VMware vCenter user interface, it is possible to configure a VM with a vGPU device and a passthrough (direct path) GPU device. This is not a currently supported configuration for vGPU. The passthrough GPU appears in Windows Device Manager with a warning sign, and the following device status:

Copy Copied! Windows has stopped this device because it has reported problems. (Code 43)





Workaround

Do not assign vGPU and passthrough GPUs to a VM simultaneously.



Status

Open



Ref. #

1735002

Description

If multiple VMs are started simultaneously, vSphere may not adhere to the placement policy currently in effect. For example, if the default placement policy (breadth-first) is in effect, and 4 physical GPUs are available with no resident vGPUs, then starting 4 VMs simultaneously should result in one vGPU on each GPU. In practice, more than one vGPU may end up resident on a GPU.



Workaround

Start VMs individually.



Status

Not an NVIDIA bug



Ref. #

200042690

Description

When a VM is configured with a vGPU, the Sleep option remains available in the Windows Start menu. Sleep is not supported on vGPU and attempts to use it will lead to undefined behavior.



Workaround

Do not use Sleep with vGPU.

Installing the Omnissa Horizon agent will disable the Sleep option.



Status

Closed



Ref. #

200043405

Description

If vGPU-enabled VMs are assigned too high a proportion of the server’s total memory, the following errors occur:

One or more of the VMs may fail to start with the following error: Copy Copied! The available Memory resources in the parent resource pool are insufficient for the operation

When run in the host shell, the nvidia-smi utility returns this error: Copy Copied! -sh: can't fork

For example, on a server configured with 256G of memory, these errors may occur if vGPU-enabled VMs are assigned more than 243G of memory.



Workaround

Reduce the total amount of system memory assigned to the VMs.



Status

Closed



Ref. #

200060499

Description

On a system running a maximal configuration, that is, with the maximum number of vGPU VMs the server can support, some VMs might fail to start post a reset or restart operation.



Fix

Upgrade to ESXi 6.0 Update 1.



Status

Closed



Ref. #

200097546

Description

vGPU VMs with an active Horizon connection utilize a high percentage of the GPU on the ESXi host. The GPU utilization remains high for the duration of the Horizon session even if there are no active applications running on the VM.



Workaround

None



Status

Open

Partially resolved for Horizon 7.0.1:

For Blast connections, GPU utilization is no longer high.

For PCoIP connections, utilization remains high.

Ref. #

1735009