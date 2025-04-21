Description

XID error 120 can put GPUs based on the Ada Lovelace architecture into a bad state. As a result, the VM to which the GPU is assigned might become unstable and the hypervisor host might crash. When this issue occurs, XID error 120 messages are written to the log files on the hypervisor host.



Status

Resolved in NVIDIA vGPU software 18.3



Ref. #

5137781

4973227

Description

The error message VGPU message 95 failed, result code: 0x80 from the vGPU plugin component is written to the log files on the hypervisor host.



Status

Resolved in NVIDIA vGPU software 18.2



Ref. #

4664109

Description

On Windows 11 VMs with more than 1 TB of system memory, GPU device unavailable errors (Error 43) occur. This issue affects NVIDIA vGPU and GPU pass through deployments.



Version

This issue affects Windows 11 guest VMS.



Workaround

Limit the amount of system memory assigned to the VM to less than 1 TB.



Status

Open



Ref. #

5115698

Description

In air-gapped environments where root certificates are not available on the host machine, timestamps cannot be verified. As a result, the NVIDIA vGPU software graphics driver fails to create the default client configuration token folder on Windows (%SystemDrive%\Program Files\NVIDIA Corporation\vGPU Licensing\ClientConfigToken). If the folder is created manually and the client configuration token is copied there, the client fails to obtain a license. Typically, root certificates are imported by Windows updates from the Microsoft Trusted Root Program.



Workaround

Determine whether the NVIDIA Authenticode signature certificate and the timestamp signature certificate are installed and, if not, install them.

To determine whether the root NVIDIA Authenticode signature certificate is installed:

Context-click the file and click the Digital Signatures tab. In the Signature list, select the NVIDIA certificate and click Details. Click View Certificate, then click Certification Path. The root certificate that is needed appears at the top of the certification path. Run the certmgr.msc command and in the certmgr window that opens, expand Trusted Root Certification Authorities and click Certificates to see whether the certificate that you identified in the previous step is installed.

To determine whether the root timestamp signature certificate is installed:

Context-click the file and click the Digital Signatures tab. In the Signature list, select the NVIDIA certificate and click Details. In the Countersignatures section, click the timestamp authority, for example, Digicert or Entrust, then click Details below the countersignature section. Click View Certificate, then click Certification Path. The root certificate that is needed appears at the top of the certification path. Run the certmgr.msc command and in the certmgr window that opens, expand Trusted Root Certification Authorities and click Certificates to see whether the certificate that you identified in the previous step is installed.

Root certificates for both Digicert and Entrust are required for timestamping and can be downloaded from the following websites:

Status

Not an NVIDIA bug



Ref. #

4684895

Description

If the NVIDIA vGPU Manager on a hypervisor host with a Tesla M10 GPU is upgraded but the Windows guest VM driver is not upgraded, a blue screen crash occurs.



Version

This issue affects any Windows VM running a guest VM driver 16.x release on a hypervisor host running an NVIDIA vGPU Manager 17.x release.



Workaround

Upgrade the guest VM driver to the driver from the same NVIDIA vGPU software release as the NVIDIA vGPU Manager.



Status

Resolved in NVIDIA vGPU software 17.4



Ref. #

4631262

Description

After the NVIDIA vGPU software graphics driver for Windows is installed, the NVIDIA Control Panel app might be missing from the system. This issue typically occurs in the following situations:

Multiple users connect to virtual machines by using remote desktop applications such as Microsoft RDP, Omnissa Horizon, and Citrix Virtual Apps and Desktops.

VM instances are created by using Citrix Machine Creation Services (MCS) or VMware Instant Clone technology.

Roaming user desktop profiles are deployed.

This issue occurs because the NVIDIA Control Panel app is now distributed through the Microsoft Store. The NVIDIA Control Panel app might fail to be installed when the NVIDIA vGPU software graphics driver for Windows is installed if the Microsoft Store app is disabled, the system is not connected to the Internet, or installation of apps from the Microsoft Store is blocked by your system settings.

To determine whether the NVIDIA Control Panel app is installed on your system, use the Windows Settings app or the Get-AppxPackage Windows PowerShell command.

To use the Windows Settings app: From the Windows Start menu, choose Settings > Apps > Apps & feautures . In the Apps & features window, type nvidia control panel in the search box and confirm that the NVIDIA Control Panel app is found.

To use the Get-AppxPackageWindows PowerShell command: Run Windows PowerShell as Administrator. Determine whether the NVIDIA Control Panel app is installed for the current user. Copy Copied! PS C:\> Get-AppxPackage -Name NVIDIACorp.NVIDIAControlPanel Determine whether the NVIDIA Control Panel app is installed for all users. Copy Copied! PS C:\> Get-AppxPackage -AllUsers -Name NVIDIACorp.NVIDIAControlPanel This example shows that the NVIDIA Control Panel app is installed for the users Administrator , pliny , and trajan . Copy Copied! PS C:\> Get-AppxPackage -AllUsers -Name NVIDIACorp.NVIDIAControlPanel Name : NVIDIACorp.NVIDIAControlPanel Publisher : CN=D6816951-877F-493B-B4EE-41AB9419C326 Architecture : X64 ResourceId : Version : 8.1.964.0 PackageFullName : NVIDIACorp.NVIDIAControlPanel_8.1.964.0_x64__56jybvy8sckqj InstallLocation : C:\Program Files\WindowsApps\NVIDIACorp.NVIDIAControlPanel_8.1.964.0_x64__56jybvy8sckqj IsFramework : False PackageFamilyName : NVIDIACorp.NVIDIAControlPanel_56jybvy8sckqj PublisherId : 56jybvy8sckqj PackageUserInformation : {S-1-12-1-530092550-1307989247-1105462437-500 [Administrator]: Installed , S-1-12-1-530092550-1307989247-1105462437-1002 [pliny]: Installed , S-1-12-1-530092550-1307989247-1105462437-1003 [trajan]: Installed } IsResourcePackage : False IsBundle : False IsDevelopmentMode : False NonRemovable : False IsPartiallyStaged : False SignatureKind : Store Status : Ok



Preventing this Issue

If your system does not allow the installation apps from the Microsoft Store, download and run the standalone NVIDIA Control Panel installer that is available from NVIDIA Licensing Portal. For instructions, refer to Virtual GPU Software User Guide .

If your system can allow the installation apps from the Microsoft Store, ensure that:

The Microsoft Store app is enabled.

Installation of Microsoft Store apps is not blocked by your system settings.

No local or group policies are set to block Microsoft Store apps.

Workaround

If the NVIDIA Control Panel app is missing, install it separately from the graphics driver by downloading and running the standalone NVIDIA Control Panel installer that is available from NVIDIA Licensing Portal. For instructions, refer to Virtual GPU Software User Guide .

If the issue persists, contact NVIDIA Enterprise Support for further assistance.



Status

Open



Ref. #

3999308

Description

On all supported Windows Server guest OS releases, NVIDIA Control Panel crashes if a user session is disconnected and then reconnected while NVIDIA Control Panel is open.



Version

This issue affects all supported Windows Server guest OS releases.



Status

Open



Ref. #

4086605

Description

A VM that has been assigned multiple fractional vGPUs from the same physical GPU hangs or becomes inaccessible during installation of the NVIDIA vGPU software graphics driver in the VM. This issue affects only GPUs based on the NVIDIA Turing and NVIDIA Volta GPU architectures. This issue does not occur if the VM has been assigned multiple fractional vGPUs from different physical GPUs.



Version

This issue affects only GPUs based on the NVIDIA Turing and NVIDIA Volta GPU architectures.



Status

Open



Ref. #

4020171

Description

NVIDIA CUDA Toolkit profilers cannot gather hardware metrics on NVIDIA vGPU. This issue affects only traces that gather hardware metrics. Other traces are not affected by this issue and work normally.



Version

This issue affects NVIDIA vGPU software releases starting with 15.2.



Status

Open



Ref. #

4041169

Description

After the NVIDIA vGPU software graphics for windows has been installed in the guest VM, the driver sends a remote call to ngx.download.nvidia.com to download and install additional components. Such a remote call might be a security issue.



Workaround

Before running the NVIDIA vGPU software graphics driver installer, disable the remote call to ngx.download.nvidia.com by setting the following Windows registry key:

Copy Copied! [HKEY_LOCAL_MACHINE\SOFTWARE\NVIDIA Corporation\Global\NGXCore] "EnableOTA"=dword:00000000

Note: If this Windows registry key is set to 1 or deleted, the remote call to ngx.download.nvidia.com is enabled again.





Status

Open



Ref. #

4031840

Description

After compute instances are created and deleted on an NVIDIA H100 GPU, creation of multiple instances in a single nvidia-smi command fails. For example, the command nvidia-smi mig -cci 0,1,2 fails with the following error message:

Copy Copied! Unable to create a compute instance on GPU 0 GPU instance ID 0 using profile 0: Invalid Argument Failed to create compute instances: Invalid Argument

Workaround

Create each compute instance in a separate nvidia-smi command, for example:

Copy Copied! $ nvidia-smi mig -cci 0 $ nvidia-smi mig -cci 1 $ nvidia-smi mig cci 2





Status

Open



Ref. #

3829786

Description

A licensed client of NVIDIA License System (NLS) fails to acquire a license with the error The allowed time to process response has expired . This error can affect clients of a Cloud License Service (CLS) instance or a Delegated License Service (DLS) instance.

This error occurs when the time difference between the system clocks on the client and the server that hosts the CLS or DLS instance is greater than 10 minutes. A common cause of this error is the failure of either the client or the server to adjust its system clock when daylight savings time begins or ends. The failure to acquire a license is expected to prevent clock windback from causing licensing errors.



Workaround

Ensure that system clock time of the client and any server that hosts a DLS instance match the current time in the time zone where they are located.

To prevent this error from occurring when daylight savings time begins or ends, enable the option to automatically adjust the system clock for daylight savings time:

Windows: Set the Adjust for daylight saving time automatically option.

Set the option. Linux: Use the hwclock command.

Status

Not a bug



Ref. #

3859889

Description

The NVIDIA vGPU software graphics driver fails to load on hypervsiors based on Linux with KVM. This issue affects UEFI VMs configured with a vGPU or pass-through GPU that requires a large BAR address space. This issue does not affect VMs that are booted in legacy BIOS mode. The issue occurs because BAR resources are not mapped into the VM.

On a Windows VM, error code 12 is reported in Device Manager for the vGPU or pass-through GPU.



Workaround

In virsh, open for editing the XML document of the VM to which the vGPU or GPU is assigned. Copy Copied! # virsh edit vm-name vm-name The name of the VM to which the vGPU or GPU is assigned. Declare the custom libvirt XML namespace that supports command-line pass through of QEMU arguments. Declare this namesapce by modifying the start tag of the top-level domain element in the first line of the XML document. Copy Copied! <domain type='kvm' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'> At the end of the XML document, between the </devices> end tag and the </domain> end tag, add the highlighted qemu elements. These elements pass the QEMU arguments for mapping the required BAR resources into the VM, setting the MMIO aperture size to 262144. If necessary, replace the value of 262144 with the MMIO aperture size that your VM requires. Copy Copied! </devices> <qemu:commandline> <qemu:arg value='-fw_cfg'/> <qemu:arg value='opt/ovmf/X-PciMmio64Mb,string=262144'/> </qemu:commandline> </domain> Start the VM to which the vGPU or GPU is assigned. Copy Copied! # virsh start vm-name vm-name The name of the VM to which the vGPU or GPU is assigned.

Status

Not an NVIDIA bug



Ref. #

200719557

Description

In an environment with multiple active desktop sessions, the Manage License page of NVIDIA Control Panel shows that a licensed system is unlicensed. However, the nvidia-smi command and the management interface of the NVIDIA vGPU software license server correctly show that the system is licensed. When an active session is disconnected and reconnected, the NVIDIA Display Container service crashes.

The Manage License page incorrectly shows that the system is unlicensed because of stale data in NVIDIA Control Panel in an environment with multiple sessions. The data is stale because NVIDIA Control Panel fails to get and update the settings for remote sessions when multiple sessions or no sessions are active in the VM. The NVIDIA Display Container service crashes when a session is reconnected because the session is not active at the moment of reconnection.



Status

Open



Ref. #

3761243

Description

VP9 and AV1 decoding with web browsers are not supported on Microsoft Windows Server 2019 and later supported releases. This issue occurs because starting with Windows Server 2019, the required codecs are not included with the OS and are not available through the Microsoft Store app. As a result, hardware decoding is not available for viewing YouTube videos or using collaboration tools such as Google Meet in a web browser.



Version

This issue affects Microsoft Windows Server releases starting with Windows Server 2019.



Status

Not an NVIDIA bug



Ref. #

200756564

Description

After a second NVIDIA vGPU device is added to a Microsoft Windows Server 2016 VM, the device does not appear in the output from the nvidia-smi command. This issue occurs only if the VM is already running NVIDIA vGPU software for the existing NVIDIA vGPU device when the second device is added to the VM.

The nvidia-smi command cannot retrieve the guest driver version, license status, and accounting mode of the second NVIDIA vGPU device.

Copy Copied! nvidia-smi vgpu --query GPU 00000000:37:00.0 Active vGPUs : 1 vGPU ID : 3251695793 VM ID : 3575923 VM Name : SVR-Reg-W(P)-KuIn vGPU Name : GRID V100D-32Q vGPU Type : 185 vGPU UUID : 29097249-2359-11b2-8a5b-8e896866496b Guest Driver Version : 572.83 License Status : Licensed Accounting Mode : Disabled ... GPU 00000000:86:00.0 Active vGPUs : 1 vGPU ID : 3251695797 VM ID : 3575923 VM Name : SVR-Reg-W(P)-KuIn vGPU Name : GRID V100D-32Q vGPU Type : 185 vGPU UUID : 2926dd83-2359-11b2-8b13-5f22f0f74801 Guest Driver Version : Not Available License Status : N/A Accounting Mode : N/A

Version

This issue affects only VMs that are running Microsoft Windows Server 2016 as a guest OS.



Workaround

To avoid this issue, configure the guest VM with both NVIDIA vGPU devices before installing the NVIDIA vGPU software graphics driver.

If you encounter this issue after the VM is configured, use one of the following workarounds:

Reinstall the NVIDIA vGPU software graphics driver.

Forcibly uninstall the Microsoft Basic Display Adapter and reboot the VM.

Upgrade the guest OS on the VM to Microsoft Windows Server 2019.

Status

Not an NVIDIA bug



Ref. #

3562801

Description

After the NVIDIA vGPU software graphics driver for Linux is upgraded from an RPM package in a licensed VM, licensing fails. The nvidia-smi vgpu -q command shows the driver version and license status as N/A. Restarting the nvidia-gridd service fails with a Unit not found error.



Workaround

Perform a clean installation of the NVIDIA vGPU software graphics driver for Linux from an RPM package.

Remove the currently installed driver. Install the new version of the driver. Copy Copied! $ rpm -iv nvidia-linux-grid-570_570.172.08_amd64.rpm

Status

Open



Ref. #

3512766

Description

The frame rate in frames per second (FPS) for the NVIDIA hardware-based H.264/HEVC video encoder (NVENC) reported by the nvidia-smi encodersessions command and NVWMI is double the actual frame rate. Only the reported frame rate is incorrect. The actual encoding of frames is not affected.

This issue affects only Windows VMs that are configured with NVIDIA vGPU.



Status

Open



Ref. #

2997564

Description

The NVIDIA hardware-based H.264/HEVC video encoder (NVENC) does not work with Teradici Cloud Access Software on Windows. This issue affects NVIDIA vGPU and GPU pass through deployments.

This issue occurs because the check that Teradici Cloud Access Software performs on the DLL signer name is case sensitive and NVIDIA recently changed the case of the company name in the signature certificate.



Status

Not an NVIDIA bug

This issue is resolved in the latest 21.07 and 21.03 Teradici Cloud Access Software releases.



Ref. #

200749065

Description

If a proxy is set with a system environment variable such as HTTP_PROXY or HTTPS_PROXY , a licensed client might fail to acquire a license.



Workaround

Perform this workaround on each affected licensed client.

Add the address of the NVIDIA vGPU software license server to the system environment variable NO_PROXY . The address must be specified exactly as it is specified in the client's license server settings either as a fully-qualified domain name or an IP address. If the NO_PROXY environment variable contains multiple entries, separate the entries with a comma ( , ). If high availability is configured for the license server, add the addresses of the primary license server and the secondary license server to the system environment variable NO_PROXY . Restart the NVIDIA driver service that runs the core NVIDIA vGPU software logic. On Windows, restart the NVIDIA Display Container service.

On Linux, restart the nvidia-gridd service.

Status

Closed



Ref. #

200704733

Description

Desktop session connections fail for a 2Q, 3Q, or 4Q vGPU that is configured with four 4K displays and for which the NVIDIA hardware-based H.264/HEVC video encoder (NVENC) is enabled. This issue affects only Teradici Cloud Access Software sessions on Linux guest VMs.

This issue is accompanied by the following error message:

Copy Copied! This Desktop has no resources available or it has timed out

This issue is caused by insufficient frame buffer.



Workaround

Ensure that sufficient frame buffer is available for all the virtual displays that are connected to a vGPU by changing the configuration in one of the following ways:

Reducing the number of virtual displays. The number of 4K displays supported with NVENC enabled depends on the vGPU. vGPU 4K Displays Supported with NVENC Enabled 2Q 1 3Q 2 4Q 3

Disabling NVENC. The number of 4K displays supported with NVENC disabled depends on the vGPU. vGPU 4K Displays Supported with NVENC Disabled 2Q 2 3Q 2 4Q 4

Using a vGPU type with more frame buffer. Four 4K displays with NVENC enabled on any Q-series vGPU with at least 6144 MB of frame buffer are supported.

Status

Not an NVIDIA bug



Ref. #

200701959

Description

The names of vGPUs that reside on the NVIDIA A100 80GB GPU are incorrectly shown as Graphics Device by the nvidia-smi command. The correct names indicate the vGPU type, for example, A100DX-40C.

Copy Copied! $ nvidia-smi Mon Jan 25 02:52:57 2021 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 460.32.04 Driver Version: 460.32.04 CUDA Version: 11.2 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 Graphics Device On | 00000000:07:00.0 Off | 0 | | N/A N/A P0 N/A / N/A | 6053MiB / 81915MiB | 0% Default | | | | Disabled | +-------------------------------+----------------------+----------------------+ | 1 Graphics Device On | 00000000:08:00.0 Off | 0 | | N/A N/A P0 N/A / N/A | 6053MiB / 81915MiB | 0% Default | | | | Disabled | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+





Status

Open



Ref. #

200691204

Description

After a Teradici Cloud Access Software session has been idle for a short period of time, the session disconnects from the VM. When this issue occurs, the error messages NVOS status 0x19 and vGPU Message 21 failed are written to the log files on the hypervisor host. This issue affects only Linux guest VMs.



Status

Open



Ref. #

200689126

Description

NVIDIA GPU Operator doesn't support vGPU deployments on GPUs based on architectures before the NVIDIA Turing™ architecture. This issue is caused by the omission of version information for the vGPU manager from the configuration information that GPU Operator requires. Without this information, GPU Operator does not deploy the NVIDIA driver container because the container cannot determine if the driver is compatible with the vGPU manager.



Status

Open



Ref. #

3227576

Description

The nvidia-smi command shows 100% GPU utilization for NVIDIA A100, NVIDIA A40, and NVIDIA A10 GPUs even if no vGPUs have been configured or no VMs are running. A GPU is affected by this issue only if the sriov-manage script has not been run to enable the virtual function for the GPU in the sysfs file system.

Copy Copied! [root@host ~]# nvidia-smi Fri Jun 13 11:45:28 2025 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 570.172.07 Driver Version: 570.172.07 CUDA Version: 12.8 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 A100-PCIE-40GB On | 00000000:5E:00.0 Off | 0 | | N/A 50C P0 97W / 250W | 0MiB / 40537MiB | 100% Default | | | | Disabled | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+





Workaround

Run the sriov-manage script to enable the virtual function for the GPU in the sysfs file system as explained in Virtual GPU Software User Guide .

After this workaround has been completed, the nvidia-smi command shows 0% GPU utilization for affected GPUs when they are idle.

Copy Copied! root@host ~]# nvidia-smi Fri Jun 13 11:47:38 2025 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 570.172.07 Driver Version: 570.172.07 CUDA Version: 12.8 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 A100-PCIE-40GB On | 00000000:5E:00.0 Off | 0 | | N/A 50C P0 97W / 250W | 0MiB / 40537MiB | 0% Default | | | | Disabled | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+





Status

Open



Ref. #

200605527

Description

The amount of frame buffer listed in a guest VM by the nvidia-smi command for vGPUs on GPUs that support Single Root I/O Virtualization (SR-IOV) is incorrect. Specifically, the amount of frame buffer listed is the amount of frame buffer allocated for the vGPU type minus the size of the VMMU segment ( vmmu_page_size ). Examples of GPUs that support SRIOV are GPUs based on the NIVIDIA Ampere architecture, such as NVIDA A100 PCIe 40GB or NVIDA A100 HGX 40GB.

For example, frame buffer for -4C and -20C vGPU types is listed as follows:

For -4C vGPU types, frame buffer is listed as 3963 MB instead of 4096 MB.

For -20C vGPU types, frame buffer is listed as 20347 MB instead of 20480 MB.

Status

Open



Ref. #

200524749

Description

On RHV 4.4, VMs fail to boot with the error Host doesn't support passthru of host PCI device . This issue affects GPU pass through deployments with all supported GPUs and NVIDIA vGPU deployments with GPUs based on the NVIDIA Ampere architecture. This issue occurs because the intel_iommu parameter and the nouveau.modeset parameter are not set correctly.



Version

This issue affects RHV 4.4.



Workaround

Perform this workaround on the hypervisor host. This workaround requires root user privileges on the hypervisor host.

In a plain-text editor, edit the file /boot/loader/entries/rhvh-4.4.1.1-0.20200722.0+1-4.18.0-193.13.2.el8_2.x86_64.conf to add the following options to the boot options. nouveau.modeset=0

intel_iommu=on Note: Line breaks have been added to this example to enhance readability. Copy Copied! title rhvh-4.4.1.1-0.20200722.0 (4.18.0-193.13.2.el8_2.x86_64) version 4.18.0-193.13.2.el8_2.x86_64 linux //rhvh-4.4.1.1-0.20200722.0+1/vmlinuz-4.18.0-193.13.2.el8_2.x86_64 initrd //rhvh-4.4.1.1-0.20200722.0+1/initramfs-4.18.0-193.13.2.el8_2.x86_64.img options crashkernel=auto resume=/dev/mapper/rhvh00-swap \ rd.lvm.lv=rhvh00/rhvh-4.4.1.1-0.20200722.0+1 rd.lvm.lv=rhvh00/swap \ root=/dev/rhvh00/rhvh-4.4.1.1-0.20200722.0+1 \ boot=UUID=38ff2175-b761-403d-8a91-d7ec9f7ec2f7 rootflags=discard \ img.bootid=rhvh-4.4.1.1-0.20200722.0+1 intel_iommu=on nouveau.modeset=0 id rhel-20200825140238-4.18.0-193.13.2.el8_2.x86_64 grub_users $grub_users grub_arg --unrestricted grub_class kernel Reboot the hypervisor host machine.

Status

Not an NVIDIA bug



Ref. #

200653675

Description

Upgrading the NVIDIA vGPU software graphics driver in a Linux guest VM with multiple vGPUs might fail. This issue occurs if the driver is upgraded by overinstalling the new release of the driver on the current release of the driver while the nvidia-gridd service is running in the VM.



Workaround

Stop the nvidia-gridd service. Try again to upgrade the driver.

Status

Open



Ref. #

200633548

Description

If NVIDIA licensing information is not configured on the system, any attempt to start NVIDIA Control Panel by right-clicking on the desktop within 30 seconds of the VM being started fails.



Workaround

Restart the VM and wait at least 30 seconds before trying to launch NVIDIA Control Panel.



Status

Open



Ref. #

200623179

Description

On Linux, the frame rate might drop to 1 frame per second (FPS) after NVIDIA vGPU software has been running for several minutes. Only some applications are affected, for example, glxgears. Other applications, such as Unigine Heaven, are not affected. This behavior occurs because Display Power Management Signaling (DPMS) for the Xorg server is enabled by default and the display is detected to be inactive even when the application is running. When DPMS is enabled, it enables power saving behavior of the display after several minutes of inactivity by setting the frame rate to 1 FPS.



Workaround

If necessary, stop the Xorg server. Copy Copied! # /etc/init.d/xorg stop In a plain text editor, edit the /etc/X11/xorg.conf file to set the options to disable DPMS and disable the screen saver. In the Monitor section, set the DPMS option to false . Copy Copied! Option "DPMS" "false" At the end of the file, add a ServerFlags section that contains option to disable the screen saver. Copy Copied! Section "ServerFlags" Option "BlankTime" "0" EndSection Save your changes to /etc/X11/xorg.conf file and quit the editor. Start the Xorg server. Copy Copied! # etc/init.d/xorg start

Status

Open



Ref. #

200605900

Description

Desktop Windows Manager (DWM) crashes randomly occur in Windows VMs, causing a blue-screen crash and the bug check CRITICAL_PROCESS_DIED . Computer Management shows problems with the primary display device.



Version

This issue affects Windows 10 1809, 1903 and 1909 VMs.



Status

Not an NVIDIA bug



Ref. #

2730037

Description

When a VM configured with vGPU is migrated to another host, the migration stops before it is complete.

This issue occurs if the ECC memory configuration (enabled or disabled) on the source and destination hosts are different. The ECC memory configuration on both the source and destination hosts must be identical.



Workaround

Before attempting to migrate the VM again, ensure that the ECC memory configuration on both the source and destination hosts are identical.



Status

Not an NVIDIA bug



Ref. #

200520027

Description

The ECC memory settings for a vGPU cannot be changed from a Linux guest VM by using NVIDIA X Server Settings. After the ECC memory state has been changed on the ECC Settings page and the VM has been rebooted, the ECC memory state remains unchanged.



Workaround

Use the nvidia-smi command in the guest VM to enable or disable ECC memory for the vGPU as explained in Virtual GPU Software User Guide .

If the ECC memory state remains unchanged even after you use the nvidia-smi command to change it, use the workaround in Changes to ECC memory settings for a Linux vGPU VM by nvidia-smi might be ignored.



Status

Open



Ref. #

200523086

Description

After the ECC memory state for a Linux vGPU VM has been changed by using the nvidia-smi command and the VM has been rebooted, the ECC memory state might remain unchanged.

This issue occurs when multiple NVIDIA configuration files in the system cause the kernel module option for setting the ECC memory state RMGuestECCState in /etc/modprobe.d/nvidia.conf to be ignored.

When the nvidia-smi command is used to enable ECC memory, the file /etc/modprobe.d/nvidia.conf is created or updated to set the kernel module option RMGuestECCState . Another configuration file in /etc/modprobe.d/ that contains the keyword NVreg_RegistryDwordsPerDevice might cause the kernel module option RMGuestECCState to be ignored.



Workaround

This workaround requires administrator privileges.

Move the entry containing the keyword NVreg_RegistryDwordsPerDevice from the other configuration file to /etc/modprobe.d/nvidia.conf. Reboot the VM.

Status

Open



Ref. #

200505777

Description

When GPU performance is being monitored, host core CPU utilization is higher than expected for moderate workloads. For example, host CPU utilization when only a small number of VMs are running is as high as when several times as many VMs are running.



Workaround

Disable monitoring of the following GPU performance statistics:

vGPU engine usage by applications across multiple vGPUs

Encoder session statistics

Frame buffer capture (FBC) session statistics

Statistics gathered by performance counters in guest VMs

Status

Open



Ref. #

2414897

Description

Because of a known limitation with NvFBC, a frame capture while the interactive logon message is displayed returns a blank screen.

An NvFBC session can capture screen updates that occur after the session is created. Before the logon message appears, there is no screen update after the message is shown and, therefore, a black screen is returned instead. If the NvFBC session is created after this update has occurred, NvFBC cannot get a frame to capture.



Workaround

Press Enter or wait for the screen to update for NvFBC to capture the frame.



Status

Not a bug



Ref. #

2115733

Description

When Windows Server is used as a guest OS, Remote Desktop Services (RDS) sessions do not use the GPU. By default, the RDS sessions use the Microsoft Basic Render Driver instead of the GPU. This default setting enables 2D DirectX applications such as Microsoft Office to use software rendering, which can be more efficient than using the GPU for rendering. However, as a result, 3D applications that use DirectX are prevented from using the GPU.



Version

This issue affects all Windows Server releases that are supported as a guest OS.



Solution

Change the local computer policy to use the hardware graphics adapter for all RDS sessions.

Choose Local Computer Policy > Computer Configuration > Administrative Templates > Windows Components > Remote Desktop Services > Remote Desktop Session Host > Remote Session Environment. Set the Use the hardware default graphics adapter for all Remote Desktop Services sessions option.

Description

When the scheduling policy is fixed share, GPU engine utilization can be reported as higher than expected for a vGPU.

For example, GPU engine usage for six P40-4Q vGPUs on a Tesla P40 GPU might be reported as follows:

Copy Copied! [root@localhost:~] nvidia-smi vgpu Mon Aug 20 10:33:18 2018 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 390.42 Driver Version: 390.42 | |-------------------------------+--------------------------------+------------+ | GPU Name | Bus-Id | GPU-Util | | vGPU ID Name | VM ID VM Name | vGPU-Util | |===============================+================================+============| | 0 Tesla P40 | 00000000:81:00.0 | 99% | | 85109 GRID P40-4Q | 85110 win7-xmpl-146048-1 | 32% | | 87195 GRID P40-4Q | 87196 win7-xmpl-146048-2 | 39% | | 88095 GRID P40-4Q | 88096 win7-xmpl-146048-3 | 26% | | 89170 GRID P40-4Q | 89171 win7-xmpl-146048-4 | 0% | | 90475 GRID P40-4Q | 90476 win7-xmpl-146048-5 | 0% | | 93363 GRID P40-4Q | 93364 win7-xmpl-146048-6 | 0% | +-------------------------------+--------------------------------+------------+ | 1 Tesla P40 | 00000000:85:00.0 | 0% | +-------------------------------+--------------------------------+------------+

The vGPU utilization of vGPU 85109 is reported as 32%. For vGPU 87195, vGPU utilization is reported as 39%. And for 88095, it is reported as 26%. However, the expected vGPU utilization of any vGPU should not exceed approximately 16.7%.

This behavior is a result of the mechanism that is used to measure GPU engine utilization.



Status

Open



Ref. #

2227591

Description

When a windows VM configured with a licensed vGPU is started, the VM fails to acquire a license.

Error messages in the following format are written to the NVIDIA service logs:

Copy Copied! [000000020.860152600 sec] - [Logging.lib] ERROR: [nvGridLicensing.FlexUtility] 353@FlexUtility::LogFneError : Error: Failed to add trusted storage. Server URL : license-server-url - [1,7E2,2,1[7000003F,0,9B00A7]] System machine type does not match expected machine type..





Workaround

This workaround requires administrator privileges.

Stop the NVIDIA Display Container LS service. Delete the contents of the folder %SystemDrive%\Program Files\NVIDIA Corporation\Grid Licensing. Start the NVIDIA Display Container LS service.

Status

Closed



Ref. #

200407287

Description

The command nvidia-smi vgpu -m shows that vGPU migration is supported on all hypervisors, even hypervisors or hypervisor versions that do not support vGPU migration.



Status

Closed



Ref. #

200407230

Description

Hot plugging or unplugging vCPUs causes a blue-screen crash in Windows VMs that are running NVIDIA vGPU software graphics drivers.

When the blue-screen crash occurs, one of the following error messages may also be seen:

Copy Copied! SYSTEM_SERVICE_EXCEPTION(nvlddmkm.sys)

Copy Copied! DRIVER_IRQL_NOT_LESS_OR_EQUAL(nvlddmkm.sys)

NVIDIA vGPU software graphics drivers do not support hot plugging and unplugging of vCPUs.



Status

Closed



Ref. #

2101499

Description

If the Luxmark application is run on a Linux guest VM configured with NVIDIA vGPU that is booted without acquiring a license, a segmentation fault occurs and the application core dumps. The fault occurs when the application cannot allocate a CUDA object on NVIDIA vGPUs where CUDA is disabled. On NVIDIA vGPUs that can support CUDA, CUDA is disabled in unlicensed mode.



Status

Not an NVIDIA bug.



Ref. #

200330956

Description

On Red Hat Enterprise Linux 6.8 and 6.9, and CentOS 6.8 and 6.9, a segmentation fault in DBus code causes the nvidia-gridd service to exit.

The nvidia-gridd service uses DBus for communication with NVIDIA X Server Settings to display licensing information through the Manage License page. Disabling the GUI for licensing resolves this issue.

To prevent this issue, the GUI for licensing is disabled by default. You might encounter this issue if you have enabled the GUI for licensing and are using Red Hat Enterprise Linux 6.8 or 6.9, or CentOS 6.8 and 6.9.



Version

Red Hat Enterprise Linux 6.8 and 6.9

CentOS 6.8 and 6.9



Status

Open



Ref. #

200358191

200319854

1895945

Description

By default, the Manage License option is not available in NVIDIA X Server Settings. This option is missing because the GUI for licensing on Linux is disabled by default to work around the issue that is described in A segmentation fault in DBus code causes nvidia-gridd to exit on Red Hat Enterprise Linux and CentOS.



Workaround

This workaround requires sudo privileges.

Note: Do not use this workaround with Red Hat Enterprise Linux 6.8 and 6.9 or CentOS 6.8 and 6.9. To prevent a segmentation fault in DBus code from causing the nvidia-gridd service from exiting, the GUI for licensing must be disabled with these OS versions.

If you are licensing a physical GPU for vCS, you must use the configuration file /etc/nvidia/gridd.conf.

If NVIDIA X Server Settings is running, shut it down. If the /etc/nvidia/gridd.conf file does not already exist, create it by copying the supplied template file /etc/nvidia/gridd.conf.template. As root, edit the /etc/nvidia/gridd.conf file to set the EnableUI option to TRUE . Start the nvidia-gridd service. Copy Copied! # sudo service nvidia-gridd start

When NVIDIA X Server Settings is restarted, the Manage License option is now available.



Status

Open

Description

NVIDIA vGPU software licenses remain checked out on the license server when non-persistent VMs are forcibly powered off.

The NVIDIA service running in a VM returns checked out licenses when the VM is shut down. In environments where non-persistent licensed VMs are not cleanly shut down, licenses on the license server can become exhausted. For example, this issue can occur in automated test environments where VMs are frequently changing and are not guaranteed to be cleanly shut down. The licenses from such VMs remain checked out against their MAC address for seven days before they time out and become available to other VMs.



Resolution

If VMs are routinely being powered off without clean shutdown in your environment, you can avoid this issue by shortening the license borrow period. To shorten the license borrow period, set the LicenseInterval configuration setting in your VM image. For details, refer to Virtual GPU Client Licensing User Guide .



Status

Closed



Ref. #

1694975

Description

When the VM is rebooted after the guest VM driver for Windows 10 RS2 is installed, the VM bug checks. When Windows boots, it selects one of the standard supported video modes. If Windows is booted directly with a display that is driven by an NVIDIA driver, for example a vGPU on XenServer, a blue screen crash occurs.

This issue occurs when the screen resolution is switched from VGA mode to a resolution that is higher than 1920×1200.



Fix

Download and install Microsoft Windows Update KB4020102 from the Microsoft Update Catalog.



Workaround

If you have applied the fix, ignore this workaround.

Otherwise, you can work around this issue until you are able to apply the fix by not using resolutions higher than 1920×1200.

Choose a GPU profile in Citrix XenCenter that does not allow resolutions higher than 1920×1200. Before rebooting the VM, set the display resolution to 1920×1200 or lower.

Status

Not an NVIDIA bug



Ref. #

200310861

Description

GDM fails to start on Red Hat Enterprise Linux 7.2 and CentOS 7.0 with the following error:

Copy Copied! Oh no! Something has gone wrong!





Workaround

Permanently enable permissive mode for Security Enhanced Linux (SELinux).

As root, edit the /etc/selinux/config file to set SELINUX to permissive . Copy Copied! SELINUX=permissive Reboot the system. Copy Copied! ~]# reboot

For more information, see Permissive Mode in Red Hat Enterprise Linux 7 SELinux User's and Administrator's Guide .



Status

Not an NVIDIA bug



Ref. #

200167868