Suspend-Resume#

The suspend-resume feature allows NVIDIA vGPU-configured VMs to be temporarily paused and later resumed without losing their operational state. During suspension, the entire VM state, including GPU and compute resources, is saved to disk, freeing these resources on the host. Upon resumption, the state is restored, enabling workload continuation.

Typical uses include planned host maintenance, pausing non-critical VMs to free GPUs, and saving a known state for development and testing.

Unlike live migration, suspend-resume involves downtime during both suspension and resumption. Cross-host operations require strict compatibility across hosts, encompassing GPU type, Virtual GPU manager version, memory configuration, and NVLink topology.

Suspend-resume is supported on all GPUs that enable vGPU functionality; however, compatibility varies by hypervisor, NVIDIA vGPU software release, and guest operating system.

For additional information and operational instructions across different hypervisors, refer to the vGPU Suspend-Resume documentation.

Suspend-Resume Known Issues and Limitations#

Table 40 Suspend-Resume Known Issues and Limitations#

Hypervisor Platform

Documentation

VMware vSphere

Known Issues and Limitations with Suspend Resume on VMware vSphere

Note

While live migration generally allows resuming a suspended VM on any compatible vGPU host manager, a current bug in Red Hat Enterprise Linux 9.4 and Ubuntu 24.04 LTS limits suspend, resume, and migration to hosts with an identical vGPU manager version. The issue has been resolved in Red Hat Enterprise Linux 9.6 and later.

Platform Support for Suspend-Resume#

Suspend-resume is supported on all GPUs that support NVIDIA vGPU for Compute, but compatibility varies by hypervisor, release version, and guest operating system.

Table 41 Platform Support for Suspend-Resume#

Hypervisor Platform

Version

NVIDIA AI Enterprise Infra Release

Documentation

Red Hat Enterprise Linux with KVM

  • 10.0

  • 9.6

  • 9.4

  • NVIDIA AI Enterprise Infra 8.x

  • NVIDIA AI Enterprise Infra 7.x

Suspending and Resuming a VM Configured with NVIDIA vGPU for Compute on RHEL KVM

Ubuntu with KVM

24.04 LTS

  • NVIDIA AI Enterprise Infra 8.x

  • NVIDIA AI Enterprise Infra 7.x

Suspending and Resuming a VM Configured with NVIDIA vGPU for Compute on Ubuntu KVM

VMware vSphere

  • 9

  • 8

All active NVIDIA AI Enterprise Infra Releases

Suspending and Resuming a VM Configured with NVIDIA vGPU for Compute on VMware vSphere

vGPU Support for Suspend-Resume#

For a list of supported GPUs, refer to the Supported NVIDIA GPUs and Networking section in the NVIDIA AI Enterprise Infra Support Matrix.