vGPU Configuration Issues#
vGPU VM Startup Issue on vSphere – “Out of Resources” Error#
When attempting to start a GPU-enabled VM on vSphere, the VM fails to start and only displays a generic “out of resources” error. This issue can occur due to various configuration or resource allocation problems.
Next Steps
Review available GPU resources to ensure sufficient capacity.
Verify that the hypervisor properly assigns and recognizes the GPU.
Follow this checklist for a step-by-step resolution.
vGPU VM Startup Issue on KVM Hypervisor#
When attempting to start a sixth vGPU VM on a KVM hypervisor with an SR-IOV capable GPU, the VM fails to start and hangs. This occurs because PCIe Alternate Routing ID (ARI) is disabled in the System BIOS, causing virtual devices beyond the fifth to be marked as “rev ff” instead of “rev a1.”
Next Steps
Access the System BIOS.
Enable PCIe Alternate Routing ID (ARI).
For more detailed instructions and additional information, visit the full article here.
Mixing different GPUs in a Single Node#
Combining different GPUs in the same node, such as Ampere and Ada-based GPUs, is unsupported due to their Resource Manager (RM) software/hardware differences. While NVIDIA’s mixed-size vGPU mode allows different vGPU profiles on the same GPU, it does not enable mixing entirely different GPU architectures within a single node.
Next Steps
Use a single GPU architecture per host to ensure compatibility with the vGPU manager.
To test multiple architectures, separate them across different nodes.