Turing Architecture vGPU Types#

The NVIDIA Turing architecture adds hardware ray tracing, tensor cores for AI inference, and improved general-purpose compute over prior generations. The Tesla T4 is a common choice for inference, video processing, and edge workloads licensed with NVIDIA AI Enterprise.

On Turing, vGPU is time-sliced only: VMs share the physical GPU by scheduler quanta, not by fixed hardware partitions. That model fits multi-tenant sharing when workloads do not need deterministic isolation between tenants.

Turing GPU Architecture#

Physical GPUs per board: 1

The maximum number of vGPUs per board is the product of the maximum number of vGPUs per GPU and the number of physical GPUs per board.

This GPU does not support mixed-size mode.

Intended use cases:

  • vGPUs with more than 40 GB of framebuffer: Training Workloads

  • vGPUs with 40 GB of framebuffer: Inference Workloads

Required license edition: NVIDIA AI Enterprise

These vGPU types support a single display with a fixed maximum resolution.

Table 127 NVIDIA vGPU for Compute for NVIDIA Tesla T4#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU

Maximum Display Resolution [1]

Virtual Displays per vGPU

T4-16C

16

1

3840x2400

1

T4-8C

8

2

3840x2400

1

T4-4C

4

4

3840x2400

1

Table 128 NVIDIA vGPU for Compute for NVIDIA Quadro RTX 6000 Passive#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU

Maximum Display Resolution [1]

Virtual Displays per vGPU

RTX6000P-24C

24

1

3840x2400

1

RTX6000P-12C

12

2

3840x2400

1

RTX6000P-8C

8

3

3840x2400

1

RTX6000P-6C

6

4

3840x2400

1


Footnotes