GPU Configuration#
Overview#
This reference provides technical specifications for GPU allocation, shared memory configuration, and platform-specific limitations for project containers.
Use this reference to understand GPU request ranges, reservation behavior, CUDA compatibility requirements, and Windows GPU allocation limitations.
For step-by-step instructions on configuring GPUs, see Configure GPU Settings for Project Container.
For GPU configuration in multi-container environments, see Multi-Container Environments (Docker Compose).
Key Concepts#
- GPU Request:
The number of GPUs (0-8) specified for a project container.
- GPU Reservation:
The allocation mechanism that explicitly assigns available GPUs to containers.
- Shared Memory:
The size of /dev/shm allocated to the container for inter-GPU communication.
- CUDA Compatibility:
The requirement that container CUDA version matches host GPU driver capabilities.
Requirements#
To configure GPUs, you must meet the following requirements:
NVIDIA GPU available on the host system
Project container with CUDA installed (for GPU-enabled projects)
CUDA version in container must be compatible with host GPU drivers
Container runtime (Docker or Podman) configured for GPU access
Note
AI Workbench installs or updates all necessary software for NVIDIA GPUs. You do not need to install anything.
The only exception is Windows, where you must install the NVIDIA drivers yourself.
GPU Allocation Specifications#
GPU Request Range#
The GPU request range has the following characteristics:
Valid values: 0 to 8
Behavior:
0 GPUs: Container starts without GPU access (CPU-only mode)
1-8 GPUs: Workbench checks available GPUs and reserves the requested number
Allocation Mechanism#
When a container starts with a GPU request, AI Workbench performs the following sequence of actions:
Checks an internal reference of available GPUs on the host.
If enough GPUs are available, they are reserved and explicitly passed into the container.
If not enough GPUs are available, the user is notified and the container will not start.
CUDA Compatibility#
Container Requirements#
The container requirements for CUDA compatibility are:
CUDA Installation: The project container must have CUDA installed to use GPUs.
Version Compatibility: The CUDA version in the container must be compatible with the host’s GPU drivers. Incompatible versions will cause runtime errors.
AI Workbench uses the cuda_version field in the project specification (spec.yaml) to verify compatibility. If this field is incorrect or missing, GPU allocation may fail without clear warnings.
Platform-Specific Behavior#
Windows GPU Allocation Limitation#
Limitation: On Windows, if you request 1 GPU, all GPUs are passed into the container due to a Windows driver limitation.
Example:
If you have a Windows system with 4 GPUs and you request 1 GPU for a project, all 4 GPUs will be passed into the container.
Impact:
You cannot restrict a Windows project to a subset of available GPUs.
All GPUs on the system will be accessible to the container when any GPU request is made.
This limitation does not affect Linux or macOS.
Linux and macOS Behavior#
On Linux and macOS, GPU requests work as expected:
Requesting 1 GPU passes exactly 1 GPU into the container.
Requesting N GPUs passes exactly N GPUs (if available).
You can restrict projects to specific GPU counts.
Project Containers vs Multi-Container Environments#
Single-Container Projects#
Single-container projects have the following characteristics:
Configuration method: Desktop App or CLI
Options available: * Number of Desired GPUs (0-8) * Shared Memory (MiB)
Location: Project Tab > Project Container > Hardware
Multi-Container Environments (Compose)#
Multi-container environments (Compose) have the following characteristics:
Configuration method: Compose file specification.
Format: Use
deploy.resources.reservations.devicesin the service definition.Documentation: See Multi-Container Environments (Docker Compose) for complete GPU configuration details in multi-container environments.
Key difference: GPU allocation in compose files uses a different syntax and is configured per-service rather than per-project.