GPU Configuration#

This page documents the technical specifications for GPU allocation, shared memory, and platform-specific limitations.

Requirements#

  • NVIDIA GPU available on the host system

  • Project container with CUDA installed (for GPU-enabled projects)

  • CUDA version in container must be compatible with host GPU drivers

  • Container runtime (Docker or Podman) configured for GPU access

GPU Allocation Specifications#

GPU Request Range#

The GPU request range has the following characteristics:

  • Valid values: 0 to 8

  • Behavior:

    • 0 GPUs: Container starts without GPU access (CPU-only mode)

    • 1-8 GPUs: Workbench checks available GPUs and reserves the requested number

Allocation Mechanism#

When a container starts with a GPU request, AI Workbench does the following:

  1. Checks an internal reference of available GPUs on the host.

  2. If enough GPUs are available, they are reserved and explicitly passed into the container.

  3. If not enough GPUs are available, the user is notified and the container will not start.

Shared Memory Configuration#

The shared memory configuration has the following characteristics:

  • Parameter: Shared Memory (in MiB (NVIDIA AI Workbench Glossary))

  • Purpose: Sets the size of /dev/shm in the container for inter-GPU communication.

  • When required: Should be configured when requesting more than 1 GPU.

  • Valid values: Positive integers representing mebibytes (NVIDIA AI Workbench Glossary).

  • Specification location: Project Tab > Project Container > Hardware in Desktop App, or execution.resources.sharedMemoryMB in spec.yaml.

CUDA Compatibility#

Container Requirements#

The container requirements for CUDA compatibility are:

  • CUDA Installation: The project container must have CUDA installed to use GPUs.

  • Version Compatibility: The CUDA version in the container must be compatible with the host’s GPU drivers. Incompatible versions will cause runtime errors.

AI Workbench uses the cuda_version field in the project specification (spec.yaml) to verify compatibility. If this field is incorrect or missing, GPU allocation may fail without clear warnings.

Platform-Specific Behavior#

Windows GPU Allocation Limitation#

Limitation: On Windows, if you request 1 GPU, all GPUs are passed into the container due to a Windows driver limitation.

Example:

If you have a Windows system with 4 GPUs and you request 1 GPU for a project, all 4 GPUs will be passed into the container.

Impact:

  • You cannot restrict a Windows project to a subset of available GPUs.

  • All GPUs on the system will be accessible to the container when any GPU request is made.

  • This limitation does not affect Linux or macOS.

Ubuntu Behavior#

GPU requests work as expected on Ubuntu.

  • Requesting 1 GPU passes exactly 1 GPU into the container.

  • Requesting N GPUs passes exactly N GPUs (if available).

  • You can restrict projects to specific GPU counts.

Project Containers vs Multi-Container Environments#

Single-Container Projects#

Single-container projects have the following characteristics:

  • Configuration method: Desktop App or CLI

  • Options available: * Number of Desired GPUs (0-8) * Shared Memory (MiB)

  • Location: Project Tab > Project Container > Hardware

Multi-Container Environments (Compose)#

Multi-container environments (Compose) have the following characteristics:

  • Configuration method: Compose file specification.

  • Format: Use deploy.resources.reservations.devices in the service definition.

  • Documentation: See Multi-Container Environments (Docker Compose) for complete GPU configuration details in multi-container environments.

  • Key difference: GPU allocation in compose files uses a different syntax and is configured per-service rather than per-project.