Multi-Container Environments (Docker Compose)#

Overview#

This reference provides technical specifications for using Docker Compose with AI Workbench projects.

Use this reference to understand compose file requirements, valid file names and locations, service profile specifications, GPU allocation in multi-container environments, and environment variable handling.

For step-by-step instructions on creating and running multi-container applications, see Use Multi-Container Environments.

Key Concepts#

Compose File:: The docker-compose.yml or compose.yaml file that defines services, networks, and volumes for multi-container environments.
Service Profile:: A Compose feature that conditionally enables services based on selected profiles.
NVWB_TRIM_PREFIX:: An environment variable that controls proxy routing behavior for web services.
Shared Volume:: The /nvwb-shared-volume mount automatically created for all compose containers.

Requirements#

To use multi-container environments with AI Workbench, you must have:

AI Workbench project with Docker Compose file in the repository.
Basic understanding of Docker Compose syntax and concepts.
For GPU services: NVIDIA GPU available on the host.

For more information about Docker Compose, see the Docker Compose documentation.

Compose File Specifications#

The compose file is a YAML file that defines the services, networks, and volumes for a multi-container environment.

File Names and Locations#

The compose file has restrictions in regards to the file name and location.

Valid File Names#

AI Workbench recognizes the following file names:

compose.yaml
compose.yml
docker-compose.yml
docker-compose.yaml

Default Locations#

The compose file must be in one of the following locations:

Project root: /project/
Deploy folder: /project/deploy/

Custom Location#

You can specify a different location by editing the project specification file:

Add the field environment:compose-file-path to .project/spec.yaml
Set the value to the relative path to the compose file in the project repository

For more information about the project specification file, see AI Workbench Project Specification.

Service Profiles#

Service profiles allow conditional service activation based on hardware or environment requirements.

For complete information about service profiles, see the Docker Compose profiles documentation.

Usage in AI Workbench:

Select one or more profiles in Project Tab > Environment > Compose before starting.
Services without a profile always run.
Services with a profile only run when that profile is selected.

GPU Configuration in Compose#

GPU configuration in compose specifies GPU allocation for services requiring GPU acceleration.

GPU Allocation by Count#

You can specify GPU requests using the deploy.resources.reservations format with a count value.

AI Workbench manages GPU reservations and explicitly passes GPUs into each container, preventing conflicts between services.

Example with count:

services:
  gpu-service:
    image: nvidia/cuda:12.2.0-base-ubuntu22.04
    runtime: nvidia
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]

Configuration Fields:

runtime: Should be nvidia for GPU access
driver: Must be nvidia
count: Number of GPUs to allocate to this service (Docker assigns available GPUs)
capabilities: Must include [gpu]

GPU Allocation by Device ID#

You can assign specific GPU devices to services using device_ids instead of count.

Example with device_ids:

services:
  embedding-service:
    image: nvcr.io/nim/nvidia/llama-3.2-nv-embedqa-1b-v2:latest
    runtime: nvidia
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              device_ids: ['0']
              capabilities: [gpu]

  llm-service:
    image: nvcr.io/nim/meta/llama-3.1-70b-instruct:latest
    runtime: nvidia
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              device_ids: ['1', '2']
              capabilities: [gpu]

Configuration Fields:

device_ids: List of specific GPU device IDs to assign
Device IDs correspond to GPU indices shown in nvidia-smi output
Multiple device IDs can be specified for multi-GPU services

When to use device_ids vs count:

Use count when you don’t care which specific GPUs are assigned
Use device_ids when you need precise GPU assignment across multiple services
Use device_ids to prevent services from competing for the same GPU

Multi-GPU Configurations#

Services requiring multiple GPUs can specify them using either count or device_ids.

Example with multiple GPUs:

services:
  large-model:
    image: nvcr.io/nim/meta/llama-3.1-405b-instruct:latest
    runtime: nvidia
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 8
              capabilities: [gpu]
    shm_size: 20gb

Shared Memory for Multi-GPU:

Multi-GPU services often require increased shared memory
Set shm_size to allocate shared memory (e.g., 20gb, 16GB)
Required for inter-GPU communication in many AI workloads

GPU Allocation Conflicts#

AI Workbench handles GPU reservation conflicts - if requested GPUs exceed available GPUs, the compose environment will fail to start with an error message.

Common conflict scenarios:

Project container using GPUs that compose services request
Multiple services requesting the same specific device_ids
Total count across all services exceeds available GPUs

Resolution:

Check GPU availability with nvidia-smi before starting services
Ensure the project container is not using GPUs needed by compose services
Use device_ids to explicitly control GPU assignment across services

CPU and Memory Resources#

Services can specify CPU and memory resource limits and reservations to prevent resource exhaustion.

Resource Limits#

Resource limits define the maximum resources a service can consume.

Example:

services:
  my-service:
    image: myapp/service:latest
    deploy:
      resources:
        limits:
          cpus: '2.0'
          memory: 4G
        reservations:
          cpus: '1.0'
          memory: 2G

Configuration Fields:

limits.cpus: Maximum CPU cores (e.g., '2.0' for 2 cores, '0.5' for half a core)
limits.memory: Maximum memory (e.g., 4G, 512M)
reservations.cpus: Minimum CPU cores reserved for the service
reservations.memory: Minimum memory reserved for the service

Best Practices:

Set limits for inference services to prevent consuming all host resources
Use reservations to guarantee minimum resources for critical services
Important when running multiple AI services on a single host

Environment Variables and Secrets#

You can use environment variables and secrets in your compose file to configure your services.

Environment Variables#

Environment variables defined in your AI Workbench project are available in compose services through interpolation:

services:
  myservice:
    environment:
      - MY_VAR=${TEST_VAR}

Special Environment Variables#

PROXY_PREFIX: Automatically injected into services, contains the proxy routing path.
NVWB_TRIM_PREFIX: Set to true to enable proxy routing with prefix trimming.

Secrets#

Secrets are used to store sensitive information in your compose file.

You can use AI Workbench secrets in your compose file in two ways:

Interpolation (as environment variables):

environment:
  - MY_SECRET=${SECRET_NAME}

Compose secrets (as files):

services:
  myservice:
    secrets:
      - SECRET_NAME
    environment:
      - SECRET_FILE=/run/secrets/SECRET_NAME

secrets:
  SECRET_NAME:
    environment: "HOME"

Important

When using compose secrets, the global secrets section must include the secret name with environment: "HOME". AI Workbench automatically replaces this value at runtime.

Networks#

Networks enable service-to-service communication in multi-container environments.

Network Configuration#

Services on the same network can reach each other using service names as hostnames.

Example:

services:
  frontend:
    image: myapp/frontend:latest
    networks:
      - app-network
    environment:
      - API_URL=http://backend:8080

  backend:
    image: myapp/backend:latest
    networks:
      - app-network
    ports:
      - "8080:8080"

networks:
  app-network:
    driver: bridge

Network Types:

bridge: Default network driver for standalone containers on a single host
Services on the same bridge network can communicate by service name

Service Discovery:

A service named backend is accessible at hostname backend from other services
Port numbers in inter-service communication refer to container ports, not host ports
External port mappings (ports) expose services to the host or AI Workbench proxy

Multiple Networks:

Services can connect to multiple networks for network segmentation and isolation.

services:
  app:
    networks:
      - frontend-network
      - backend-network

networks:
  frontend-network:
  backend-network:

Mounts and Volumes#

You can use mounts and volumes in your compose file to share files and data between your services.

Volume Types#

Docker Compose supports three volume types for persistent storage:

Named Volumes:

Managed by the container engine (Docker or Podman) and are useful for data that should persist but doesn’t need direct host access.

services:
  database:
    volumes:
      - db-data:/var/lib/postgresql/data

volumes:
  db-data:

Bind Mounts:

Connect to specific host directories, useful for development or accessing existing data.

services:
  nim:
    volumes:
      - type: bind
        source: /tmp
        target: /opt/nim/.cache/

Tmpfs Mounts:

Store data in host memory, useful for temporary sensitive data.

services:
  app:
    tmpfs:
      - /tmp

Shared Volume#

AI Workbench automatically creates a shared volume accessible to all containers. This shared volume has the following characteristics:

Mount path: /nvwb-shared-volume
Purpose: Enables file sharing between the project container and compose services.
Permissions: If services run as different users, you may need to modify file permissions for shared access.

Bind Mounts#

AI Workbench does not manage bind mount source paths for compose containers.

You must specify bind mount source paths directly in the compose file. These paths are relative to the project repository or absolute paths on the host.

Common Bind Mount Patterns:

services:
  service:
    volumes:
      # Relative to project root
      - ./data:/app/data
      # Absolute host path
      - /mnt/datasets:/data
      # Model cache directory
      - ${MODEL_DIRECTORY:-/tmp}:/opt/nim/.cache

External Volumes#

External volumes are managed outside of Compose and must be created before starting services.

services:
  nim:
    volumes:
      - nim_cache:/opt/nim/.cache

volumes:
  nim_cache:
    external: true

Create external volumes with docker volume create nim_cache before running compose.

Service Dependencies and Healthchecks#

Service dependencies control startup order and ensure services are ready before dependent services start.

depends_on Configuration#

The depends_on option specifies service dependencies.

Basic Dependency:

services:
  frontend:
    depends_on:
      - backend
  backend:
    image: myapp/backend:latest

With Healthcheck Conditions:

services:
  frontend:
    depends_on:
      backend:
        condition: service_healthy
  backend:
    image: myapp/backend:latest
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
      interval: 10s
      timeout: 5s
      retries: 3

Dependency Conditions:

service_started: Default, waits for service to start (not necessarily ready)
service_healthy: Waits for service to pass healthcheck
service_completed_successfully: Waits for service to complete and exit successfully

Healthcheck Configuration#

Healthchecks verify that a service is ready to accept requests.

Healthcheck Fields:

services:
  nim-llm:
    healthcheck:
      test: ["CMD", "python3", "-c", "import requests; requests.get('http://localhost:8000/v1/health/ready')"]
      interval: 10s
      timeout: 20s
      retries: 100
      start_period: 10m

Configuration Options:

test: Command to run (returns 0 for healthy, 1 for unhealthy)
interval: Time between healthcheck attempts
timeout: Maximum time to wait for healthcheck to complete
retries: Number of consecutive failures before marking unhealthy
start_period: Grace period before first healthcheck (useful for slow-starting services)

Common Healthcheck Patterns:

HTTP endpoint check:

test: ["CMD", "curl", "-f", "http://localhost:8000/health"]

Python script check:

test: ["CMD", "python3", "-c", "import requests; requests.get('http://localhost:8000/v1/health/ready')"]

Port check:

test: ["CMD-SHELL", "nc -z localhost 8080"]

Best Practices:

Define healthchecks for all critical services
Use appropriate start_period for services that take time to initialize
Set realistic retries and timeout values for AI services that may take time to respond
Healthchecks are essential for complex multi-service pipelines

Versioning#

The compose file is versioned with the project repository. As long as the compose file is in a tracked folder in the project repository, Git will version it.

What is Versioned#

Versioning occurs when the compose file is in a tracked folder in the project repository. As long as the compose file is in a tracked folder in the project repository, Git will version it.

What is NOT Versioned#

The component images are not version when those images are used for services are not versioned. They are not versioned for the following reasons:

Service container images are pulled from registries at runtime.
Image versions are determined by tags in the compose file.
Updating the compose file and committing changes does not version the actual container images.

Examples#

Example 1: Simple Web Service#

A basic compose file with a single web service that always runs:

services:

   web1:
      image: hashicorp/http-echo
      environment:
         # NVWB_TRIM_PREFIX=true enables proxy routing with prefix trimming
         # PROXY_PREFIX is automatically injected
         - NVWB_TRIM_PREFIX=true
      ports:
         - '5678:5678'
      command: ["-text=hello from service 1"]

Example 2: GPU-Enabled Service with Profile#

A compose file with two services - one always runs, one requires GPU and uses a profile:

services:

   web1:
      image: hashicorp/http-echo
      environment:
         - NVWB_TRIM_PREFIX=true
      ports:
         - '5678:5678'
      command: ["-text=hello from service 1"]

   web2:
      image: hashicorp/http-echo
      profiles: [gpu-service]
      environment:
         - NVWB_TRIM_PREFIX=true
      ports:
         - '5679:5679'
      # GPU allocation specification
      deploy:
        resources:
          reservations:
            devices:
              - driver: nvidia
                count: 1
                capabilities: [gpu]
      command: ["-text=hello from service 2", "-listen=:5679"]

Example 3: Environment Variables and Secrets via Interpolation#

A compose file using AI Workbench environment variables and secrets through interpolation:

services:

   web3:
      image: hashicorp/http-echo
      environment:
         - NVWB_TRIM_PREFIX=true
         # Environment variables from AI Workbench project
         - TEST_ENV_VAR=${TEST_VAR}
         # Secrets also available via interpolation
         - TEST_SECRET_FROM_ENV_VAR=${TEST_SECRET}
      ports:
         - '5678:5678'
      command: ["-text=${TEST_VAR}"]

Note

Create the variable TEST_VAR and secret TEST_SECRET in your AI Workbench project before using this example.

Example 4: Compose Secrets as Files#

A compose file using AI Workbench secrets through the Compose secrets mechanism:

services:

   web1:
      image: hashicorp/http-echo
      environment:
         - NVWB_TRIM_PREFIX=true
      ports:
         - '5678:5678'
      command: ["-text=hello from service 1"]

   web4:
      image: hashicorp/http-echo
      profiles: [compose-secret]
      environment:
         - NVWB_TRIM_PREFIX=true
         # Secret mounted as file by compose
         - TEST_SECRET_FILE=/run/secrets/TEST_SECRET
      ports:
         - '5680:5680'
      # Link to the global secrets section
      secrets:
         - TEST_SECRET
      command: ["-text=hello from service 4", "-listen=:5680"]

# Global secrets section - AI Workbench replaces the value at runtime
# The name must match the secret name in your AI Workbench project
secrets:
  TEST_SECRET:
    environment: "HOME"

Note

Create the secret TEST_SECRET in your AI Workbench project before using this example.

Multi-Container Environments (Docker Compose)#

Overview#

Key Concepts#

Requirements#

Compose File Specifications#

File Names and Locations#

Valid File Names#

Default Locations#

Custom Location#

Service Profiles#

GPU Configuration in Compose#

GPU Allocation by Count#

GPU Allocation by Device ID#

Multi-GPU Configurations#

GPU Allocation Conflicts#

CPU and Memory Resources#

Resource Limits#

Environment Variables and Secrets#

Environment Variables#

Special Environment Variables#

Secrets#

Networks#

Network Configuration#

Mounts and Volumes#

Volume Types#

Shared Volume#

Bind Mounts#

External Volumes#

Service Dependencies and Healthchecks#

depends_on Configuration#

Healthcheck Configuration#

Versioning#

What is Versioned#

What is NOT Versioned#

Examples#

Example 1: Simple Web Service#

Example 2: GPU-Enabled Service with Profile#

Example 3: Environment Variables and Secrets via Interpolation#

Example 4: Compose Secrets as Files#

Related Topics#