Multi-Container Environments (Docker Compose)#
Overview#
This reference provides technical specifications for using Docker Compose with AI Workbench projects.
Use this reference to understand compose file requirements, valid file names and locations, service profile specifications, GPU allocation in multi-container environments, and environment variable handling.
For step-by-step instructions on creating and running multi-container applications, see Use Multi-Container Environments.
Key Concepts#
- Compose File:
The docker-compose.yml or compose.yaml file that defines services, networks, and volumes for multi-container environments.
- Service Profile:
A Compose feature that conditionally enables services based on selected profiles.
- NVWB_TRIM_PREFIX:
An environment variable that controls proxy routing behavior for web services.
- Shared Volume:
The /nvwb-shared-volume mount automatically created for all compose containers.
Requirements#
To use multi-container environments with AI Workbench, you must have:
AI Workbench project with Docker Compose file in the repository.
Basic understanding of Docker Compose syntax and concepts.
For GPU services: NVIDIA GPU available on the host.
For more information about Docker Compose, see the Docker Compose documentation.
Compose File Specifications#
The compose file is a YAML file that defines the services, networks, and volumes for a multi-container environment.
File Names and Locations#
The compose file has restrictions in regards to the file name and location.
Valid File Names#
AI Workbench recognizes the following file names:
compose.yaml
compose.yml
docker-compose.yml
docker-compose.yaml
Default Locations#
The compose file must be in one of the following locations:
Project root: /project/
Deploy folder: /project/deploy/
Custom Location#
You can specify a different location by editing the project specification file:
Add the field environment:compose-file-path to .project/spec.yaml
Set the value to the relative path to the compose file in the project repository
For more information about the project specification file, see AI Workbench Project Specification.
Service Profiles#
Service profiles allow conditional service activation based on hardware or environment requirements.
For complete information about service profiles, see the Docker Compose profiles documentation.
Usage in AI Workbench:
Select one or more profiles in Project Tab > Environment > Compose before starting.
Services without a profile always run.
Services with a profile only run when that profile is selected.
GPU Configuration in Compose#
GPU configuration in compose specifies GPU allocation for services requiring GPU acceleration.
GPU Allocation by Count#
You can specify GPU requests using the deploy.resources.reservations format with a count value.
AI Workbench manages GPU reservations and explicitly passes GPUs into each container, preventing conflicts between services.
Example with count:
services:
gpu-service:
image: nvidia/cuda:12.2.0-base-ubuntu22.04
runtime: nvidia
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
Configuration Fields:
runtime: Should benvidiafor GPU accessdriver: Must benvidiacount: Number of GPUs to allocate to this service (Docker assigns available GPUs)capabilities: Must include[gpu]
GPU Allocation by Device ID#
You can assign specific GPU devices to services using device_ids instead of count.
Example with device_ids:
services:
embedding-service:
image: nvcr.io/nim/nvidia/llama-3.2-nv-embedqa-1b-v2:latest
runtime: nvidia
deploy:
resources:
reservations:
devices:
- driver: nvidia
device_ids: ['0']
capabilities: [gpu]
llm-service:
image: nvcr.io/nim/meta/llama-3.1-70b-instruct:latest
runtime: nvidia
deploy:
resources:
reservations:
devices:
- driver: nvidia
device_ids: ['1', '2']
capabilities: [gpu]
Configuration Fields:
device_ids: List of specific GPU device IDs to assignDevice IDs correspond to GPU indices shown in
nvidia-smioutputMultiple device IDs can be specified for multi-GPU services
When to use device_ids vs count:
Use
countwhen you don’t care which specific GPUs are assignedUse
device_idswhen you need precise GPU assignment across multiple servicesUse
device_idsto prevent services from competing for the same GPU
Multi-GPU Configurations#
Services requiring multiple GPUs can specify them using either count or device_ids.
Example with multiple GPUs:
services:
large-model:
image: nvcr.io/nim/meta/llama-3.1-405b-instruct:latest
runtime: nvidia
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 8
capabilities: [gpu]
shm_size: 20gb
Shared Memory for Multi-GPU:
Multi-GPU services often require increased shared memory
Set
shm_sizeto allocate shared memory (e.g.,20gb,16GB)Required for inter-GPU communication in many AI workloads
GPU Allocation Conflicts#
AI Workbench handles GPU reservation conflicts - if requested GPUs exceed available GPUs, the compose environment will fail to start with an error message.
Common conflict scenarios:
Project container using GPUs that compose services request
Multiple services requesting the same specific device_ids
Total count across all services exceeds available GPUs
Resolution:
Check GPU availability with
nvidia-smibefore starting servicesEnsure the project container is not using GPUs needed by compose services
Use device_ids to explicitly control GPU assignment across services
CPU and Memory Resources#
Services can specify CPU and memory resource limits and reservations to prevent resource exhaustion.
Resource Limits#
Resource limits define the maximum resources a service can consume.
Example:
services:
my-service:
image: myapp/service:latest
deploy:
resources:
limits:
cpus: '2.0'
memory: 4G
reservations:
cpus: '1.0'
memory: 2G
Configuration Fields:
limits.cpus: Maximum CPU cores (e.g.,'2.0'for 2 cores,'0.5'for half a core)limits.memory: Maximum memory (e.g.,4G,512M)reservations.cpus: Minimum CPU cores reserved for the servicereservations.memory: Minimum memory reserved for the service
Best Practices:
Set limits for inference services to prevent consuming all host resources
Use reservations to guarantee minimum resources for critical services
Important when running multiple AI services on a single host
Environment Variables and Secrets#
You can use environment variables and secrets in your compose file to configure your services.
Environment Variables#
Environment variables defined in your AI Workbench project are available in compose services through interpolation:
services:
myservice:
environment:
- MY_VAR=${TEST_VAR}
Special Environment Variables#
PROXY_PREFIX: Automatically injected into services, contains the proxy routing path.NVWB_TRIM_PREFIX: Set totrueto enable proxy routing with prefix trimming.
Secrets#
Secrets are used to store sensitive information in your compose file.
You can use AI Workbench secrets in your compose file in two ways:
Interpolation (as environment variables):
environment: - MY_SECRET=${SECRET_NAME}
Compose secrets (as files):
services: myservice: secrets: - SECRET_NAME environment: - SECRET_FILE=/run/secrets/SECRET_NAME secrets: SECRET_NAME: environment: "HOME"
Important
When using compose secrets, the global secrets section must include the secret name with environment: "HOME". AI Workbench automatically replaces this value at runtime.
Networks#
Networks enable service-to-service communication in multi-container environments.
Network Configuration#
Services on the same network can reach each other using service names as hostnames.
Example:
services:
frontend:
image: myapp/frontend:latest
networks:
- app-network
environment:
- API_URL=http://backend:8080
backend:
image: myapp/backend:latest
networks:
- app-network
ports:
- "8080:8080"
networks:
app-network:
driver: bridge
Network Types:
bridge: Default network driver for standalone containers on a single hostServices on the same bridge network can communicate by service name
Service Discovery:
A service named
backendis accessible at hostnamebackendfrom other servicesPort numbers in inter-service communication refer to container ports, not host ports
External port mappings (
ports) expose services to the host or AI Workbench proxy
Multiple Networks:
Services can connect to multiple networks for network segmentation and isolation.
services:
app:
networks:
- frontend-network
- backend-network
networks:
frontend-network:
backend-network:
Mounts and Volumes#
You can use mounts and volumes in your compose file to share files and data between your services.
Volume Types#
Docker Compose supports three volume types for persistent storage:
Named Volumes:
Managed by the container engine (Docker or Podman) and are useful for data that should persist but doesn’t need direct host access.
services:
database:
volumes:
- db-data:/var/lib/postgresql/data
volumes:
db-data:
Bind Mounts:
Connect to specific host directories, useful for development or accessing existing data.
services:
nim:
volumes:
- type: bind
source: /tmp
target: /opt/nim/.cache/
Tmpfs Mounts:
Store data in host memory, useful for temporary sensitive data.
services:
app:
tmpfs:
- /tmp
Bind Mounts#
AI Workbench does not manage bind mount source paths for compose containers.
You must specify bind mount source paths directly in the compose file. These paths are relative to the project repository or absolute paths on the host.
Common Bind Mount Patterns:
services:
service:
volumes:
# Relative to project root
- ./data:/app/data
# Absolute host path
- /mnt/datasets:/data
# Model cache directory
- ${MODEL_DIRECTORY:-/tmp}:/opt/nim/.cache
External Volumes#
External volumes are managed outside of Compose and must be created before starting services.
services:
nim:
volumes:
- nim_cache:/opt/nim/.cache
volumes:
nim_cache:
external: true
Create external volumes with docker volume create nim_cache before running compose.
Service Dependencies and Healthchecks#
Service dependencies control startup order and ensure services are ready before dependent services start.
depends_on Configuration#
The depends_on option specifies service dependencies.
Basic Dependency:
services:
frontend:
depends_on:
- backend
backend:
image: myapp/backend:latest
With Healthcheck Conditions:
services:
frontend:
depends_on:
backend:
condition: service_healthy
backend:
image: myapp/backend:latest
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
interval: 10s
timeout: 5s
retries: 3
Dependency Conditions:
service_started: Default, waits for service to start (not necessarily ready)service_healthy: Waits for service to pass healthcheckservice_completed_successfully: Waits for service to complete and exit successfully
Healthcheck Configuration#
Healthchecks verify that a service is ready to accept requests.
Healthcheck Fields:
services:
nim-llm:
healthcheck:
test: ["CMD", "python3", "-c", "import requests; requests.get('http://localhost:8000/v1/health/ready')"]
interval: 10s
timeout: 20s
retries: 100
start_period: 10m
Configuration Options:
test: Command to run (returns 0 for healthy, 1 for unhealthy)interval: Time between healthcheck attemptstimeout: Maximum time to wait for healthcheck to completeretries: Number of consecutive failures before marking unhealthystart_period: Grace period before first healthcheck (useful for slow-starting services)
Common Healthcheck Patterns:
HTTP endpoint check:
test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
Python script check:
test: ["CMD", "python3", "-c", "import requests; requests.get('http://localhost:8000/v1/health/ready')"]
Port check:
test: ["CMD-SHELL", "nc -z localhost 8080"]
Best Practices:
Define healthchecks for all critical services
Use appropriate
start_periodfor services that take time to initializeSet realistic
retriesandtimeoutvalues for AI services that may take time to respondHealthchecks are essential for complex multi-service pipelines
Versioning#
The compose file is versioned with the project repository. As long as the compose file is in a tracked folder in the project repository, Git will version it.
What is Versioned#
Versioning occurs when the compose file is in a tracked folder in the project repository. As long as the compose file is in a tracked folder in the project repository, Git will version it.
What is NOT Versioned#
The component images are not version when those images are used for services are not versioned. They are not versioned for the following reasons:
Service container images are pulled from registries at runtime.
Image versions are determined by tags in the compose file.
Updating the compose file and committing changes does not version the actual container images.
Examples#
Example 1: Simple Web Service#
A basic compose file with a single web service that always runs:
services:
web1:
image: hashicorp/http-echo
environment:
# NVWB_TRIM_PREFIX=true enables proxy routing with prefix trimming
# PROXY_PREFIX is automatically injected
- NVWB_TRIM_PREFIX=true
ports:
- '5678:5678'
command: ["-text=hello from service 1"]
Example 2: GPU-Enabled Service with Profile#
A compose file with two services - one always runs, one requires GPU and uses a profile:
services:
web1:
image: hashicorp/http-echo
environment:
- NVWB_TRIM_PREFIX=true
ports:
- '5678:5678'
command: ["-text=hello from service 1"]
web2:
image: hashicorp/http-echo
profiles: [gpu-service]
environment:
- NVWB_TRIM_PREFIX=true
ports:
- '5679:5679'
# GPU allocation specification
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
command: ["-text=hello from service 2", "-listen=:5679"]
Example 3: Environment Variables and Secrets via Interpolation#
A compose file using AI Workbench environment variables and secrets through interpolation:
services:
web3:
image: hashicorp/http-echo
environment:
- NVWB_TRIM_PREFIX=true
# Environment variables from AI Workbench project
- TEST_ENV_VAR=${TEST_VAR}
# Secrets also available via interpolation
- TEST_SECRET_FROM_ENV_VAR=${TEST_SECRET}
ports:
- '5678:5678'
command: ["-text=${TEST_VAR}"]
Note
Create the variable TEST_VAR and secret TEST_SECRET in your AI Workbench project before using this example.
Example 4: Compose Secrets as Files#
A compose file using AI Workbench secrets through the Compose secrets mechanism:
services:
web1:
image: hashicorp/http-echo
environment:
- NVWB_TRIM_PREFIX=true
ports:
- '5678:5678'
command: ["-text=hello from service 1"]
web4:
image: hashicorp/http-echo
profiles: [compose-secret]
environment:
- NVWB_TRIM_PREFIX=true
# Secret mounted as file by compose
- TEST_SECRET_FILE=/run/secrets/TEST_SECRET
ports:
- '5680:5680'
# Link to the global secrets section
secrets:
- TEST_SECRET
command: ["-text=hello from service 4", "-listen=:5680"]
# Global secrets section - AI Workbench replaces the value at runtime
# The name must match the secret name in your AI Workbench project
secrets:
TEST_SECRET:
environment: "HOME"
Note
Create the secret TEST_SECRET in your AI Workbench project before using this example.