Deploy NeMo Data Designer Using Docker Compose#
You can deploy the NeMo Data Designer microservice using Docker Compose for local development, testing, and quickstart scenarios. This deployment method provides a simple way to get Data Designer running quickly without complex Kubernetes configurations.
Prerequisites#
Docker and Docker Compose installed on your system
NGC API Key for accessing NGC Catalog
At least 8GB of available RAM (for complete stack including PostgreSQL and MinIO)
Sufficient disk space for generated artifacts, database, and object storage (recommended: 20GB+)
Access to LLM endpoints (NVIDIA API, local NIM, or other compatible endpoints)
Authenticate with NGC#
Before pulling container images, log in to the NVIDIA NGC container registry:
echo $NGC_CLI_API_KEY | docker login nvcr.io -u '$oauthtoken' --password-stdin
Replace $NGC_CLI_API_KEY
with your actual NGC API key.
Deployment#
Download the Docker Compose configuration from NGC:
ngc registry resource download-version "nvidia/nemo-microservices/nemo-data-designer-docker-compose:25.08" cd nemo-data-designer-docker-compose_v25.08
Set up environment variables:
export NEMO_MICROSERVICES_IMAGE_REGISTRY="nvcr.io/nvidia/nemo-microservices" export NEMO_MICROSERVICES_IMAGE_TAG="25.08"
Note: The Data Designer service will automatically configure additional environment variables:
NEMO_MICROSERVICES_DATA_DESIGNER_ARTIFACTS_ROOT=/artifacts_root
NEMO_MICROSERVICES_DATA_DESIGNER_DATA_STORE_ENDPOINT=http://datastore:3000/v1/hf
NEMO_MICROSERVICES_DATA_DESIGNER_DATA_STORE_TOKEN
(optional)
Start Data Designer:
docker compose -f docker-compose.ea.yaml up -d
This will start a complete backend stack with the following services:
data-designer: The main Data Designer service (accessible on port 8000)
data-designer-volume-permissions: Initializes proper permissions for artifacts storage. Runs once before data-designer and exits.
datastore: Backend storage service built on Gitea for dataset management
datastore-volume-permissions: Initializes proper permissions for datastore storage. Runs once before datastore and exits.
postgres: PostgreSQL database for datastore metadata and configuration. A dependency of datastore.
minio: Object storage service for large files and artifacts. A dependency of datastore.
Verify Deployment#
After starting the services, verify everything is working:
Check service status:
docker ps
The response should show four containers running: data-designer
, datastore
, postgres
, and minio
.
Test the health endpoint:
curl localhost:8000/health
The response should have status 200
and body {"status": "healthy"}
.
Service Endpoints#
After starting Data Designer, the following services will be accessible:
Primary Services#
Data Designer API: http://localhost:8000
Health check:
GET /health
Data preview:
POST /v1beta1/data-designer/preview
Batch jobs:
POST /v1beta1/data-designer/jobs
List jobs:
GET /v1beta1/data-designer/jobs
Job status:
GET /v1beta1/data-designer/jobs/{job_id}
Job logs:
GET /v1beta1/data-designer/jobs/{job_id}/logs
Job results:
GET /v1beta1/data-designer/jobs/{job_id}/results
Download result:
GET /v1beta1/data-designer/jobs/{job_id}/results/{result_id}/download
Data Store API: http://localhost:3000
Health check:
GET /v1/health
Repository management for datasets and artifacts
Backend Services (for troubleshooting)#
PostgreSQL Database:
localhost:5432
Database:
ndsdb
User:
ndsuser
/ Password:ndspass
MinIO Object Storage:
Console: http://localhost:9001
Credentials:
minioadmin
/minioadmin
Quick API Test#
Test the service with a simple preview request (generates basic categorical data):
curl --json @- localhost:8000/v1beta1/data-designer/preview <<EOF
{
"config": {
"model_configs": [],
"columns":[
{
"name":"school_subject",
"type":"category",
"params":{
"values":[
"math",
"science",
"history",
"art"
]
}
}
]
}
}
EOF
Backend Architecture#
The Data Designer deployment includes a complete backend stack:
Data Flow#
Data Designer processes requests and generates synthetic data
Datastore manages dataset repositories and metadata via Gitea
PostgreSQL stores datastore configuration and repository metadata
MinIO provides object storage for large files and artifacts
Storage Volumes#
artifacts_root
: Stores generated synthetic datasetsdatastore_storage
: Stores datastore application datapostgres_storage
: PostgreSQL database filesminio_storage
: Object storage data
Networking#
All services communicate via the nmp
Docker network bridge.
Troubleshooting#
Check All Services#
docker ps
All services should show as “Up” or “healthy”.
Service Health Checks#
# Data Designer
curl localhost:8000/health
# Datastore
curl localhost:3000/v1/health
# PostgreSQL
docker compose -f docker-compose.ea.yaml exec postgres pg_isready -d ndsdb -U ndsuser
# MinIO
curl localhost:9000/minio/health/ready
View Service Logs#
# Data Designer logs
docker compose -f docker-compose.ea.yaml logs data-designer
# Datastore logs
docker compose -f docker-compose.ea.yaml logs datastore
# Database logs
docker compose -f docker-compose.ea.yaml logs postgres
Stop the Service#
To stop Data Designer:
docker compose -f docker-compose.ea.yaml down