Docker Compose Deployment#

This section describes how to deploy and configure FTMS using Docker Compose with configurable settings and optional airgapped support. This approach provides a simplified deployment alternative compared to Kubernetes-based deployments.

Overview#

The FTMS Docker Compose setup provides a multi-container deployment of FTMS with the following features:

  • Minimal Setup: Only requires NGC API keys, Docker, and a NVIDIA GPU enabled machine.

  • Configurable Settings: All configuration managed through environment variables and configuration files

  • Airgapped Support: Support for airgapped deployments of FTMS.

  • Easy Management: Simple scripts for starting, stopping, and managing services

  • Job Orchestration: Same job orchestration as Kubernetes-based deployments

  • AutoML Support: Same AutoML support as Kubernetes-based deployments

Note

Docker Compose currently does not support multi node deployments. For multi node capabilities, please refer to the Kubernetes-based option of deploying FTMS.

Architecture#

The following diagram depicts the high-level architecture of FTMS when deployed using Docker Compose in either airgapped or internet connected mode.

FTMS Docker Compose Architecture

Components#

  1. TAO API App: Exposes REST endpoints, validates requests, resolves model/dataset metadata, and persists state. Reads/writes to MongoDB.

  2. TAO Workflow: Orchestrates workflows, manages job state, and handles job execution by checking for pending jobs from MongoDB.

  3. TAO Job: A short-lived container launched by the workflow service to execute training and other actions. Pulls its container image from a registry, pulls datasets/pretrained models, and uploads checkpoints/logs.

  4. MongoDB: Database backend. Stores job state, workflow state, and other metadata. More information at MongoDB.

  5. SeaweedFS: (Optional) Local storage solution for airgapped environments. Natively supports AWS S3 CLI. More information at SeaweedFS.

  6. AWS S3/Azure Blob Storage: (Optional) Remote storage solution for non airgapped environments.

  7. NGC Registry: (Optional) NVIDIA Container registry for pulling container images.

Prerequisites#

Refer to Microservices Setup to install the prerequisites for Docker Compose deployment.

Note

FTMS requires at least 100 GB of disk space to download models, datasets, and containers. To check available disk space on each partition, run:

df -h

If your default partition does not have enough disk space, you may need to modify the data-root in your Docker daemon configuration. Edit /etc/docker/daemon.json and add:

{
  "data-root": "/path/to/larger/partition"
}

After modifying the configuration, restart the Docker daemon:

sudo systemctl restart docker

Deployment Steps#

  1. Clone the FTMS tao_tutorials repository.

    git clone https://github.com/NVIDIA/tao_tutorials.git
    
  2. Navigate to the tao_tutorials/setup/tao-docker-compose repository.

    cd tao_tutorials/setup/tao-docker-compose
    
  3. Edit secrets.json

    {
       "ngc_api_key": "nvapi-xxx",
       "ptm_api_key": "your-legacy-api-key"
    }
    

    Note

    • Both keys can be generated from NGC

    • ngc_api_key: Your NGC Personal API Key (starts with nvapi-). Required for authentication and scoped for downloading models from your NGC organization

    • ptm_api_key: NGC Legacy API Key. Required for downloading models from across NGC organizations.

API Endpoints#

The FTMS API endpoints are available at:

{base_url}/api/v2/orgs/{ngc_org_name}/

Note

  • base_url: http://localhost:8090 (or your configured NGINX_HTTP_PORT at config.env)

  • ngc_org_name: The name of the NGC organization.

Quick Start#

Default Setup (FTMS only)#

./run.sh up

With SeaweedFS Storage#

./run.sh up-all

With Custom Settings#

./run.sh up-all --airgapped

Configuration#

Settings File: config.env#

All settings are controlled through the config.env file. Key configurable options are:

  • AIRGAPPED_MODE: Enable/disable airgapped mode (true/false)

  • PTM_PULL: Enable/disable pretrained models pull (true/false)

  • PYTHON_VERSION: Python version for main services (e.g., 3.12)

  • DEBUG_MODE: Enable debug mode (true/false)

  • DEPLOYMENT_MODE: Deployment mode (PROD/DEV)

  • SEAWEEDFS_ENABLED: Enable SeaweedFS integration (true/false)

Port Configuration#

If you encounter port conflicts with existing services on your system, you can modify the following port settings in config.env:

  • NGINX_HTTP_PORT: FTMS API port (default: 8090)

  • SEAWEEDFS_FILER_PORT: SeaweedFS Filer port (default: 8888)

  • SEAWEEDFS_S3_PORT: SeaweedFS S3 API port (default: 8333)

  • SEAWEEDFS_VOLUME_PORT: SeaweedFS Volume port (default: 8080)

  • SEAWEEDFS_MASTER_PORT: SeaweedFS Master port (default: 9333)

Note

After changing port configurations, restart the services using ./run.sh restart for the changes to take effect.

If all ports are already in use by other processes, you can first identify which processes are using them:

sudo lsof -i :8090  # Check specific port (e.g., NGINX_HTTP_PORT)

If needed, you can stop the processes using the required ports:

sudo kill $(sudo lsof -ti :8090)  # Kill process on specific port

Warning

The kill command will forcefully terminate processes using the specified port. This may cause data loss or service interruption for existing applications. Only use this command if you are certain you want to stop the conflicting process. Consider changing the port configuration in config.env as a safer alternative.

Available Commands#

  • ./run.sh up - Start FTMS services (PTM pull based on config)

  • ./run.sh up-all - Start all services (including SeaweedFS)

  • ./run.sh down - Stop all services (including all profiles)

  • ./run.sh restart - Restart services

  • ./run.sh logs - Show service logs

  • ./run.sh status - Show service status

  • ./run.sh config - Show current configuration

Note

The down command automatically stops all services including those in profiles (SeaweedFS, PTM) to ensure a clean shutdown regardless of how services were started.

Services#

FTMS API Services#

  • mongodb: Database backend

  • tao_api_app: Main FTMS API application (http://localhost)

  • tao_api_workflow: Workflow management service

  • nginx: Reverse proxy

Pretrained Model Service (Optional)#

The pretrained model service downloads pretrained models from NGC and makes them available to the FTMS API. This service is disabled when using airgapped mode.

  • tao_api_pretrained_models: Pretrained models initialization (enabled via PTM_PULL setting)

SeaweedFS Services (Optional)#

The SeaweedFS service provides a local storage solution for the FTMS API. This service is required when using airgapped mode.

Airgapped Workflow#

Docker Compose is compatible with airgapped deployments. Please refer to Airgapped Deployment for more details.

Next Steps#

Once you have completed the Docker Compose setup, you can interact with the FTMS API using a tutorial notebook where we will use AutoML hyper parameter optimization to find the best model for object detection.