Running TAO via Fine-Tuning Micro-Services (FTMS)#

Fine-Tuning Micro-Services (FTMS) provides a comprehensive API-driven interface for training, optimizing, and deploying deep learning models. The service features the new TAO API v2 with a unified job-centric architecture, along with an integrated Python SDK and CLI for seamless access to all TAO functionality.

Navigate to any section below to get started:

Fine-Tuning Micro-Services Overview#

Get started with a high-level introduction to Fine-Tuning Micro-Services (FTMS), including the new TAO API v2 architecture, unified jobs interface, and enhanced authentication. Learn about the three access methods: REST API, Python SDK, and CLI.

Read the overview.

Microservices Prerequisites and Setup#

Follow step-by-step guidance to configure prerequisites and prepare your environment for deployment across various platforms.

Get started with setup.

Kubernetes Deployment#

Deploy the microservice using Kubernetes with detailed explanations of configurable values in the Helm chart.

Learn about Kubernetes deployment.

Docker Compose Deployment#

Deploy the microservice using Docker Compose with configurable settings for simplified deployment.

Learn about Docker Compose deployment.

Air-gapped Environment Deployment#

Deploy TAO Toolkit in secure, isolated environments without internet connectivity using pre-downloaded models and SeaweedFS storage.

Learn about air-gapped deployment.

Python SDK and CLI#

Interact with the TAO API v2 using two integrated tools:

Python SDK

The nvidia-tao-client package provides programmatic access to all TAO operations with a TaoClient class. Features include:

Environment variable authentication
Unified job creation and management
Workspace and dataset operations
Inference microservice control
Comprehensive error handling

Command-Line Interface (CLI)

The tao command provides terminal access to all TAO functionality organized by network architecture. Features include:

36+ supported network architectures (classification_pyt, rtdetr, mask2former, etc.)
Consistent command structure across all networks
Interactive authentication with tao login
Job management, workspace operations, and dataset handling
Inference microservice deployment

Installation

pip install nvidia-tao-client

Explore the SDK and CLI documentation.

REST API Overview and Examples#

Access comprehensive documentation of the TAO API v2 endpoints, including request/response formats and complete workflow examples. The v2 API features:

Unified Jobs Endpoint - Single endpoint for all experiment and dataset operations
Environment Variable Authentication - JWT token-based auth with CI/CD integration
Resource-Specific Metadata - Dedicated endpoints for workspaces, datasets, and jobs
Enhanced Job Control - Pause, resume, cancel, and delete operations

View the REST API guide.

Inference Microservice#

Deploy trained models as persistent inference servers for fast, repeated inference without model reloading overhead. Features include:

Start/stop microservices via API, SDK, or CLI
Support for multiple inference modes (base64, cloud media paths, VLM with prompts)
Scalable GPU allocation
Real-time status monitoring

Learn about Inference Microservices.

AutoML#

Discover the supported AutoML algorithms (Bayesian, Hyperband), their configuration, and how to use them for automated hyperparameter optimization. AutoML is fully integrated with the v2 API and can be enabled during job creation.

Learn about AutoML.

API Reference#

Access comprehensive OpenAPI specifications for TAO API v2 and previous versions. The v2 API documentation includes:

Interactive Swagger UI
ReDoc documentation
OpenAPI specs (JSON/YAML)
Complete endpoint reference
Request/response schemas
Example notebooks

After deployment, access the v2 API documentation at:

Swagger UI: /api/v2/swagger
ReDoc: /api/v2/redoc
OpenAPI Specs: /api/v2/openapi.json

Browse the API reference.

Quick Start Guide#

1. Install the SDK/CLI

pip install nvidia-tao-client

2. Authenticate

# Using CLI
tao login --ngc-key YOUR_NGC_KEY --ngc-org-name YOUR_ORG

# Or using Python SDK
from tao_sdk.client import TaoClient
client = TaoClient()
client.login(ngc_key="YOUR_NGC_KEY", ngc_org_name="YOUR_ORG")

3. Create Your First Job

# Using CLI
tao classification_pyt create-job \
  --kind experiment \
  --name "my_first_job" \
  --encryption-key "my_key" \
  --workspace "workspace_id" \
  --action train \
  --specs '{"epochs": 100, "learning_rate": 0.001}' \
  --train-datasets '["dataset_id"]'

# Using Python SDK
job = client.create_job(
    kind="experiment",
    name="my_first_job",
    network_arch="classification_pyt",
    encryption_key="my_key",
    workspace="workspace_id",
    action="train",
    specs={"epochs": 100, "learning_rate": 0.001},
    train_datasets=["dataset_id"]
)

4. Monitor Progress

# CLI
tao classification_pyt get-job-status --job-id "job_id"

# Python SDK
status = client.get_job_status("job_id")

Migration from v1 to v2#

If you’re using TAO API v1, the v2 API offers significant improvements:

Key Changes

Unified /jobs endpoint replaces separate /experiments and dataset action endpoints
Single-step job creation instead of two-step process
Environment variable authentication instead of file-based configuration
Resource-specific metadata endpoints
Enhanced job control operations

Migration Benefits

Simpler API structure with fewer endpoints
Better authentication for CI/CD pipelines
Comprehensive job management
Inference microservice support
Improved error handling

See the individual documentation sections for detailed migration guidance and examples.