Finetuning Microservices Overview#
The Fine-Tuning Micro-Service (FTMS) is TAO Toolkit’s new interface for accelerating model training without the overhead of setting up and managing the compute infrastructure. This new interface makes it easy to offer a managed training service to your development teams. It automates model fine-tuning flows, and improves user experience for non-domain experts, avoiding training flow intricacies and reducing significantly user input mistakes. It also makes it easy to integrate into other applications and MLOps services.
If you are looking on using this version of TAO Toolkit with the legacy TAO Launcher CLI, please refer to TAO Launcher. Or directly with DNN containers, please refer to Working With The Containers.
The following diagram depicts the high-level architecture where a remote client accesses an API allowing to train, optimize and test your model, as well as augment and annotate your data. This version of FTMS includes AutoML. Given a dataset and pretrained model, AutoML hyper parameters optimization finds the best parameters for better accuracy using Bayesian or Hyperband algorithms.

The FTMS securely accesses remotely stored datasets and pushes experiment artifacts to your remote storage.
Actions such as train, evaluate, prune, retrain, export, and inference can be spawned using API calls. For each action, you can request the action’s default parameters, update said parameters to your liking, then pass them while running the action. The specs are in the JSON format.
The service exposes a Job API endpoint that allows you to cancel, download, and monitor jobs. Job API endpoints also provide useful information such as epoch number, accuracy, loss values, and ETA.

After FTMS deployment, one can access the API’s OpenAPI specs from /swagger or /redoc, and download notebooks using /tao_api_notebooks.zip.
If deployed outside of NVIDIA’s NVCF platform, FTMS also offers a Remote Client CLI.
To get started, please refer to Microservices Setup.