Helm Chart Deployment#
The following is used to deploy or update the Fine-Tuning Microservice (FTMS) API on an existing Kubernetes cluster. You can use the following to enable HTTPS and enforce user authentication to enable secure multi-tenancy.
(Optional) Create a namespace to deploy the FTMS API.
kubectl create namespace nvidia-ftms
Note
Deploying the FTMS API in a non-default namespace, like nvidia-ftms
, affects API paths.
If ingress is enabled, you might need to access the API at https://<host>/<namespace>/api
instead of https://<host>/api
.
Consider using the default namespace to avoid modifying paths in notebooks or commands.
You can shut down an already deployed FTMS API.
helm delete tao-api --namespace nvidia-ftms
You must use the provided Helm chart to deploy FTMS resources.
helm fetch https://helm.ngc.nvidia.com/nvidia/tao/charts/tao-toolkit-api-6.0.0.tgz --username='$oauthtoken' --password=<YOUR API KEY>
mkdir tao-api && tar -zxvf tao-toolkit-api-6.0.0.tgz
You can customize the deployment if necessary by updating the chart’s tao-api/values.yaml
.
Required values:
ngc_api_key
: The admin NGC Personal Key to create imagepullsecret for nvcr.io access.ptmApiKey
: The NGC Legacy API key to pull pretrained models from across NGC orgs. A required value ifptmPull
istrue
.Please visit NGC to create your NGC personal key and legacy API key (Requires NGC account).
Optional values:
Deployment-related parameters:
backend
: Platform used for training jobs. Options arelocal-k8s
amdNVCF
. Defaults tolocal-k8s
.hostPlatform
: Platform used for hosting the API orchestration service. Options arelocal
andNVCF
. Defaults tolocal
.ingressEnabled
: Whether to enable ingress controller. Must be disabled whenhostPlatform
isNVCF
. Defaults totrue
.hostBaseUrl
: Base URL of the API service. Format ishttps://<host>:<port>
, for examplehttps://10.10.10.10:32080
.serviceAdminUUID
: UUID of the service admin user. This user has access to internal API endpoints.
Note
To obtain your
serviceAdminUUID
, run the following Python code:
import requests
import uuid
key = "<YOUR_NGC_PERSONAL_KEY>" # Replace with your actual NGC Personal Key
url = 'https://api.ngc.nvidia.com/v3/keys/get-caller-info'
r = requests.post(
url,
headers={'Content-Type': 'application/x-www-form-urlencoded'},
data={'credentials': key},
timeout=5
)
ngc_user_id = r.json().get('user', {}).get('id')
service_admin_uuid = str(uuid.uuid5(uuid.UUID(int=0), str(ngc_user_id)))
print(f"Your serviceAdminUUID is: {service_admin_uuid}")
host
,tlsSecret
: For enabling HTTPS and enforcing user authentication, and enabling secure multi-tenancy.corsOrigin
: For enabling CORS and setting origin.authClientID
: Reserved for future NVIDIA Starfleet authentication.
Container related parameters:
image
: Location of the TAO API container image.ngcImagePullSecretName
: Secret name set up to access the NVIDIA nvcr.io registry. Defaults to ‘imagePullSecret’.imagePullPolicy
: Set to always fetch from nvcr.io instead of using a locally cached image. Defaults to ‘Always’.pythonVersion
: Version of Python used in the container. Defaults to 3.12.pythonBasePath
: Path to the Python executable. Defaults to/usr/local/lib/python
.
Other parameters:
ptmOrgTeams
: List of org/teams that pretrained models are available for. Defaults tonvidia/tao,ea-tlt/tao_ea
.ptmPull
: Whether to pull pretrained models from NGC when deploying API. Defaults totrue
.maxNumGpuPerNode
: Number of GPUs assigned to each job.mongoOperatorEnabled
: Whether to enable the MongoDB operator. Defaults tofalse
.telemetryOptOut
: Set totrue
to opt out from NVIDIA collection of anonymous usage metrics.
We provide additional configurable parameters for dependent services:
mongo*
: List of parameters for mongodb memory, CPU, and storage configuration.community-operator
: Configuration for the mongo community operator.ingress-nginx
: Configuration for ingress-nginx controller.notebooksDir
: Path to the notebooks directory in JupyterLab. Defaults tonotebooks
.enableVault
: Whether to enable vault for secrets management. Default tofalse
.vault
: Configuration for the vault operator.profiler
: Whether to enable the Python profiler. Defaults toFalse
.kube-prometheus-stack.enabled
: Whether to enable the prometheus in the cluster. Default to falsekratosClientCert
: Client certificate to export telemetry to Kratos.kratosClientKey
: Client key to export telemetry to Kratos.
Example for creating a tlsSecret:
openssl req -x509 -sha256 -nodes -days 365 -newkey rsa:2048 -keyout tls.key -out tls.crt -subj "/CN=ec2-34-221-205-157.us-west-2.compute.amazonaws.com/O=ec2-34-221-205-157.us-west-2.compute.amazonaws.com" --addext "subjectAltName = DNS:ec2-34-221-205-157.us-west-2.compute.amazonaws.com"
kubectl create secret tls tls-secret --key tls.key --cert tls.crt --namespace default
Then install the FTMS API service:
helm install tao-api tao-api/ --namespace nvidia-ftms
FTMS Deployment is completed when all pods are in the Running state. This may take 10-15 minutes.
kubectl get pods -n nvidia-ftms
To debug a deployment, look for events toward the bottom of the following command’s output:
kubectl describe pods tao-api -n nvidia-ftms
Next Steps#
The swagger UI can be accessed at
<host_url>/swagger
The notebooks can be downloaded at
<host_url>/tao_api_notebooks.zip
host_url
in the notebooks: The base URL of the API service. Format ishttp://<host>:<port>
, for examplehttp://10.10.10.10:32080
After successful deployment, you can start using the FTMS API through either:
The Remote Client CLI - A command-line interface for interacting with the API
The REST API - Direct HTTP endpoints for programmatic access
Or a tutorial notebook where we will distill a RT-DETR model down to 1/4 of its size but keep the same accuracy.
Choose the interface that best suits your needs and refer to the corresponding documentation section for detailed usage instructions.
Quick Start: Log-In#
The following diagram and examples show how to interact with the FTMS API quickly after a successful deployment, using either the Remote Client CLI or direct REST API calls.
User interaction flow with FTMS API#
Log-In Example
Using Remote Client CLI:
BASE_URL=<host_url>/default/api/v1 tao-client login --ngc-key <NGC_KEY> --ngc-org-name <NGC_ORG_NAME> --enable-telemetry
Using curl (REST API):
curl -X POST "<host_url>/api/v1/login" \ -H "Content-Type: application/json" \ -d '{"ngc_org_name": "<NGC_ORG_NAME>", "ngc_key": "<NGC_KEY>", "enable_telemetry": true}'
Replace <host_url>
and <NGC_ORG_NAME>
and <NGC_KEY>
with your actual API endpoint and NGC key.
For more details, see the Remote Client CLI and REST API documentation sections.
Common issues are:
GPU Operator pods not in Ready or Completed states
Invalid values.yaml file
Missing or invalid imagepullsecret
Missing or invalid ngc_api_key
Missing or invalid ptmApiKey