Intermediate Users
You can run TAO directly using the Docker container. There are multiple containers under TAO, and depending on the model that you want to train, you must pull the appropriate container. This is not required when using the Launcher CLI.
export DOCKER_REGISTRY="nvcr.io"
export DOCKER_NAME="nvidia/tao/tao-toolkit"
export DOCKER_TAG="5.0.0-tf1.15.5" ## for TensorFlow docker
export DOCKER_CONTAINER=$DOCKER_REGISTRY/$DOCKER_NAME:$DOCKER_TAG
docker run -it --rm --gpus all -v /path/in/host:/path/in/docker $DOCKER_CONTAINER \
detectnet_v2 train -e /path/to/experiment/spec.txt -r /path/to/results/dir -k $KEY --gpus 4
For detailed instructions on how to run directly from containers, refer to this section.
TAO API is a Kubernetes service that enables building end-to-end AI models using REST APIs. The API service can be installed on a Kubernetes cluster (local / AWS EKS) using a Helm chart along with minimal dependencies. TAO jobs can be run using the GPUs available on the cluster and can scale to a multi-node setting. You can use a TAO client CLI to interact with TAO services remotely or you can integrate it with your apps and services directly using REST APIs.
To get started, use the provided one-click deploy script to deploy on a bare-metal setup or on
a managed Kubernetes service like Amazon EKS. Jupyter Notebooks to train using the APIs directly or using
the client app are provided under notebooks/api_starter_kit
.
bash setup/quickstart_api_bare_metal/setup.sh install
bash setup/quickstart_api_aws_eks/setup.sh install
More information about setting up the API services and the API is provided here.
You can also run TAO directly on bare-metal without Docker or K8s by using the Python wheels, which contain standalone implementations of the DNN functionality that are pre-built and packaged into the TAO containers.
The table below maps each TAO wheel to its container and captures any exceptions associated with these wheels.
Wheel Name |
Container Mapping |
Networks Supported |
---|---|---|
nvidia-tao-pytorch | nvcr.io/nvidia/tao/tao-toolkit:5.5.0-pytorch |
|
nvidia-tao-deploy | nvcr.io/nvidia/tao/tao-toolkit:5.5.0-deploy |
|
TAO provides sample tutorials that allow you to interact with the Python wheels on Google Colab without having to configure your infrastructure. Full instructions on how to work with Google Colab are provided in the TAO with Google Colab section.
Installing nvidia_tao_deploy
Locally
This section details how to install the nvidia_tao_deploy
wheel locally.
Install the following Python pip dependencies:
python3 -m pip install --upgrade pip python3 -m pip install Cython==0.29.36 python3 -m pip install nvidia-ml-py python3 -m pip install nvidia-pyindex python3 -m pip install --upgrade setuptools python3 -m pip install pycuda==2020.1 python3 -m pip install nvidia-eff-tao-encryption python3 -m pip install nvidia-eff python3 -m pip install cffi
Set up openMPI and mpi4py:
sudo apt-get install libopenmpi-dev -y python3 -m pip install mpi4py
Install the
nvidia_tao_deploy
wheel:python3 -m pip install nvidia-tao-deploy
Installing nvidia_tao_pytorch
Locally
The nvidia-tao-pytorch
wheel has several 3rd party dependencies, which can be cumbersome to install.
To the build installation, refer the steps in this script: