# TAO Deploy Installation

When tao-deploy command is invoked through the TAO launcher, tao-deploy container is pulled from NGC and instantiated. The TAO Deploy container only contains few lightweight python packages such as OpenCV, Numpy, Pillow, and ONNX and is based on the NGC TensorRT container. Along with the NGC container, tao-deploy is also released as a public wheel on PyPI. The TensorRT engines generated by tao-deploy are specific to the GPU that it is generated on. So, based on the platform that the model is being deployed to, you will need to download the specific version of the tao-deploy wheel and generate the engine there after installing the corresponding TensorRT version for your platform.

## Invoking the TAO Deploy Container Directly

To deploy TAO models to TensorRT from the tao-deploy container, you should first identify the latest docker tag associated with the tao launcher by running tao-deploy info --verbose.

The following is sample output from TAO 4.0.0:

Copy
Copied!

Configuration of the TAO Toolkit Instance
dockers:
nvidia/tao/tao-toolkit:
4.0.0-deploy:
docker_registry: nvcr.io
1. classification_tf1
2. classification_tf2
3. deformable_detr
4. detectnet_v2
5. dssd
6. efficientdet_tf1
7. efficientdet_tf2
8. faster_rcnn
9. lprnet
12. retinanet
13. segformer
14. ssd
15. unet
16. yolo_v3
17. yolo_v4
18. yolo_v4_tiny
format_version: 2.0
toolkit_version: 4.0.0
published_date: 12/06/2022


The container name associated with the task can be retrieved as $DOCKER_REGISTRY/$DOCKER_NAME:$DOCKER_TAG. For example, from the log above, the Docker name to run detectnet_v2 can be derived as follows: Copy Copied!  export DOCKER_REGISTRY="nvcr.io" export DOCKER_NAME="nvidia/tao/tao-toolkit" export DOCKER_TAG="4.0.0-deploy" export DOCKER_CONTAINER=$DOCKER_REGISTRY/$DOCKER_NAME:$DOCKER_TAG


Once you have the Docker name, invoke the container by running the commands defined by the network without the tao-deploy prefix. For example, the following command will run detectnet_v2 TensorRT engine generation for FP16.

Copy
Copied!

docker run -it --rm --gpus all \
-v /path/in/host:/path/in/docker \
$DOCKER_CONTAINER \ detectnet_v2 gen_trt_engine -e /path/to/experiment/spec.txt \ -m /path/to/etlt/file \ -k$KEY \
--data_type fp16
--engine_file /path/to/engine/file


## Installing TAO Deploy through wheel

TAO Deploy is also distributed as a public wheel file at PyPI. The wheel does not include TensorRT or TensorRT OSS as part of its dependencies. Hence, you must either install these dependencies through the official TensorRT website or invoke TensorRT container available on NGC.

Run the following command to install the nvidia-tao-deploy wheel in your python environment.

Copy
Copied!

pip install nvidia-tao-deploy


Then, you can run TAO Deploy tasks with the tao-deploy prefix. For example, the following command will run a detectnet_v2 TensorRT engine generation for FP16.

Copy
Copied!

detectnet_v2 gen_trt_engine -e /path/to/experiment/spec.txt \
-m /path/to/etlt/file \
-k \$KEY \
--data_type fp16 \
--engine_file /path/to/engine/file


### Installing TAO Deploy on Google Colab

You can download the nvidia-tao-deploy wheel to Google Colab using the same commands as the x86 platform installation.

Note

The general limitations of Colab are outlined here.

1. Get the TensorRT TAR archive:

1. Visit the TensorRT webpage <https://developer.nvidia.com/tensorrt>

3. After logging in, choose TensorRT 8 from the available versions.

4. Agree to the Terms and Conditions.

5. On the next landing page, click TensorRT 8.5 GA to expand the available options.

6. Click TensorRT 8.5 GA for Linux x86_64 and CUDA 11.0, 11.1, 11.2, 11.3, 11.4, 11.5, 11.6, 11.7 and 11.8 TAR Package to download the TAR file.

You can download the nvidia-tao-deploy wheel to a jetson platform using the same commands as the x86 platform installation. We recommend using the NVIDIA TensorRT Docker container that already includes the TensorRT installation. Due to memory issues, you should first run the gen_trt_engine subtask on the x86 platform to generate the engine; you can then use the generated engine to run inference or evaluation on the Jetson platform and with the target dataset.