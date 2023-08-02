You can deploy the trained deep -earning and computer-vision models on edge devices–such as a Jetson Xavier, Jetson Nano, or Tesla–or in the cloud with NVIDIA GPUs. The exported \*.etlt model can also be used with TAO Toolkit Triton Apps.

The TAO Toolkit Triton Apps provide an inference sample for ReIdentificationNet. It consumes a TensorRT engine and supports running with a directory of query (probe) images and a directory of test (gallery) images containing the same identities.

To use this sample, you need to generate the TensorRT engine from an \*.etlt model using tao-converter .

Generating TensorRT Engine Using tao-converter

The tao-converter tool is provided with the TAO Toolkit to facilitate the deployment of TAO trained models on TensorRT and/or Deepstream. This section elaborates on how to generate a TensorRT engine using tao-converter .

For deployment platforms with an x86-based CPU and discrete GPUs, the tao-converter is distributed within the TAO docker. Therefore, we suggest using the docker to generate the engine. However, this requires that the user adhere to the same minor version of TensorRT as distributed with the docker. The TAO docker includes TensorRT version 8.0.

Instructions for x86

For an x86 platform with discrete GPUs, the default TAO package includes the tao-converter built for TensorRT 8.2.5.1 with CUDA 11.4 and CUDNN 8.2. However, for any other version of CUDA and TensorRT, please refer to the overview section for download. Once the tao-converter is downloaded, follow the instructions below to generate a TensorRT engine.

Unzip the zip file on the target machine. Install the OpenSSL package using the command: Copy Copied! sudo apt-get install libssl-dev Export the following environment variables:

Copy Copied! $ export TRT_LIB_PATH=”/usr/lib/x86_64-linux-gnu” $ export TRT_INC_PATH=”/usr/include/x86_64-linux-gnu”

Run the tao-converter using the sample command below and generate the engine. Instructions to build TensorRT OSS on Jetson can be found in the TensorRT OSS on x86 section above or in this GitHub repo.

Note Make sure to follow the output node names as mentioned in the Exporting the Model section of the respective model.





Instructions for Jetson

For the Jetson platform, the tao-converter is available to download in the NVIDIA developer zone. You may choose the version you wish to download as listed in the overview section. Once the tao-converter is downloaded, please follow the instructions below to generate a TensorRT engine.

Unzip the zip file on the target machine. Install the OpenSSL package using the command: Copy Copied! sudo apt-get install libssl-dev Export the following environment variables:

Copy Copied! $ export TRT_LIB_PATH=”/usr/lib/aarch64-linux-gnu” $ export TRT_INC_PATH=”/usr/include/aarch64-linux-gnu”

For Jetson devices, TensorRT comes pre-installed with Jetpack. If you are using older JetPack, upgrade to JetPack-5.0DP. Instructions to build TensorRT OSS on Jetson can be found in the TensorRT OSS on Jetson (ARM64) section above or in this GitHub repo. Run the tao-converter using the sample command below and generate the engine.

Note Make sure to follow the output node names as mentioned in Exporting the Model section of the respective model.





Using the tao-converter

Here is a sample command to generate the ReIdentificationNet engine through tao-converter :

Copy Copied! #convert ResNet50 model, input image of width 128 and height 256: tao-converter <etlt_model> \ -k <key_to_etlt_model> \ -d 3,256,128 \ -p input,1x3x256x128,4x3x256x128,16x3x256x128 \ -o fc_pred \ -t fp16 \ -m 16 \ -e <path_to_generated_trt_engine>

This command will generate an optimized TensorRT engine.

Running the Triton Inference Sample

You can generate the TensorRT engine when starting the Triton server using the following command:

Copy Copied! bash scripts/start_server.sh

When the server is running, you can get results from a directory of query images and a directory of test images using the following command with a client:

Copy Copied! python tao_client.py <path_to_query_directory> \ --test_dir <path_to_test_directory> -m re_identification_tao \ -x 1 \ -b 16 \ --mode Re_identification \ -i https \ -u localhost:8000 \ --async \ --output_path <path_to_output_directory>

Note The server will perform inference on the input image directories. The results are saved as a JSON file. The following is a sample of the JSON output: Copy Copied! [ ..., { "img_path": "/localhome/Data/market1501/query/1121_c3s2_156744_00.jpg", "embedding": [-1.1530249118804932, -1.8521332740783691,..., 0.380886435508728] },... { "img_path": "/localhome/Data/market1501/bounding_box_test/1377_c2s3_038007_05.jpg", "embedding": [0.09496910870075226, 0.26107653975486755,..., 0.2835155725479126] },... ]





End-to-End Inference Using Triton

The TAO Toolkit Triton Apps provides a sample for end-to-end inference from a directory of query images and a directory of test images. The sample downloads the Market-1501 dataset and randomly samples a subset of 100 identities. The client implicitly converts the image samples into arrays and sends them to the Triton server. The feature embedding for each image is returned and saved to the JSON output. An image of sampled matches and a figure of the CMC curve is also generated for visualization.

You can start the Triton server using the following command (only the ReIdentificationNet model will be downloaded and converted into a TensorRT engine):

Copy Copied! bash scripts/re_id_e2e_inference/start_server.sh

Once the Triton server has started, open another terminal and use the following command to run re-identification on the query and test images using the Triton server instance that you have previously spun up: