TLT CV Inference Pipeline Quick Start Scripts
=============================================

.. _tlt_cv_quick_start_scripts:

This page describes how to use the TLT CV Inference Pipeline Quick Start Scripts after the :ref:`Installation Prerequisites<tlt_cv_inference_pipeline_install_prereq>`.

The Quick Start Scripts hides container downloads, updates, model compilation, and more.
Here is a flow diagram of the Quick Start:

.. image:: ../../content/tlt_cv_inf_pipeline_quick_start_flow.png

Please enter the directory that houses the scripts, and ensure that they are executable:

.. code-block:: bash

    cd scripts
    chmod +x *.sh

All the scripts should be executed within this :code:`scripts` directory.

These scripts will automatically pull containers and models for x86 or aarch64 (Jetson).

Configuring
^^^^^^^^^^^

General configuration for the containers deployed using the quick start script
can be seen by viewing the file :code:`config.sh`.
By default, the configuration file is set to launch all available containers
on the supported GPU which is selected automatically based on the system architecture.

If you would like to use a video handle, ensure your video device handle
(for example, :code:`/dev/video0`) has been entered in :code:`config.sh` to make it
discoverable to the relevant Client container.

.. Note:: Make note of the resolutions and FPS support for your video handle (e.g. using the command :code:`v4l2-ctl --list-formats-ext`).

Models are automatically downloaded to the host machine at the location
(absolute path) specified by the variable :code:`models_location` inside the :code:`config.sh`.
This location becomes important in the context of retraining and replacing the TensorRT models.

By default, deployable TLT models come encrypted with their own keys.
These keys listed in the config are specific to those models that exist on NGC.
These do not need to be modified unless a user wishes to work with retrained and
re-encrypted TLT models.

Also inside the :code:`config.sh` is a field to specify a volume mount for the sample
applications. This would be useful in the case of a user wanting to modify
applications and saving that new source to the host machine as opposed to the
container (which if exited can result in loss of modifications).

All of the configuration options are documented within the configuration file itself.

.. Note:: The NVIDIA Triton server will be listening/broadcasting on ports :code:`8001` for gRPC, :code:`8000` for HTTP, and :code:`8002` for Triton metrics.


Initialization
^^^^^^^^^^^^^^

Run:

.. code-block:: bash

    bash tlt_cv_init.sh

The :code:`tlt_cv_init.sh` script will pull all the relevant containers and models
to the machine. It will also download specific 3rd party dependencies to our client
container.

Successful completion of this download will result in:

.. code-block:: bash

    [INFO] Finished pulling containers and models

The script will then compile the TLT models into TensorRT models to deploy for
the NVIDIA Triton Server. This step will take up to 10 minutes as it compiles
all the TLT models for the Inference Pipeline. Upon successful completion of this, users will see the following:

.. code-block:: bash

    [INFO] SUCCESS: Proceed to 'tlt_cv_start_server.sh'


Launching the Server and Client Containers
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Run:

.. code-block:: bash

    bash tlt_cv_start_server.sh

This will launch the NVIDIA Triton Server to allow inference requests. To verify
the server has started correctly, users can check if the output shows:

.. code-block:: bash

    I0428 03:14:46.464529 1 grpc_server.cc:1973] Started GRPCService at 0.0.0.0:8001
    I0428 03:14:46.464569 1 http_server.cc:1443] Starting HTTPService at 0.0.0.0:8000
    I0428 03:14:46.507043 1 http_server.cc:1458] Starting Metrics Service at 0.0.0.0:8002

To stop the server, use :code:`ctrl-c` in the relevant terminal.

Next, in another terminal, proceed to run:

.. code-block:: bash

    bash tlt_cv_start_client.sh

This will open an interactive container session with sample applications and
all the necessary libraries. For more information regarding the Inference Pipeline sample applications,
refer to :ref:`Running and Building Sample Applications<tlt_cv_inference_pipeline_sample_applications>`.


Stopping
^^^^^^^^

To stop active containers, run:

.. code-block:: bash

    bash tlt_cv_stop.sh

Cleaning
^^^^^^^^

To clean your machine of containers and/or models that were downloaded at init,
run and follow the prompts:

.. code-block:: bash

    bash tlt_cv_clean.sh


Integration with TLT
^^^^^^^^^^^^^^^^^^^^

A utility script :code:`tlt_cv_compile.sh` is provided to ease the deployment of
TLT models into the Inference Pipeline. The models are downloaded to the host
system in the :code:`models_location` specified in :code:`config.sh`. Simply replace the
default "deployable" model with the newly-trained ETLT model in the respective
:code:`tlt_*/` folder while preserving the name,
and run one of the commands in the next section for the new model. For ease, let us
save the encoding key before hand.

.. code-block:: bash

    export ENCODING_KEY=<key>

**It is important to rename the new TLT model to the default and already present
model.**

.. Note:: Default encoding keys for the original deployable TLT models exist in :code:`config.sh`.

The NVIDIA Triton Server points to the
:code:`models_location` so during the next :code:`tlt_cv_start_server.sh` call, the newly deployed
TensorRT model will serve inferences.

Emotion
-------

Let us say we have a new emotion TLT model. To deploy it into the TLT CV Inference Pipeline:

1. Stop the Triton Server by using :code:`ctrl-c`.
2. Locate the :code:`models_location` in :code:`config.sh` and change directories into it.
3. Replace the default TLT model in the location :code:`${models_location}/tlt_emotionnet_vdeployable/` so that the new TLT model will be named "model.etlt".
4. Run the following script which uses the :code:`tlt-converter` to generate a TensorRT model that will work with the Inference Pipeline:

   .. code-block:: bash

    bash tlt_cv_compile.sh emotion $ENCODING_KEY

   This will automatically drop the TensorRT model into the Triton Server models location.
   Ensure that the conversion is successful.

5. We can start the Triton Server again and ensure the startup is successful.

Face Detect
-----------

Let us say we have a new face detect TLT model. To deploy it into the TLT CV Inference Pipeline:

1. Stop the Triton Server by using :code:`ctrl-c`.
2. Locate the :code:`models_location` in :code:`config.sh` and change directories into it.
3. Replace the default TLT model in the location :code:`${models_location}/tlt_facenet_vdeployable/` so that the new TLT model will be named "model.etlt".
4. Run the following script which uses the :code:`tlt-converter` to generate a TensorRT model that will work with the Inference Pipeline:

   .. code-block:: bash

    bash tlt_cv_compile.sh facedetect $ENCODING_KEY

   This will automatically drop the TensorRT model into the Triton Server models location.
   Ensure that the conversion is successful.

5. We can start the Triton Server again and ensure the startup is successful.

Facial Landmarks
----------------

Let us say we have a new facial landmarks TLT model. To deploy it into the TLT CV Inference Pipeline:

1. Stop the Triton Server by using :code:`ctrl-c`.
2. Locate the :code:`models_location` in :code:`config.sh` and change directories into it.
3. Replace the default TLT model in the location :code:`${models_location}/tlt_fpenet_vdeployable/` so that the new TLT model will be named "model.etlt".
4. Run the following script which uses the :code:`tlt-converter` to generate a TensorRT model that will work with the Inference Pipeline:

   .. code-block:: bash

    bash tlt_cv_compile.sh faciallandmarks $ENCODING_KEY

   This will automatically drop the TensorRT model into the Triton Server models location.
   Ensure that the conversion is successful.

.. Note:: By default, the TLT CV Inference Pipeline assumes 80 landmarks from the TensorRT model.
          In the case of a newly trained TLT model with 68 output landmarks (for example),
          one must modify the Triton configuration which exists at :code:`${models_location}/triton_model_repository/faciallandmarks_tlt/config.pbtxt`.
          Ensure that both outputs (not inputs) are changed to 68 (or the corresponding output of the new model).

5. We can start the Triton Server again and ensure the startup is successful.

Gaze
----

Let us say we have a new gaze TLT model. To deploy it into the TLT CV Inference Pipeline:

1. Stop the Triton Server by using :code:`ctrl-c`.
2. Locate the :code:`models_location` in :code:`config.sh` and change directories into it.
3. Replace the default TLT model in the location :code:`${models_location}/tlt_gazenet_vdeployable/` so that the new TLT model will be named "model.etlt".
4. Run the following script which uses the :code:`tlt-converter` to generate a TensorRT model that will work with the Inference Pipeline:

   .. code-block:: bash

    bash tlt_cv_compile.sh gaze $ENCODING_KEY

   This will automatically drop the TensorRT model into the Triton Server models location.
   Ensure that the conversion is successful.

5. We can start the Triton Server again and ensure the startup is successful.


Gesture
-------

Let us say we have a new gesture TLT model. To deploy it into the TLT CV Inference Pipeline:

1. Stop the Triton Server by using :code:`ctrl-c`.
2. Locate the :code:`models_location` in :code:`config.sh` and change directories into it.
3. Replace the default TLT model in the location :code:`${models_location}/tlt_gesturenet_vdeployable/` so that the new TLT model will be named "model.etlt".
4. Run the following script which uses the :code:`tlt-converter` to generate a TensorRT model that will work with the Inference Pipeline:

   .. code-block:: bash

    bash tlt_cv_compile.sh gesture $ENCODING_KEY

   This will automatically drop the TensorRT model into the Triton Server models location.
   Ensure that the conversion is successful.

5. We can start the Triton Server again and ensure the startup is successful.


Heart Rate
----------

Let us say we have a new heart rate TLT model. To deploy it into the TLT CV Inference Pipeline:

1. Stop the Triton Server by using :code:`ctrl-c`.
2. Locate the :code:`models_location` in :code:`config.sh` and change directories into it.
3. Replace the default TLT model in the location :code:`${models_location}/tlt_heartratenet_vdeployable/` so that the new TLT model will be named "model.etlt".
4. Run the following script which uses the :code:`tlt-converter` to generate a TensorRT model that will work with the Inference Pipeline:

   .. code-block:: bash

    bash tlt_cv_compile.sh heartrate $ENCODING_KEY

   This will automatically drop the TensorRT model into the Triton Server models location.
   Ensure that the conversion is successful.

5. We can start the Triton Server again, and ensure the startup is successful.