Gesture Recognition
===================

.. _gesturenet:

GestureNet is an NVIDIA-developed gesture classification model that is included in the Transfer
Learning Toolkit. GestureNet supports the following tasks:

* dataset_convert
* train
* evaluate
* inference
* export

These tasks can be invoked from the TLT launcher using the following convention on the command line:

.. code::

   tlt gesturenet <sub_task> <args_per_subtask>

where :code:`args_per_subtask` are the command-line arguments required for a given subtask. Each
subtask is explained in detail below.

Pre-processing the Dataset
--------------------------

.. _data_input_for_gesture_recognition:


GesturNet App requires the data images and labels to be in a specific format. Once it is prepared the Transfer Learning Toolkit includes the :code:`dataset_convert` to prepare data for model training.

Image Format
^^^^^^^^^^^^

A gesture recognition model should perform well on users outside the training dataset. Thus, model training requires user segregation when splitting into train, validation and test dataset. To enable this we need some unique identifier, `user_id` for each subject. In addition each subject might record multiple videos. 


We wish to organise dataset in the following format:

.. code::

   .
   |-- original dataset root
     |-- uid_1
         |-- session_1     
             |-- 000000.png
             |-- 000001.png
                   .
                   .
             |-- xxxxxx.png
         |-- session_2     
             |-- 000000.png
             |-- 000001.png
                   .
                   .
             |-- xxxxxx.png
     |-- uid_2
         |-- session_1     
             |-- 000000.png
             |-- 000001.png
                   .
                   .
             |-- xxxxxx.png
         |-- session_2     
             |-- 000000.png
             |-- 000001.png
                   .
                   .
             |-- xxxxxx.png
     |-- uid_3
         |-- session_1     
             |-- 000000.png
             |-- 000001.png
                   .
                   .
             |-- xxxxxx.png


For each set we also prepare a metadata file that captures fields that can be used for dataset sampling.

.. code::

    {
        "set": "data", 
        "users": {
            "uid_1": {
                "location": "outdoor", 
                "illumination": "good",
                "class_fps": {
                    "session_1": 30, 
                    "session_2": 30
                }
            }, 
            "uid_2": {
                "location": "indoor", 
                "illumination": "good",
                "class_fps": {
                    "session_1": 10, 
                    "session_2": 15
                }
            }, 
            "uid_3": {
                "location": "indoor", 
                "illumination": "poor",
                "class_fps": {
                    "session_1": 10
                }
            }
        }
    }
 

Label Format
^^^^^^^^^^^^
Each image corresponds to a subject performing a gesture. The image requires a corresponding label json which contains a bounding box for the hand of interest and gesture label. We follow the `Label Studio <https://labelstud.io/>`_ format. A sample label for an image is: 

.. code::

    {
      "completions": [
        {
          "result": [
            {
              "type": "rectanglelabels",
              "original_width": 320,
              "original_height": 240,
              "value": {
                "x": 58.1,
                "y": 18.3,
                "width": 18.8,
                "height": 49.5
              }
            },
            {
              "type": "choices",
              "value": {
                "choices": [
                  "Thumbs-up"
                ]
              }
            }
          ]
        }
      ],
      "task_path": "/workspace/tlt-experiments/gesturenet/data/uid_1/session_1/image_0001.png"
    }

* :code:`task_path`: specifies the full path to the image.

* :code:`completions`: This is a chunk that conatins the labels under `results`.

The bounding box and gesture class are seperate entries whith the following `type`

* :code:`rectanglelabels`: specifies the label corresponding to hand bounding box.

+----------------------------------------------------+----------------------------------------------------------------------------------------------+----------------------+----------------+
| **Parameter name**                                 | **Description**                                                                              | **Type**             | **Range**      |
+====================================================+==============================================================================================+======================+================+
| type                                               | The type of label                                                                            | String               | rectanglelabels| 
+----------------------------------------------------+----------------------------------------------------------------------------------------------+----------------------+----------------+
| original_width                                     | Width of image being labelled (in pixels)                                                    | Integer              | [1, inf)       |
+----------------------------------------------------+----------------------------------------------------------------------------------------------+----------------------+----------------+
| original_height                                    | Height of image being labelled (in pixels)                                                   | Integer              | [1, inf)       |
+----------------------------------------------------+----------------------------------------------------------------------------------------------+----------------------+----------------+
| value["x"]                                         | x coordinate of top left corner of hand bounding box (as a percentage of image width)        | Float                | [0, 100]       |
+----------------------------------------------------+----------------------------------------------------------------------------------------------+----------------------+----------------+
| value["y"]                                         | y coordinate of top left corner of hand bounding box (as a percentage of image height)       | Float                | [0, 100]       |
+----------------------------------------------------+----------------------------------------------------------------------------------------------+----------------------+----------------+
| value["width"]                                     | Width of the hand bounding box (as a percentage of image width)                              | Float                | [0, 100]       |
+----------------------------------------------------+----------------------------------------------------------------------------------------------+----------------------+----------------+
| value["height"]                                    | Height of the hand bounding box (as a percentage of image height)                            | Float                | [0, 100]       |
+----------------------------------------------------+----------------------------------------------------------------------------------------------+----------------------+----------------+

* :code:`choices`: specifies the label corresponding to gesture class.

+----------------------------------------------------+----------------------------------------------------------------------------------------------+----------------------+------------------------+
| **Parameter name**                                 | **Description**                                                                              | **Type**             | **Range**              |
+====================================================+==============================================================================================+======================+========================+
| type                                               | The type of label                                                                            | String               | choices                | 
+----------------------------------------------------+----------------------------------------------------------------------------------------------+----------------------+------------------------+
| value["choices"]                                   | List of attributes. For GestureNet app this will be a single entry with gesture class name   | List of strings      | Valid gesture classes  |
+----------------------------------------------------+----------------------------------------------------------------------------------------------+----------------------+------------------------+



The :code:`dataset_convert` tool requires an extraction and experiment configuration spec files input. The details of the
configuration files and sample usage examples are included in the following sections.

Dataset Extraction Config
^^^^^^^^^^^^^^^^^^^^^^^^^

The dataset_config spec specifies the parameters neededed to crop hand bounding box and prepare dataset.

Here's a sample spec:

.. code::

    {
        "org_dataset": "data",
        "mount_dir_path" :"/workspace/tlt-experiments/gesturenet/",
        "org_data_dir" : "original",
        "post_data_dir" : "extracted",
        "kpi_users": ["uid_1", "uid_2"],
        "sampling_rate": 1,
        "convert_bbox_square": true,
        "bbox_enlarge_ratio": 1.1,
        "annotation_config":{
            "annotation_path": "annotation"
        }
    }

The following table describes the parameters:

+----------------------------+-------------------------------------------------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| **Field**                  | **Description**                                                                                                                     | **Data Type and Constraints** | **Recommended/Typical Value** |
+----------------------------+-------------------------------------------------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| org_dataset                | Name of dataset.                                                                                                                    | String                        |                               |
+----------------------------+-------------------------------------------------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| mount_dir_path             | Path to the root directory relative to which the data is stored.                                                                    | String                        |                               |
+----------------------------+-------------------------------------------------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| org_data_dir               | Path to original images directory relative to :code:`mount_dir_path`.                                                               | String                        |                               |
+----------------------------+-------------------------------------------------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| post_data_dir              | Path to directory relative to :code:`mount_dir_path` where crops are to be extracted.                                               | String                        |                               |
+----------------------------+-------------------------------------------------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| kpi_users                  | List of user IDs set aside for Test set.                                                                                            | List of String                |                               |
+----------------------------+-------------------------------------------------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| sampling_rate              | Rate at which to select frames for labeling. If data is not video please set to 1.                                                  | Integer                       |1 if dataset is not from video |
+----------------------------+-------------------------------------------------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| convert_bbox_square        | Boolean variable to indicate if the labelled bounding box should be converted to a square.                                          | Boolean                       |                               |
+----------------------------+-------------------------------------------------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| bbox_enlarge_ratio         | Scale factor used to enlarge bounding box.                                                                                          | Float                         |          [1.0,1.2]            |
+----------------------------+-------------------------------------------------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| annotation_config          | The nested annotation dictionary that contains path to folder with labels (relative to :code:`<org_data_dir>/<org_dataset>`).       | Dictionary                    |                               |
+----------------------------+-------------------------------------------------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+

Dataset Experiment Config
^^^^^^^^^^^^^^^^^^^^^^^^^

The dataset_experiment_config spec specifies the parameters neededed to combine different datasets. It allows user to provide user IDs that are set aside for validation or test set. It also allows different sampling strategies based on meta data and class counts.

Here's a sample spec:

.. code::

    {
        "mount_dir_path" :"/workspace/tlt-experiments/gesturenet",
        "org_data_dir" : "original",
        "post_data_dir" : "extracted",
        "set_list": {
            "train_val": [
                "data"
            ],
            "kpi": [
                "data"
            ]
        },
        "uid_list": {
            "uid_name": "user_id",
            "predefined_val_users": false,
            "val_fraction": 0.25,
            "validation": [
            ],
            "kpi": [
                "uid_1"
            ]
        },
        "image_feature_filter": {
            "train_val":{
            "*":[
                {
                    "location":"outdoor"
                }
            ]
            },
            "kpi":{
            }
        },
        "sampling": {
            "sampling_mode": "average",
            "use_class_weights": false,
            "class_weights": {
                "thumbs_up": 0.5,
                "v": 0.5
            }
        }
    }




+----------------------------+--------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| **Field**                  | **Description**                                                                            | **Data Type and Constraints** | **Recommended/Typical Value** |
+----------------------------+--------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| mount_dir_path             | Path to the root directory relative to which the data is stored.                           | String                        |                               |
+----------------------------+--------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| org_data_dir               | Path to original images directory relative to :code:`mount_dir_path`.                      | String                        |                               |
+----------------------------+--------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| post_data_dir              | Path to directory relative to :code:`mount_dir_path` where crops were extracted.           | String                        |                               |
+----------------------------+--------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| set_list                   | This nested configuration for parameters related to datasets.                              | Dictionary                    |                               |
+----------------------------+--------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| uid_list                   | This nested configuration for parameters related to user ids.                              | Dictionary                    |                               |
+----------------------------+--------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| image_feature_filter       | This nested configuration for parameters related to filtering images based on metadata.    | Dictionary                    |                               |
+----------------------------+--------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| sampling                   | This nested configuration for parameters related to class weights and sampling startegy    | Dictionary                    |                               |
+----------------------------+--------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+


The following table describes the :code:`set_list` parameters:

+----------------------------+--------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| **Field**                  | **Description**                                                                            | **Data Type and Constraints** | **Recommended/Typical Value** |
+----------------------------+--------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| train_val                  | List of datasets from which to select users for training and validation.                   | List of String                |                               |
+----------------------------+--------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| kpi                        | List of datasets from which to select users for test set.                                  | List of String                |                               |
+----------------------------+--------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+


The following table describes the :code:`uid_list` parameters:

+----------------------------+----------------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| **Field**                  | **Description**                                                                                    | **Data Type and Constraints** | **Recommended/Typical Value** |
+----------------------------+----------------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| uid_name                   | Name of field that represents unique identifier of each subject.                                   | String                        |                               |
+----------------------------+----------------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| predefined_val_users       | Flag to indicate if train-validation split of is specifed by config.                               | Boolean                       |                               |
+----------------------------+----------------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| val_fraction               | Fraction of non-kpi users used for validation set. Only used if :code:`predefined_val_users=false`.| Float                         |                               |
+----------------------------+----------------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| validation                 | List of uid used in validation set. Only used if :code:`predefined_val_users=true`                 | List of String                |                               |
+----------------------------+----------------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| kpi                        | List of uid assigned to test set.                                                                  | List of String                |                               |
+----------------------------+----------------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+



The following table describes the :code:`image_feature_filter` parameters:

+----------------------------+--------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| **Field**                  | **Description**                                                                            | **Data Type and Constraints** | **Recommended/Typical Value** |
+----------------------------+--------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| train_val                  | Metadata fields that used to discard images in training and validation set.                | Dictionary                    |                               |
+----------------------------+--------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| kpi                        | Metadata fields that used to discard images in test set.                                   | Dictionary                    |                               |
+----------------------------+--------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+



The following table describes the :code:`sampling` parameters:

+----------------------------+--------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| **Field**                  | **Description**                                                                            | **Data Type and Constraints** | **Recommended/Typical Value** |
+----------------------------+--------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| sampling_mode              | Sampling methodology when using class_weights                                              | String                        |                               |
|                            |                                                                                            |                               |                               |
|                            | * :code:`under` : undersampling                                                            |                               |    "average"                  |
|                            | * :code:`over` : oversampling                                                              |                               |                               |
|                            | * :code:`average` : some classes are oversampled and others are undersampled               |                               |                               |
+----------------------------+--------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| use_class_weights          | Boolean variable to indicate if sampling should be based on class weights.                 | Boolean                       | True / False                  |
+----------------------------+--------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| class_weights              | Dictionary mapping gesture classes of interest and their class weight.                     | Dictionary                    |                               |
+----------------------------+--------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+



Sample Usage of the Dataset Converter Tool
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

TLT has built in commands to run prepare datset for GestureNet model and is given below.

.. code::

    tlt gesturenet dataset_convert  --dataset_spec <dataset_spec_path>
                                    --experiment_spec <experiment_spec_path>
                                    --k_folds <num_folds>
                                    --output_filename <output_filename>
                                    --experiment_name <experiment_name>

Required Arguments
^^^^^^^^^^^^^^^^^^

* :code:`--dataset_spec`: The path to dataset spec.
* :code:`--experiment_spec`: The path to dataset experiment spec.
* :code:`--k_folds`: Number of folds.
* :code:`--output_filename`: Output json that is ingested by GestureNet training pipeline.
* :code:`--experiment_name`: Name of experiment.

Sample Usage
^^^^^^^^^^^^

Here is an example using a GestureNet model.

.. code::

    tlt gesturenet dataset_convert --dataset_spec $SPECS_DIR/dataset_config.json \
                                   --k_folds 0 \
                                   --experiment_spec $SPECS_DIR/dataset_experiment_config.json \
                                   --output_filename $USER_EXPERIMENT_DIR/data.json \
                                   --experiment_name v1

Creating a Configuration File
-----------------------------

To do training, evaluation, and inference for GestureNet, several components need to be configured, each
with their own parameters. The `gesturenet train`, `gesturenet evaluate`, and `gesturenet inference` commands
for a GestureNet experiment share the same configuration file.

The main components of the configuration file is given below.

* Trainer
* Model
* Evaluator

+----------------------------+--------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| **Field**                  | **Description**                                                                | **Data Type and Constraints** | **Recommended/Typical Value** |
+----------------------------+--------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| random_seed                | The random seed for the experiment.                                            | Unsigned int                  | 108                           |
+----------------------------+--------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| batch_size                 | Batch size used for experiment.                                                | Unsigned Int                  | 64                            |
+----------------------------+--------------------------------------------------------------------------------+-------------------------------+-------------------------------+
|output_experiments_fld      | Directory where experiments will be saved.                                     | String                        |                               |
+----------------------------+--------------------------------------------------------------------------------+-------------------------------+-------------------------------+
|save_weights_path           | Folder in `output_experiments_fld` that the model will be saved to.            | String                        |                               |
+----------------------------+--------------------------------------------------------------------------------+-------------------------------+-------------------------------+
|trainer                     | Trainer configuration.                                                         |                               |                               |
+----------------------------+--------------------------------------------------------------------------------+-------------------------------+-------------------------------+
|model                       | Model configuration.                                                           |                               |                               |
+----------------------------+--------------------------------------------------------------------------------+-------------------------------+-------------------------------+
|evaluator                   | Evaluator configuration.                                                       |                               |                               |
+----------------------------+--------------------------------------------------------------------------------+-------------------------------+-------------------------------+


Trainer Config
^^^^^^^^^^^^^^

The trainer configuration allows you to configure how you want to train your model. The two main components
are `top_training` and `finetuning`. `num_workers` allows you to specify how many workers to use to train your model.
Details on the `top_training` configuration file is explained in detail in the next section.

+----------------------------+--------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| **Field**                  | **Description**                                                                | **Data Type and Constraints** | **Recommended/Typical Value** |
+----------------------------+--------------------------------------------------------------------------------+-------------------------------+-------------------------------+
|top_training                | Top Training Configuration                                                     |                               |                               |
+----------------------------+--------------------------------------------------------------------------------+-------------------------------+-------------------------------+
|finetuning                  | Fine Tuning Configuration                                                      |                               |                               |
+----------------------------+--------------------------------------------------------------------------------+-------------------------------+-------------------------------+
|num_workers                 | Number of workers to train model.                                              | Unsigned Int                  | 1                             |
+----------------------------+--------------------------------------------------------------------------------+-------------------------------+-------------------------------+


Top Training Config
*******************

The top training configuration allows you to customize how your model trains. There are 5 main components to the
`top_training` configuration and they are as follows:

+---------------------+-----------------------------------------------------------------+-------------------------------+-------------------------------+
| **Field**           | **Description**                                                 | **Data Type and Constraints** | **Recommended/Typical Value** |
+---------------------+-----------------------------------------------------------------+-------------------------------+-------------------------------+
| stage_order         |                                                                 |                               |  1                            |
+---------------------+-----------------------------------------------------------------+-------------------------------+-------------------------------+
| loss_fn             | Loss function to use for top training the model.                | String                        | categorical_crossentropy      |
|                     | Currently only supports categorical cross entropy.              |                               |                               |
+---------------------+-----------------------------------------------------------------+-------------------------------+-------------------------------+
| train_epochs        | The number of epochs to perform top training.                   | Unsigned Int                  | 5                             |
+---------------------+-----------------------------------------------------------------+-------------------------------+-------------------------------+
| num_layers_unfreeze | The number of layers whose weights are updated during training. | Unsigned Int                  | 3                             |
|                     | For example, if 3 layers are unfrozen then the model            |                               |                               |
|                     | will freeze all the layers starting from the inputs until       |                               |                               |
|                     | the last 3 layers in the model are left unfrozen.               |                               |                               |
+---------------------+-----------------------------------------------------------------+-------------------------------+-------------------------------+
| optimizer           | The optimizer to use for top training. Currently support        | String                        | rmsprop                       |
|                     | :code:`sgd`, :code:`adam` and :code:`rmsprop`.                  |                               |                               |
+---------------------+-----------------------------------------------------------------+-------------------------------+-------------------------------+

Fine Tuning Config
******************

Each fine tuning configuration file has 9 different options to perform fine tuning and is listed below.

+----------------------+------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| **Field**            | **Description**                                                                                                                                      | **Data Type and Constraints** | **Recommended/Typical Value** |
+----------------------+------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| stage_order          |                                                                                                                                                      |                               |                               |
+----------------------+------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| train_epochs         | The number of epochs to perform fine tuning. The fine tuning                                                                                         | Unsigned Int                  | 50                            |
|                      | option will allow you to obtain the best results when switching                                                                                      |                               |                               |
|                      | datasets. Usually more layers are frozen and a lower learning rate                                                                                   |                               |                               |
|                      | is used to achieve the best results.                                                                                                                 |                               |                               |
+----------------------+------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| loss_fn              | The loss function to use for fine tuning.                                                                                                            | String                        | categorical_crossentropy      |
|                      | Currently only supports categorical crossentropy.                                                                                                    |                               |                               |
+----------------------+------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| initial_lrate        | The initial learning rate to be used for fine tuning. Fine tuning uses a step learning                                                               | Float                         | 3e-04                         |
|                      | rate annnealing schedule according to the progress of the current experiment. The                                                                    |                               |                               |
|                      | training progress is defined as the ratio of the current iteration to the maximum                                                                    |                               |                               |
|                      | iterations. The scheduler adjusts the learning rate of the experiment in steps at                                                                    |                               |                               |
|                      | regular intervals.                                                                                                                                   |                               |                               |
+----------------------+------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| decay_step_size      | Decay step size for learning rate. Fine tuning uses a step learningrate annnealing schedule according to the progress of the current experiment. The | Float                         | 33                            |
|                      | training progress is defined as the ratio of the current iteration to the maximum                                                                    |                               |                               |
|                      | iterations. The scheduler adjusts the learning rate of the experiment in steps at                                                                    |                               |                               |
|                      | regular intervals.                                                                                                                                   |                               |                               |
+----------------------+------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| lr_drop_rate         | Drop rate for learning rate. Fine tuning uses a step learningrate annnealing schedule according to the progress of the current experiment. The       | Float                         | 0.5                           |
|                      | training progress is defined as the ratio of the current iteration to the maximum                                                                    |                               |                               |
|                      | iterations. The scheduler adjusts the learning rate of the experiment in steps at                                                                    |                               |                               |
|                      | regular intervals.                                                                                                                                   |                               |                               |
+----------------------+------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| enable_checkpointing | Flag to determine whether to enable checkpoints.                                                                                                     | Bool                          | True                          |
+----------------------+------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| num_layers_unfreeze  | The number of layers unfrozen (whose weights are updated) during training. It is advised to unfreeze most layers for finetuning step.                | Unsigned Int                  | 100                           |
+----------------------+------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| optimizer            | Optimizer to use for fine tuning. "sgd", "adam" and "rmsprop" are supported.                                                                         | String                        | sgd                           |
+----------------------+------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+


Model Config
^^^^^^^^^^^^

The model configuration file allows you to customize the architecture you want to use and the hyperparameters. The key options available are
given below.

+---------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| **Field**                 | **Description**                                                                                                                                 | **Data Type and Constraints** | **Recommended/Typical Value** |
+---------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| base_model                | The base model to use. The public version uses a vanilla resnet but the release version uses an optimized model that obtains better results.    | String                        | resnet_vanilla                |
+---------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| num_layers                | Number of layers to use in the model. The current supported layers are 6, 10, 12, 18, 26, 34, 50, 101, 152.                                     | Unsigned Int                  | 18                            |
+---------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| weights_init              | The path to the saved weights. Model loads in the weights.                                                                                      | String                        |                               |
+---------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| gray_scale_input          | Image input type. It is best to use RGB images but grayscale inputs work as well. If the images are RGB then set this flag to be false.         | Bool                          | False                         |
+---------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| data_format               | The image format to use. This must align with the model provided. The current options are either channels_first (NCHW) or channels_last (NHWC). | String                        | channels_first                |
|                           | At the moment, NCHW is the preferred format to use.                                                                                             |                               |                               |
+---------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| image_height              | Image height of the model input.                                                                                                                | UnsignedInt                   | 160                           |
+---------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| image_width               | Image width of the model input.                                                                                                                 | UnsignedInt                   | 160                           |
+---------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| use_batch_norm            | Flag to determine whether to use batch normalization or not to use batch normalization. To use batch normalization set to True.                 | Bool                          | False                         |
+---------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| kernel_regularizer_type   | The regularization to use for the convolutional layers. If you want to prune the model it is recommended to use l1 / lasso regularization. This | String                        | l2                            |
|                           | helps to generate sparse weights that can later be pruned from min weight pruning. The current options are either l1 or l2 regularization.      |                               |                               |
+---------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| Kernel_regularizer_factor | The value to use for the regularization.                                                                                                        | Float                         | 0.001                         |
+---------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+


Evaluator Config
^^^^^^^^^^^^^^^^

Evaluator configuration is the configuration options for evaluating your GestureNet.

+----------------------------+--------------------------------------------------------------------------------+-------------------------------+-------------------------------+
| **Field**                  | **Description**                                                                | **Data Type and Constraints** | **Recommended/Typical Value** |
+----------------------------+--------------------------------------------------------------------------------+-------------------------------+-------------------------------+
|evaluation_exp_name         | Name of experiment.                                                            | String                        |                               |
+----------------------------+--------------------------------------------------------------------------------+-------------------------------+-------------------------------+
|data_path                   | Path to evaluation json file.                                                  | String                        |                               |
+----------------------------+--------------------------------------------------------------------------------+-------------------------------+-------------------------------+


Training the Model
------------------

TLT has built-in commands to train a GestureNet model and is given below.

.. code::

    tlt gesturenet train -e <spec_file>
                         -k <key>

Required Arguments
^^^^^^^^^^^^^^^^^^

* :code:`-e, --experiment_spec_filename`: Path to spec file.
* :code:`-k, –key`: User specific encoding key to save or load a :code:`.tlt` model.


Sample Usage
^^^^^^^^^^^^

Here's an example of using the train command on GestureNet:

.. code::

    tlt gesturenet train -e $SPECS_DIR/train_spec.json \
                         -k $KEY


Evaluating the Model
--------------------

TLT has built in commands to evaluate a GestureNet model and is given below.

.. code::

    tlt gesturenet evaluate -e <spec_file>
                            -m <model_file>
                            -k <key>

Required Arguments
^^^^^^^^^^^^^^^^^^

* :code:`-e, --experiment_spec_filename`: Experiment spec file to set up the evaluation experiment.
  This should be the same as training spec file.
* :code:`-m, --model`: Path to the model file to use for evaluation.
* :code:`-k, -–key`: Provide the encryption key to decrypt the model. This is a required argument
  only with a :code:`.tlt` model file.


Sample Usage
^^^^^^^^^^^^

Here's an example of using the evaluation command on a GestureNet model.

.. code::

    tlt gesturenet evaluate -e $USER_EXPERIMENT_DIR/model/train_spec.json \
                            -m $USER_EXPERIMENT_DIR/model/model.tlt \
                            -k $KEY


Running Inference on the Model
------------------------------

TLT has built in commands to run inference on a GestureNet model and is given below.

.. code::

    tlt gesturenet inference -e <spec_file>
                             -m <model_full_path>
                             -k <key>
                             --image_root_path <root_path>
                             --data_json <json_path>
                             --data_type <data_type>
                             -results_dir <results_dir>

Required Arguments
^^^^^^^^^^^^^^^^^^

* :code:`-e, --experiment_spec_filename`: Experiment spec file to set up the evaluation experiment.
  The model parameters should be the same as training spec file.
* :code:`-m, --model`: Path to the model file to use for evaluation.
* :code:`-k, -–key`: Provide the encryption key to decrypt the model. This is a required argument.
  only with a :code:`.tlt` model file.
* :code:`--image_root_path`: The root directpry that dataset is mounted at.
* :code:`--data_json`: The json spec with image path and hand bounding box.
* :code:`--data_type`: The dataset type within data_json that inference is to be run on.
* :code:`--results_dir`: Directory where the results are saved.


Sample Usage
^^^^^^^^^^^^

Here is an example of running inference using a GestureNet model.

.. code::

    tlt gesturenet inference -e $USER_EXPERIMENT_DIR/model/train_spec.json \
                             -m $USER_EXPERIMENT_DIR/model/model.tlt \
                             -k $KEY \
                             --image_root_path /workspace/tlt-experiments/gesturenet \
                             --data_json /workspace/tlt-experiments/gesturenet/data.json \
                             --data_type kpi_set \
                             --results_dir $USER_EXPERIMENT_DIR/model


Exporting the Model
-------------------

The command to export the GestureNet model to a TensorRT plan can be found below. It only supports
FP16 at the moment.

.. code::

    tlt gesturenet export -m <model_full_path>
                          -k <key>
                          -o <out_file>
                          -t <export_type>
                          -ll <log_level>

Required Arguments
^^^^^^^^^^^^^^^^^^

* :code:`-m, --model_filename`: The full path to the model to export.
* :code:`-k, --key`: Encryption key used to train the model.
* :code:`-o, --out_file`: Place to save the exported model.

Optional Arguments
^^^^^^^^^^^^^^^^^^

* :code:`-t, --export_type`: Type of export to use. Options are `onnx` or `tfonnx`.
* :code:`-ll, --log_level`: The log level to use.

Sample Usage
^^^^^^^^^^^^

Here is an example of exporting a GestureNet model.

.. code::

    tlt gesturenet export -m $USER_EXPERIMENT_DIR/model/model.tlt \
                          -k $KEY \
                          -o $USER_EXPERIMENT_DIR/model/model.etlt \
                          -t 'tfonnx'

Deploying to the TLT CV Inference Pipeline
------------------------------------------

The pretrain model for gesture classification provided through NGC is available by default
to use inside the TLT CV Inference Pipeline.
You can also deploy a model trained through TLT workflow to the TLT CV Inference Pipeline.
Refer to :ref:`TLT CV Quick Start Scripts<tlt_cv_quick_start_scripts>` section for instructions of
both options.