Gaze Estimation
===============

.. _gazenet:

GazeNet is an NVIDIA developed gaze estimation model which is included in the Transfer Learning Toolkit as one of the models
supported. With GazeNet the following tasks are supported:

* :code:`dataset_convert`
* :code:`train`
* :code:`evaluate`
* :code:`inference`
* :code:`export`

These tasks may be invoked from the TLT launcher by following the below mentioned convention from command line: 

.. code::

   tlt gazenet <sub_task> <args_per_subtask>

where :code:`args_per_subtask` are the command line arguments required for a given subtask. Each of these sub-tasks are explained
in detail below.

Pre-processing the Dataset
--------------------------

.. _conversion_to_tfrecords_gazenet:

As described in the :ref:`Data Annotation Format<json_label_data_format>` section, the GazeNet app
requires a defined JSON format data to be converted to TFRecords. This can be done using the
:code:`dataset_convert` subtask under GazeNet.

The :code:`dataset_convert` tool takes in a defined json format data and convert it to the TFRecords that the
GazeNet model ingests. See the following sections for the sample usage examples.

Sample Usage of the Dataset Converter Tool
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. _sample_usage_of_the_dataset_converter_tool_gazenet:

The labeling json data format is the accepted dataset format for GazeNet. The labeling json data
fromat must be converted to the TFRecord file format for ingestion. The sampe usage for the
:code:`dataset_convert` tool is as mentioned below.

.. code::

   tlt gazenet dataset_convert [-h] -folder-suffix TFRECORDS_FOLDER_SUFFIX
                                    -norm_folder_name NORM_DATA_FOLDER_NAME
                                    -sets DATASET_NAME
                                    -data_root_path DATASET_ROOT_PATH

You can use these optional arguments:

* :code:`-h, --help`: Show this help message and exit.
* :code:`-folder-suffix, --ground_truth_experiment_folder_suffix`: suffix of folder including
  generated :code:`.tfrecords` files.
* :code:`-norm_folder_name --norm_folder_name`: Folder to generate normalized.
* :code:`-data_root_path, –-data_root_path`: root path to the dataset.
* :code:`-sets --set_ids`: name of the dataset.

Here's an example of using the command with the dataset:

.. code::

    tlt gazenet dataset_convert -folder-suffix <tfrecord_folder_suffix> -norm_folder_name <norm data folder name>> \
                                -sets sample-set <dataset name> -data_root_path <root path to the dataset>

Output log from executing :code:`tlt gazenet dataset_convert`:

.. code::

  Using TensorFlow backend.

  Test ['p01-1']
  Validation ['p01-0']
  Train ['p01-4', 'p01-3', 'p01-2']
  Test ['p01-1']
  Validation ['p01-0']
  Train ['p01-4', 'p01-3', 'p01-2']


Creating an Experiment Specification File
-----------------------------------------

.. _creating_an_experiment_specification_file_gazenet:

To do training, evaluation, and inference for GazeNet, several components need to be configured, each with
their own parameters. The :code:`train`, :code:`evaluate`, and :code:`inference` tasks for a GazeNet
experiment share the same configuration file.

The specification file for GazeNet training configures the following components of training pipeline:

* Trainer/Evaluator
* Model
* Loss
* Optimizer
* Dataloader
* Augmentation

Trainer/Ealuator
^^^^^^^^^^^^^^^^

.. _trainer_evaluator_specification_gazenet:

GazeNet trainer and evaluator share the same configurations. 

Here's a sample example to config GazeNet trainer.

.. code::

  __class_name__: GazeNetTrainer
  checkpoint_dir: null
  checkpoint_n_epoch: 1
  dataloader:
    ...
  evaluation_metric: rmse
  network_inputs: face_and_eyes_and_fgrid
  infrequent_summary_every_n_steps: 0
  log_every_n_secs: 10
  model_selection_metric: logcosh
  num_epoch: 2
  random_seed: 42
  hooks: null
  enable_visualization: false
  loss:
    ...
  model:
    ...
  optimizer:
    ...
  visualize_bins_2d: 5
  visualize_bins_3d: 100
  visualize_num_images: 3

The following table describes the parameters used to config the :code:`trainer`:

+----------------------------------------------+------------------+--------------------------------------+------------------------------------------------------------------------------------------------------------+----------------------------------------------+
| **Parameter**                                | **Datatype**     | **Default**                          | **Description**                                                                                            | **Supported Values**                         |
+==============================================+==================+======================================+============================================================================================================+==============================================+
| __class_name__                               | string           | GazeNetTrainer                       | Name for the trainer specification                                                                         | GazeNetTrainer                               |
+----------------------------------------------+------------------+--------------------------------------+------------------------------------------------------------------------------------------------------------+----------------------------------------------+
| :code:`checkpoint_dir`                       | string           | :code:`null`                         | Path to the checkpoint. If not specified, will save all checkpoints in the output folder                   | NA                                           |
+----------------------------------------------+------------------+--------------------------------------+------------------------------------------------------------------------------------------------------------+----------------------------------------------+
| :code:`checkpoint_n_epoch`                   | int              | :code:`1`                            | Save checkpoint per n number of epochs                                                                     | 1 to num_epoch                               |
+----------------------------------------------+------------------+--------------------------------------+------------------------------------------------------------------------------------------------------------+----------------------------------------------+
| :code:`dataloader`                           | structure        | :code:`NA`                           | Dataloader specification                                                                                   | 1 to num_epoch                               |
+----------------------------------------------+------------------+--------------------------------------+------------------------------------------------------------------------------------------------------------+----------------------------------------------+
| :code:`evaluation_metric`                    | string           | :code:`rmse`                         | Metric used during KPI testing                                                                             | rmse                                         |
+----------------------------------------------+------------------+--------------------------------------+------------------------------------------------------------------------------------------------------------+----------------------------------------------+
| :code:`network_inputs`                       | string           | :code:`face_and_eyes_and_fgrid`      | Input type (only 'face_and_eyes_and_fgrid' is supported in TLT 3.0 release)                                | face_and_eyes_and_fgrid                      |
+----------------------------------------------+------------------+--------------------------------------+------------------------------------------------------------------------------------------------------------+----------------------------------------------+
| :code:`infrequent_summary_every_n_steps`     | int              | :code:`0`                            | Infrequent summary every n epoch                                                                           | 0 to num_epoch                               |
+----------------------------------------------+------------------+--------------------------------------+------------------------------------------------------------------------------------------------------------+----------------------------------------------+
| :code:`log_every_n_secs`                     | int              | :code:`10`                           | Log the training output for every n secs                                                                   | NA                                           |
+----------------------------------------------+------------------+--------------------------------------+------------------------------------------------------------------------------------------------------------+----------------------------------------------+
| :code:`model_selection_metric`               | string           | :code:`logcosh`                      | Metric used to select final model                                                                          | logcosh                                      |
+----------------------------------------------+------------------+--------------------------------------+------------------------------------------------------------------------------------------------------------+----------------------------------------------+
| :code:`num_epoch`                            | int              | :code:`40`                           | Number of epochs                                                                                           | NA                                           |
+----------------------------------------------+------------------+--------------------------------------+------------------------------------------------------------------------------------------------------------+----------------------------------------------+
| :code:`random_seed`                          | int              | :code:`42`                           | Random seed used during the experiments                                                                    | NA                                           |
+----------------------------------------------+------------------+--------------------------------------+------------------------------------------------------------------------------------------------------------+----------------------------------------------+
| :code:`enable_visualization`                 | boolean          | :code:`false`                        | Toggle to enable visualization                                                                             | false/true                                   |
+----------------------------------------------+------------------+--------------------------------------+------------------------------------------------------------------------------------------------------------+----------------------------------------------+
| :code:`visualize_bins_2d`                    | int              | :code:`5`                            | Resolution for 2D data distribution visualization                                                          | NA                                           |
+----------------------------------------------+------------------+--------------------------------------+------------------------------------------------------------------------------------------------------------+----------------------------------------------+
| :code:`visualize_bins_3d`                    | int              | :code:`100`                          | Resolution for 3D data distribution visualization                                                          | NA                                           |
+----------------------------------------------+------------------+--------------------------------------+------------------------------------------------------------------------------------------------------------+----------------------------------------------+
| :code:`visualize_num_images`                 | int              | :code:`3`                            | Number of data images to show on Tensorboard                                                               | NA                                           |
+----------------------------------------------+------------------+--------------------------------------+------------------------------------------------------------------------------------------------------------+----------------------------------------------+

Model
^^^^^

.. _model_specification_gazenet:

GazeNet can be configured using the model option in the spec file.

Here's a sample model config to instantiate a GazeNet model with pretrained weights and the number of freeze blocks.

.. code::

  model:
    __class_name__: GazeNetBaseModel
    model_parameters:
      dropout_rate: 0.25
      frozen_blocks: 5
      num_outputs: 5
      pretrained_model_path: /workspace/tlt-experiments/gazenet/pretrain_models/model.tlt
      regularizer_type: l2
      regularizer_weight: 0.002
      type: GazeNet_public
      use_batch_norm: true

The following table describes the :code:`model` parameters:

+---------------------------------+------------------+-----------------------------+------------------------------------------------------------------------------------------------------------+----------------------------------------------+
| **Parameter**                   | **Datatype**     | **Default**                 | **Description**                                                                                            | **Supported Values**                         |
+=================================+==================+=============================+============================================================================================================+==============================================+
| __class_name__                  | string           | GazeNetBaseModel            | Name for the model config                                                                                  | GazeNetBaseModel                             |
+---------------------------------+------------------+-----------------------------+------------------------------------------------------------------------------------------------------------+----------------------------------------------+
| :code:`dropout_rate`            | float            | :code:`0.3`                 | Probability for drop out                                                                                   | 0.0-1.0                                      |
+---------------------------------+------------------+-----------------------------+------------------------------------------------------------------------------------------------------------+----------------------------------------------+
| :code:`frozen_blocks`           | int              | :code:`2`                   | This parameter defines how many blocks that will be frozen during training.                                | 0,1,2,3,4,5,6                                |
|                                 |                  |                             | If the value for this variable is set to be larger than 0, provide a pretrain model.                       |                                              |
+---------------------------------+------------------+-----------------------------+------------------------------------------------------------------------------------------------------------+----------------------------------------------+
| :code:`num_outputs`             | int              | :code:`5`                   | Number of outputs                                                                                          | 5                                            |
|                                 |                  |                             | (x, y, z point of regards amd theta, phi gaze vector)                                                      |                                              |
+---------------------------------+------------------+-----------------------------+------------------------------------------------------------------------------------------------------------+----------------------------------------------+
| :code:`pretrained_model_path`   | string           | :code:`null`                | Path to the pretrain model                                                                                 | NA                                           |
+---------------------------------+------------------+-----------------------------+------------------------------------------------------------------------------------------------------------+----------------------------------------------+
| :code:`regularization_type`     | string           | :code:`l2`                  | Type of the regularization                                                                                 | l1/l2/None                                   |
+---------------------------------+------------------+-----------------------------+------------------------------------------------------------------------------------------------------------+----------------------------------------------+
| :code:`regularizer_weight`      | float            | :code:`0.002`               | Factor of the regularization                                                                               | 0.0-1.0                                      |
+---------------------------------+------------------+-----------------------------+------------------------------------------------------------------------------------------------------------+----------------------------------------------+
| :code:`type`                    | string           | :code:`GazeNet_public`      | Type of supported GazeNet model                                                                            | GazeNet_public                               |
|                                 |                  |                             | Only GazeNet_public is supported in TLT 3.0                                                                |                                              |
+---------------------------------+------------------+-----------------------------+------------------------------------------------------------------------------------------------------------+----------------------------------------------+
| :code:`use_batch_norm`          | boolean          | :code:`true`                | Boolean variable to use batch normalization layers or not                                                  | true/false                                   |
+---------------------------------+------------------+-----------------------------+------------------------------------------------------------------------------------------------------------+----------------------------------------------+

Loss
^^^^

This section helps you configure the parameters for loss, optimizer, and learning rate scheduler for optimizer.

.. code::

  loss:
    __class_name__: GazeLoss
    loss_type: logcosh

The following table describes the :code:`loss` parameters:

+-------------------------+------------------+-------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| **Parameter**           | **Datatype**     | **Default**                               | **Description**                                                                                            | **Supported Values**                                                                 |
+=========================+==================+===========================================+============================================================================================================+======================================================================================+
| __class_name__          | string           | GazeLoss                                  | Name of the loss config                                                                                    | NA                                                                                   |
+-------------------------+------------------+-------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`loss_type`       | string           | `logcosh`                                 | Type of the loss function                                                                                  | l1/rmse/cosine/l1_cosine_joint/l2_cosine_joint/logcosh                               |
|                         |                  |                                           |                                                                                                            | l1: l1 loss                                                                          |
|                         |                  |                                           |                                                                                                            | rmse: root mean square error                                                         |
|                         |                  |                                           |                                                                                                            | l1_cosine_joint: l1 loss for x, y, z point of regards cosine loss for theta and phi  |
|                         |                  |                                           |                                                                                                            | l2_cosine_joint: l2 loss for x, y, z point of regards cosine loss for theta and phi  |
|                         |                  |                                           |                                                                                                            | logcosh: log-cosh loss                                                               |
+-------------------------+------------------+-------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+

Optimizer
^^^^^^^^^

.. code::

  optimizer:
    __class_name__: AdamOptimizer
    beta1: 0.9
    beta2: 0.999
    epsilon: 1.0e-08
    learning_rate_schedule:
      __class_name__: SoftstartAnnealingLearningRateSchedule
      annealing: 0.8
      base_learning_rate: 0.003
      last_step: 263000
      min_learning_rate: 5.0e-07
      soft_start: 0.2

The following table describes the :code:`optimizer` parameters:

.. TODO @julianak Add comment Check table format. Goes beyond right margin.

+---------------------------------+------------------+---------------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| **Parameter**                   | **Datatype**     | **Default**                                       | **Description**                                                                                            | **Supported Values**                                                                 |
+=================================+==================+===================================================+============================================================================================================+======================================================================================+
| __class_name__                  | string           | AdamOptimizer                                     | Name of the optimizer config                                                                               | AdamOptimizer/AdadeltaOptimizer/GradientDescentOptimizer                             |
+---------------------------------+------------------+---------------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`beta1`                   | float            | :code:`0.9`                                       | The exponential decay rate for the 1st moment estimates                                                    | 0-1                                                                                  |
+---------------------------------+------------------+---------------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`beta2`                   | float            | :code:`0.999`                                     | The exponential decay rate for the 2nd moment estimates                                                    | 0-1                                                                                  |
+---------------------------------+------------------+---------------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`epsilon`                 | float            | :code:`1.0e-08`                                   | A small constant for numerical stability                                                                   | NA                                                                                   |
+---------------------------------+------------------+---------------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`learning_rate_schedule`  | structure        | :code:`SoftstartAnnealingLearningRateSchedule`    | Type of learning rate schedule                                                                             | SoftstartAnnealingLearningRateSchedule                                               |
|                                 |                  |                                                   |                                                                                                            | ConstantLearningRateSchedule                                                         |
|                                 |                  |                                                   |                                                                                                            | ExponentialDecayLearningRateSchedule                                                 |
+---------------------------------+------------------+---------------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+

The following table describes the :code:`learning_rate_schedule` parameters:

+---------------------------------+------------------+-------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| **Parameter**                   | **Datatype**     | **Default**                               | **Description**                                                                                            | **Supported Values**                                                                 |
+=================================+==================+===========================================+============================================================================================================+======================================================================================+
| __class_name__                  | string           | SoftstartAnnealingLearningRateSchedule    | Name of the learning rate schedule config                                                                  | SoftstartAnnealingLearningRateSchedule                                               |
|                                 |                  |                                           |                                                                                                            | - This scheduling has soft starting and ending learning rate value                   |
|                                 |                  |                                           |                                                                                                            | ConstantLearningRateSchedule                                                         |
|                                 |                  |                                           |                                                                                                            | - This scheduling has constant learning rate value                                   |
|                                 |                  |                                           |                                                                                                            | ExponentialDecayLearningRateSchedule                                                 |
|                                 |                  |                                           |                                                                                                            | - This scheduling has learning rate that are decay exponentially                     |
+---------------------------------+------------------+-------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`soft_start`              | float            | :code:`0.2`                               | Indicating the fraction of `last_step` that will be taken before reaching the base_learning rate           | 0-1                                                                                  |
+---------------------------------+------------------+-------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`annealing`               | float            | :code:`0.8`                               | Indicating the fraction of `last_step` after which the learning rate ramps down from base_learning rate    | 0-1                                                                                  |
+---------------------------------+------------------+-------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`base_learning_rate`      | float            | :code:`0.0002`                            | Learning rate                                                                                              | 0-1                                                                                  |
+---------------------------------+------------------+-------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`min_learning_rate`       | float            | :code:`2.0e-07`                           | Minimum value the learning rate will be set to                                                             | 0-1                                                                                  |
+---------------------------------+------------------+-------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`last_step`               | int              | :code:`953801`                            | Last step the schedule is made for                                                                         | NA                                                                                   |
+---------------------------------+------------------+-------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+

GazeNet currently supports the soft-start annealing learning rate schedule. The learning rate
when plotted as a function of the training progress (0.0, 1.0) results in the following curve.

.. image:: ../../content/learning_rate.png

In this experiment, the soft start was set as 0.2 and annealing as 0.8  with minimum learning rate
as 2.0e-07 and a maximum learning rate or base_lr as 0.0002.

Dataloader
^^^^^^^^^^

.. _dataloader_specification_gazenet:

The dataloader module provides parameters used for dataset pre-processing, some basic pre-processing, data and dataloader when training. Here
is a sample :code:`dataloader specification` element:

.. code::

  dataloader:
    __class_name__: GazeNetDataloaderAugV2
    augmentation_info:
      ...
    batch_size: 128
    dataset_info:
      ...
    eye_scale_factor: 1.8
    face_scale_factor: 1.3
    filter_phases:
      - training
      - testing
      - validation
      - kpi_testing
    filter_info:
      ...
    image_info:
      facegrid:
        channel: 1
        height: 25
        width: 25
      image_face:
        channel: 1
        height: 224
        width: 224
      image_frame:
        channel: 1
        height: 480
        width: 640
      image_left:
        channel: 1
        height: 224
        width: 224
      image_right:
        channel: 1
        height: 224
        width: 224
    input_normalization_type: zero-one
    kpiset_info:
      ...
    learn_delta: false
    use_head_norm: false
    num_outputs: 5
    theta_phi_degrees: false
    use_narrow_eye: true
    add_test_to: null

The following table describes the :code:`dataloader specification` parameters:

+---------------------------------------+------------------+---------------------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| **Parameter**                         | **Datatype**     | **Default**                                             | **Description**                                                                                            | **Supported Values**                                                                 |
+=======================================+==================+=========================================================+============================================================================================================+======================================================================================+
| __class_name__                        | string           | GazeNetDataloaderAugV2                                  | Name of the dataloader specification                                                                       | GazeNetDataloaderAugV2                                                               |
+---------------------------------------+------------------+---------------------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`augmentation_info`             | structure        | :code:`NA`                                              | Augmentation specification                                                                                 | NA                                                                                   |
+---------------------------------------+------------------+---------------------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`batch_size`                    | int              | :code:`128`                                             | Batch size                                                                                                 | 0-1                                                                                  |
+---------------------------------------+------------------+---------------------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`dataset_info`                  | structure        | :code:`NA`                                              | dataset specification                                                                                      | 0-1                                                                                  |
+---------------------------------------+------------------+---------------------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`eye_scale_factor`              | float            | :code:`1.8`                                             | Scaling factor for eyes (if value is larger than 1, then eye crop is enlarged)                             | NA                                                                                   |
+---------------------------------------+------------------+---------------------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`face_scale_factor`             | float            | :code:`1.3`                                             | Scaling factor for the face (if value is larger than 1, then face crop is enlarged)                        | NA                                                                                   |
+---------------------------------------+------------------+---------------------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`filter_phases`                 | structure        | :code:`training, testing, validation, kpi_testing`      | Phase to apply the filter                                                                                  | training/testing/validation/kpi_testing                                              |
+---------------------------------------+------------------+---------------------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`filter_info`                   | structure        | :code:`NA`                                              | Data filter variables and criteria                                                                         | NA                                                                                   |
+---------------------------------------+------------------+---------------------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`image_info`                    | structure        | :code:`NA`                                              | Input image information                                                                                    | facegrid, image_face, image_frame, image_left, image_right                           |
+---------------------------------------+------------------+---------------------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`channel`                       | int              | :code:`NA`                                              | Input normalization type                                                                                   | zero-one                                                                             |
+---------------------------------------+------------------+---------------------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`input_normalization_type`      | string           | :code:`NA`                                              | Input normalization type                                                                                   | zero-one                                                                             |
+---------------------------------------+------------------+---------------------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`kpiset_info`                   | structure        | :code:`NA`                                              | KPI set information                                                                                        | NA                                                                                   |
+---------------------------------------+------------------+---------------------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`learn_delta`                   | boolean          | :code:`false`                                           | Boolean values to enable/disable learning of variable difference                                           | false                                                                                |
+---------------------------------------+------------------+---------------------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`use_head_norm`                 | boolean          | :code:`false`                                           | Data filter variable and criteria                                                                          | false                                                                                |
+---------------------------------------+------------------+---------------------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`num_outputs`                   | int              | :code:`5`                                               | Number of outputs                                                                                          | 5                                                                                    |
|                                       |                  |                                                         | (x, y, z point of regards amd theta, phi gaze vector)                                                      |                                                                                      |
+---------------------------------------+------------------+---------------------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`theta_phi_degrees`             | boolean          | :code:`false`                                           | Boolean values to enable/disable theta phi learning                                                        | false                                                                                |
+---------------------------------------+------------------+---------------------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`use_narrow_eye`                | boolean          | :code:`true`                                            | Boolean values to enable/disable tight eye input                                                           | true/false                                                                           |
+---------------------------------------+------------------+---------------------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`add_test_to`                   | string           | :code:`null`                                            | Testing dataset from dataio can be added to training or validation.                                        | null/training/validation                                                             |
|                                       |                  |                                                         | By default, will keep testing dataset for KPI usage                                                        |                                                                                      |
+---------------------------------------+------------------+---------------------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+

.. code::

    dataset_info:
      ground_truth_folder_name:
      - Ground_Truth_DataFactory_pipeline
      image_extension: png
      root_path: null
      test_file_name: test.tfrecords
      tfrecord_folder_name:
      - TfRecords_joint_combined
      tfrecords_directory_path:
      - /workspace/tlt-experiments/gazenet/data/MPIIFaceGaze/sample-dataset
      tfrecords_set_id:
      - p01-day03
      train_file_name: train.tfrecords
      validate_file_name: validate.tfrecords

The following table describes the :code:`dataset_info` parameters:

+---------------------------------------+------------------+-------------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| **Parameter**                         | **Datatype**     | **Default**                                     | **Description**                                                                                            | **Supported Values**                                                                 |
+=======================================+==================+=================================================+============================================================================================================+======================================================================================+
| :code:`ground_truth_folder_name`      | string           | :code:`NA`                                      | Ground truth folder name                                                                                   | NA                                                                                   |
+---------------------------------------+------------------+-------------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`image_extension`               | string           | :code:`NA`                                      | Image extension                                                                                            | NA                                                                                   |
+---------------------------------------+------------------+-------------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`root_path`                     | string           | :code:`NA`                                      | Root path                                                                                                  | null                                                                                 |
+---------------------------------------+------------------+-------------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`test_file_name`                | string           | :code:`NA`                                      | File name for test tfrecords                                                                               | NA                                                                                   |
+---------------------------------------+------------------+-------------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`tfrecord_folder_name`          | string           | :code:`NA`                                      | Tfrecords folder name                                                                                      | NA                                                                                   |
+---------------------------------------+------------------+-------------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`tfrecords_directory_path`      | string           | :code:`NA`                                      | Path to Tfrecords directory                                                                                | NA                                                                                   |
+---------------------------------------+------------------+-------------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`tfrecords_set_id`              | string           | :code:`NA`                                      | Set ID                                                                                                     | NA                                                                                   |
+---------------------------------------+------------------+-------------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`train_file_name`               | string           | :code:`NA`                                      | File name for train tfrecords                                                                              | NA                                                                                   |
+---------------------------------------+------------------+-------------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`validate_file_name`            | string           | :code:`NA`                                      | File name for validate tfrecords                                                                           | NA                                                                                   |
+---------------------------------------+------------------+-------------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+

.. code::

    filter_info:
      - desired_val_max: 400.0
        desired_val_min: -400.0
        feature_names:
        - label/gaze_cam_x
      - desired_val_max: 400.0
        desired_val_min: -400.0
        feature_names:
        - label/gaze_cam_y
      - desired_val_max: 300.0
        desired_val_min: -300.0
        feature_names:
        - label/gaze_cam_z

The following table describes the :code:`filter_info` parameters:

+---------------------------------------+------------------+-------------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| **Parameter**                         | **Datatype**     | **Default**                                     | **Description**                                                                                            | **Supported Values**                                                                 |
+=======================================+==================+=================================================+============================================================================================================+======================================================================================+
| :code:`feature_names`                 | string           | :code:`NA`                                      | Feature name                                                                                               | label/gaze_cam_x, label/gaze_cam_y, label/gaze_cam_z                                 |
+---------------------------------------+------------------+-------------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`desired_val_max`               | float            | :code:`NA`                                      | Maximum value for the feature                                                                              | NA                                                                                   |
+---------------------------------------+------------------+-------------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`desired_val_min`               | float            | :code:`NA`                                      | Minimum value for the feature                                                                              | NA                                                                                   |
+---------------------------------------+------------------+-------------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+

.. code::

    kpiset_info:
      ground_truth_folder_name_kpi:
      - Ground_Truth_DataFactory_pipeline
      kpi_file_name: test.tfrecords
      kpi_root_path: null
      kpi_tfrecords_directory_path:
      - /workspace/tlt-experiments/gazenet/data/MPIIFaceGaze/sample-dataset
      tfrecord_folder_name_kpi:
      - TfRecords_joint_combined
      tfrecords_set_id_kpi:
      - p01-day03

The following table describes the :code:`dataset_info` parameters:

+---------------------------------------+------------------+-------------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| **Parameter**                         | **Datatype**     | **Default**                                     | **Description**                                                                                            | **Supported Values**                                                                 |
+=======================================+==================+=================================================+============================================================================================================+======================================================================================+
| :code:`ground_truth_folder_name_kpi`  | string           | :code:`NA`                                      | Ground truth folder name for KPI dataset                                                                   | NA                                                                                   |
+---------------------------------------+------------------+-------------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`kpi_file_name`                 | string           | :code:`NA`                                      | File name for KPI tfrecords                                                                                | NA                                                                                   |
+---------------------------------------+------------------+-------------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`kpi_root_path`                 | string           | :code:`null`                                    | KPI root path                                                                                              | Reserved value, currently only null is supported                                     |
+---------------------------------------+------------------+-------------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`kpi_tfrecords_directory_path`  | string           | :code:`NA`                                      | Path to KPI Tfrecords directory                                                                            | NA                                                                                   |
+---------------------------------------+------------------+-------------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`tfrecord_folder_name_kpi`      | string           | :code:`NA`                                      | KPI tfrecords folder name                                                                                  | NA                                                                                   |
+---------------------------------------+------------------+-------------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`tfrecords_set_id_kpi`          | string           | :code:`NA`                                      | KPI tfrecords set ID                                                                                       | NA                                                                                   |
+---------------------------------------+------------------+-------------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+

Augmentation
^^^^^^^^^^^^

.. _augmentation_gazenet:

The augmentation module provides some basic pre-processing and augmentation when training. Here
is a sample :code:`augmentation` element:

.. code::

    augmentation_info:
      blur_augmentation:
        blur_probability: 0.0
        kernel_sizes:
        - 1
        - 3
        - 5
        - 7
        - 9
      enable_online_augmentation: true
      gamma_augmentation:
        gamma_max: 1.1
        gamma_min: 0.9
        gamma_probability: 0.1
        gamma_type: uniform
      modulus_color_augmentation:
        contrast_center: 127.5
        contrast_scale_max: 0.0
        hue_rotation_max: 0.0
        saturation_shift_max: 0.0
      modulus_spatial_augmentation:
        hflip_probability: 0.5
        zoom_max: 1.0
        zoom_min: 1.0
      random_shift_bbx_augmentation:
        shift_percent_max: 0.16
        shift_probability: 0.9


The following table describes the :code:`augmentation` parameters:

+---------------------------------------+------------------+-------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| **Parameter**                         | **Datatype**     | **Default**                               | **Description**                                                                                            | **Supported Values**                                                                 |
+=======================================+==================+===========================================+============================================================================================================+======================================================================================+
| __class_name__                        | string           | GazeNetDataloaderAugV2                    | Name of the dataloader specification                                                                       | GazeNetDataloaderAugV2                                                               |
+---------------------------------------+------------------+-------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`augmentation_info`             | structure        | :code:`NA`                                | Augmentation specification                                                                                 | NA                                                                                   |
+---------------------------------------+------------------+-------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`blur_augmentation`             | structure        | :code:`NA`                                | Blur augmentation specification                                                                            | NA                                                                                   |
+---------------------------------------+------------------+-------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`blur_probability`              | float            | :code:`0.0`                               | Probability of imgages to apply blur augmentation                                                          | 0.0 - 1.0                                                                            |
+---------------------------------------+------------------+-------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`kernel_sizes`                  | int              | :code:`1,3,5,7,9`                         | Kernel size for the blur operation                                                                         | 1, 3, 5, 7, 9                                                                        |
+---------------------------------------+------------------+-------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`enable_online_augmentation`    | boolean          | :code:`true`                              | Boolean values to enable/disable augmentation                                                              | true/false                                                                           |
+---------------------------------------+------------------+-------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`gamma_augmentation`            | structure        | :code:`NA`                                | Gamma augmentation specification                                                                           | NA                                                                                   |
+---------------------------------------+------------------+-------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`gamma_max`                     | float            | :code:`1.1`                               | Maximum value of gamma variable                                                                            | 1.0 - 1.4                                                                            |
+---------------------------------------+------------------+-------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`gamma_min`                     | float            | :code:`0.9`                               | Minimum value of gamma variable                                                                            | 0.7 - 1.0                                                                            |
+---------------------------------------+------------------+-------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`gamma_probability`             | float            | :code:`0.1`                               | Probability of data to apply gamma augmentation                                                            | uniform                                                                              |
+---------------------------------------+------------------+-------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`gamma_type`                    | string           | :code:`uniform`                           | Type of gamma augmentation                                                                                 | true/false                                                                           |
+---------------------------------------+------------------+-------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`modulus_color_augmentation`    | structure        | :code:`NA`                                | Color argumentation specification                                                                          | NA                                                                                   |
+---------------------------------------+------------------+-------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`contrast_center`               | float            | :code:`127.5`                             | Contrast center for color argumentation                                                                    | 0 - 255                                                                              |
+---------------------------------------+------------------+-------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`contrast_scale_max`            | float            | :code:`0.0`                               | Maximum scale of contrast change                                                                           | NA                                                                                   |
+---------------------------------------+------------------+-------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`hue_rotation_max`              | float            | :code:`0.0`                               | Maximum hue rotation change                                                                                | NA                                                                                   |
+---------------------------------------+------------------+-------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`saturation_shift_max`          | float            | :code:`0.0`                               | Maximum saturation shift change                                                                            | NA                                                                                   |
+---------------------------------------+------------------+-------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`modulus_spatial_augmentation`  | structure        | :code:`NA`                                | Spatial augmentation specification                                                                         | NA                                                                                   |
+---------------------------------------+------------------+-------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`hflip_probability`             | float            | :code:`0.5`                               | Probability of data to apply horizontal flip                                                               | 0.0 - 1.0                                                                            |
+---------------------------------------+------------------+-------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`zoom_max`                      | float            | :code:`1.0`                               | Maximum zoom scale                                                                                         | NA                                                                                   |
+---------------------------------------+------------------+-------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`zoom_min`                      | float            | :code:`1.0`                               | Minimum zoom scale                                                                                         | NA                                                                                   |
+---------------------------------------+------------------+-------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`random_shift_bbx_augmentation` | structure        | :code:`NA`                                | Bounding box random ship augmentation                                                                      | NA                                                                                   |
+---------------------------------------+------------------+-------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`shift_percent_max`             | float            | :code:`0.16`                              | Maximum percent shift of the bounding box                                                                  | NA                                                                                   |
+---------------------------------------+------------------+-------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+
| :code:`shift_probability`             | float            | :code:`0.9`                               | Probability of data to apply random shift augmentation                                                     | 0.0 - 1.0                                                                            |
+---------------------------------------+------------------+-------------------------------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+

Training the Model
------------------

.. _training_the_model_gazenet:

After following the steps to :ref:`Pre-processing the Dataset
<conversion_to_tfrecords_gazenet>` to create TFRecords ingestible by the
TLT training, and setting up a :ref:`spec file <creating_an_experiment_specification_file_gazenet>`.
You are now ready to start training a gaze estimation network.

GazeNet training command:

.. code::

    tlt gazenet train [-h] -e <spec_file>
                           -r <result directory>
                           -k <key>

Required Arguments
^^^^^^^^^^^^^^^^^^

* :code:`-r, --results_dir`: Path to a folder where experiment outputs should be written.
* :code:`-k, –key`: User specific encoding key to save or load a :code:`.tlt` model.
* :code:`-e, --experiment_spec_file`: Path to spec file. Absolute path or relative to working
  directory.

Optional Arguments
^^^^^^^^^^^^^^^^^^

:code:`-h, --help`: To print help message.

Sample Usage
^^^^^^^^^^^^

Here is an example of command for gazenet training:

.. code::

    tlt gazenet train -e <path_to_spec_file>
                      -r <path_to_experiment_output>
                      -k <key_to_load_the_model>

.. Note:: The :code:`tlt gazenet train` tool can support training on images of different resolutions.
          Face, left eye, and right eye crop is obtained online throught dataloader. However,
          it requires all input images to have the same resolution.


Evaluating the Model
--------------------

.. _evaluating_the_model_gazenet:

Execute :code:`evaluate` on a GazeNet model.

.. code::

    tlt gazenet evaluate [-h] -type <testing dataset type>
                              -m <model_file>
                              -e <experiment_spec>
                              -k <key>

Required Arguments
^^^^^^^^^^^^^^^^^^

* :code:`-e, --experiment_spec_file`: Experiment spec file to set up the evaluation experiment.
  This should be the same as training spec file.
* :code:`-m, --model`: Path to the model file to use for evaluation. This could be a
  :code:`.tlt` model file or a tensorrt engine generated using the export tool.
* :code:`-k, -–key`: Provide the encryption key to decrypt the model. This is a required argument
  only with a :code:`.tlt` model file.

Optional Arguments
^^^^^^^^^^^^^^^^^^

* :code:`-h, --help`: show this help message and exit.

If you have followed the example in :ref:`Training the Model<training_the_model_gazenet>`,
you may now evaluate the model using the following command:

.. code::

    tlt gazenet evaluate  -type <testing data type>
                          -e <path to training spec file>
                          -m <path to the model>
                          -k <key to load the model>

.. Note:: This command runs evaluation on the testing/KPI dataset.

Use these steps to evaluate on a new test set with ground truth labeled:

1. Create tfrecords for this test set by following the steps listed in :ref:`Pre-processing the Dataset
   <conversion_to_tfrecords_gazenet>` section.
2. Update the dataloader configuration part of the training experiment spec file to update kpiset_info
   with newly generated tfrecords for the test set. For more information on the dataset config, 
   refer to :ref:`Creating an Experiment Specification File<creating_an_experiment_specification_file_gazenet>`.
   The evaluate tool iterates through all the folds in the kpiset_info.

.. code::

    kpiset_info:
      ground_truth_folder_name_kpi:
      - Ground_Truth_Folder_Dataset1
      - Ground_Truth_Folder_Dataset2
      kpi_file_name: test.tfrecords
      kpi_root_path: null
      kpi_tfrecords_directory_path:
      - /path_to_kpi_dataset1
      - /path_to_kpi_dataset2
      tfrecord_folder_name_kpi:
      - TfRecords_joint_combined
      - TfRecords_joint_combined
      tfrecords_set_id_kpi:
      - kpi_dataset1
      - kpi_dataset2

The rest of the experiment spec file remains the same as the training spec file.

Run Inference on the Model
--------------------------

.. _run_inference_on_the_model_gazenet:

The :code:`inference` task for gazenet may be used to visualize gaze vector. An
example of the command for this task is shown below:

.. code::

    tlt gazenet inference [-h] -e </path/to/inference/spec/file> \
                               -i </path/to/inference/input> \
                               -m <model_file> \
                               -o </path/to/inference/output> \
                               -k <model key>

Required Parameters
^^^^^^^^^^^^^^^^^^^

* :code:`-e, --inference_spec`: Path to an inference spec file.
* :code:`-i, --inference_input`: The directory of input images or a single image for inference.
* :code:`-o, --inference_output`: The directory to the output images and labels.
* :code:`-k, --enc_key`: Key to load model.

Sample usage for the inference sub-task
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Here's a sample command to run inference for a testing dataset.

.. code::

    tlt gazenet inference -e $SPECS_DIR/gazenet_tlt_pretrain.yaml \
                          -i $DATA_DOWNLOAD_DIR/inference-set \
                          -m $USER_EXPERIMENT_DIR/experiment_result/exp1/model.tlt \
                          -o $USER_EXPERIMENT_DIR/experiment_result/exp1 \
                          -k $KEY

Exporting the GazeNet Model
^^^^^^^^^^^^^^^^^^^^^^^^^^^

Here's an example of the command line arguments of the export command:

.. code::

    tlt gazenet export [-h] -m <path to the .tlt model file generated by tlt train>
                            -o <path to output file>
                            -t tfonnx
                            -k <key>

Required Arguments
******************

* :code:`-m, --model_filename`: Path to the .tlt model file to be exported using :code:`export`.
* :code:`-k, --output_filename`: Key used to save the :code:`.tlt` model file.
* :code:`-o, --key`: Key used to save the :code:`.tlt` model file.
* :code:`-t, --export_type`: Model type to export to. Only 'tfonnx' is support in TLT 3.0.

Sample usage for the export sub-task
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Here's a sample command to export a GazeNet model.

.. code::

    tlt gazenet export -m $USER_EXPERIMENT_DIR/experiment_result/exp1/model.tlt
                       -o $USER_EXPERIMENT_DIR/experiment_dir_final/gazenet_onnx.etlt
                       -t tfonnx
                       -k $KEY

Deploying to the TLT CV Inference Pipeline
------------------------------------------

The pretrained model for gaze estimation provided through NGC is available by default
to use inside the TLT CV Inference Pipeline.
You can also deploy a model trained through TLT workflow to the TLT CV Inference Pipeline.
Refer to :ref:`TLT CV Quick Start Scripts<tlt_cv_quick_start_scripts>` section for instructions of
both options.