Emotion Classification

EmotionNet is an NVIDIA developed emotion detection model which is included in the Transfer Learning Toolkit as one of the tasks supported. With EmotionNet the following subtasks are supported, namely:

  • dataset_convert

  • train

  • evaluate

  • inference

  • export

These tasks may be invoke from the TLT launcher by following the below mentioned convention from command line:

tlt emotionnet <sub_task> <args_per_subtask>

where args_per_subtask are the command line arguments required for a given subtask. Each of these sub-tasks are explained in detail below.

Pre-processing the Dataset

As described in the Data Annotation Format section, the EmotionNet app requires defined JSON format data to be converted to TFRecords. This can be done using the dataset_convert subtask under EmotionNet.

The dataset_convert tool takes in a defined json format data and converts it to the TFRecords that the EmotionNet model ingests. See the following sections for the sample usage examples.

Sample Usage of the Dataset Converter Tool

The labeling json data format is the accepted dataset format for emotionnet. The labeling json data fromat must be converted to the TFRecord file format before passing to emotionnet training. Use this command to do the conversion:

tlt emotionnet dataset_convert [-h] -c CONFIG_PATH

You can use these optional arguments:

  • -h, --help: Show this help message and exit.

  • -c, -config_path: path to the config file.

The config file contains various parameters to generate the dataset:

  • ground_truth_folder_suffix: suffix of the generated tfrecords folder.

  • is_filtered: whether to filter the dataset.

  • set_name: name of the processed set.

  • is_datafactory: whether to use data factory labels.

  • sdk_label_folder: if SDK labels are used, the SDK label folder name.

  • data_path: root path to the dataset.

  • num_keypoints: number of keypoints.

  • emotion_map: map between emotion class name and id.

Here’s an example of using the command with the dataset:

tlt emotionnet dataset_convert -c /workspace/examples/emotionnet/dataset_specs/dataio_config_ckplus.json

Output log from executing dataset_convert:

2021-01-06 18:35:29,690 - __main__ - INFO - Generate Tfrecords for data with required json labels
/workspace/tlt-experiments/emotionnet/postData/ckplus/Ground_Truth_DataFactory/TfRecords
/workspace/tlt-experiments/emotionnet/postData/ckplus/Ground_Truth_DataFactory/GT
2021-01-06 18:35:29,690 - __main__ - INFO - Start to parse data...
2021-01-06 18:35:29,690 - __main__ - INFO - Run full conversion...
/workspace/tlt-experiments/emotionnet/postData/ckplus/GT_user_json
2021-01-06 18:35:29,690 - __main__ - INFO - Convert json file...
2021-01-06 18:35:33,196 - __main__ - INFO - Start to write user tfrecord...
2021-01-06 18:35:33,488 - __main__ - INFO - Start to split data...
/workspace/tlt-experiments/emotionnet/postData/ckplus/Ground_Truth_DataFactory/TfRecords_combined
2021-01-06 18:35:33,489 - __main__ - INFO - Test: ['S051', 'S108', 'S158', 'S149', 'S137', 'S032',
                                                  'S066', 'S046', 'S097', 'S504', 'S091']
2021-01-06 18:35:33,489 - __main__ - INFO - Validation ['S094', 'S122', 'S082', 'S147', 'S060', 'S042',
                                                        'S096', 'S014', 'S083', 'S089', 'S113']
2021-01-06 18:35:33,489 - __main__ - INFO - Train ['S005', 'S129', 'S157', 'S068', 'S063', 'S111',
                                                  'S044', 'S074', 'S139', 'S011', 'S127', 'S155',
                                                  'S105', 'S010', 'S154', 'S061', 'S088', 'S125',
                                                  'S101', 'S062', 'S090', 'S160', 'S106', 'S131',
                                                  'S078', 'S895', 'S112', 'S092', 'S071', 'S126',
                                                  'S087', 'S148', 'S057', 'S128', 'S080', 'S506',
                                                  'S052', 'S029', 'S081', 'S055', 'S095', 'S079',
                                                  'S502', 'S116', 'S099', 'S076', 'S098', 'S053',
                                                  'S093', 'S136', 'S065', 'S085', 'S059', 'S156',
                                                  'S100', 'S064', 'S501', 'S077', 'S505', 'S037',
                                                  'S110', 'S069', 'S026', 'S124', 'S028', 'S058',
                                                  'S067', 'S050', 'S084', 'S138', 'S070', 'S073',
                                                  'S132', 'S135', 'S151', 'S119', 'S034', 'S133',
                                                  'S086', 'S109', 'S107', 'S503', 'S114', 'S056',
                                                  'S134', 'S045', 'S035', 'S072', 'S115', 'S022',
                                                  'S075', 'S102', 'S130', 'S054', 'S117', 'S999']
/workspace/tlt-experiments/emotionnet/postData/ckplus/Ground_Truth_DataFactory/GT_combined

Creating an Experiment Specification File

To do training, evaluation, and inference for EmotionNet, several components need to be configured, each with their own parameters. The emotionnet train and emotionnet evaluate commands for a EmotionNet experiment share the same configuration file.

The training and evaluation tools use an experiment specification file for emotion detection. The specification file consists the following components:

  • Trainer

  • Model

  • Loss

  • Optimizer

  • Dataloader

Trainer

Here’s a sample list of parameters to config EmotionNet trainer.

__class_name__: EmotionNetTrainer
checkpoint_dir: null
random_seed: 42
log_every_n_secs: 10
checkpoint_n_epoch: 1
num_epoch: 100
infrequent_summary_every_n_steps: 0
use_landmarks_input: True
class_list: ['neutral',
              'happy',
              'surprise',
              'squint',
              'disgust',
              'scream']
dataloader:
...
model:
...
loss:
...
optimizer:
...

The following table describes the trainer parameters:

Parameter

Datatype

Default

Description

Supported Values

__class_name__

string

EmotionNetTrainer

Name for the trainer specification section

EmotionNetTrainer

checkpoint_dir

string

null

Path to the checkpoint. If not specified, will save all checkpoints in the output folder

NA

random_seed

int

42

Random seed used during the experiments

NA

log_every_n_secs

int

10

Log the training output for every n secs

NA

checkpoint_n_epoch

int

1

Save checkpoint per n number of epochs

1 to num_epoch

num_epoch

int

100

Number of epochs

NA

infrequent_summary_every_n_steps

int

0

Infrequent summary every n epoch

0 to num_epoch

use_landmarks_input

boolean

True

Whether input is landmarks (in TLT 3.0, only landmarks input is supported)

True/False

class_list

list

‘neutral’, ‘happy’, ‘surprise’, ‘squint’, ‘disgust’, ‘scream’

list of emotion classes

NA

Model

Here’s a sample model config to instantiate an EmotionNet model with pretrained weights and the number of frozen blocks.

model:
 __class_name__: EmotionNetModel
 model_parameters:
   use_batch_norm: True
   data_format: channels_first
   regularization_type: l2
   regularization_factor: 0.0015
   bias_regularizer: null
   use_landmarks_input: True
   activation_type: 'relu'
   dropout_rate: 0.3
   num_class: 6
   pretrained_model_path: null
   frozen_blocks: 2

The following table describes the trainer parameters:

Parameter

Datatype

Default

Description

Supported Values

__class_name__

string

EmotionNetModel

Name of the model configuration section

NA

use_batch_norm

boolean

True

Boolean variable to use batch normalization layers or not

True/False

data_format

string

channels_first

Input data format

channel_first/channel_last

regularization_type

string

l2

Type of the regularization

l1/l2/None

regularization_factor

float

0.0015

Factor of the regularization

0.0-1.0

bias_regularizer

float

null

Regularizer to apply a penalty on the layer’s bias

l1/l2/None

use_landmarks_input

boolean

True

Whether input is landmarks (in TLT 3.0, only landmarks input is supported)

True/False

activation_type

string

relu

Type of the activation

relu, sigmoid

dropout_rate

float

0.3

Probability for drop out

0.0-1.0

num_class

int

6

Number of Emotion classes

6

pretrained_model_path

string

null

Path to the pretrain model

NA

frozen_blocks

int

0

This parameter defines how many blocks that will be frozen during training. If the value for this variable is set to be larger than 0, provide a pretrain model.

0,1,2,3,4,5

Loss

This section helps you configure the cost function to select the type of loss.

loss:
  __class_name__: EmotionNetLoss
  loss_function_name: CE
  class_weights_dict: None

The following table describes the parameters used to configure loss:

Parameter

Datatype

Default

Description

Supported Values

__class_name__

string

EmotionNetLoss

Name of the loss section

NA

loss_type

string

CE

Type of the loss function

CE/BCE/MSE CE: cross entropy loss BCE: binary cross entropy loss MSE: mean square error

class_weights_dict

dict

None

MSE: mean square error

Optimizer

This section helps you configure the optimizer and learning rate schedule:

optimizer:
  __class_name__: AdamOptimizer
  beta1: 0.9
  beta2: 0.999
  epsilon: 1.0e-08
  learning_rate_schedule:
    __class_name__: SoftstartAnnealingLearningRateSchedule
    soft_start: 0.2
    annealing: 0.8
    base_learning_rate: 0.0002
    min_learning_rate: 2.0e-07
    last_step: 953801

The following table describes the parameters used to configure optimizer:

Parameter

Datatype

Default

Description

Supported Values

__class_name__

string

AdamOptimizer

Type of optimizer

AdamOptimizer AdadeltaOptimizer GradientDescentOptimizer

beta1

float

0.9

The exponential decay rate for the 1st moment estimates

0-1

beta2

float

0.999

The exponential decay rate for the 2nd moment estimates

0-1

epsilon

float

1.0e-08

A small constant for numerical stability

NA

learning_rate_schedule

structure

SoftstartAnnealingLearningRateSchedule

Type of learning rate schedule

SoftstartAnnealingLearningRateSchedule ConstantLearningRateSchedule ExponentialDecayLearningRateSchedule

The following table describes the parameters used to configure learning rate schedule:

Parameter

Datatype

Default

Description

Supported Values

__class_name__

string

SoftstartAnnealingLearningRateSchedule

Name of the learning rate schedule section

SoftstartAnnealingLearningRateSchedule - Soft starting and ending learning rate value ConstantLearningRateSchedule - Constant learning rate value ExponentialDecayLearningRateSchedule - Decay exponentially learning rate

soft_start

float

0.2

Indicating the fraction of last_step that will be taken before reaching the base_learning rate

0-1

annealing

float

0.8

Indicating the fraction of last_step after which the learning rate ramps down from base_learning rate

0-1

base_learning_rate

float

0.0002

Learning rate

0-1

min_learning_rate

float

2.0e-07

Minimum value the learning rate will be set to

0-1

last_step

int

953801

Last step the schedule is made for

NA

Dataloader

Here’s a sample list of parameters to config EmotionNet dataloader.

dataloader:
  __class_name__: EmotionNetDataloader
  batch_size: 64
  face_scale_factor: 1.3
  num_keypoints: 68
  prefetch_num: 3
  image_info:
    image_frame:
      channel: 1
      height: 480
      width: 640
    image_face:
      channel: 1
      height: 224
      width: 224
  dataset_info:
  ...
  kpiset_info:
  ...

The following table describes the dataloader parameters:

Parameter

Datatype

Default

Description

Supported Values

batch_size

int

64

Number of samples per batch

NA

face_scale_factor

float

1.3

Face scaling factor

1.0 - 1.5

num_keypoints

int

68

Number of keypoints for landmarks

68

prefetch_num

int

3

Number of preferch sampes

0 - 8

image_info

structure

NA

Image information specification

Reserve for image input, not supported in TLT 3.0

channel

int

NA

Image channel

Reserve for image input, not supported in TLT 3.0

height

int

NA

Image height

Reserve for image input, not supported in TLT 3.0

width

int

NA

Image width

Reserve for image input, not supported in TLT 3.0

dataset_info

structure

NA

Dataset information specification

NA

kpiset_info

structure

NA

KPI dataset information specification

NA

dataset_info:
  root_path: null
  image_extension: png
  tfrecords_directory_path:
  - /workspace/tlt-experiments/emotionnet/postData
  tfrecords_set_id:
  - s1-x1-faceoms-0
  ground_truth_folder_name:
  - Ground_Truth_DataFactory
  tfrecord_folder_name:
  - TfRecords_combined
  train_file_name: test.tfrecords
  validate_file_name: test.tfrecords
  test_file_name: test.tfrecords

The following table describes the dataset_info parameters:

Parameter

Datatype

Default

Description

Supported Values

root_path

string

NA

Root path to the dataset

NA

image_extension

string

png

Extension of the image

Reserved variable (image input is not supported in TLT 3.0)

tfrecords_directory_path

string

NA

Path to tfrecords directory

NA

tfrecords_set_id

string

NA

Set ID for tfrecords

NA

ground_truth_folder_name

string

NA

Ground truth folder name

NA

tfrecord_folder_name

string

NA

Tfrecords folder name

NA

train_file_name

string

NA

File name for tfrecords file for training

NA

validate_file_name

string

NA

File name for tfrecords file for validation

NA

test_file_name

string

NA

File name for tfrecords file for testing

NA

kpiset_info:
  kpi_root_path: null
  kpi_tfrecords_directory_path:
  - /workspace/tlt-experiments/emotionnet/postData
  tfrecords_set_id_kpi:
  - s1-x1-faceoms-0
  ground_truth_folder_name_kpi:
  - Ground_Truth_DataFactory
  tfrecord_folder_name_kpi:
  - TfRecords_combined
  kpi_file_name: test.tfrecords

The following table describes the kpiset_info parameters:

Parameter

Datatype

Default

Description

Supported Values

kpi_root_path

string

NA

Root path for KPI dataset

NA

kpi_tfrecords_directory_path

string

NA

Path to KPI tfrecords directory

NA

tfrecords_set_id_kpi

string

NA

Set ID for KPI tfrecords

NA

ground_truth_folder_name_kpi

string

NA

Ground truth folder name for KPI dataset

NA

tfrecord_folder_name_kpi

string

NA

KPI tfrecords folder name

NA

kpi_file_name

string

NA

KPI tfrecords file name

NA

Training the Model

After following the steps, to Pre-processing the Dataset to create TFRecords ingestible by the TLT training, and setting up a spec file. You are now ready to start training an emotion classification network.

EmotionNet training command:

tlt emotionnet train [-h] -e <spec_file>
                          -r <result directory>
                          -k <key>

Required Arguments

  • -r, --results_dir: Path to a folder where experiment outputs should be written.

  • -k, –key: User specific encoding key to save or load a .tlt model.

  • -e, --experiment_spec_file: Path to spec file. Absolute path or relative to working directory.

Optional Arguments

-h, --help: To print help message.

Sample Usage

Here is an example of command for emotionnet training:

tlt emotionnet train -r <path_to_experiment_output>
                     -e <path_to_spec_file>
                     -k <key_to_load_the_model>

Note

The tlt emotionnet train tool can support training on input with different number of fiducial landmarks points.

Evaluating the Model

Execute evaluate on an EmotionNet model.

tlt emotionnet evaluate [-h] -r <result directory>
                             -m <model_file>
                             -e <experiment_spec>
                             -k <key>

Required Arguments

  • -r, --results_dir: Path to a folder where experiment outputs should be written.

  • -e, --experiment_spec_file: Experiment spec file to set up the evaluation experiment. This should be the same as training spec file.

  • -m, --model: Path to the model file to use for evaluation. This could be a .tlt model file or a tensorrt engine generated using the export tool.

  • -k, -–key: Provide the encryption key to decrypt the model. This is a required argument only with a .tlt model file.

Optional Arguments

  • -h, --help: show this help message and exit.

If you have followed the example in Training the Model, you may now evaluate the model using the following command:

tlt emotionnet evaluate -r <path_to_experiment_output>
                        -m <path to the model>
                        -e <path to training spec file>
                        -k <key to load the model>

Use these steps to evaluate on a new test set with ground truth labeled:

  1. Create tfrecords for this test set by following the steps listed in Pre-processing the Dataset section.

  2. Update the dataloader configuration part of the training experiment spec file to update kpiset_info with newly generated tfrecords for the test set. For more information on the dataset config, please refer to Creating an Experiment Specification File. The evaluate tool iterates through all the folds in the kpiset_info.

kpiset_info:
  kpi_root_path: null
  kpi_tfrecords_directory_path:
  - /path_to_tfrecords_for_kpi_dataset
  tfrecords_set_id_kpi:
  - kpi_dataset
  ground_truth_folder_name_kpi:
  - Ground_Truth_Data_Folder
  tfrecord_folder_name_kpi:
  - TfRecords_folder
  kpi_file_name: test.tfrecords

The rest of the experiment spec file remains the same as the training spec file.

Run Inference on the Model

The inference task for emotionnet may be used to visualize emotion class label. An example of the command for this task is shown below:

tlt emotionnet inference -e </path/to/inference/spec/file>
                         -i </path/to/inference/input>
                         -m <model_file>
                         -r <path_to_experiment_output>
                         -o </path/to/inference/output>
                         -k <model key>

Required Parameters

  • -e, --inference_spec: Path to an inference spec file.

  • -i, --inference_input: The directory of input images or a single image for inference.

  • -m, --model: Path to the model file to use for evaluation. This could be a .tlt model file or a tensorrt engine generated using the export tool.

  • -r, --results_dir: Path to a folder where experiment outputs should be written.

  • -o, --inference_output: The directory to the output images and labels.

  • -k, --enc_key: Key to load model.

Sample usage for the inference sub-task

Here’s a sample command to run inference for a testing dataset.

tlt emotionnet inference -e $SPECS_DIR/emotionnet_tlt_pretrain.yaml
                         -i $USER_EXPERIMENT_DIR/inferSamples/001.json
                         -m $USER_EXPERIMENT_DIR/experiment_result/exp1/model.tlt
                         -r $USER_EXPERIMENT_DIR/inferSamples
                         -o $USER_EXPERIMENT_DIR/inferSamples
                         -k encode_key

Exporting the EmotionNet Model

Here’s an example of the command line arguments of the export command:

tlt emotionnet export -m <path to the .tlt model file generated by tlt train>
                      -o <path to output file>
                      -t tfonnx
                      -k <key>

Required Arguments

  • -m, --model_filename: Path to the .tlt model file to be exported using export.

  • -k, --output_filename: Key used to save the .tlt model file.

  • -o, --key: Key used to save the .tlt model file.

  • -t, --export_type: Model type to export to. Only ‘tfonnx’ is support in TLT 3.0.

Sample usage for the export sub-task

Here’s a sample command to export an EmotionNet model.

tlt emotionnet export -m $USER_EXPERIMENT_DIR/experiment_result/exp1/model.tlt
                      -o $USER_EXPERIMENT_DIR/experiment_dir_final/emotionnet_onnx.etlt
                      -t tfonnx
                      -k $KEY

Deploying to the TLT CV Inference Pipeline

The pretrain model for emotion classification provided through NGC is available by default to use inside the TLT CV Inference Pipeline. You can also deploy a model trained through TLT workflow to the TLT CV Inference Pipeline. Refer to TLT CV Quick Start Scripts section for instructions of both options.