Emotion Classification¶
EmotionNet is an NVIDIA developed emotion detection model which is included in the Transfer Learning Toolkit as one of the tasks supported. With EmotionNet the following subtasks are supported, namely:
dataset_convert
train
evaluate
inference
export
These tasks may be invoke from the TLT launcher by following the below mentioned convention from command line:
tlt emotionnet <sub_task> <args_per_subtask>
where args_per_subtask
are the command line arguments required for a given subtask. Each of these sub-tasks are explained
in detail below.
Pre-processing the Dataset¶
As described in the Data Annotation Format section, the EmotionNet
app requires defined JSON format data to be converted to TFRecords. This can be done using the
dataset_convert
subtask under EmotionNet.
The dataset_convert
tool takes in a defined json format data and converts it to the
TFRecords that the EmotionNet model ingests. See the following sections for the sample usage examples.
Sample Usage of the Dataset Converter Tool¶
The labeling json data format is the accepted dataset format for emotionnet. The labeling json data fromat must be converted to the TFRecord file format before passing to emotionnet training. Use this command to do the conversion:
tlt emotionnet dataset_convert [-h] -c CONFIG_PATH
You can use these optional arguments:
-h, --help
: Show this help message and exit.-c, -config_path
: path to the config file.
The config file contains various parameters to generate the dataset:
ground_truth_folder_suffix
: suffix of the generated tfrecords folder.is_filtered
: whether to filter the dataset.set_name
: name of the processed set.is_datafactory
: whether to use data factory labels.sdk_label_folder
: if SDK labels are used, the SDK label folder name.data_path
: root path to the dataset.num_keypoints
: number of keypoints.emotion_map
: map between emotion class name and id.
Here’s an example of using the command with the dataset:
tlt emotionnet dataset_convert -c /workspace/examples/emotionnet/dataset_specs/dataio_config_ckplus.json
Output log from executing dataset_convert
:
2021-01-06 18:35:29,690 - __main__ - INFO - Generate Tfrecords for data with required json labels
/workspace/tlt-experiments/emotionnet/postData/ckplus/Ground_Truth_DataFactory/TfRecords
/workspace/tlt-experiments/emotionnet/postData/ckplus/Ground_Truth_DataFactory/GT
2021-01-06 18:35:29,690 - __main__ - INFO - Start to parse data...
2021-01-06 18:35:29,690 - __main__ - INFO - Run full conversion...
/workspace/tlt-experiments/emotionnet/postData/ckplus/GT_user_json
2021-01-06 18:35:29,690 - __main__ - INFO - Convert json file...
2021-01-06 18:35:33,196 - __main__ - INFO - Start to write user tfrecord...
2021-01-06 18:35:33,488 - __main__ - INFO - Start to split data...
/workspace/tlt-experiments/emotionnet/postData/ckplus/Ground_Truth_DataFactory/TfRecords_combined
2021-01-06 18:35:33,489 - __main__ - INFO - Test: ['S051', 'S108', 'S158', 'S149', 'S137', 'S032',
'S066', 'S046', 'S097', 'S504', 'S091']
2021-01-06 18:35:33,489 - __main__ - INFO - Validation ['S094', 'S122', 'S082', 'S147', 'S060', 'S042',
'S096', 'S014', 'S083', 'S089', 'S113']
2021-01-06 18:35:33,489 - __main__ - INFO - Train ['S005', 'S129', 'S157', 'S068', 'S063', 'S111',
'S044', 'S074', 'S139', 'S011', 'S127', 'S155',
'S105', 'S010', 'S154', 'S061', 'S088', 'S125',
'S101', 'S062', 'S090', 'S160', 'S106', 'S131',
'S078', 'S895', 'S112', 'S092', 'S071', 'S126',
'S087', 'S148', 'S057', 'S128', 'S080', 'S506',
'S052', 'S029', 'S081', 'S055', 'S095', 'S079',
'S502', 'S116', 'S099', 'S076', 'S098', 'S053',
'S093', 'S136', 'S065', 'S085', 'S059', 'S156',
'S100', 'S064', 'S501', 'S077', 'S505', 'S037',
'S110', 'S069', 'S026', 'S124', 'S028', 'S058',
'S067', 'S050', 'S084', 'S138', 'S070', 'S073',
'S132', 'S135', 'S151', 'S119', 'S034', 'S133',
'S086', 'S109', 'S107', 'S503', 'S114', 'S056',
'S134', 'S045', 'S035', 'S072', 'S115', 'S022',
'S075', 'S102', 'S130', 'S054', 'S117', 'S999']
/workspace/tlt-experiments/emotionnet/postData/ckplus/Ground_Truth_DataFactory/GT_combined
Creating an Experiment Specification File¶
To do training, evaluation, and inference for EmotionNet, several components need to be configured, each with
their own parameters. The emotionnet train
and emotionnet evaluate
commands for a EmotionNet
experiment share the same configuration file.
The training and evaluation tools use an experiment specification file for emotion detection. The specification file consists the following components:
Trainer
Model
Loss
Optimizer
Dataloader
Trainer¶
Here’s a sample list of parameters to config EmotionNet trainer.
__class_name__: EmotionNetTrainer
checkpoint_dir: null
random_seed: 42
log_every_n_secs: 10
checkpoint_n_epoch: 1
num_epoch: 100
infrequent_summary_every_n_steps: 0
use_landmarks_input: True
class_list: ['neutral',
'happy',
'surprise',
'squint',
'disgust',
'scream']
dataloader:
...
model:
...
loss:
...
optimizer:
...
The following table describes the trainer
parameters:
Parameter |
Datatype |
Default |
Description |
Supported Values |
---|---|---|---|---|
__class_name__ |
string |
EmotionNetTrainer |
Name for the trainer specification section |
EmotionNetTrainer |
|
string |
|
Path to the checkpoint. If not specified, will save all checkpoints in the output folder |
NA |
|
int |
|
Random seed used during the experiments |
NA |
|
int |
|
Log the training output for every n secs |
NA |
|
int |
|
Save checkpoint per n number of epochs |
1 to num_epoch |
|
int |
|
Number of epochs |
NA |
|
int |
|
Infrequent summary every n epoch |
0 to num_epoch |
|
boolean |
|
Whether input is landmarks (in TLT 3.0, only landmarks input is supported) |
True/False |
|
list |
‘neutral’, ‘happy’, ‘surprise’, ‘squint’, ‘disgust’, ‘scream’ |
list of emotion classes |
NA |
Model¶
Here’s a sample model config to instantiate an EmotionNet model with pretrained weights and the number of frozen blocks.
model:
__class_name__: EmotionNetModel
model_parameters:
use_batch_norm: True
data_format: channels_first
regularization_type: l2
regularization_factor: 0.0015
bias_regularizer: null
use_landmarks_input: True
activation_type: 'relu'
dropout_rate: 0.3
num_class: 6
pretrained_model_path: null
frozen_blocks: 2
The following table describes the trainer
parameters:
Parameter |
Datatype |
Default |
Description |
Supported Values |
---|---|---|---|---|
__class_name__ |
string |
EmotionNetModel |
Name of the model configuration section |
NA |
|
boolean |
|
Boolean variable to use batch normalization layers or not |
True/False |
|
string |
|
Input data format |
channel_first/channel_last |
|
string |
|
Type of the regularization |
l1/l2/None |
|
float |
|
Factor of the regularization |
0.0-1.0 |
|
float |
|
Regularizer to apply a penalty on the layer’s bias |
l1/l2/None |
|
boolean |
|
Whether input is landmarks (in TLT 3.0, only landmarks input is supported) |
True/False |
|
string |
|
Type of the activation |
relu, sigmoid |
|
float |
|
Probability for drop out |
0.0-1.0 |
|
int |
|
Number of Emotion classes |
6 |
|
string |
|
Path to the pretrain model |
NA |
|
int |
|
This parameter defines how many blocks that will be frozen during training. If the value for this variable is set to be larger than 0, provide a pretrain model. |
0,1,2,3,4,5 |
Loss¶
This section helps you configure the cost function to select the type of loss.
loss:
__class_name__: EmotionNetLoss
loss_function_name: CE
class_weights_dict: None
The following table describes the parameters used to configure loss
:
Parameter |
Datatype |
Default |
Description |
Supported Values |
---|---|---|---|---|
__class_name__ |
string |
EmotionNetLoss |
Name of the loss section |
NA |
|
string |
|
Type of the loss function |
CE/BCE/MSE CE: cross entropy loss BCE: binary cross entropy loss MSE: mean square error |
|
dict |
|
MSE: mean square error |
Optimizer¶
This section helps you configure the optimizer and learning rate schedule:
optimizer:
__class_name__: AdamOptimizer
beta1: 0.9
beta2: 0.999
epsilon: 1.0e-08
learning_rate_schedule:
__class_name__: SoftstartAnnealingLearningRateSchedule
soft_start: 0.2
annealing: 0.8
base_learning_rate: 0.0002
min_learning_rate: 2.0e-07
last_step: 953801
The following table describes the parameters used to configure optimizer
:
Parameter |
Datatype |
Default |
Description |
Supported Values |
---|---|---|---|---|
__class_name__ |
string |
AdamOptimizer |
Type of optimizer |
AdamOptimizer AdadeltaOptimizer GradientDescentOptimizer |
|
float |
|
The exponential decay rate for the 1st moment estimates |
0-1 |
|
float |
|
The exponential decay rate for the 2nd moment estimates |
0-1 |
|
float |
|
A small constant for numerical stability |
NA |
|
structure |
|
Type of learning rate schedule |
SoftstartAnnealingLearningRateSchedule ConstantLearningRateSchedule ExponentialDecayLearningRateSchedule |
The following table describes the parameters used to configure learning rate schedule
:
Parameter |
Datatype |
Default |
Description |
Supported Values |
---|---|---|---|---|
__class_name__ |
string |
SoftstartAnnealingLearningRateSchedule |
Name of the learning rate schedule section |
SoftstartAnnealingLearningRateSchedule - Soft starting and ending learning rate value ConstantLearningRateSchedule - Constant learning rate value ExponentialDecayLearningRateSchedule - Decay exponentially learning rate |
|
float |
|
Indicating the fraction of last_step that will be taken before reaching the base_learning rate |
0-1 |
|
float |
|
Indicating the fraction of last_step after which the learning rate ramps down from base_learning rate |
0-1 |
|
float |
|
Learning rate |
0-1 |
|
float |
|
Minimum value the learning rate will be set to |
0-1 |
|
int |
|
Last step the schedule is made for |
NA |
Dataloader¶
Here’s a sample list of parameters to config EmotionNet dataloader.
dataloader:
__class_name__: EmotionNetDataloader
batch_size: 64
face_scale_factor: 1.3
num_keypoints: 68
prefetch_num: 3
image_info:
image_frame:
channel: 1
height: 480
width: 640
image_face:
channel: 1
height: 224
width: 224
dataset_info:
...
kpiset_info:
...
The following table describes the dataloader
parameters:
Parameter |
Datatype |
Default |
Description |
Supported Values |
|
int |
|
Number of samples per batch |
NA |
|
float |
|
Face scaling factor |
1.0 - 1.5 |
|
int |
|
Number of keypoints for landmarks |
68 |
|
int |
|
Number of preferch sampes |
0 - 8 |
|
structure |
|
Image information specification |
Reserve for image input, not supported in TLT 3.0 |
|
int |
|
Image channel |
Reserve for image input, not supported in TLT 3.0 |
|
int |
|
Image height |
Reserve for image input, not supported in TLT 3.0 |
|
int |
|
Image width |
Reserve for image input, not supported in TLT 3.0 |
|
structure |
|
Dataset information specification |
NA |
|
structure |
|
KPI dataset information specification |
NA |
dataset_info:
root_path: null
image_extension: png
tfrecords_directory_path:
- /workspace/tlt-experiments/emotionnet/postData
tfrecords_set_id:
- s1-x1-faceoms-0
ground_truth_folder_name:
- Ground_Truth_DataFactory
tfrecord_folder_name:
- TfRecords_combined
train_file_name: test.tfrecords
validate_file_name: test.tfrecords
test_file_name: test.tfrecords
The following table describes the dataset_info
parameters:
Parameter |
Datatype |
Default |
Description |
Supported Values |
|
string |
|
Root path to the dataset |
NA |
|
string |
|
Extension of the image |
Reserved variable (image input is not supported in TLT 3.0) |
|
string |
|
Path to tfrecords directory |
NA |
|
string |
|
Set ID for tfrecords |
NA |
|
string |
|
Ground truth folder name |
NA |
|
string |
|
Tfrecords folder name |
NA |
|
string |
|
File name for tfrecords file for training |
NA |
|
string |
|
File name for tfrecords file for validation |
NA |
|
string |
|
File name for tfrecords file for testing |
NA |
kpiset_info:
kpi_root_path: null
kpi_tfrecords_directory_path:
- /workspace/tlt-experiments/emotionnet/postData
tfrecords_set_id_kpi:
- s1-x1-faceoms-0
ground_truth_folder_name_kpi:
- Ground_Truth_DataFactory
tfrecord_folder_name_kpi:
- TfRecords_combined
kpi_file_name: test.tfrecords
The following table describes the kpiset_info
parameters:
Parameter |
Datatype |
Default |
Description |
Supported Values |
|
string |
|
Root path for KPI dataset |
NA |
|
string |
|
Path to KPI tfrecords directory |
NA |
|
string |
|
Set ID for KPI tfrecords |
NA |
|
string |
|
Ground truth folder name for KPI dataset |
NA |
|
string |
|
KPI tfrecords folder name |
NA |
|
string |
|
KPI tfrecords file name |
NA |
Training the Model¶
After following the steps, to Pre-processing the Dataset to create TFRecords ingestible by the TLT training, and setting up a spec file. You are now ready to start training an emotion classification network.
EmotionNet training command:
tlt emotionnet train [-h] -e <spec_file>
-r <result directory>
-k <key>
Required Arguments¶
-r, --results_dir
: Path to a folder where experiment outputs should be written.-k, –key
: User specific encoding key to save or load a.tlt
model.-e, --experiment_spec_file
: Path to spec file. Absolute path or relative to working directory.
Optional Arguments¶
-h, --help
: To print help message.
Sample Usage¶
Here is an example of command for emotionnet training:
tlt emotionnet train -r <path_to_experiment_output>
-e <path_to_spec_file>
-k <key_to_load_the_model>
Note
The tlt emotionnet train
tool can support training on input with different
number of fiducial landmarks points.
Evaluating the Model¶
Execute evaluate
on an EmotionNet model.
tlt emotionnet evaluate [-h] -r <result directory>
-m <model_file>
-e <experiment_spec>
-k <key>
Required Arguments¶
-r, --results_dir
: Path to a folder where experiment outputs should be written.-e, --experiment_spec_file
: Experiment spec file to set up the evaluation experiment. This should be the same as training spec file.-m, --model
: Path to the model file to use for evaluation. This could be a.tlt
model file or a tensorrt engine generated using the export tool.-k, -–key
: Provide the encryption key to decrypt the model. This is a required argument only with a.tlt
model file.
Optional Arguments¶
-h, --help
: show this help message and exit.
If you have followed the example in Training the Model, you may now evaluate the model using the following command:
tlt emotionnet evaluate -r <path_to_experiment_output>
-m <path to the model>
-e <path to training spec file>
-k <key to load the model>
Use these steps to evaluate on a new test set with ground truth labeled:
Create tfrecords for this test set by following the steps listed in Pre-processing the Dataset section.
Update the dataloader configuration part of the training experiment spec file to update kpiset_info with newly generated tfrecords for the test set. For more information on the dataset config, please refer to Creating an Experiment Specification File. The evaluate tool iterates through all the folds in the kpiset_info.
kpiset_info:
kpi_root_path: null
kpi_tfrecords_directory_path:
- /path_to_tfrecords_for_kpi_dataset
tfrecords_set_id_kpi:
- kpi_dataset
ground_truth_folder_name_kpi:
- Ground_Truth_Data_Folder
tfrecord_folder_name_kpi:
- TfRecords_folder
kpi_file_name: test.tfrecords
The rest of the experiment spec file remains the same as the training spec file.
Run Inference on the Model¶
The inference
task for emotionnet may be used to visualize emotion class label. An
example of the command for this task is shown below:
tlt emotionnet inference -e </path/to/inference/spec/file>
-i </path/to/inference/input>
-m <model_file>
-r <path_to_experiment_output>
-o </path/to/inference/output>
-k <model key>
Required Parameters¶
-e, --inference_spec
: Path to an inference spec file.-i, --inference_input
: The directory of input images or a single image for inference.-m, --model
: Path to the model file to use for evaluation. This could be a.tlt
model file or a tensorrt engine generated using the export tool.-r, --results_dir
: Path to a folder where experiment outputs should be written.-o, --inference_output
: The directory to the output images and labels.-k, --enc_key
: Key to load model.
Sample usage for the inference sub-task¶
Here’s a sample command to run inference for a testing dataset.
tlt emotionnet inference -e $SPECS_DIR/emotionnet_tlt_pretrain.yaml
-i $USER_EXPERIMENT_DIR/inferSamples/001.json
-m $USER_EXPERIMENT_DIR/experiment_result/exp1/model.tlt
-r $USER_EXPERIMENT_DIR/inferSamples
-o $USER_EXPERIMENT_DIR/inferSamples
-k encode_key
Exporting the EmotionNet Model¶
Here’s an example of the command line arguments of the export command:
tlt emotionnet export -m <path to the .tlt model file generated by tlt train>
-o <path to output file>
-t tfonnx
-k <key>
Required Arguments¶
-m, --model_filename
: Path to the .tlt model file to be exported usingexport
.-k, --output_filename
: Key used to save the.tlt
model file.-o, --key
: Key used to save the.tlt
model file.-t, --export_type
: Model type to export to. Only ‘tfonnx’ is support in TLT 3.0.
Sample usage for the export sub-task¶
Here’s a sample command to export an EmotionNet model.
tlt emotionnet export -m $USER_EXPERIMENT_DIR/experiment_result/exp1/model.tlt
-o $USER_EXPERIMENT_DIR/experiment_dir_final/emotionnet_onnx.etlt
-t tfonnx
-k $KEY
Deploying to the TLT CV Inference Pipeline¶
The pretrain model for emotion classification provided through NGC is available by default to use inside the TLT CV Inference Pipeline. You can also deploy a model trained through TLT workflow to the TLT CV Inference Pipeline. Refer to TLT CV Quick Start Scripts section for instructions of both options.