Gesture Recognition¶

Model Overview¶

The model described in this card is a classification network that aims to classify hand-crop images into five gesture types:

thumbs up
fist
stop
ok
two
random

GestureNet is generally cascaded with Bodypose for gesture-based applications.

Model Architecture¶

This is a classification model with a Resnet18 backbone.

Training Algorithm¶

The training algorithm optimizes the network to minimize the categorical cross entropy loss for the classes. This model was trained using the Gesture Recognition training app in TLT v3.0.

Reference¶

He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770-778).

Intended Use¶

GestureNet is cascaded with a hand detect or a bodypose network. For example, BodyPoseNet detects human body joints, which are used to create hand crops, and GestureNet acts as a classifier that determines the gesture of the hand.

Input

RGB Images of 160 X 160 X 3

Output

Gesture category labels

The datasheet for the model is captured in its model card hosted on NGC.