Gesture Recognition =================== Model Overview -------------- The model described in this card is a classification network that aims to classify hand-crop images into five gesture types: * thumbs up * fist * stop * ok * two * random GestureNet is generally cascaded with Bodypose for gesture-based applications. Model Architecture ------------------ This is a classification model with a Resnet18 backbone. Training Algorithm ------------------ The training algorithm optimizes the network to minimize the categorical cross entropy loss for the classes. This model was trained using the :ref:`Gesture Recognition` training app in TLT v3.0. Reference --------- * He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770-778). Intended Use ------------ GestureNet is cascaded with a hand detect or a bodypose network. For example, BodyPoseNet detects human body joints, which are used to create hand crops, and GestureNet acts as a classifier that determines the gesture of the hand. **Input** - RGB Images of 160 X 160 X 3 **Output** - Gesture category labels The datasheet for the model is captured in its model card hosted on `NGC`_. .. _NGC: https://ngc.nvidia.com/catalog/models/nvidia:tlt_gesturenet