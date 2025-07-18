PoseClassificationNet requires a sequence of skeletons (body poses) for input. The coordinates need to be normalized. For example, 3D joints are produced relative to the root keypoint (i.e. pelvis) and normalized by the focal length (1200.0 for 1080P). The entrypoint for dataset conversion generates an array of spatio-temporal sequences based on the output JSON metadata from the deepstream-bodypose-3d app.

The input data for training or inference are formatted as a NumPy array in five dimensions (N, C, T, V, M) :

N : The number of sequences

C : The number of input channels, which is set to 3 in the NGC model

T : The maximum sequence length in frames, which is 300 (10 seconds for 30 FPS) in the NGC model

V : The number of joint points, set to 34 for the NVIDIA format

M : The number of persons. The pre-trained model assumes a single object, but it can also support multiple people

The output of model inference is an array of N elements that gives the predicted action class for each sequence.

The labels used for training or evaluation are stored as a pickle file that consists of a list of two lists, including N elements each. The first list contains N strings of sample names. The second list contains the labeled action class ID of each sequence. The following is an example:

Copy Copied! [["xl6vmD0XBS0.json", "OkLnSMGCWSw.json", "IBopZFDKfYk.json", "HpoFylcrYT4.json", "mlAtn_zi0bY.json", ...], [235, 388, 326, 306, 105, ...]]

The graph to model skeletons is defined by two configuration parameters:

graph_layout (string): Must be one the following candidates: nvidia consists of 34 joints. For more information, please refer to AR SDK Programming Guide. openpose consists of 18 joints. For more information, please refer to OpenPose. human3.6m consists of 17 joints. For more information, please refer to Human3.6M. ntu-rgb+d consists of 25 joints. For more information, please refer to NTU RGB+D. ntu_edge consists of 24 joints. For more information, please refer to NTU RGB+D. coco consists of 17 joints. For more information, please refer to COCO.

graph_strategy (string): Must be one of the following candidates (for more information, refer to the “Partition Strategies” section in this paper): uniform : Uniform Labeling distance : Distance Partitioning spatial : Spatial Configuration

