CenterPose
CenterPose is a category-level object pose estimation model included in the TAO Toolkit. It supports the following tasks:
train
evaluate
inference
export
These tasks can be invoked from the TAO Toolkit Launcher using the following convention on the command-line:
tao model centerpose <sub_task> <args_per_subtask>
where, args_per_subtask
are the command-line arguments required for a given subtask. Each
subtask is explained in detail in the following sections.
CenterPose expects directories of images and annotated JSON files for training or validation. See the CenterPose Data Format page for more information about the input data format.
The training experiment spec file for CenterPose includes model
, train
, and dataset
parameters.
Here is an example spec file for training a CenterPose model with a fan_small
backbone on a Google Objectron dataset bike category.
dataset:
train_data: /path/to/category/train/
val_data: /path/to/category/val/
num_classes: 1
batch_size: 64
workers: 4
category: bike
num_symmetry: 1
max_objs: 10
train:
num_gpus: 1
validation_interval: 20
checkpoint_interval: 20
num_epochs: 140
clip_grad_val: 100.0
randomseed: 317
resume_training_checkpoint_path: null
precision: "fp32"
optim:
lr: 6e-05
lr_steps: [90, 120]
model:
down_ratio: 4
use_pretrained: True
backbone:
model_type: fan_small
pretrained_backbone_path: /path/to/your-fan-small-pretrained-model
Parameter | Data Type | Default | Description | Supported Values |
model |
dict config | – | The configuration of the model architecture | |
train |
dict config | – | The configuration of the training task | |
dataset |
dict config | – | The configuration of the dataset | |
evaluate |
dict config | – | The configuration of the evaluation task | |
inference |
dict config | – | The configuration of the inference task | |
export |
dict config | – | The configuration of the ONNX export task | |
gen_trt_engine |
dict config | – | The configuration of the TensorRT generation task. Only used in tao deploy | |
encryption_key |
string | None | The encryption key to encrypt and decrypt model files | |
results_dir |
string | None | The directory where experiment results are saved |
model
The model
parameter provides options to change the CenterPose architecture.
model:
down_ratio: 4
use_pretrained: False
backbone:
model_type: fan_small
pretrained_backbone_path: /path/to/your-fan-small-pretrained-model
Parameter | Datatype | Default | Description | Supported Values |
down_ratio |
int | 4 | The down scale ratio of the network feature map. | 4 |
use_pretrained |
bool | False | A flag specifying whether to initial the backbone with the pretrained weights. | True, False |
backbone |
dict config | The config for the backbone model type and the path of the pretrained weights. | >0 |
backbone
The backbone
parameter provides options to change the CenterPose backbone architecture.
backbone:
model_type: fan_small
pretrained_backbone_path: /path/to/your-fan-small-pretrained-model
Parameter | Datatype | Default | Description | Supported Values |
|
string |
None |
The optional path to the pretrained backbone file. Set the pretrained path when using “FAN” backbone. |
string to the path |
|
string |
DLA34 |
The backbone name of the model. DLA34 and FAN are supported. |
DLA34, fan_small, |
train
The train
parameter defines the hyperparameters of the training process.
train:
num_gpus: 1
validation_interval: 20
checkpoint_interval: 20
num_epochs: 140
clip_grad_val: 100.0
randomseed: 317
resume_training_checkpoint_path: null
precision: "fp32"
optim:
lr: 6e-05
lr_steps: [90, 120]
Parameter | Datatype | Default | Description | Supported Values |
num_gpus |
unsigned int | 1 | The number of GPUs to use | >0 |
|
unsigned int |
20 |
The epoch interval at which the validation is run. |
>0 |
checkpoint_interval |
unsigned int | 20 | The interval at which the checkpoints are saved | >0 |
num_epochs |
unsigned int | 140 | The total number of epochs to run the experiment | >0 |
clip_grad_val |
float | 100.0 | Clips gradient of an iterable of parameters at specified value | >=0 |
randomseed |
int | 317 | Obtain the identical results by setting the randomseed to the same value | >0 |
resume_training_checkpoint_path |
string | The intermediate PyTorch Lightning checkpoint to resume training from | ||
precision |
string | fp32 | Specifying “fp16” enables precision training. Training with fp16 can help save GPU memory. | fp32, fp16 |
optim |
dict config | The config for the optimizer, including the learning rate, learning scheduler | >0 |
optim
The optim
parameter defines the config for the optimizer in training, including the
learning rate and learning rate steps.
optim:
lr: 6e-05
lr_steps: [90, 120]
Parameter |
Datatype |
Default |
Description |
Supported Values |
---|---|---|---|---|
lr |
float | 6e-05 | The initial learning rate for training the model, excluding the backbone | >0.0 |
lr_steps |
int list | [90, 120] | The steps to decrease the learning rate for the scheduler | int list |
dataset
The dataset
parameter defines the dataset source, training batch size, and
dataset settings.
dataset:
train_data: /path/to/category/train/
val_data: /path/to/category/val/
num_classes: 1
batch_size: 64
workers: 4
category: bike
num_symmetry: 1
max_objs: 10
Parameter | Datatype | Default | Description | Supported Values |
|
string |
|
The path of training data: |
|
|
string |
|
The path of validation data: |
|
|
string |
|
The path of test data: |
|
|
string |
|
The path of inference data: |
|
num_classes |
unsigned int | 1 | The number of category in the training data. Because CenterPose is a category-level pose estimation method, it only supported 1 class. | 1 |
batch_size |
unsigned int | 4 | The batch size for training and validation | >0 |
workers |
unsigned int | 8 | The number of parallel workers processing data | >0 |
|
string |
|
The category name of the training dataset |
|
|
unsigned int |
1 |
The number of symmetric rotations, which means the rotation times for the 3D bounding box along with the y-axis |
>0 |
max_objs |
unsigned int | 10 | The maximum number of objects in the single image that used for training. | >0 |
To train a CenterPose model, use this command:
tao model centerpose train [-h] -e <experiment_spec>
[-r <results_dir>]
[-k <key>]
Required Arguments
-e, --experiment_spec
: The experiment specification file to set up the training experiment
Optional Arguments
-r, --results_dir
: The path to the folder where the experiment outputs should be written. If this argument is not specified, theresults_dir
from the spec file is used.-k, --key
: A user-specific encoding key to save or load a.tlt
model. If this argument is not specified, the model checkpoint isn’t encrypted.--gpus
: The number of GPUs used to run training.-h, --help
: Show this help message and exit.
Sample Usage
Here’s an example of the train
command:
tao centerpose model train -e /path/to/spec.yaml
Optimizing Resource for Training CenterPose
Training CenterPose requires GPUs (for example, V100/A100) and CPU memory to be trained on a standard dataset, such as Objectron. The following are some of the strategies you can use to launch training with only limited resources.
Optimize GPU Memory
There are various ways to optimize GPU memory usage. One trick is to reduce dataset.batch_size
, which can cause your training to take longer than usual.
Typically, the following options result in a more balanced performance optimization:
Set
train.precision
tofp16
to enable automatic mixed precision training. This can reduce your GPU memory usage and speed up the training. But might affect the accuracy.Try using more lightweight backbones like
DLA34
.
evaluate
The evaluate
parameter defines the hyperparameters of the evaluate process.
evaluate:
num_gpus: 1
checkpoint: /path/to/model.pth
opencv: False
eval_num_symmetry: 1
results_dir: /path/to/saving/directory
Parameter | Datatype | Default | Description | Supported Values |
num_gpus |
unsigned int | 1 | The number of GPUs to use | >0 |
checkpoint |
string | Path to PyTorch model to evaluate | ||
|
bool |
False |
If |
True, False |
|
unsigned int |
1 |
For symmetric object categories (e.g. bottle), we rotate the estimated bounding box along the symmetry axis N times (N = 100) and evaluate the prediction w.r.t. each rotated instance |
>0 |
results_dir |
string | Path to the saved evaluation report. Please make sure the calibration information is correct before running the evaluation | ||
trt_engine |
string | Path to TensorRT model to evaluate. Should be only used with tao deploy |
To run evaluation with a CenterPose model, use this command:
tao model centerpose evaluate [-h] -e <experiment_spec>
[-r <results_dir>]
[-k <key>]
evaluate.checkpoint=<model to be evaluated>
Required Arguments
-e, --experiment_spec
: The experiment spec file to set up the evaluation experiment
Optional Arguments
-k, --key
: A user-specific encoding key to save or load a.tlt
model. If this value is not specified, a.pth
model must be used.-r, --results_dir
: The directory where the evaluation result is stored.evaluate.checkpoint
: The.tlt
or.pth
model to be evaluated.
Sample Usage
The following is an example of using the evaluate
command:
tao model centerpose evaluate -e /path/to/spec.yaml -r /path/to/results/ evaluate.checkpoint=/path/to/model.pth
inference
The inference
parameter defines the hyperparameters of the inference process.
inference:
checkpoint: /path/to/model.pth
visualization_threshold: 0.3
principle_point_x: 300.7
principle_point_y: 392.8
focal_length_x: 615.0
focal_length_y: 615.0
skew: 0.0
use_pnp: True
save_json: True
save_visualization: True
opencv: True
Parameter | Datatype | Default | Description | Supported Values |
checkpoint |
string | Path to PyTorch model to inference | ||
visualization_threshold |
float | 0.3 | Confidence threshold to filter predictions | >=0 |
principle_point_x |
float | 300.7 | The principle point x of the intrinsic matrix. Please use the correct camera calibration matrix along with your data | >0 |
principle_point_y |
float | 392.8 | The principle point y of the intrinsic matrix. Please use the correct camera calibration matrix along with your data | >0 |
focal_length_x |
float | 615.0 | The focal length x of the intrinsic matrix. Please use the correct camera calibration matrix along with your data | >0 |
focal_length_y |
float | 615.0 | The focal length y of the intrinsic matrix.Please use the correct camera calibration matrix along with your data | >0 |
skew |
float | 0.0 | The skew of the intrinsic matrix. Please use the correct camera calibration matrix along with your data | >=0 |
use_pnp |
bool | True | The PnP algorithm that used to establish 2D-3D correspondences for solving the 6-DoF pose | True, False |
save_json |
bool | True | Save all the results to local JSON file, including 2d keypoints, 3D keypoints, location, quaternion and relative scale | True, False |
|
bool |
True |
Save the visualization results to local .jpg file, including projected 2d bounding box along with the point order, relative scale and object pose |
True, False |
|
bool |
False |
If |
True, False |
trt_engine |
string | Path to TensorRT model to inference. Should be only used with tao deploy |
The inference tool for CenterPose models can be used to visualize 3D bounding boxes in 2D image plane, the order of points and the object relative dimension. Furthermore, it also generates a frame-by-frame JSON file for recording the results for each image.
tao model centerpose inference [-h] -e <experiment spec file>
[-r <results_dir>]
[-k <key>]
inference.checkpoint=<model to be inferenced>
Required Arguments
-e, --experiment_spec
: The experiment spec file to set up the inference experiment
Optional Arguments
-k, --key
: A user-specific encoding key to save or load a.tlt
model. If this value is not specified, a.pth
model must be used.-r, --results_dir
: The directory where the inference result is stored.inference.checkpoint
: The.tlt
or.pth
model to inference.
Sample Usage
The following is an example of using the inference
command:
tao model centerpose inference -e /path/to/spec.yaml -r /path/to/results/ inference.checkpoint=/path/to/model.pth
export
The export
parameter defines the hyperparameters of the export process.
export:
gpu_id: 0
checkpoint: /path/to/model.pth
onnx_file: /path/to/model.onnx
input_channel: 3
input_width: 512
input_height: 512
opset_version: 16
do_constant_folding: True
Parameter | Datatype | Default | Description | Supported Values |
gpu_id |
unsigned int | 0 | The gpu id for converting the pth model to ONNX model | >=0 |
checkpoint |
string | The path to the PyTorch model to export | ||
onnx_file |
string | The path to the .onnx file |
||
input_channel |
unsigned int | 3 | The input channel size. Only the value 3 is supported. | 3 |
input_width |
unsigned int | 512 | The input width | >0 |
input_height |
unsigned int | 512 | The input height | >0 |
opset_version |
unsigned int | 16 | The opset version of the exported ONNX | >0 |
do_constant_folding |
bool | True | Whether to execute constant folding. If the TensorRT version lower than 8.6, it sets to True | True, False |
tao model centerpose export [-h] -e <experiment spec file>
[-r <results_dir>]
[-k <key>]
export.checkpoint=<model to export>
export.onnx_file=<onnx path>
Required Arguments
-e, --experiment_spec
: The path to an experiment spec file
Optional Arguments
-k, --key
: A user-specific encoding key to save or load a.tlt
model. If this value is not specified, a.pth
model must be used.-r, --results_dir
: The directory where the inference result is stored.export.checkpoint
: The.tlt
or.pth
model to export.export.onnx_file
: The path where the.etlt
or.onnx
model is saved.
Sample Usage
The following is an example of using the export
command:
tao model centerpose export -e /path/to/spec.yaml export.checkpoint=/path/to/model.pth export.onnx_file=/path/to/model.onnx
For deployment, refer to TAO Deploy documentation for CenterPose.