PointPillars

PointPillars is a model for 3D object detection in point cloud data. Unlike images, point cloud data is in-nature a collection of sparse points in 3D space. Each point cloud sample(example) is called a scene(stored as a file with .bin extension here). For each scene, it contains generally a variable number of points in 3D Euclidean space. The shape of the data in a single scene is hence (N, K), where N, represents the number of points in this scene, is generally a variable positive integer; K is the number of features for each point, and should be 4. So the features of each point can be represented as: (x, y, z, r) , where x, y, z, r represents the X coordinate, Y coordinate, Z coordinate, and reflectance(intensity), respectively. Those numbers are all float-point numbers and reflectance(r) is a real number in the interval of [0.0, 1.0] that represents the intensity(fraction) perceived by LIDAR of a laser beam reflected back at some point in 3D space.

An object in 3D euclidean space can be described as a 3D bounding box. Formally, 3D bounding box can be represented by (x, y, z, dx, dy, dz, yaw). The 7 numbers in the tuple represents the X coordinate of object center, Y coordinate of object center, Z coordinate of object center, length (in X direction), width(in Y direction), height(in Z direction) and orientation in 3D Euclidean space , respectively.

To dealing with coordinates of points and objects, a coordinate system is required. In TAO Toolkit PointPillars, the coordinate system is defined as below:

  • Origin of the coordinate system is the center of LIDAR

  • X axis is to the front

  • Y axis is to the left

  • Z axis is to the up

  • yaw is the rotation in the horizontal plane(X-Y plane), in counter-clockwise direction. So X axis corresponds to yaw = 0, and Y axis corresponds to yaw = pi / 2, and so on.

A illustration of the coordinate system is shown below.

Copy
Copied!
            

up z x front (yaw=0) ^ ^ | / | / (yaw=0.5*pi) left y <------ 0

The dataset for PointPillars contains point cloud data and the corresponding annotations of 3D objects. The point cloud data is a directory of point cloud files(in .bin extension) and the annotations is a directory of text files in KITTI format.

The directory structure should be organized as below, where the directory name for point cloud files has to be lidar and the directory name for annotations has to be label. The names of the files in the 2 directory can be arbitrary as long as each .bin file has its unique corresponding .txt file and vice-versa.

Copy
Copied!
            

/lidar 0.bin 1.bin ... /label 0.txt 1.txt ...

Finally, train/val split has to be maintained for PointPillars as usual. So for both training dataset and validation set we have to ensure they follow the same structure described above. So the overall structure should look like below. The exact name train and val are not required but are preferred by convention.

Copy
Copied!
            

/train /lidar 0.bin 1.bin ... /label 0.txt 1.txt ... /val /lidar 0.bin 1.bin ... /label 0.txt 1.txt ...

Each .bin file should comply with the format described above. Each .txt label file should comply to the KITTI format. There is an exception for PointPillars label format compared to standard KITTI format. Although the structure is the same as KITTI, the last field for each object has different interpretation. In KITTI the last field is Rotation_y(rotation around Y-axis in Camera coordinate system), while in PointPillars they are Rotation_z(rotation around Z-axis in LIDAR coordinate system).

Below is an example, we should interpret -1.59, -2.35, -0.03 differently from standard KITTI.

Copy
Copied!
            

car 0.00 0 -1.58 587.01 173.33 614.12 200.12 1.65 1.67 3.64 -0.65 1.71 46.70 -1.59 cyclist 0.00 0 -2.46 665.45 160.00 717.93 217.99 1.72 0.47 1.65 2.45 1.35 22.10 -2.35 pedestrian 0.00 2 0.21 423.17 173.67 433.17 224.03 1.60 0.38 0.30 -5.87 1.63 23.11 -0.03

Note

The interpretation of the label of PointPillars is slightly different from standard KITTI format. In PointPillars the yaw is rotation around Z-axis in LIDAR coordinate system, as defined above, while in standard KITTI interpretation the yaw is rotation around Y-axis in Camera coordinate system. In this way, PointPillars dataset does not depend on Camera information and Camera calibration.

Once the above dataset directory structure is ready, copy and paste the base names to spec file ‘s dataset.data_split dict. For example,

Copy
Copied!
            

{ 'train': train, 'test': val }

Also, set names to the pickle info files in dataset.info_path parameter. For example,

Copy
Copied!
            

{ 'train': ['infos_train.pkl'], 'test': ['infos_val.pkl'], }

Once these are done, the statistics of the dataset should be generated via the dataset_convert command to generate the pickle files above. The pickle files will be used in the data augmentations during training process.

Converting The Dataset

The pickle info files need to be generated based on the original point cloud files and KITTI text label files. This is accomplished by a command line.

Copy
Copied!
            

tao model pointpillars dataset_convert -e $SPECS_DIR/pointpillars.yaml

The -e provides spec file for training, see below.

The spec file for PointPillars includes the dataset, model, train, evaluate, inference, export and prune parameters. Below is an example spec file for training on the KITTI dataset.

Copy
Copied!
            

dataset: class_names: ['Car', 'Pedestrian', 'Cyclist'] type: 'GeneralPCDataset' data_path: '/path/to/tao-experiments/data/pointpillars' data_split: { 'train': train, 'test': val } info_path: { 'train': [infos_train.pkl], 'test': [infos_val.pkl], } balanced_resampling: False point_feature_encoding: { encoding_type: absolute_coordinates_encoding, used_feature_list: ['x', 'y', 'z', 'intensity'], src_feature_list: ['x', 'y', 'z', 'intensity'], } point_cloud_range: [0, -39.68, -3, 69.12, 39.68, 1] data_augmentor: disable_aug_list: ['placeholder'] aug_config_list: - name: gt_sampling db_info_path: - dbinfos_train.pkl preface: { filter_by_min_points: ['Car:5', 'Pedestrian:5', 'Cyclist:5'], } sample_groups: ['Car:15','Pedestrian:15', 'Cyclist:15'] num_point_features: 4 disable_with_fake_lidar: False remove_extra_width: [0.0, 0.0, 0.0] limit_whole_scene: False - name: random_world_flip along_axis_list: ['x'] - name: random_world_rotation world_rot_angle: [-0.78539816, 0.78539816] - name: random_world_scaling world_scale_range: [0.95, 1.05] data_processor: - name: mask_points_and_boxes_outside_range remove_outside_boxes: True - name: shuffle_points shuffle: { 'train': True, 'test': False } - name: transform_points_to_voxels voxel_size: [0.16, 0.16, 4] max_points_per_voxel: 32 max_number_of_voxels: { 'train': 16000, 'test': 10000 } num_workers: 4 model: name: PointPillar pretrained_model_path: null vfe: name: PillarVFE with_distance: False use_absolue_xyz: True use_norm: True num_filters: [64] map_to_bev: name: PointPillarScatter num_bev_features: 64 backbone_2d: name: BaseBEVBackbone layer_nums: [3, 5, 5] layer_strides: [2, 2, 2] num_filters: [64, 128, 256] upsample_strides: [1, 2, 4] num_upsample_filters: [128, 128, 128] dense_head: name: AnchorHeadSingle class_agnostic: False use_direction_classifier: True dir_offset: 0.78539 dir_limit_offset: 0.0 num_dir_bins: 2 anchor_generator_config: [ { 'class_name': 'Car', 'anchor_sizes': [[3.9, 1.6, 1.56]], 'anchor_rotations': [0, 1.57], 'anchor_bottom_heights': [-1.78], 'align_center': False, 'feature_map_stride': 2, 'matched_threshold': 0.6, 'unmatched_threshold': 0.45 }, { 'class_name': 'Pedestrian', 'anchor_sizes': [[0.8, 0.6, 1.73]], 'anchor_rotations': [0, 1.57], 'anchor_bottom_heights': [-0.6], 'align_center': False, 'feature_map_stride': 2, 'matched_threshold': 0.5, 'unmatched_threshold': 0.35 }, { 'class_name': 'Cyclist', 'anchor_sizes': [[1.76, 0.6, 1.73]], 'anchor_rotations': [0, 1.57], 'anchor_bottom_heights': [-0.6], 'align_center': False, 'feature_map_stride': 2, 'matched_threshold': 0.5, 'unmatched_threshold': 0.35 } ] target_assigner_config: name: AxisAlignedTargetAssigner pos_fraction: -1.0 sample_size: 512 norm_by_num_examples: False match_height: False box_coder: ResidualCoder loss_config: loss_weights: { 'cls_weight': 1.0, 'loc_weight': 2.0, 'dir_weight': 0.2, 'code_weights': [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0] } post_processing: recall_thresh_list: [0.3, 0.5, 0.7] score_thresh: 0.1 output_raw_score: False eval_metric: kitti nms_config: multi_classes_nms: False nms_type: nms_gpu nms_thresh: 0.01 nms_pre_max_size: 4096 nms_post_max_size: 500 sync_bn: False train: batch_size: 4 num_epochs: 80 optimizer: adam_onecycle lr: 0.003 weight_decay: 0.01 momentum: 0.9 moms: [0.95, 0.85] pct_start: 0.4 div_factor: 10 decay_step_list: [35, 45] lr_decay: 0.1 lr_clip: 0.0000001 lr_warmup: False warmup_epoch: 1 grad_norm_clip: 10 resume_training_checkpoint_path: null pruned_model_path: "/path/to/pointpillar_workspace/33/pruned_0.5.tlt" tcp_port: 18888 random_seed: null checkpoint_interval: 1 max_checkpoint_save_num: 30 merge_all_iters_to_one_epoch: False evaluate: batch_size: 1 checkpoint: "/path/to/pointpillar_workspace/33/ckpt/checkpoint_epoch_80.tlt" inference: max_points_num: 25000 batch_size: 1 checkpoint: "/path/to/pointpillar_workspace/33/ckpt/checkpoint_epoch_80.tlt" viz_conf_thresh: 0.1 export: gpu_id: 0 checkpoint: "/path/to/tao-experiments/ckpt/checkpoint_epoch_80.tlt" onnx_file: "/path/to/tao-experiments/ckpt/checkpoint_epoch_80.tlt.onnx" prune: model: "/path/to/tlt-experiments/ckpt/checkpoint_epoch_80.tlt"

The top level description of the spec file is provided in the table below.

Parameter Data Type Default Description
class_names list of strings The list of class names in dataset
data_path string The path to the dataset
data_split dict The dict that maps train and test splits to actual directory name
info_path dict The dict that maps train and test splits to actual pickle info name
balanced_resampling bool False Whether or not to enable balanced resampling in data loader
point_feature_encoding Collection The configuration for point feature encoding
point_feature_encoding Collection The configuration for point feature encoding
point_cloud_range list of floats The point cloud coordinates range in [xmin, ymin, zmin, xmax, ymax, zmax] format
data_augmentor Collection The configuration for data augmentation
data_processor Collection The configuration for data processing
num_workers int 1 The number of workers used for data loader

Class Names

The class_names parameter provides the list of object class names in the dataset. It is simply a list of strings.

Dataset

The dataset parameter defines the dataset for training and validation/evaluation of the PointPillars model, described below.

Parameter Data Type Default Description
dataset Collection The configuration of the dataset
model Collection The configuration of the PointPillars model
train Collection The configuration of the training process
inference Collection The configuration for the inference process
evaluate Collection The configuration for the evaluation process
export Collection The configuration for exporting the model
prune Collection The configuration for pruning the model

Point Feature Encoding

Point feature encoding defines how the features of each point are represented. This parameter is fixed for this version and has to be:

Copy
Copied!
            

{ encoding_type: absolute_coordinates_encoding, used_feature_list: ['x', 'y', 'z', 'intensity'], src_feature_list: ['x', 'y', 'z', 'intensity'], }

Data Augmentations

Data augmentation pipelines are defined by the parameter data_augmentor. See table below.

Parameter Data Type Default Description
disable_aug_list List of strings ["placeholder"] The list of augmentations to be disabled
aug_config_list List of collections The list of augmentations, whose name should be gt_sampling, random_world_flip, random_world_rotation, random_world_scaling, in that order

The parameters for gt_sampling is provided below.

Parameter Data Type Default Description
name string gt_sampling The name, has to be gt_sampling
db_info_path List of strings dbinfos_train.pkl The list of db infos for sampling
preface dict Preface of the gt sampling
sample_groups List of strings list of strings to provide per-class sample groups
num_point_features int 4 Number of features for each point
disable_with_fake_lidar bool False Whether the fake LIDAR is enabled
remove_extra_width list of floats Extra widths to remove per-class
limit_whole_scene bool False Whether or not to limit whole scene

The parameters for random_world_flip are described below.

Parameter Data Type Default Description
along_axis_list List of string The axes along which to flip the coordinates

The parameters for random_world_rotation are described below.

Parameter Data Type Default Description
world_rot_angle List of floats The maximum angles to rotate

The parameters for random_world_scaling are described below.

Parameter Data Type Default Description
world_scale_range List of floats The minimum and maximum scaling factors

Data Processing

The dataset processing is defined by the DATA_PROCESSOR parameter.

Parameter Data Type Default Description
data_processor List of collections The list of data processing, should include mask_points_and_boxes_outside_range, shuffle_points, transform_points_to_voxels, in that order

The parameters for mask_points_and_boxes_outside_range are described below.

Parameter Data Type Default Description
name string mask_points_and_boxes_outside_range The name, has to be mask_points_and_boxes_outside_range
remove_outside_boxes bool True Whether or not to remove outside boxes

The parameters for shuffle_points are described below.

Parameter Data Type Default Description
name string shuffle_points The name, has to be shuffle_points
shuffle_enabled dict {'train': True, 'test': False} Dict to enable/disable shuffling for train/val datasets

The parameters for transform_points_to_voxels are described below.

Parameter Data Type Default Description
name string transform_points_to_voxels The name, has to be transform_points_to_voxels
voxel_size List of floats Voxel size in the format [dx, dy, dz]
max_points_per_voxel int 32 Maximum number of points per voxel
max_number_of_voxels dict Dict that provides the maximum number of voxels in training and test/validation mode

Model Architecture

The PointPillars model architecture is defines in the parameter model, detailed in table below.

Parameter Data Type Default Description
name string PointPillar The name, has to be PointPillar
vfe Collection Definition of the voxel feature extractor
map_to_bev Collection Definition of the scatter module
backbone_2d Collection Definition of the 2D backbone
dense_head Collection Definition of the dense head
post_processing Collection Post-processing
sync_bn bool False Enable sync-BN or not

Voxel Feature extractor

The voxel feature extractor is configured by the parameter vfe, described below.

Parameter Data Type Default Description
name string PillarVFE The name, has to be PillarVFE
with_distance bool False With distance or not
use_absolue_xyz bool True Use absolute XYZ coordinates or not
use_norm bool True Use normalization or not
num_filters List of int 64 Number of filters

Scatter

The scattering process is configured by the parameter map_to_bev, described below.

Parameter Data Type Default Description
name string PointPillarScatter The name, has to be PointPillarScatter
num_bev_features int 64 Number of features for bird’s eye view

2D backbone

The 2D backbone is configured by the parameter backbone_2d, described below.

Parameter Data Type Default Description
name string BaseBEVBackbone The name, has to be BaseBEVBackbone
layer_nums List of ints [3, 5, 5] Numbers of layers
layer_strides List of ints [2, 2, 2] The number of strides
num_filters List of ints [64, 128, 256] The numbers of filters
upsample_strides List of ints [1, 2, 4] The upsampling strides
num_upsample_filters List of ints [128, 128, 128] The numbers of upsampling filters

Dense Head

The dense head is configured by the parameter dense_head, described below.

Parameter Data Type Default Description
name string AnchorHeadSingle The name, has to be AnchorHeadSingle
class_agnostic bool False Class agnostic or not
use_direction_classifier bool True Use direction classifier or not
dir_offset float 0.78539 Direction offset
dir_limit_offset float 0.0 Direction limit offset
num_dir_bins int 2 The numbers of direction bins
anchor_generator_config List of dict The config for per-class anchor generator
target_assigner_config Collection Config for target assigner
loss_config Collection Config for loss function

The parameters of anchor_generator_config is a list of dicts. Each dict follows the same format, described below.

Copy
Copied!
            

{ 'class_name': 'Car', 'anchor_sizes': [[3.9, 1.6, 1.56]], 'anchor_rotations': [0, 1.57], 'anchor_bottom_heights': [-1.78], 'align_center': False, 'feature_map_stride': 2, 'matched_threshold': 0.6, 'unmatched_threshold': 0.45 }

The parameters of target_assigner_config are described below.

Parameter Data Type Default Description
name string AxisAlignedTargetAssigner The name, has to be AxisAlignedTargetAssigner
pos_fraction float -1.0 Positive fraction
sample_size int 512 Sample size
norm_by_num_examples bool False Normalize by number of examples or not
match_height bool False Match height or not
box_coder string ResidualCoder The name of the box coder

The parameters for loss_config are described below.

Parameter Data Type Default Description
loss_weights dict The dict to provide loss weighting factors

The loss_weights dict should be in the format below.

Copy
Copied!
            

{ 'cls_weight': 1.0, 'loc_weight': 2.0, 'dir_weight': 0.2, 'code_weights': [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0] }

Post Processing

The post-processing is defined in the parameter post_processing, described below.

Parameter Data Type Default Description
recall_thresh_list List of floats The dict to provide loss weighting factors
score_thresh float 0.1 The score threshold
output_raw_score bool False Output raw score or not
eval_metric string kitti The evaluation metric, only kitti is supported
nms_config Collection The NMS config

The Non-Maximum Suppression(NMS) is configured by the nms_config parameter, described below.

Parameter Data Type Default Description
multi_classes_nms bool False Multi-class NMS or not
nms_type string nms_gpu The NMS type
nms_thresh float 0.01 The NMS IoU threshold
nms_pre_maxsize int Pre-NMS maximum number of boxes
nms_post_maxsize int Post-NMS maximum number of boxes

Training Process

The train parameter defines the hyper-parameters of the training process.

Parameter Datatype Default Description Supported Values
batch_size_per_gpu int 4 The batch size per GPU >=1
num_epochs int 80 The number of epochs to train the model >=1
optimizer string adam_onecycle The optimizer name(type) adam_onecycle
lr float 0.003 The initial learning rate >0.0
weight_decay float 0.01 Weight decay >0.0
momentum float 0.9 Momentum for SGD optimizer >0, <1
moms List of floats [0.95, 0.85] Momentums for One Cycle learning rate scheduler [0.95, 0.85]
pct_start float 0.4 The percentage of the cycle spent increasing the learning rate 0.4
div_factor float 10.0 Division factor 10.0
decay_step_list list of ints [35, 45] The list of epoch number on which to decay learning rate list whose elements < NUM_EPOCHS
lr_decay float 0.1 The decay of learning rate >0, <1
lr_clip float 0.0000001 Minimum value of learning rate >0, <1
lr_warmup bool False Enable learning rate warm up or not True/False
warmup_epoch int 1 Number of epochs to warm up learning rate >=1
grad_norm_clip float 10.0 The limit to apply gradient norm clip >0
resume_model_path string The path of model to resume training Unix path
pretrained_model_path string The path to the pretrained model Unix path
pruned_model_path string The path to the pruned model for retrain Unix path
tcp_port int 18888 TCP port for multi-gpu training 18888
random_seed int Random seed integer
checkpoint_interval int 1 Interval of epochs to save checkpoints >=1
max_checkpoint_save_num int 1 The maximum number of checkpoints to save >=1
merge_all_iters_to_one_epoch bool False Merge all training steps in one epoch or not False

Evaluation

The evaluation parameter defines the hyper-parameters of the evaluation process. The metric of evaluation is mAP(3D and BEV).

Parameter Datatype Default/Suggested value Description Supported Values
batch_size int 1 The batch size of evaluation >=1
checkpoint string The path to the model to run evaluation Unix path

Inference

The inference parameter defines the hyper-parameters of the inference process. Inference will draw bounding boxes and visualize it on images.

Parameter Datatype Default/Suggested value Description Supported Values
batch_size int 1 The batch size of inference >=1
checkpoint string The path to the model to run inference Unix path
max_points_num int Maximum number of points in a point cloud file >=1
vis_conf_thresh float 0.1 Visualization confidence threshold >0, <1

Export

The export parameter defines the hyper-parameters of the export process.

Parameter Datatype Default/Suggested value Description Supported Values
gpu_id int 0 The index of the GPU to be used >=0
checkpoint string The path to the model to run export Unix path
onnx_file string The output path to the exported model Unix path

Prune

The prune parameter defines the hyper-parameters of the pruning process.

Parameter Datatype Default/Suggested value Description Supported Values
model string The path to the model to be pruned Unix path

Use the following command to run PointPillars training:

Copy
Copied!
            

tao model pointpillars train -e <experiment_spec_file> -r <results_dir> -k <key> [--gpus <num_gpus>] [-h, --help]

Required Arguments

  • -e, --experiment_spec_file: The path to the experiment spec file

  • -r, --results_dir: The path to a folder where the experiment outputs should be written.

  • -k, --key: The user-specific encoding key to save or load a .tlt model.

Optional Arguments

  • --gpus: The number of GPUs to be used in the training in a multi-GPU scenario (default: 1).

  • -h, --help: Show this help message and exit.

Here’s an example of using the PointPillars training command:

Copy
Copied!
            

tao model pointpillars train -e $DEFAULT_SPEC -r $RESULTS_DIR -k $YOUR_KEY

The evaluation metric of PointPillars is mAP(BEV and 3D).

Use the following command to run PointPillars evaluation:

Copy
Copied!
            

tao model pointpillars evaluate -e <experiment_spec_file> -k <key> -r <results_dir> [--trt_engine <trt_engine_file>] [-h, --help]

Required Arguments

  • -e, --experiment_spec_file: Experiment spec file to set up the evaluation experiment. This should be the same as a training spec file.

  • -r, --results_dir: The path to a folder where the experiment outputs should be written.

  • -k, --key: The user-specific encoding key to save or load a .tlt model.

Optional Arguments

  • --trt_engine: Path to the TensorRT engine file to load for evaluation.

  • -h, --help: Show this help message and exit.

Here’s an example of using the PointPillars evaluation command:

Copy
Copied!
            

tao model pointpillars evaluate -e $DEFAULT_SPEC -r $RESULTS_DIR -k $YOUR_KEY

Note

The evaluation metric in TAO PointPillars is different from that in official metric of KITTI point cloud detection. While KITTI metric considers easy/moderate/hard categories of objects and filters small objects whose sizes are smaller than a threshold, it is only meaningful for KITTI dataset. Instead, TAO PointPillars metric is a general metric that does not classify objects into easy/moderate/hard categories and does not exclude objects in calculation of metric. This makes TAO PointPillars metric a general metric that is applicable to a general dataset. The final result is average precision(AP) and mean average precision(mAP) regardless of its details in computation. Due to this, the TAO PointPillars metric is not comparable with KITTI official metric on KITTI dataset, although they should be roughly the same.

Use the following command to run inference on PointPillars with .tlt model or TensorRT engine:

Copy
Copied!
            

tao model pointpillars evaluate -e <experiment_spec_file> -k <key> -r <results_dir> [--trt_engine <trt_engine_file>] [-h, --help]

Required Arguments

  • -e, --experiment_spec_file: Experiment spec file to set up the inference experiment. This should be the same as a training spec file.

  • -r, --results_dir: The path to a folder where the experiment outputs should be written.

  • -k, --key: The user-specific encoding key to save or load a .tlt model.

Optional Arguments

  • --trt_engine: Path to the TensorRT engine file to load for inference.

  • -h, --help: Show this help message and exit.

Here’s an example of using the PointPillars inference command:

Copy
Copied!
            

tao model pointpillars inference -e $DEFAULT_SPEC -r $RESULTS_DIR -k $YOUR_KEY

TAO PointPillars models supports model pruning. Model pruning reduces model parameters and hence can improve inference frame per second(FPS) on NVIDIA GPUs while maintaining (almost) the same accuracy(mAP).

Pruning is applied to an already trained PointPillars model. The pruning will output a new model with fewer number of parameters in it. Once we have the pruned model, it is necessary to do finetune on the same dataset to bring back the accuracy(mAP). Finetune is simply running training again but with the pruned model as its pretrained model.

Use the following command to run pruning on the PointPillars .tlt model.

Copy
Copied!
            

tao model pointpillars prune -e <experiment_spec_file> \ -r <results_dir> \ -k <key> \ -m <path_to_tlt_model_to_prune> \ -pth <pruning_threshold>

Required Arguments

  • -e, --experiment_spec_file: Experiment spec file to set up the inference experiment. This should be the same as a training spec file.

  • -r, --results_dir: The path to a folder where the experiment outputs should be written.

  • -k, --key: The user-specific encoding key to save or load a .tlt model.

  • -m, --model: The path to the .tlt model to prune.

Optional Arguments

  • -pth, --pruning_thresh: Pruning threshold, should be a float number between 0-1. Defaults to 0.1.

After pruning, the pruned model can be used for retrain(finetune). To start the retrain, we simply provide the path to the pruned model in config file as the parameter OPTIMIZATION.PRUNED_MODEL_PATH and then start the training command as mentioned above.

Use the following command to export PointPillars to .etlt format for deployment:

Copy
Copied!
            

tao model pointpillars export -m <model> -k <key> -e <experiment_spec> [-o <output_file>] [--data_type {fp32,fp16}] [--workspace_size <workspace_size>] [--batch_size <batch_size>] [--save_engine <engine_file>] [-h, --help]

Required Arguments

  • -m, --model: The .tlt model to be exported.

  • -k, --key: The encoding key of the .tlt model.

  • -e, --experiment_spec: Experiment spec file to set up export. Can be the same as the training spec.

Optional Arguments

  • -o, --output_model: The path to save the exported model to. The default is ./<input_file>.etlt.

  • -h, --help: Show this help message and exit.

You can use the following optional arguments to save the TRT engine that is generated to verify export:

  • -b, --batch_size: The batch size of TensorRT engine. The default value is 1.

  • -w, --workspace_size: The workspace size of the TensorRT engine in MB. The default value is 1024, i.e., 1GB.

  • --save_engine: The path to the serialized TensorRT engine file. Note that this file is hardware specific and cannot be generalized across GPUs. Useful to quickly test your model accuracy using TensorRT on the host. As the TensorRT engine file is hardware specific, you cannot use this engine file for deployment unless the deployment GPU is identical to the training GPU.

  • -t, --data_type: The desired engine data type. The options are fp32 or fp16. The default value is fp32.

Here’s an example for using the PointPillars export command:

Copy
Copied!
            

tao model pointpillars export -m $TRAINED_TAO_MODEL -e $DEFAULT_SPEC -k $YOUR_KEY

The PointPillars models that you trained can be deployed on edge devices, such as a Jetson Xavier, Jetson Nano, or Tesla, or in the cloud with NVIDIA GPUs.

DeepStream SDK is currently does not support deployment of PointPillars models. Instead, the PointPillars models can only be deployed via a standalone TensorRT application. A TensorRT sample is developed as a demo to show how to deploy PointPillars models trained in TAO Toolkit.

Using trtexec

For instructions on generating a TensorRT engine using the trtexec command, refer to the trtexec guide for ReIdentificationNet.

Previous 3D Object Detection
Next Integrating TAO CV Models with Triton Inference Server
© Copyright 2024, NVIDIA. Last updated on Mar 22, 2024.