PointPillars#

PointPillars is a model for 3D object detection in point cloud data. Unlike images, point cloud data is in-nature a collection of sparse points in 3D space. Each point cloud sample(example) is called a scene(stored as a file with .bin extension here). For each scene, it contains generally a variable number of points in 3D Euclidean space. The shape of the data in a single scene is hence (N, K), where N, represents the number of points in this scene, is generally a variable positive integer; K is the number of features for each point, and should be 4. So the features of each point can be represented as: (x, y, z, r) , where x, y, z, r represents the X coordinate, Y coordinate, Z coordinate, and reflectance(intensity), respectively. Those numbers are all float-point numbers and reflectance(r) is a real number in the interval of [0.0, 1.0] that represents the intensity(fraction) perceived by LIDAR of a laser beam reflected back at some point in 3D space.

An object in 3D euclidean space can be described as a 3D bounding box. Formally, 3D bounding box can be represented by (x, y, z, dx, dy, dz, yaw). The 7 numbers in the tuple represents the X coordinate of object center, Y coordinate of object center, Z coordinate of object center, length (in X direction), width(in Y direction), height(in Z direction) and orientation in 3D Euclidean space , respectively.

To dealing with coordinates of points and objects, a coordinate system is required. In TAO PointPillars, the coordinate system is defined as below:

Origin of the coordinate system is the center of LIDAR
X axis is to the front
Y axis is to the left
Z axis is to the up
yaw is the rotation in the horizontal plane(X-Y plane), in counter-clockwise direction. So X axis corresponds to yaw = 0, and Y axis corresponds to yaw = pi / 2, and so on.

A illustration of the coordinate system is shown below.

                         up z    x front (yaw=0)
                            ^   ^
                            |  /
                            | /
(yaw=0.5*pi) left y <------ 0

Each task is explained in detail in the following sections.

Note

Throughout this documentation, you will see references to $EXPERIMENT_ID and $DATASET_ID in the FTMS Client sections.
- For instructions on creating a dataset using the remote client, see the Creating a dataset section in the Remote Client documentation.
- For instructions on creating an experiment using the remote client, see the Creating an experiment section in the Remote Client documentation.
The spec format is YAML for TAO Launcher and JSON for FTMS Client.
File-related parameters, such as dataset paths or pretrained model paths, are required only for TAO Launcher and not for FTMS Client.

Preparing the Dataset#

The dataset for PointPillars contains point cloud data and the corresponding annotations of 3D objects. The point cloud data is a directory of point cloud files(in .bin extension) and the annotations is a directory of text files in KITTI format.

The directory structure should be organized as below, where the directory name for point cloud files has to be lidar and the directory name for annotations has to be label. The names of the files in the 2 directory can be arbitrary as long as each .bin file has its unique corresponding .txt file and vice-versa.

/lidar
  0.bin
  1.bin
  ...
/label
  0.txt
  1.txt
  ...

Finally, train/val split has to be maintained for PointPillars as usual. So for both training dataset and validation set we have to ensure they follow the same structure described above. So the overall structure should look like below. The exact name train and val are not required but are preferred by convention.

/train
  /lidar
    0.bin
    1.bin
    ...
  /label
    0.txt
    1.txt
    ...
/val
  /lidar
    0.bin
    1.bin
    ...
  /label
    0.txt
    1.txt
    ...

Each .bin file should comply with the format described above. Each .txt label file should comply to the KITTI format. There is an exception for PointPillars label format compared to standard KITTI format. Although the structure is the same as KITTI, the last field for each object has different interpretation. In KITTI the last field is Rotation_y(rotation around Y-axis in Camera coordinate system), while in PointPillars they are Rotation_z(rotation around Z-axis in LIDAR coordinate system).

Below is an example, we should interpret -1.59, -2.35, -0.03 differently from standard KITTI.

car 0.00 0 -1.58 587.01 173.33 614.12 200.12 1.65 1.67 3.64 -0.65 1.71 46.70 -1.59
cyclist 0.00 0 -2.46 665.45 160.00 717.93 217.99 1.72 0.47 1.65 2.45 1.35 22.10 -2.35
pedestrian 0.00 2 0.21 423.17 173.67 433.17 224.03 1.60 0.38 0.30 -5.87 1.63 23.11 -0.03

Note

The interpretation of the label of PointPillars is slightly different from standard KITTI format. In PointPillars the yaw is rotation around Z-axis in LIDAR coordinate system, as defined above, while in standard KITTI interpretation the yaw is rotation around Y-axis in Camera coordinate system. In this way, PointPillars dataset does not depend on Camera information and Camera calibration.

Once the above dataset directory structure is ready, copy and paste the base names to spec file ‘s dataset.data_split dict. For example,

{
  'train': train,
  'test': val
}

Also, set names to the pickle info files in dataset.info_path parameter. For example,

{
  'train': ['infos_train.pkl'],
  'test': ['infos_val.pkl'],
}

Once these are done, the statistics of the dataset should be generated via the dataset_convert command to generate the pickle files above. The pickle files will be used in the data augmentations during training process.

Converting The Dataset#

The pickle info files need to be generated based on the original point cloud files and KITTI text label files. This is accomplished by a command line.

DATASET_CONVERT_JOB_ID=$(tao-client pointpillars dataset-run-action --action dataset_convert --id $DATASET_ID --specs "$SPECS")

tao model pointpillars dataset_convert -e $SPECS_DIR/pointpillars.yaml

The -e provides spec file for training, see below.

Creating an Experiment Spec File#

The spec file for PointPillars includes the dataset, model, train, evaluate, inference, export and prune parameters. Below is an example spec file for training on the KITTI dataset.

SPECS=$(tao-client pointpillars get-spec --action train --job_type experiment --id $EXPERIMENT_ID)

dataset:
    class_names: ['Car', 'Pedestrian', 'Cyclist']
    type: 'GeneralPCDataset'
    data_path: '/path/to/tao-experiments/data/pointpillars'
    data_split: {
        'train': train,
        'test': val
    }
    info_path: {
        'train': [infos_train.pkl],
        'test': [infos_val.pkl],
    }
    balanced_resampling: False
    point_feature_encoding: {
        encoding_type: absolute_coordinates_encoding,
        used_feature_list: ['x', 'y', 'z', 'intensity'],
        src_feature_list: ['x', 'y', 'z', 'intensity'],
    }
    point_cloud_range: [0, -39.68, -3, 69.12, 39.68, 1]
    data_augmentor:
        disable_aug_list: ['placeholder']
        aug_config_list:
            - name: gt_sampling
              db_info_path:
                  - dbinfos_train.pkl
              preface: {
                filter_by_min_points: ['Car:5', 'Pedestrian:5', 'Cyclist:5'],
              }
              sample_groups: ['Car:15','Pedestrian:15', 'Cyclist:15']
              num_point_features: 4
              disable_with_fake_lidar: False
              remove_extra_width: [0.0, 0.0, 0.0]
              limit_whole_scene: False
            - name: random_world_flip
              along_axis_list: ['x']
            - name: random_world_rotation
              world_rot_angle: [-0.78539816, 0.78539816]
            - name: random_world_scaling
              world_scale_range: [0.95, 1.05]
    data_processor:
        - name: mask_points_and_boxes_outside_range
          remove_outside_boxes: True
        - name: shuffle_points
          shuffle: {
              'train': True,
              'test': False
          }
        - name: transform_points_to_voxels
          voxel_size: [0.16, 0.16, 4]
          max_points_per_voxel: 32
          max_number_of_voxels: {
              'train': 16000,
              'test': 10000
          }
    num_workers: 4

model:
    name: PointPillar
    pretrained_model_path: null
    vfe:
        name: PillarVFE
        with_distance: False
        use_absolue_xyz: True
        use_norm: True
        num_filters: [64]
    map_to_bev:
        name: PointPillarScatter
        num_bev_features: 64
    backbone_2d:
        name: BaseBEVBackbone
        layer_nums: [3, 5, 5]
        layer_strides: [2, 2, 2]
        num_filters: [64, 128, 256]
        upsample_strides: [1, 2, 4]
        num_upsample_filters: [128, 128, 128]
    dense_head:
        name: AnchorHeadSingle
        class_agnostic: False
        use_direction_classifier: True
        dir_offset: 0.78539
        dir_limit_offset: 0.0
        num_dir_bins: 2
        anchor_generator_config: [
            {
                'class_name': 'Car',
                'anchor_sizes': [[3.9, 1.6, 1.56]],
                'anchor_rotations': [0, 1.57],
                'anchor_bottom_heights': [-1.78],
                'align_center': False,
                'feature_map_stride': 2,
                'matched_threshold': 0.6,
                'unmatched_threshold': 0.45
            },
            {
                'class_name': 'Pedestrian',
                'anchor_sizes': [[0.8, 0.6, 1.73]],
                'anchor_rotations': [0, 1.57],
                'anchor_bottom_heights': [-0.6],
                'align_center': False,
                'feature_map_stride': 2,
                'matched_threshold': 0.5,
                'unmatched_threshold': 0.35
            },
            {
                'class_name': 'Cyclist',
                'anchor_sizes': [[1.76, 0.6, 1.73]],
                'anchor_rotations': [0, 1.57],
                'anchor_bottom_heights': [-0.6],
                'align_center': False,
                'feature_map_stride': 2,
                'matched_threshold': 0.5,
                'unmatched_threshold': 0.35
            }
        ]
        target_assigner_config:
            name: AxisAlignedTargetAssigner
            pos_fraction: -1.0
            sample_size: 512
            norm_by_num_examples: False
            match_height: False
            box_coder: ResidualCoder
        loss_config:
            loss_weights: {
                'cls_weight': 1.0,
                'loc_weight': 2.0,
                'dir_weight': 0.2,
                'code_weights': [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
            }
    post_processing:
        recall_thresh_list: [0.3, 0.5, 0.7]
        score_thresh: 0.1
        output_raw_score: False
        eval_metric: kitti
        nms_config:
            multi_classes_nms: False
            nms_type: nms_gpu
            nms_thresh: 0.01
            nms_pre_max_size: 4096
            nms_post_max_size: 500
    sync_bn: False

train:
    batch_size: 4
    num_epochs: 80
    optimizer: adam_onecycle
    lr: 0.003
    weight_decay: 0.01
    momentum: 0.9
    moms: [0.95, 0.85]
    pct_start: 0.4
    div_factor: 10
    decay_step_list: [35, 45]
    lr_decay: 0.1
    lr_clip: 0.0000001
    lr_warmup: False
    warmup_epoch: 1
    grad_norm_clip: 10
    resume_training_checkpoint_path: null
    pruned_model_path: "/path/to/pointpillar_workspace/33/pruned_0.5.tlt"
    tcp_port: 18888
    random_seed: null
    checkpoint_interval: 1
    max_checkpoint_save_num: 30
    merge_all_iters_to_one_epoch: False

evaluate:
    batch_size: 1
    checkpoint: "/path/to/pointpillar_workspace/33/ckpt/checkpoint_epoch_80.tlt"

inference:
    max_points_num: 25000
    batch_size: 1
    checkpoint: "/path/to/pointpillar_workspace/33/ckpt/checkpoint_epoch_80.tlt"
    viz_conf_thresh: 0.1

export:
  gpu_id: 0
  checkpoint: "/path/to/tao-experiments/ckpt/checkpoint_epoch_80.tlt"
  onnx_file: "/path/to/tao-experiments/ckpt/checkpoint_epoch_80.tlt.onnx"

prune:
  model: "/path/to/tlt-experiments/ckpt/checkpoint_epoch_80.tlt"

The top level description of the spec file is provided in the table below.

Parameter	Data Type	Default	Description
`class_names`	list of strings	–	The list of class names in dataset
`data_path`	string	–	The path to the dataset
`data_split`	dict	–	The dict that maps `train` and `test` splits to actual directory name
`info_path`	dict	–	The dict that maps `train` and `test` splits to actual pickle info name
`balanced_resampling`	bool	False	Whether or not to enable balanced resampling in data loader
`point_feature_encoding`	Collection	–	The configuration for point feature encoding
`point_feature_encoding`	Collection	–	The configuration for point feature encoding
`point_cloud_range`	list of floats	–	The point cloud coordinates range in [xmin, ymin, zmin, xmax, ymax, zmax] format
`data_augmentor`	Collection	–	The configuration for data augmentation
`data_processor`	Collection	–	The configuration for data processing
`num_workers`	int	1	The number of workers used for data loader

Class Names#

The class_names parameter provides the list of object class names in the dataset. It is simply a list of strings.

Dataset#

The dataset parameter defines the dataset for training and validation/evaluation of the PointPillars model, described below.

Parameter	Data Type	Default	Description
`dataset`	Collection	–	The configuration of the dataset
`model`	Collection	–	The configuration of the PointPillars model
`train`	Collection	–	The configuration of the training process
`inference`	Collection	–	The configuration for the inference process
`evaluate`	Collection	–	The configuration for the evaluation process
`export`	Collection	–	The configuration for exporting the model
`prune`	Collection	–	The configuration for pruning the model

Point Feature Encoding#

Point feature encoding defines how the features of each point are represented. This parameter is fixed for this version and has to be:

{
  encoding_type: absolute_coordinates_encoding,
  used_feature_list: ['x', 'y', 'z', 'intensity'],
  src_feature_list: ['x', 'y', 'z', 'intensity'],
}

Data Augmentations#

Data augmentation pipelines are defined by the parameter data_augmentor. See table below.

Parameter	Data Type	Default	Description
`disable_aug_list`	List of strings	`["placeholder"]`	The list of augmentations to be disabled
`aug_config_list`	List of collections	–	The list of augmentations, whose name should be `gt_sampling, random_world_flip, random_world_rotation, random_world_scaling`, in that order

The parameters for gt_sampling is provided below.

Parameter	Data Type	Default	Description
`name`	string	`gt_sampling`	The name, has to be `gt_sampling`
`db_info_path`	List of strings	`dbinfos_train.pkl`	The list of db infos for sampling
`preface`	dict	–	Preface of the gt sampling
`sample_groups`	List of strings	–	list of strings to provide per-class sample groups
`num_point_features`	int	4	Number of features for each point
`disable_with_fake_lidar`	bool	False	Whether the fake LIDAR is enabled
`remove_extra_width`	list of floats	–	Extra widths to remove per-class
`limit_whole_scene`	bool	False	Whether or not to limit whole scene

The parameters for random_world_flip are described below.

Parameter	Data Type	Default	Description
`along_axis_list`	List of string	–	The axes along which to flip the coordinates

The parameters for random_world_rotation are described below.

Parameter	Data Type	Default	Description
`world_rot_angle`	List of floats	–	The maximum angles to rotate

The parameters for random_world_scaling are described below.

Parameter	Data Type	Default	Description
`world_scale_range`	List of floats	–	The minimum and maximum scaling factors

Data Processing#

The dataset processing is defined by the DATA_PROCESSOR parameter.

Parameter	Data Type	Default	Description
`data_processor`	List of collections	–	The list of data processing, should include `mask_points_and_boxes_outside_range, shuffle_points, transform_points_to_voxels`, in that order

The parameters for mask_points_and_boxes_outside_range are described below.

Parameter	Data Type	Default	Description
`name`	string	`mask_points_and_boxes_outside_range`	The name, has to be `mask_points_and_boxes_outside_range`
`remove_outside_boxes`	bool	True	Whether or not to remove outside boxes

The parameters for shuffle_points are described below.

Parameter	Data Type	Default	Description
`name`	string	`shuffle_points`	The name, has to be `shuffle_points`
`shuffle_enabled`	dict	`{'train': True, 'test': False}`	Dict to enable/disable shuffling for train/val datasets

The parameters for transform_points_to_voxels are described below.

Parameter	Data Type	Default	Description
`name`	string	`transform_points_to_voxels`	The name, has to be `transform_points_to_voxels`
`voxel_size`	List of floats	–	Voxel size in the format `[dx, dy, dz]`
`max_points_per_voxel`	int	32	Maximum number of points per voxel
`max_number_of_voxels`	dict	–	Dict that provides the maximum number of voxels in training and test/validation mode

Model Architecture#

The PointPillars model architecture is defines in the parameter model, detailed in table below.

Parameter	Data Type	Default	Description
`name`	string	`PointPillar`	The name, has to be `PointPillar`
`vfe`	Collection	–	Definition of the voxel feature extractor
`map_to_bev`	Collection	–	Definition of the scatter module
`backbone_2d`	Collection	–	Definition of the 2D backbone
`dense_head`	Collection	–	Definition of the dense head
`post_processing`	Collection	–	Post-processing
`sync_bn`	bool	False	Enable sync-BN or not

Voxel Feature extractor#

The voxel feature extractor is configured by the parameter vfe, described below.

Parameter	Data Type	Default	Description
`name`	string	`PillarVFE`	The name, has to be `PillarVFE`
`with_distance`	bool	False	With distance or not
`use_absolue_xyz`	bool	True	Use absolute XYZ coordinates or not
`use_norm`	bool	True	Use normalization or not
`num_filters`	List of int	64	Number of filters

Scatter#

The scattering process is configured by the parameter map_to_bev, described below.

Parameter	Data Type	Default	Description
`name`	string	`PointPillarScatter`	The name, has to be `PointPillarScatter`
`num_bev_features`	int	64	Number of features for bird’s eye view

2D backbone#

The 2D backbone is configured by the parameter backbone_2d, described below.

Parameter	Data Type	Default	Description
`name`	string	`BaseBEVBackbone`	The name, has to be `BaseBEVBackbone`
`layer_nums`	List of ints	[3, 5, 5]	Numbers of layers
`layer_strides`	List of ints	[2, 2, 2]	The number of strides
`num_filters`	List of ints	[64, 128, 256]	The numbers of filters
`upsample_strides`	List of ints	[1, 2, 4]	The upsampling strides
`num_upsample_filters`	List of ints	[128, 128, 128]	The numbers of upsampling filters

Dense Head#

The dense head is configured by the parameter dense_head, described below.

Parameter	Data Type	Default	Description
`name`	string	`AnchorHeadSingle`	The name, has to be `AnchorHeadSingle`
`class_agnostic`	bool	False	Class agnostic or not
`use_direction_classifier`	bool	True	Use direction classifier or not
`dir_offset`	float	0.78539	Direction offset
`dir_limit_offset`	float	0.0	Direction limit offset
`num_dir_bins`	int	2	The numbers of direction bins
`anchor_generator_config`	List of dict	–	The config for per-class anchor generator
`target_assigner_config`	Collection	–	Config for target assigner
`loss_config`	Collection	–	Config for loss function

The parameters of anchor_generator_config is a list of dicts. Each dict follows the same format, described below.

{
  'class_name': 'Car',
  'anchor_sizes': [[3.9, 1.6, 1.56]],
  'anchor_rotations': [0, 1.57],
  'anchor_bottom_heights': [-1.78],
  'align_center': False,
  'feature_map_stride': 2,
  'matched_threshold': 0.6,
  'unmatched_threshold': 0.45
}

The parameters of target_assigner_config are described below.

Parameter	Data Type	Default	Description
`name`	string	`AxisAlignedTargetAssigner`	The name, has to be `AxisAlignedTargetAssigner`
`pos_fraction`	float	-1.0	Positive fraction
`sample_size`	int	512	Sample size
`norm_by_num_examples`	bool	False	Normalize by number of examples or not
`match_height`	bool	False	Match height or not
`box_coder`	string	ResidualCoder	The name of the box coder

The parameters for loss_config are described below.

Parameter	Data Type	Default	Description
`loss_weights`	dict	–	The dict to provide loss weighting factors

The loss_weights dict should be in the format below.

{
  'cls_weight': 1.0,
  'loc_weight': 2.0,
  'dir_weight': 0.2,
  'code_weights': [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
}

Post Processing#

The post-processing is defined in the parameter post_processing, described below.

Parameter	Data Type	Default	Description
`recall_thresh_list`	List of floats	–	The dict to provide loss weighting factors
`score_thresh`	float	0.1	The score threshold
`output_raw_score`	bool	False	Output raw score or not
`eval_metric`	string	`kitti`	The evaluation metric, only `kitti` is supported
`nms_config`	Collection	–	The NMS config

The Non-Maximum Suppression(NMS) is configured by the nms_config parameter, described below.

Parameter	Data Type	Default	Description
`multi_classes_nms`	bool	False	Multi-class NMS or not
`nms_type`	string	`nms_gpu`	The NMS type
`nms_thresh`	float	0.01	The NMS IoU threshold
`nms_pre_maxsize`	int	–	Pre-NMS maximum number of boxes
`nms_post_maxsize`	int	–	Post-NMS maximum number of boxes

Training Process#

The train parameter defines the hyper-parameters of the training process.

Parameter	Datatype	Default	Description	Supported Values
`batch_size_per_gpu`	int	4	The batch size per GPU	>=1
`num_epochs`	int	80	The number of epochs to train the model	>=1
`optimizer`	string	`adam_onecycle`	The optimizer name(type)	`adam_onecycle`
`lr`	float	0.003	The initial learning rate	>0.0
`weight_decay`	float	0.01	Weight decay	>0.0
`momentum`	float	0.9	Momentum for SGD optimizer	>0, <1
`moms`	List of floats	[0.95, 0.85]	Momentums for One Cycle learning rate scheduler	[0.95, 0.85]
`pct_start`	float	0.4	The percentage of the cycle spent increasing the learning rate	0.4
`div_factor`	float	10.0	Division factor	10.0
`decay_step_list`	list of ints	[35, 45]	The list of epoch number on which to decay learning rate	list whose elements < NUM_EPOCHS
`lr_decay`	float	0.1	The decay of learning rate	>0, <1
`lr_clip`	float	0.0000001	Minimum value of learning rate	>0, <1
`lr_warmup`	bool	False	Enable learning rate warm up or not	True/False
`warmup_epoch`	int	1	Number of epochs to warm up learning rate	>=1
`grad_norm_clip`	float	10.0	The limit to apply gradient norm clip	>0
`resume_model_path`	string	–	The path of model to resume training	Unix path
`pretrained_model_path`	string	–	The path to the pretrained model	Unix path
`pruned_model_path`	string	–	The path to the pruned model for retrain	Unix path
`tcp_port`	int	18888	TCP port for multi-gpu training	18888
`random_seed`	int	–	Random seed	integer
`checkpoint_interval`	int	1	Interval of epochs to save checkpoints	>=1
`max_checkpoint_save_num`	int	1	The maximum number of checkpoints to save	>=1
`merge_all_iters_to_one_epoch`	bool	False	Merge all training steps in one epoch or not	False

Evaluation#

The evaluation parameter defines the hyper-parameters of the evaluation process. The metric of evaluation is mAP(3D and BEV).

Parameter	Datatype	Default/Suggested value	Description	Supported Values
`batch_size`	int	1	The batch size of evaluation	>=1
`checkpoint`	string	–	The path to the model to run evaluation	Unix path

Inference#

The inference parameter defines the hyper-parameters of the inference process. Inference will draw bounding boxes and visualize it on images.

Parameter	Datatype	Default/Suggested value	Description	Supported Values
`batch_size`	int	1	The batch size of inference	>=1
`checkpoint`	string	–	The path to the model to run inference	Unix path
`max_points_num`	int	–	Maximum number of points in a point cloud file	>=1
`vis_conf_thresh`	float	0.1	Visualization confidence threshold	>0, <1

Export#

The export parameter defines the hyper-parameters of the export process.

Parameter	Datatype	Default/Suggested value	Description	Supported Values
`gpu_id`	int	0	The index of the GPU to be used	>=0
`checkpoint`	string	–	The path to the model to run export	Unix path
`onnx_file`	string	–	The output path to the exported model	Unix path

Prune#

The prune parameter defines the hyper-parameters of the pruning process.

Parameter	Datatype	Default/Suggested value	Description	Supported Values
`model`	string	–	The path to the model to be pruned	Unix path

Training the Model#

Use the following command to run PointPillars training:

TRAIN_JOB_ID=$(tao-client pointpillars experiment-run-action --action train --id $EXPERIMENT_ID --specs "$SPECS" --parent_job_id $DATASET_CONVERT_JOB_ID)

tao model pointpillars train -e <experiment_spec_file>
                     [-h, --help]
                     [results_dir=<global_results_dir>]
                     [model.<model_option>=<model_option_value>]
                     [dataset.<dataset_option>=<dataset_option_value>]
                     [train.<train_option>=<train_option_value>]
                     [train.gpu_ids=<gpu indices>]
                     [train.num_gpus=<number of gpus>]

Required Arguments

The following arguments are required.

-e, --experiment_spec_file: The path to the experiment spec file

Optional Arguments

The following arguments are optional to run the command.

-h, --help: show this help message and exit.
model.<model_option>: the model options.
dataset.<dataset_option>: the dataset options.
train.<train_option>: the train options.

Note

For training, evaluation, and inference, we expose 2 variables for each respective task: num_gpus and gpu_ids, which default to 1 and [0], respectively. If both are passed, but inconsistent, for example num_gpus = 1, gpu_ids = [0, 1]`, then they are modified to follow the setting with more GPUs, for example num_gpus = 1 -> num_gpus = 2.

In some cases, you may encounter an issue with multi-GPU training resulting in a segmentation fault. You may circumvent this by setting the OMP_NUM_THREADS enviroment variable to 1. Depending upon your model of execution, you may use the following methods to set this variable

CLI Launcher

You may set this env variable by adding the following fields to the Envs field of your ~/.tao_mounts.json file as mentioned in bullet 3 in this section

{
    "Envs": [
        {
            "variable": "OMP_NUM_THREADSR",
            "value": "1"
        }
    ]
}

Docker

You may set environment variables in the docker by setting the -e flag in the docker command line.

docker run -it --rm --gpus all \
    -e OMP_NUM_THREADS=1 \
    -v /path/to/local/mount:/path/to/docker/mount nvcr.io/nvidia/tao/tao-toolkit:5.5.0-pyt <model> train -e

Evaluating the model#

The evaluation metric of PointPillars is mAP(BEV and 3D).

Use the following command to run PointPillars evaluation:

EVALUATE_JOB_ID=$(tao-client pointpillars experiment-run-action --action evaluate --id $EXPERIMENT_ID --specs "$SPECS" --parent_job_id $TRAIN_JOB_ID)

tao model pointpillars evaluate -e <experiment_spec_file>
                   evaluate.checkpoint=<model to be evaluated>
                   results_dir=<results_dir>
                   [-h, --help]

Required Arguments

The following arguments are required.

-e, --experiment_spec_file: Experiment spec file to set up the evaluation experiment. This should be the same as a training spec file.
results_dir: The path to a folder where the experiment outputs should be written.
evaluate.checkpoint: The .pth model to be evaluated.

Optional Arguments

The following arguments are optional to run the command.

-h, --help: Show this help message and exit.
evaluate.<evaluate_option>: The evaluate options.

Note

The evaluation metric in TAO PointPillars is different from that in official metric of KITTI point cloud detection. While KITTI metric considers easy/moderate/hard categories of objects and filters small objects whose sizes are smaller than a threshold, it is only meaningful for KITTI dataset. Instead, TAO PointPillars metric is a general metric that does not classify objects into easy/moderate/hard categories and does not exclude objects in calculation of metric. This makes TAO PointPillars metric a general metric that is applicable to a general dataset. The final result is average precision(AP) and mean average precision(mAP) regardless of its details in computation. Due to this, the TAO PointPillars metric is not comparable with KITTI official metric on KITTI dataset, although they should be roughly the same.

Running Inference on the PointPillars Model#

Use the following command to run inference on PointPillars with .tlt model or TensorRT engine:

INFERENCE_JOB_ID=$(tao-client pointpillars experiment-run-action --action inference --id $EXPERIMENT_ID --specs "$SPECS" --parent_job_id $TRAIN_JOB_ID)

tao model pointpillars inference -e <experiment_spec_file>
                   results_dir=<results_dir>
                   inference.checkpoint=<inference model>
                   [-h, --help]

Required Arguments

The following arguments are required.

-e, --experiment_spec_file: Experiment spec file to set up the inference experiment. This should be the same as a training spec file.
results_dir: The path to a folder where the experiment outputs should be written.
inference.checkpoint: The .pth model to run inference on.

Optional Arguments

The following arguments are optional to run the command.

-h, --help: Show this help message and exit.
inference.<inference_option>: The inference options.

Pruning and Retrain a PointPillars Model#

TAO PointPillars models supports model pruning. Model pruning reduces model parameters and hence can improve inference frame per second(FPS) on NVIDIA GPUs while maintaining (almost) the same accuracy(mAP).

Pruning is applied to an already trained PointPillars model. The pruning will output a new model with fewer number of parameters in it. Once we have the pruned model, it is necessary to do finetune on the same dataset to bring back the accuracy(mAP). Finetune is simply running training again but with the pruned model as its pretrained model.

Use the following command to run pruning on the PointPillars .tlt model.

PRUNE_JOB_ID=$(tao-client pointpillars experiment-run-action --action prune --id $EXPERIMENT_ID --specs "$SPECS" --parent_job_id $TRAIN_JOB_ID)

tao model pointpillars prune -e <experiment_spec_file>
                   results_dir=<results_dir>
                   prune.model=<path_to_tlt_model_to_prune>
                   [prune.pruning_thresh=<pruning_threshold>]

Required Arguments

The following arguments are required.

-e, --experiment_spec_file: Experiment spec file to set up the inference experiment. This should be the same as a training spec file.
results_dir: The path to a folder where the experiment outputs should be written.
prune.model: The path to the model to prune.

Optional Arguments

The following arguments are optional to run the command.

prune.pruning_thresh: Pruning threshold, should be a float number between 0-1. Defaults to 0.1.

After pruning, the pruned model can be used for retrain(finetune). To start the retrain, we simply provide the path to the pruned model in config file as the parameter OPTIMIZATION.PRUNED_MODEL_PATH and then start the training command as mentioned above.

Exporting the Model#

Use the following command to export PointPillars to .onnx format for deployment:

EXPORT_JOB_ID=$(tao-client pointpillars experiment-run-action --action export --id $EXPERIMENT_ID --specs "$SPECS" --parent_job_id $TRAIN_JOB_ID)

tao model pointpillars export -m <model>
                   -e <experiment_spec>
                   export.checkpoint=<model to export>
                   export.onnx_file=<output_file>
                   [export.<export_option>=<export_option_value>]
                   [-h, --help]

Required Arguments

The following arguments are required to run the command.

-e, --experiment_spec: The path to an experiment spec file
export.checkpoint: The .pth model to export.
export.onnx_file: The path where the .etlt or .onnx model is saved.

Optional Arguments

The following arguments are optional to run the command.

-h, --help: Show this help message and exit.
export.<export_option>: The export options.

Deploying the Model#

The PointPillars models that you trained can be deployed on edge devices, such as a Jetson Xavier, Jetson Nano, or Tesla, or in the cloud with NVIDIA GPUs.

DeepStream SDK is currently does not support deployment of PointPillars models. Instead, the PointPillars models can only be deployed via a standalone TensorRT application. A TensorRT sample is developed as a demo to show how to deploy PointPillars models trained in TAO.

Using `trtexec`#

For instructions on generating a TensorRT engine using the trtexec command, refer to the trtexec guide for ReIdentificationNet.

PointPillars#

Preparing the Dataset#

Converting The Dataset#

Creating an Experiment Spec File#

Class Names#

Dataset#

Point Feature Encoding#

Data Augmentations#

Data Processing#

Model Architecture#

Voxel Feature extractor#

Scatter#

2D backbone#

Dense Head#

Post Processing#

Training Process#

Evaluation#

Inference#

Export#

Prune#

Training the Model#

Evaluating the model#

Running Inference on the PointPillars Model#

Pruning and Retrain a PointPillars Model#

Exporting the Model#

Deploying the Model#

Using trtexec#

Using `trtexec`#