The dataset for PointPillars contains point cloud data and the corresponding annotations of 3D objects. The point cloud data is a directory of point cloud files(in .bin extension) and the annotations is a directory of text files in KITTI format.

The directory structure should be organized as below, where the directory name for point cloud files has to be lidar and the directory name for annotations has to be label . The names of the files in the 2 directory can be arbitrary as long as each .bin file has its unique corresponding .txt file and vice-versa.

/lidar 0.bin 1.bin ... /label 0.txt 1.txt ...

Finally, train/val split has to be maintained for PointPillars as usual. So for both training dataset and validation set we have to ensure they follow the same structure described above. So the overall structure should look like below. The exact name train and val are not required but are preferred by convention.

/train /lidar 0.bin 1.bin ... /label 0.txt 1.txt ... /val /lidar 0.bin 1.bin ... /label 0.txt 1.txt ...

Each .bin file should comply with the format described above. Each .txt label file should comply to the KITTI format. There is an exception for PointPillars label format compared to standard KITTI format. Although the structure is the same as KITTI, the last field for each object has different interpretation. In KITTI the last field is Rotation_y(rotation around Y-axis in Camera coordinate system), while in PointPillars they are Rotation_z(rotation around Z-axis in LIDAR coordinate system).

Below is an example, we should interpret -1.59, -2.35, -0.03 differently from standard KITTI.

car 0.00 0 -1.58 587.01 173.33 614.12 200.12 1.65 1.67 3.64 -0.65 1.71 46.70 -1.59 cyclist 0.00 0 -2.46 665.45 160.00 717.93 217.99 1.72 0.47 1.65 2.45 1.35 22.10 -2.35 pedestrian 0.00 2 0.21 423.17 173.67 433.17 224.03 1.60 0.38 0.30 -5.87 1.63 23.11 -0.03

Note The interpretation of the label of PointPillars is slightly different from standard KITTI format. In PointPillars the yaw is rotation around Z-axis in LIDAR coordinate system, as defined above, while in standard KITTI interpretation the yaw is rotation around Y-axis in Camera coordinate system. In this way, PointPillars dataset does not depend on Camera information and Camera calibration.

Once the above dataset directory structure is ready, copy and paste the base names to spec file ‘s dataset.data_split dict. For example,

{ 'train': train, 'test': val }

Also, set names to the pickle info files in dataset.info_path parameter. For example,

{ 'train': ['infos_train.pkl'], 'test': ['infos_val.pkl'], }

Once these are done, the statistics of the dataset should be generated via the dataset_convert command to generate the pickle files above. The pickle files will be used in the data augmentations during training process.

The pickle info files need to be generated based on the original point cloud files and KITTI text label files. This is accomplished by a command line.

tao model pointpillars dataset_convert -e $SPECS_DIR/pointpillars.yaml