BEVFusion#

BEVFusion is an 3D object-detection model that is included in the TAO. It supports the following tasks:

convert
train
evaluate
inference

The sample inferenc result:

../../../_images/bevfusion_sample.png — TAO BEVFusion Inferece Image#

These tasks may be invoked from the TAO Launcher using the following convention on the command line:

tao model bevfusion <sub_task> <args_per_subtask>

where args_per_subtask are the command-line arguments required for a given subtask. Each of these subtasks are explained as follows.

Dataset Format#

The dataset for BEVFusion contains point cloud data, rgb image and the corresponding annotations of 3D objects. The directory structure should be organized as KITTI directory structure.

/kitti
    /training
        /calib
          000000.txt
          000001.txt
            ...
          N.txt
        /image_2
          000000.png
          000001.png
            ...
          N.png
        /label_2
          000000.txt
          000001.txt
            ...
          N.txt
        /velodyne
          000000.bin
          000001.bin
            ...
          N.bin
    /ImageSets
        train.txt
        val.txt
        test.txt

Each .bin file should comply with the format described above. Each .txt label file should comply to the KITTI format.

Creating a Configuration File#

Below is a sample BEVFusion spec file. It has six components -model, inference, evaluate, dataset and train-as well as several global parameters, which are described below. The format of the spec file is a YAML file.

Here’s a sample of the BEVFusion spec file:

results_dir: /results/bevfusion
dataset:
  type: KittiPersonDataset
  root_dir: /data/
  gt_box_type: camera
  default_cam_key: CAM2
  train_dataset:
    repeat_time: 2
    ann_file: /data/kitti_person_infos_train.pkl
    data_prefix:
      pts: training/velodyne_reduced
      img: training/image_2
    batch_size: 4
    num_workers: 8
  val_dataset:
    ann_file: /data/kitti_person_infos_val.pkl
    data_prefix:
      pts: training/velodyne_reduced
      img: training/image_2
    batch_size: 2
    num_workers: 4
  test_dataset:
    ann_file: /data/kitti_person_infos_val.pkl
    data_prefix:
      pts: training/velodyne_reduced
      img: training/image_2
    batch_size: 4
    num_workers: 4
model:
  type: BEVFusion
  point_cloud_range: [0, -40, -3, 70.4, 40, 1]
  voxel_size: [0.05, 0.05, 0.1]
  grid_size: [1440, 1440, 41]
train:
  num_gpus: 1
  num_nodes: 1
  validation_interval: 1
  num_epochs: 5
  optimizer:
    type: AdamW
    lr:  0.0002
  lr_scheduler:
    - type: LinearLR
      start_factor: 0.33333333
      by_epoch: False
      begin: 0
      end: 500
    - type: CosineAnnealingLR
      T_max: 10
      begin: 0
      end: 10
      by_epoch: True
      eta_min_ratio: 1e-4
    - type: CosineAnnealingMomentum
      eta_min: 0.8947
      begin: 0
      end: 2.4
      by_epoch: True
    - type: CosineAnnealingMomentum
      eta_min: 1
      begin: 2.4
      end: 10
      by_epoch: True
inference:
  num_gpus: 1
  conf_threshold: 0.3
  checkpoint: /results/train/bevfusion_model.pth
evaluate:
  num_gpus: 1
  checkpoint: /results/train/bevfusion_model.pth

Field	value_type	description	default_value	automl_enabled
`results_dir`	string		/results	FALSE
`default_scope`	string	Default scope to use mmdet3d	mmdet3d	FALSE
`default_hooks`	collection	Default hooks for mmlabs	{‘timer’: {‘type’: ‘IterTimerHook’}, ‘logger’: {‘type’: ‘LoggerHook’, ‘interval’: 1, ‘log_metric_by_epoch’: True}, ‘param_scheduler’: {‘type’: ‘ParamSchedulerHook’}, ‘checkpoint’: {‘type’: ‘CheckpointHook’, ‘by_epoch’: True, ‘interval’: 1}, ‘sampler_seed’: {‘type’: ‘DistSamplerSeedHook’}, ‘visualization’: {‘type’: ‘Det3DVisualizationHook’}}	FALSE
`logger_hook`	string	Default logger hook type	TAOBEVFusionLoggerHook	FALSE
`manual_seed`	int	Optional manual seed. Seed is set when the value is given in spec file.		FALSE
`input_modality`	collection	Input modality for the model. Set True for each modality to use.	{‘use_lidar’: True, ‘use_camera’: True, ‘use_radar’: False, ‘use_map’: False, ‘use_external’: False}	FALSE
`model`	collection	Configurable parameters to construct the model for a BEVFusion experiment.		FALSE
`dataset`	collection	Configurable parameters to construct the dataset for a BEVFusion experiment.		FALSE
`train`	collection	Configurable parameters to construct the trainer for a BEVFusion experiment.		FALSE
`evaluate`	collection	Configurable parameters to construct the evaluator for a BEVFusion experiment.		FALSE
`inference`	collection	Configurable parameters to construct the inferencer for a BEVFusion experiment.		FALSE

Data Preprocessor Config#

The dataset configuration (data_preprocessor) defines the data source and pre-processing hyperparameters.

Field	value_type	description	default_value	automl_enabled
`type`	string	Name of Data Pre-processor for 3D Fusion	Det3DDataPreprocessor	FALSE
`mean`	list	The input mean for RGB frames	[123.675, 116.28, 103.53]	FALSE
`std`	list	The input standard deviation per pixel for RGB frames	[58.395, 57.12, 57.375]	FALSE
`bgr_to_rgb`	bool	whether to convert image from BGR to RGB.	32	FALSE
`pad_size_divisor`	int	The size of padded image should be divisible.	32	FALSE
`voxelize_cfg`	collection		{‘max_num_points’: 10, ‘max_voxels’: [120000, 160000], ‘voxelize_reduce’: True}	FALSE

Dataset Config#

The dataset configuration (dataset) defines the dataset directories, annotation file and batch size for either train, val or test.

Field	value_type	description	default_value	valid_options	automl_enabled
`type`	string	Dataset types for 3D Fusion	KittiPersonDataset	TAO3DSyntheticDataset,TAO3DDataset,KittiPersonDataset	FALSE
`root_dir`	string	A path to the root directory of the given dataset	/data/		FALSE
`classes`	list	A List of the classes to be trained.	[‘person’]		FALSE
`box_type_3d`	string	3D bounding boxes type to be used when training.	lidar	lidar,camera	FALSE
`gt_box_type`	string	3D bounding boxes type in ground truth.	camera	lidar,camera	FALSE
`origin`	list	The origin of the given center point in ground truth 3D bounding boxes.	[0.5, 1.0, 0.5]		FALSE
`default_cam_key`	string	Default camera name in dataset	CAM0		FALSE
`per_sequence`	bool	Whether to save results in per sequence format.	False		FALSE
`num_views`	int	Number of camera view in dataset.	1		FALSE
`point_cloud_dim`	int	Input lidar point cloud data dimension	4		FALSE
`train_dataset`	collection	Configurable parameters to construct the train dataset.			FALSE
`val_dataset`	collection	Configurable parameters to construct the validation dataset.			FALSE
`test_dataset`	collection	Configurable parameters to construct the test dataset.			FALSE
`img_file`	string	Image file for single file inference			FALSE
`pc_file`	string	Point cloud file for single file inference			FALSE
`cam2img`	list	Camera instrinsic matrix for single file inference			FALSE
`lidar2cam`	list	Lidar to camera extrinsic matrix for single file inference			FALSE

Model Config#

The model configuration (model) defines the BEVFusion model structure. This model is used for training, evaluation, and inference. A detailed description is included in the table below.

Field	value_type	description	default_value	valid_options	automl_enabled
`type`	string	Model name	BEVFusion	BEVFusion	FALSE
`point_cloud_range`	list	point cloud range	[0, -40, -3, 70.4, 40, 1]		FALSE
`voxel_size`	list	voxel size in voxelization	[0.05, 0.05, 0.1]		FALSE
`post_center_range`	list	post processing center filter range	[-61.2, -61.2, -20.0, 61.2, 61.2, 20.0]		FALSE
`grid_size`	list	Grid size for bevfusion model	[1440, 1440, 41]		FALSE
`data_preprocessor`	collection	Configurable parameters to construct the preprocessor for the bevfusion model.			FALSE
`img_backbone`	collection	Configurable parameters to construct the camera image backbone for the bevfusion model.			FALSE
`img_neck`	collection	Configurable parameters to construct the camera image neck for the bevfusion model.			FALSE
`view_transform`	collection	Configurable parameters to construct the camera view transform for the bevfusion model.			FALSE
`pts_backbone`	collection	Configurable parameters to construct the lidar point cloud backbone for the bevfusion model.			FALSE
`pts_voxel_encoder`	collection	Configurable parameters to construct the lidar point cloud voxel encoder for the bevfusion model.	{‘type’: ‘HardSimpleVFE’, ‘num_features’: 4}		FALSE
`pts_middle_encoder`	collection	Configurable parameters to construct the lidar encoder for the bevfusion model.			FALSE
`pts_neck`	collection	Configurable parameters to construct the lidar neck for the bevfusion model.			FALSE
`fusion_layer`	collection	Configurable parameters to construct the fusion layer for the bevfusion model.			FALSE
`bbox_head`	collection	Configurable parameters to construct the bounding box head for the bevfusion model.			FALSE

Image Backbone Config#

The backbone configuration (img_backbone) defines the backbone structure. A detailed description is included in the table below. Currently, BEVFusion only supports Swin-Transformers and ResNet50 image backbone.

Field	value_type	description	default_value	automl_enabled
`type`	string	Name of Image Backbone for 3D Fusion	mmdet.SwinTransformer	FALSE
`embed_dims`	int	Number of input channels.	96	FALSE
`depths`	list	Depths of each Swin Transformer stage.	[2, 2, 6, 2]	FALSE
`num_heads`	list	Number of attention head of each stage.	[3, 6, 12, 24]	FALSE
`window_size`	int	Window size for Swin Transformer.	7	FALSE
`mlp_ratio`	int	Ratio of mlp hidden dim to embedding dim.	4	FALSE
`qkv_bias`	bool	If True, add a learnable bias to query, key, value.	True	FALSE
`qk_scale`	string	Override default qk scale of head_dim ** -0.5 if set.		FALSE
`drop_rate`	float	Dropout rate.	0.0	FALSE
`attn_drop_rate`	float	Attention dropout rate.	0.0	FALSE
`drop_path_rate`	float	Stochastic drop rate	0.2	FALSE
`patch_norm`	bool	If True, add normalization after patch embedding.	True	FALSE
`out_indices`	list	Output from which stages.	[1, 2, 3]	FALSE
`with_cp`	bool	Use checkpoint or not. Using checkpoint will save some memory while slowing down the training speed.	False	FALSE
`convert_weights`	bool	The flag indicates whether the pre-trained model is from the original repo.	True	FALSE
`init_cfg`	collection	Configuration for initialzation.		FALSE

Image Neck Config#

The neck configuration (img_neck) defines the image neck structure. A detailed description is included in the table below. Currently, BEVFusion only supports GeneralizedLSSFPN image backbone.

Field	value_type	description	default_value	automl_enabled
`type`	string	Image Neck Name	GeneralizedLSSFPN	FALSE
`in_channels`	list	The number of input channels for image neck.	[192, 384, 768]	FALSE
`out_channels`	int	The number of output channels for image neck.	256	FALSE
`start_level`	int	Starting level for image neck.	0	FALSE
`num_outs`	int	The number of outputput for image neck.	0	FALSE
`norm_cfg`	collection	The configuration of normalization for image neck.	{‘type’: ‘BN2d’, ‘requires_grad’: True}	FALSE
`act_cfg`	collection	The configuration of activation for image neck.	{‘type’: ‘ReLU’, ‘inplace’: True}	FALSE
`upsample_cfg`	collection	The configuration of upsampling for image neck.	{‘mode’: ‘bilinear’, ‘align_corners’: False}	FALSE

View Transform Config#

The configuration (view_transform) defines the view transform structure for camera input. A detailed description is included in the table below. Currently, BEVFusion only supports DepthLSSTransform and LSSTransform image backbone.

Field	value_type	description	default_value	valid_options	automl_enabled
`type`	string	Image view transform name.	DepthLSSTransform	DepthLSSTransform,LSSTransform	FALSE
`in_channels`	int	The number of input channels for view transform.	256		FALSE
`out_channels`	int	The number of output channels for view transform.	80		FALSE
`image_size`	list	Image size for view transform.	[256, 704]		FALSE
`feature_size`	list	Feature size for view transform.	[32, 88]		FALSE
`xbound`	list	The grid range for x-axis.	[-54.0, 54.0, 0.3]		FALSE
`ybound`	list	The grid range for y-axis.	[-54.0, 54.0, 0.3]		FALSE
`zbound`	list	The grid range for z-axis.	[-10.0, 10.0, 20.0]		FALSE
`dbound`	list	The grid range for depth.	[1.0, 60.0, 0.5]		FALSE
`downsample`	int	The ratio for downsampling.	2		FALSE

Lidar Backbone Config#

The backbone configuration (lidar_backbone) defines the image backbone structure. A detailed description is included in the table below. Currently, BEVFusion only supports SECOND lidar backbone at the moment.

Field	value_type	description	default_value	automl_enabled
`type`	string	The lidar backbone name.	SECOND	FALSE
`in_channels`	int	The number of input channels for lidar backbone.	256	FALSE
`out_channels`	list	The number of output channels for lidar backbone.	[128, 256]	FALSE
`layer_nums`	list	The number of layer in each stage for lidar backbone.	[5, 5]	FALSE
`layer_strides`	list	Number of layers in each stage for lidar backbone.	[1, 2]	FALSE
`norm_cfg`	collection	The configuration of normalization for lidar backbone.	{‘type’: ‘BN’, ‘eps’: 0.001, ‘momentum’: 0.01}	FALSE
`conv_cfg`	collection	The configuration of convolution layers for lidar backbone.	{‘type’: ‘Conv2d’, ‘bias’: False}	FALSE

Lidar Encoder Config#

The encoder configuration (pts_middle_encoder) defines the lidar encoder structure. A detailed description is included in the table below. Currently, BEVFusion only supports BEVFusionSparseEncoder structure at the moment.

Field	value_type	description	default_value	automl_enabled
`type`	string	The lidar encoder name.	BEVFusionSparseEncoder	FALSE
`in_channels`	int	The number of input channels for lidar encoder.	4	FALSE
`sparse_shape`	list	The sparse shape of input tensor.	[1440, 1440, 41]	FALSE
`order`	list	Order of conv module.	[‘conv’, ‘norm’, ‘act’]	FALSE
`norm_cfg`	collection	The configuration of normalization for lidar encoder.	{‘type’: ‘BN1d’, ‘eps’: 0.001, ‘momentum’: 0.01}	FALSE
`encoder_channels`
`encoder_paddings`
`block_type`	string	Type of the block to use.	basicblock	FALSE

Lidar Neck Config#

The configuration (pts_neck) defines the lidar neck structure. A detailed description is included in the table below. Currently, BEVFusion only supports SECONDFPN structure at the moment.

Field	value_type	description	default_value	automl_enabled
`type`	string	The lidar neck name.	SECONDFPN	FALSE
`in_channels`	list	The number of input channels for lidar neck.	[128, 256]	FALSE
`out_channels`	list	The number of output channels for lidar neck.	[256, 256]	FALSE
`upsample_strides`	list	Strides used to upsample the feature map for lidar neck.	[1, 2]	FALSE
`norm_cfg`	collection	The configuration of normalization for lidar neck.	{‘type’: ‘BN’, ‘eps’: 0.001, ‘momentum’: 0.01}	FALSE
`upsample_cfg`	collection	The configuration of upsample layers for lidar neck.	{‘type’: ‘deconv’, ‘bias’: False}	FALSE
`use_conv_for_no_stride`	bool	Whether to use conv when stride is 1.	True	FALSE

Fusion Layer Config#

The configuration (fusion_layer) defines the fusion layer structure. A detailed description is included in the table below. Currently, BEVFusion only supports ConvFuser structure at the moment.

Field	value_type	description	default_value	automl_enabled
`type`	string	The fusion layer name.	ConvFuser	FALSE
`in_channels`	list	The number of input channels for fusion layer.	[80, 256]	FALSE
`out_channels`	int	The number of output channels for fusion layer.	256	FALSE

BBoxHead Config#

The configuration (bbox_head) defines the bbox prediction head structure. A detailed description is included in the table below. Currently, BEVFusion only supports BEVFusionHead structure at the moment.

Field	value_type	description	default_value	valid_options	automl_enabled
`type`	string	Prediction head name.	BEVFusionHead	BEVFusionHead	FALSE
`num_proposals`	int	Number of proposals.	200		FALSE
`auxiliary`	bool	Whether to enable auxiliary training.	True		FALSE
`in_channels`	int	Number of channels in the input feature map.	512		FALSE
`hidden_channel`	int	Number of hiden channel.	128		FALSE
`num_classes`	int	Number of classes.	1		FALSE
`nms_kernel_size`	int	NMS kernel size.	3		FALSE
`bn_momentum`	float	Batch Norm momentum.	0.1		FALSE
`num_decoder_layers`	int	Number of decoder layer.	1		FALSE
`out_size_factor`	int	Output size factor.	8		FALSE
`bbox_coder`	collection	The configuration for bounding box encoder.			FALSE
`decoder_layer`	collection	The configuration for decoder layer.			FALSE
`code_weights`	list	Weights for box encoder.	[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]		FALSE
`nms_type`	string	The type of NMS.			FALSE
`assigner`	collection	The configuration for assginer.	{‘type’: ‘HungarianAssigner3D’, ‘iou_calculator’: {‘type’: ‘BboxOverlaps3D’, ‘coordinate’: ‘lidar’}, ‘cls_cost’: {‘type’: ‘mmdet.FocalLossCost’, ‘gamma’: 2.0, ‘alpha’: 0.25, ‘weight’: 0.15}, ‘reg_cost’: {‘type’: ‘BBoxBEVL1Cost’, ‘weight’: 0.25}, ‘iou_cost’: {‘type’: ‘IoU3DCost’, ‘weight’: 0.25}}		FALSE
`common_heads`	collection	The configuration for common heads.	{‘center’: [2, 2], ‘height’: [1, 2], ‘dim’: [3, 2], ‘rot’: [6, 2]}		FALSE
`loss_cls`	collection	The configuration for classification loss.	{‘type’: ‘mmdet.FocalLoss’, ‘use_sigmoid’: True, ‘gamma’: 2.0, ‘alpha’: 0.25, ‘reduction’: ‘mean’, ‘loss_weight’: 1.0}		FALSE
`loss_heatmap`	collection	The configuration for heatmap loss.	{‘type’: ‘mmdet.GaussianFocalLoss’, ‘reduction’: ‘mean’, ‘loss_weight’: 1.0}		FALSE
`loss_bbox`	collection	The configuration for bounding box loss.	{‘type’: ‘mmdet.L1Loss’, ‘reduction’: ‘mean’, ‘loss_weight’: 0.25}		FALSE

Train Config#

The train configuration defines the hyperparameters of the training process.

train:
  precision: 'fp16'
  num_gpus: 1
  checkpoint_interval: 10
  validation_interval: 10
  num_epochs: 50
  optim:
    type: "AdamW"
    lr: 0.0001
    weight_decay: 0.05

Field	value_type	description	default_value	valid_min	valid_max	automl_enabled
`num_gpus`	int	The number of GPUs to run the train job.	1	1		FALSE
`gpu_ids`	list	List of GPU IDs to run the training on. The length of this list must be equal to the number of gpus in train.num_gpus.	[0]			FALSE
`num_nodes`	int	Number of nodes to run the training on. If > 1, then multi-node is enabled.	1			FALSE
`seed`	int	The seed for the initializer in PyTorch. If < 0, disable fixed seed.	1234	-1	inf	FALSE
`cudnn`	collection					FALSE
`num_epochs`	int	Number of epochs to run the training.	10	1	inf	TRUE
`checkpoint_interval`	int	The interval (in epochs) at which a checkpoint will be saved. Helps resume training.	1	1		FALSE
`validation_interval`	int	The interval (in epochs) at which a evaluation will be triggered on the validation dataset.	1	1		FALSE
`resume_training_checkpoint_path`	string	Path to the checkpoint to resume training from.				FALSE
`results_dir`	string	Path to where all the assets generated from a task are stored.				FALSE
`by_epoch`	bool	Whether EpochBasedRunner is used.	True			FALSE
`logging_interval`	int	logging interval every k iterations.	1			FALSE
`resume`	bool	Whether to resume the training or not.	False			FALSE
`pretrained_checkpoint`	string	Path to a pre-trained BEVFusion model to initialize the current training from.				FALSE
`optimizer`	collection	Hyper parameters to configure the optimizer				FALSE
`lr_scheduler`	list	Hyper parameters to configure the learning rate scheduler.	[{‘type’: ‘LinearLR’, ‘start_factor’: 0.33333333, ‘by_epoch’: False, ‘begin’: 0, ‘end’: 500}, {‘type’: ‘CosineAnnealingLR’, ‘T_max’: 10, ‘eta_min_ratio’: 0.0001, ‘begin’: 0, ‘end’: 10, ‘by_epoch’: True}, {‘type’: ‘CosineAnnealingMomentum’, ‘eta_min’: 0.8947, ‘begin’: 0, ‘end’: 2.4, ‘by_epoch’: True}, {‘type’: ‘CosineAnnealingMomentum’, ‘eta_min’: 1, ‘begin’: 2.4, ‘end’: 10, ‘by_epoch’: True}]			FALSE

Optimizer config#

The optim parameter defines the config for the optimizer in training, including the learning rate, learning scheduler, and weight decay.

Field	value_type	description	default_value	automl_enabled
`type`	string	Type of optimizer used to train the network.	AdamW	FALSE
`lr`	float	The initial learning rate for training the model.	0.0002	FALSE
`weight_decay`	float	The weight decay coefficient.	0.01	FALSE
`betas`	list	The moving average parameter for adaptive learning rate.	[0.9, 0.999]	FALSE
`clip_grad`	collection	Clip the gradient norm of an iterable of parameters.	{‘max_norm’: 35, ‘norm_type’: 2}	FALSE
`wrapper_type`	string	Opitmizer Wrapper in MMengine. AmpOptimWrapper to enables mixed precision training	OptimWrapper	FALSE

Evaluation Config#

The evaluate parameter defines the hyperparameters of the evaluation process.

evaluate:
  checkpoint: /path/to/model.pth
  num_gpus: 1

Field	value_type	default_value	automl_enabled
`num_gpus`	int	1	FALSE
`gpu_ids`	list	[0]	FALSE
`num_nodes`	int	1	FALSE
`checkpoint`	string	???	FALSE
`results_dir`	string		FALSE

Inference Config#

The inference parameter defines the hyperparameters of the inference process.

inference:
  checkpoint: /path/to/model.pth
  num_gpus: 1

Field	value_type	description	default_value	automl_enabled
`num_gpus`	int		1	FALSE
`gpu_ids`	list		[0]	FALSE
`num_nodes`	int		1	FALSE
`checkpoint`	string		???	FALSE
`results_dir`	string			FALSE
`conf_threshold`	float	Confidence Threshold	0.5	FALSE
`show`	bool	Whether to show the 3D visualizaiton on screen	False	FALSE

Training the Model#

To train a BEVFusion model, use this command:

tao model bevfusion train [-h] -e <experiment_spec>
                          [-r <results_dir>]

Required Arguments#

-e, --experiment_spec: The experiment specification file to set up the training experiment

Optional Arguments#

-r, --results_dir: The path to the folder where the experiment outputs should be written. If this argument is not specified, the results_dir from the spec file is used.
--gpus: The number of GPUs used to run training
--num_nodes: The number of nodes used to run training. If this value is larger than 1, distributed multi-node training is enabled.
-h, --help: Show this help message and exit.

Sample Usage#

Here’s an example of the train command:

tao bevfusion model train -e /path/to/spec.yaml

Evaluating the Model#

To run evaluation with a BEVFusion model, use this command:

tao model bevfusion evaluate [-h] -e <experiment_spec>
                             [-r <results_dir>]

Required Arguments#

-e, --experiment_spec: The experiment spec file to set up the evaluation experiment

Optional Arguments#

-r, --results_dir: The directory where the evaluation result is stored

Sample Usage#

Here’s an example of using the evaluate command:

tao model bevfusion evaluate -e /path/to/spec.yaml -r /path/to/results/ evaluate.checkpoint=/path/to/model.pth

Running Inference with BEVFusion Model#

Use the following command to run inference on BEVFusion with .pth:

tao model bevfusion inference [-h] -e <experiment spec file>
                              [-r <results_dir>]

Required Arguments#

-e, --experiment_spec: The experiment spec file to set up the inference experiment

Optional Arguments#

-r, --results_dir: The directory where the inference result is stored

Sample Usage#

Here’s an example of using the inference command:

tao model bevfusion inference -e /path/to/spec.yaml -r /path/to/results/ inference.checkpoint=/path/to/model.pth