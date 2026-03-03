Visual ChangeNet-Classification#
Visual ChangeNet-Classification is an NVIDIA-developed classification change detection model and is included in the TAO. Visual ChangeNet supports the following tasks:
train
evaluate
inference
export
Each task is explained in detail in the following sections.
Data Input for VisualChangeNet#
Single Golden Data Format#
VisualChangeNet-Classification requires the data to be provided as image and CSV files. Refer to the Data Annotation Format page for more information about the input data format for VisualChangeNet-Classification, which follows the same input data format as Optical Inspection.
Multiple Golden Data Format#
To enable Multiple Golden mode, set
num_golden > 1 in the Dataset Configuration.
This mode requires a different data format to support multiple golden reference images per sample.
Refer to the Data Annotation Format page for more information
about the input data format for Multiple-Golden-VisualChangeNet-Classification.
Creating a Training Experiment Spec File#
Configuring a Custom Dataset#
This section provides example configuration and commands to retrieve configuration for training VisualChangeNet-Classification using the dataset format described above.
BASE_EXPERIMENT_ID=$(tao visual_changenet list-base-experiments | jq -r '.[0].id')
SPECS=$(tao visual_changenet get-job-schema --action train --base-experiment-id $BASE_EXPERIMENT_ID | jq -r '.default')
Here is an example spec file for training a VisualChangeNet-Classification model with NVIDIA’s FAN Hybrid backbone using the Data Annotation Format.
encryption_key: tlt_encode
task: classify
train:
pretrained_model_path: /path/to/pretrained/model.pth
resume_training_checkpoint_path: null
classify:
loss: "ce"
cls_weight: [1.0, 10.0]
num_epochs: 10
num_nodes: 1
validation_interval: 5
checkpoint_interval: 5
seed: 1234
optim:
lr: 0.0001
optim: "adamw"
policy: "linear"
momentum: 0.9
weight_decay: 0.01
results_dir: "${results_dir}/train"
tensorboard:
enabled: True
results_dir: /path/to/experiment_results
model:
backbone:
type: "fan_small_12_p4_hybrid"
pretrained_backbone_path: null
freeze_backbone: False
decode_head:
feature_strides: [4, 8, 16, 16]
use_summary_token: True
classify:
train_margin_euclid: 2.0
eval_margin: 0.005
embedding_vectors: 5
embed_dec: 30
difference_module: 'learnable'
learnable_difference_modules: 4
dataset:
classify:
train_dataset:
csv_path: /path/to/train.csv
images_dir: /path/to/img_dir
validation_dataset:
csv_path: /path/to/val.csv
images_dir: /path/to/img_dir
test_dataset:
csv_path: /path/to/test.csv
images_dir: /path/to/img_dir
infer_dataset:
csv_path: /path/to/infer.csv
images_dir: /path/to/img_dir
image_ext: .jpg
batch_size: 16
workers: 2
fpratio_sampling: 0.2
num_input: 4
input_map:
LowAngleLight: 0
SolderLight: 1
UniformLight: 2
WhiteLight: 3
concat_type: linear
grid_map:
x: 2
y: 2
image_width: 128
image_height: 128
augmentation_config:
rgb_input_mean: [0.485, 0.456, 0.406]
rgb_input_std: [0.229, 0.224, 0.225]
num_classes: 2
num_golden: 1
evaluate:
checkpoint: "???"
inference:
checkpoint: "???"
export:
gpu_id: 0
checkpoint: "???"
onnx_file: "???"
input_width: 128
input_height: 512
|
Parameter
|
Data Type
|
Default
|
Description
|
Supported Values
|
model
|
dict config
|
–
|
The configuration of the model architecture.
|
dataset
|
dict config
|
–
|
The configuration of the dataset.
|
train
|
dict config
|
–
|
The configuration of the training task.
|
evaluate
|
dict config
|
–
|
The configuration of the evaluation task.
|
inference
|
dict config
|
–
|
The configuration of the inference task.
|
encryption_key
|
string
|
None
|
The encryption key to encrypt and decrypt model files.
|
results_dir
|
string
|
/results
|
The directory where experiment results are saved.
|
export
|
dict config
|
–
|
The configuration of the ONNX export task.
|
task
|
str
|
classify
|
A flag to indicate the change detection task. Supports two tasks: ‘segment’ and ‘classify’ for segmentation and classification.
|
classify, segment
train#
|
Parameter
|
Datatype
|
Default
|
Description
|
Supported Values
|
num_gpus
|
unsigned int
|
1
|
The number of GPUs to use for distributed training.
|
>0
|
gpu_ids
|
List[int]
|
[0]
|
The indices of the GPU’s to use for distributed training.
|
seed
|
unsigned int
|
1234
|
The random seed for random, NumPy, and torch.
|
>0
|
num_epochs
|
unsigned int
|
10
|
The total number of epochs to run the experiment.
|
>0
|
checkpoint_interval
|
unsigned int
|
1
|
The epoch interval at which the checkpoints are saved.
|
>0
|
validation_interval
|
unsigned int
|
1
|
The epoch interval at which the validation is run.
|
>0
|
resume_training_checkpoint_path
|
string
|
The intermediate PyTorch Lightning checkpoint from which to resume training.
|
results_dir
|
string
|
/results/train
|
The directory in which to save training results.
|
classify
|
Dict
str
list
|
None
ce
|
The classify dict contains configurable parameters for the VisualChangeNet Classification pipeline with the following parameters:
* loss: The loss function used for classification training.
* cls_weights: Weights for Cross-Entropy Loss for unbalanced dataset distributions.
|
segment
|
Dict
str
list
|
None
ce
[0.5, 0.5, 0.5, 0.8, 1.0]
|
The segment dict contains configurable parameters for the VisualChangeNet Segmentation pipeline with the following parameters:
* loss: The loss function used for segmentation training.
|
num_nodes
|
unsigned int
|
1
|
The number of nodes. If larger than 1, multi-node is enabled.
|
pretrained_model_path
|
string
|
–
|
The path to the pretrained model checkpoint to initialize the end-end model weights.
|
optim
|
dict
config
|
None
|
Contains the configurable parameters for the VisualChangeNet optimizer detailed in
the optim section.
|
tensorboard
|
dict config
bool
|
None
True
|
Enable TensorBoard visualisation using a dict with configurable parameters:
* enabled: If set to
True, enables TensorBoard.
optim#
optim:
lr: 0.0001
optim: "adamw"
policy: "linear"
momentum: 0.9
weight_decay: 0.01
|
Parameter
|
Datatype
|
Default
|
Description
|
Supported Values
|
lr
|
float
|
0.0005
|
The learning rate.
|
>=0.0
|
optim
|
str
|
adamw
|
The optimizer.
|
policy
|
str
|
linear
|
The learning scheduler:
* linear : LambdaLR decreases the lr by a multiplicative factor.
* step : StepLR decrease the lr by 0.1 at every
num_epochs // 3 steps.
|
linear/step
|
momentum
|
float
|
0.9
|
The momentum for the AdamW optimizer.
|
weight_decay
|
float
|
0.1
|
The weight decay coefficient.
|
monitor_name
|
str
|
val_loss
|
The name of the monitor used for saving the top-k checkpoints.
Model#
The following example model config provides options to change the VisualChangeNet-Classification architecture for
training. VisualChangeNet-Classification supports two model architectures. Architecture 1
(
difference_module = euclidean) leverages only the last feature maps from the FAN backbone using Euclidean
difference to perform contrastive learning. Architecture 2 (
difference_module = learnable) leverages the
VisualChangeNet-Classification learnable difference modules for 4 different features at 3 feature resolutions to
minimize Cross-Entropy loss.
model:
backbone:
type: "fan_small_12_p4_hybrid"
pretrained_backbone_path: null
freeze_backbone: False
decode_head:
feature_strides: [4, 8, 16, 16]
align_corner: False
use_summary_token: True
classify:
train_margin_euclid: 2.0
eval_margin: 0.005
embedding_vectors: 5
embed_dec: 30
difference_module: 'learnable'
learnable_difference_modules: 4
|
Parameter
|
Datatype
|
Default
|
Description
|
Supported Values
|
backbone
|
Dict
string
bool
bool
|
None
None
False
False
|
A dictionary containing the following configurable parameters for VisualChangeNet-Classification backbone:
* type: The name of the backbone to be used.
* pretrained_backbone_path: The path to pre-trained backbone weights file.
* freeze_backbone: If set to
True, freezes the backbone weights during training.
* feat_downsample: If set to
True, downsamples the last feature map in FAN backbone configurations. This parameter is not propagated to other backbones.
|
fan_tiny_8_p4_hybrid
fan_large_16_p4_hybrid
fan_small_12_p4_hybrid
fan_base_16_p4_hybrid
vit_large_nvdinov2
c_radio_p1_vit_huge_patch16_224_mlpnorm
c_radio_p2_vit_huge_patch16_224_mlpnorm
c_radio_p3_vit_huge_patch16_224_mlpnorm
c_radio_v2_vit_huge_patch16_224
c_radio_v2_vit_large_patch16_224
c_radio_v2_vit_base_patch16_224
|
decode_head
|
Dict
bool
bool
list
Dict
int
|
None
False
True
[4, 8, 16, 16]
256
|
A dictionary containing the following configurable parameters for the decoder:
* align_corners: If set to
True, the input and output tensors are aligned by the center points of their corner pixels, preserving the values at the corner pixels.
* use_summary_token: If set to
True, uses the summary token of the backbone.
* feature_strides: The downsampling feature strides for different backbones.
* decoder_params: Contains the following network parameters:
– embed_dims: The embedding dimensions.
|
True, False
True, False
>0
|
classify
|
Dict
string
|
None
2.0
5
30
learnable
4
|
A dictionary containing the following configurable parameters for VisualChangeNet-Classification model:
* train_margin_euclid: The training margin threshold for contrastive learning (applicable for Architecture 1).
* eval_margin: The evaluation margin threshold.
* embedding_vectors: The output embedding dimension for each input image before computing Euclidean distance (applicable to Architecture 1).
* embed_dec: The transformer decoder MLP embedding dimension (applicable to Architecture 2).
* difference_module: The type of difference module used (applicable to both architectures).
* learnable_difference_modules: The number of learnable difference modules (applicable to Architecture 2).
|
>0
>0
>0
>0
euclidean, learnable
<4
Dataset#
The dataset parameter defines the dataset source, training batch size, augmentation, and pre-processing. An example dataset is provided below.
dataset:
classify:
train_dataset:
csv_path: /path/to/train.csv
images_dir: /path/to/img_dir
validation_dataset:
csv_path: /path/to/val.csv
images_dir: /path/to/img_dir
test_dataset:
csv_path: /path/to/test.csv
images_dir: /path/to/img_dir
infer_dataset:
csv_path: /path/to/infer.csv
images_dir: /path/to/img_dir
image_ext: .jpg
batch_size: 16
workers: 2
fpratio_sampling: 0.2
num_input: 4
input_map:
LowAngleLight: 0
SolderLight: 1
UniformLight: 2
WhiteLight: 3
concat_type: linear
grid_map:
x: 2
y: 2
image_width: 128
image_height: 128
augmentation_config:
rgb_input_mean: [0.485, 0.456, 0.406]
rgb_input_std: [0.229, 0.224, 0.225]
num_classes: 2
* Refer to the Dataset Annotation Format definition for more information about specifying lighting conditions.
|
Parameter
|
Datatype
|
Default
|
Description
|
Supported Values
|
segment
|
Dict
|
–
|
The
|
classify
|
Dict
|
–
|
The
classify#
|
Parameter
|
Datatype
|
Default
|
Description
|
Supported Values
|
train_dataset
|
Dict
|
–
|
The paths to the image directory and CSV files for the training dataset.
|
validation_dataset
|
Dict
|
–
|
The paths to the image directory and CSV files for the validation dataset.
|
test_dataset
|
Dict
|
–
|
The paths to the image directory and CSV files for the test dataset.
|
infer_dataset
|
Dict
|
–
|
The paths to the image directory and CSV files for the inference dataset.
|
image_ext
|
str
|
.jpg
|
The file extension of the images in the dataset.
|
string
|
batch_size
|
int
|
32
|
The number of samples per batch.
|
string
|
workers
|
int
|
8
|
The number of worker processes for data loading.
|
fpratio_sampling
|
int
|
0.1
|
The ratio of false-positive examples to sample.
|
>0
|
num_input
|
int
|
4
|
The number of lighting conditions for each input image*.
|
>0
|
input_map
|
Dict
|
–
|
The mapping of lighting conditions to indices specifying concatenation ordering*.
|
concat_type
|
string
|
linear
|
Type of concatenation to use for different image lighting conditions.
|
linear, grid
|
grid_map
|
Dict
Dict
Dict
|
None
None
None
|
The parameters to define the grid dimensions to concatenate images as a grid:
* x: The number of images along the x-axis.
* y: The number of images along the y-axis.
|
Dict
|
input_width
|
int
|
100
|
The width of the input image.
|
>0
|
input_height
|
int
|
100
|
The height of the input image.
|
>0
|
num_classes
|
int
|
2
|
The number of classes in the dataset.
|
>1
|
augmentation_config
|
Dict
|
None
|
Dictionary containing various data augmentation settings, which is detailed in the augmentation section.
|
num_golden
|
int
|
1
|
Number of golden images to use per input image. Setting this value greater than 1 enables Multiple Golden mode.
Multiple Golden mode is only supported with ViT backbones, using
input_width = input_height = 224 and
input_map = None.
In Multiple Golden mode, the dataset must follow the multiple golden data format.
|
>0
augmentation_config#
|
Parameter
|
Datatype
|
Default
|
Description
|
Supported Values
|
random_flip
|
Dict
float
float
bool
|
None
0.5
0.5
True
|
Random vertical and horizontal flipping augmentation settings.
* vflip_probability: Probability of vertical flipping.
* hflip_probability: Probability of horizontal flipping.
* enable: If set to
True, enables random flipping augmentation.
|
>=0.0
>=0.0
|
random_rotate
|
Dict
float
list
bool
|
None
0.5
[90, 180, 270]
True
|
Random rotation augmentation settings.
* rotate_probability: Probability of applying random rotation.
* angle_list: List of rotation angles to choose from.
* enable: If set to
True, enables random rotation augmentation.
|
>=0.0
>=0.0
|
random_color
|
Dict
float
float
float
float
bool
float
|
None
0.3
0.3
0.3
0.3
True
0.5
|
Random color augmentation settings.
* brightness: Maximum brightness change factor.
* contrast: Maximum contrast change factor.
* saturation: Maximum saturation change factor.
* hue: Maximum hue change factor.
* enabled: If set to
True, enables random color augmentation.
* color_probability: Probability of applying color augmentation.
|
>=0.0
>=0.0
>=0.0
>=0.0
>=0.0
|
with_random_crop
|
bool
|
True
|
If set to
|
True, False
|
with_random_blur
|
bool
|
True
|
If set to
|
True, False
|
rgb_input_mean
|
List[float]
|
[0.485, 0.456, 0.406]
|
The mean to be subtracted for pre-processing.
|
rgb_input_std
|
List[float]
|
[0.229, 0.224, 0.225]
|
The standard deviation to divide the image by.
|
augment
|
bool
|
False
|
If set to
|
True, False
Example spec File for ViT Backbones#
BASE_EXPERIMENT_ID=$(tao visual_changenet list-base-experiments | jq -r '.[0].id')
SPECS=$(tao visual_changenet get-job-schema --action train --base-experiment-id $BASE_EXPERIMENT_ID | jq -r '.default')
encryption_key: tlt_encode
task: classify
train:
pretrained_model_path: /path/to/pretrained/model.pth
resume_training_checkpoint_path: null
classify:
loss: "contrastive"
cls_weight: [1.0, 10.0]
num_epochs: 10
num_nodes: 1
validation_interval: 5
checkpoint_interval: 5
seed: 1234
optim:
lr: 0.0001
optim: "adamw"
policy: "linear"
momentum: 0.9
weight_decay: 0.01
results_dir: "${results_dir}/train"
tensorboard:
enabled: True
results_dir: /path/to/experiment_results
model:
backbone:
type: "vit_large_nvdinov2"
pretrained_backbone_path: /path/to/pretrained/backbone.pth
freeze_backbone: False
decode_head:
feature_strides: [4, 8, 16, 32]
use_summary_token: True
classify:
train_margin_euclid: 2.0
eval_margin: 0.005
embedding_vectors: 5
embed_dec: 30
difference_module: 'euclidean'
learnable_difference_modules: 4
dataset:
classify:
train_dataset:
csv_path: /path/to/train.csv
images_dir: /path/to/img_dir
validation_dataset:
csv_path: /path/to/val.csv
images_dir: /path/to/img_dir
test_dataset:
csv_path: /path/to/test.csv
images_dir: /path/to/img_dir
infer_dataset:
csv_path: /path/to/infer.csv
images_dir: /path/to/img_dir
image_ext: .jpg
batch_size: 16
workers: 2
fpratio_sampling: 0.2
num_input: 4
input_map:
LowAngleLight: 0
SolderLight: 1
UniformLight: 2
WhiteLight: 3
concat_type: grid
grid_map:
x: 2
y: 2
image_width: 112
image_height: 112
augmentation_config:
rgb_input_mean: [0.485, 0.456, 0.406]
rgb_input_std: [0.229, 0.224, 0.225]
num_classes: 2
num_golden: 1
evaluate:
checkpoint: "???"
inference:
checkpoint: "???"
export:
gpu_id: 0
checkpoint: "???"
onnx_file: "???"
input_width: 224
input_height: 224
Training the Model#
Use the following command to run VisualChangeNet-Classification training:
TRAIN_JOB_ID=$(tao visual_changenet create-job \
--kind experiment \
--name "visual_changenet_train" \
--action train \
--workspace-id $WORKSPACE_ID \
--specs "$TRAIN_SPECS" \
--train-datasets '["'$DATASET_ID'"]' \
--eval-dataset "$DATASET_ID" \
--base-experiment-ids '["'$BASE_EXPERIMENT_ID'"]' \
--encryption-key "nvidia_tlt" | jq -r '.id')
tao model visual_changenet train [-h] -e <experiment_spec>
task=classify
[results_dir=<global_results_dir>]
[model.<model_option>=<model_option_value>]
[dataset.<dataset_option>=<dataset_option_value>]
[train.<train_option>=<train_option_value>]
[train.gpu_ids=<gpu indices>]
[train.num_gpus=<number of gpus>]
Required Arguments
The following arguments are required.
-e,
--experiment_spec_file: The path to the experiment spec file.
task: The task (‘segment’ or ‘classify’) for the visual_changenet training. Default: segment.
Optional Arguments
You can set optional arguments to override the option values in the experiment spec file.
-h,
--help: Show this help message and exit.
model.<model_option>: The model options.
dataset.<dataset_option>: The dataset options.
train.<train_option>: The train options.
train.optim.<optim_option>: The optimizer options
Note
For training, evaluation, and inference, we expose two variables for each task:
num_gpus and
gpu_ids, which
default to
1 and
[0], respectively. If both are passed, but are inconsistent, for example
num_gpus = 1,
gpu_ids = [0, 1], then they are modified to follow the setting that implies more GPUs; in the same example
num_gpus is modified from 1 to 2.
In some cases multi-GPU training may result in a segmentation fault. You can circumvent this by
setting the enviroment variable
OMP_NUM_THREADS to 1. Depending upon your model of execution, you may use the following methods to set
this variable:
CLI Launcher:
You may set the environment variable by adding the following fields to the
Envsfield of your
~/.tao_mounts.jsonfile as mentioned in bullet 3 in ths section Running the launcher.
{ "Envs": [ { "variable": "OMP_NUM_THREADSR", "value": "1" } }
Docker:
You may set environment variables in Docker by setting the
-eflag in the Docker command line.
docker run -it --rm --gpus all \ -e OMP_NUM_THREADS=1 \ -v /path/to/local/mount:/path/to/docker/mount nvcr.io/nvidia/tao/tao-toolkit:5.5.0-pyt <model> train -e
Checkpointing and Resuming Training
At every
train.checkpoint_interval, a PyTorch Lightning checkpoint is saved. It is called
model_epoch_<epoch_num>.pth.
Checkpoints are saved in
train.results_dir, like this:
$ ls /results/train
'model_epoch_000.pth'
'model_epoch_001.pth'
'model_epoch_002.pth'
'model_epoch_003.pth'
'model_epoch_004.pth'
The latest checkpoint is also saved as
changenet_model_classify_latest.pth.
Training automatically resumes from
changenet_model_classify_latest.pth, if it exists in
train.results_dir.
This is superseded by
train.resume_training_checkpoint_path, if it is provided.
The major implication of this logic is that, if you wish to trigger fresh training from scratch, either:
Specify a new, empty results directory (Recommended)
Remove the latest checkpoint from the results directory
Creating a Testing Experiment Spec File#
Here is an example spec file for testing evaluation and inference of a trained VisualChangeNet-Classification model.
BASE_EXPERIMENT_ID=$(tao visual_changenet list-base-experiments | jq -r '.[0].id')
SPECS=$(tao visual_changenet get-job-schema --action evaluate --base-experiment-id $BASE_EXPERIMENT_ID | jq -r '.default')
results_dir: /path/to/experiment_results
task: classify
model:
backbone:
type: "fan_small_12_p4_hybrid"
classify:
eval_margin: 0.005
dataset:
classify:
test_dataset:
csv_path: /path/to/test.csv
images_dir: /path/to/img_dir
infer_dataset:
csv_path: /path/to/infer.csv
images_dir: /path/to/img_dir
image_ext: .jpg
batch_size: 16
workers: 2
num_input: 4
input_map:
LowAngleLight: 0
SolderLight: 1
UniformLight: 2
WhiteLight: 3
concat_type: linear
grid_map:
x: 2
y: 2
output_shape:
- 128
- 128
augmentation_config:
rgb_input_mean: [0.485, 0.456, 0.406]
rgb_input_std: [0.229, 0.224, 0.225]
num_classes: 2
num_golden: 1
evaluate:
checkpoint: /path/to/checkpoint
results_dir: /results/evaluate
inference:
checkpoint: /path/to/checkpoint
results_dir: /results/inference
Inference/Evaluate#
|
Parameter
|
Datatype
|
Default
|
Description
|
Supported Values
|
checkpoint
|
string
|
Path to PyTorch model to evaluate/inference.
|
trt_engine
|
string
|
Path to TensorRT model to inference/evaluate. Should be only used with TAO Deploy.
|
num_gpus
|
unsigned int
|
1
|
The number of GPUs to use.
|
>0
|
gpu_ids
|
unsigned int
|
[0]
|
The GPU IDs to use.
|
results_dir
|
string
|
The path to a folder where the experiment outputs should be written.
|
vis_after_n_batches
|
unsigned int
|
1
|
Number of batches after which to save inference/evaluate visualization results.
|
>0
|
batch_size
|
unsigned int
|
The batch size of inference/evaluate.
Evaluating the Model#
Use the following command to run VisualChangeNet-Classification evaluation:
EVALUATE_JOB_ID=$(tao visual_changenet create-job \
--kind experiment \
--name "visual_changenet_evaluate" \
--action evaluate \
--workspace-id $WORKSPACE_ID \
--parent-job-id $TRAIN_JOB_ID \
--eval-dataset "$DATASET_ID" \
--specs "$EVALUATE_SPECS" \
--base-experiment-ids '["'$BASE_EXPERIMENT_ID'"]' \
--encryption-key "nvidia_tlt" | jq -r '.id')
tao model visual_changenet evaluate [-h] -e <experiment_spec_file>
task=classify
evaluate.checkpoint=<model to be evaluated>
[evaluate.<evaluate_option>=<evaluate_option_value>]
[evaluate.gpu_ids=<gpu indices>]
[evaluate.num_gpus=<number of gpus>]
Required Arguments
The following arguments are required.
-e,
--experiment_spec_file: The experiment spec file to set up the evaluation experiment.
evaluate.checkpoint: The
.pthmodel to be evaluated.
Optional Arguments
The following arguments are optional to run the command.
evaluate.<evaluate_option>: The evaluate options.
Multi-GPU evaluation is currently not supported for Visual ChangeNet Classify.
Running Inference on the Model#
Use the following command to run inference on VisualChangeNet-Classification with the
.pth model:
INFERENCE_JOB_ID=$(tao visual_changenet create-job \
--kind experiment \
--name "visual_changenet_inference" \
--action inference \
--workspace-id $WORKSPACE_ID \
--parent-job-id $TRAIN_JOB_ID \
--inference-dataset "$DATASET_ID" \
--specs "$INFERENCE_SPECS" \
--base-experiment-ids '["'$BASE_EXPERIMENT_ID'"]' \
--encryption-key "nvidia_tlt" | jq -r '.id')
tao model visual_changenet inference [-h] -e <experiment_spec_file>
task=classify
inference.checkpoint=<inference model>
[inference.<evaluate_option>=<evaluate_option_value>]
[inference.gpu_ids=<gpu indices>]
[inference.num_gpus=<number of gpus>]
Required Arguments
The following arguments are required.
-e,
--experiment_spec_file: The experiment spec file to set up the evaluation experiment.
inference.checkpoint: The
.pthmodel to run inference on.
Optional Arguments
The following arguments are optional to run the command.
inference.<inference_option>: The inference options.
Exporting the Model#
Here is an example spec file for exporting the trained VisualChangeNet model:
BASE_EXPERIMENT_ID=$(tao visual_changenet list-base-experiments | jq -r '.[0].id')
SPECS=$(tao visual_changenet get-job-schema --action export --base-experiment-id $BASE_EXPERIMENT_ID | jq -r '.default')
export:
checkpoint: /path/to/model.pth
onnx_file: /path/to/model.onnx
opset_version: 12
input_channel: 3
input_width: 128
input_height: 512
batch_size: -1
|
Parameter
|
Datatype
|
Default
|
Description
|
Supported Values
|
checkpoint
|
string
|
The path to the PyTorch model to export.
|
onnx_file
|
string
|
The path to the
|
opset_version
|
unsigned int
|
12
|
The opset version of the exported ONNX.
|
>0
|
input_channel
|
unsigned int
|
3
|
The input channel size. Only the value 3 is supported.
|
3
|
input_width
|
unsigned int
|
128
|
The input width.
|
>0
|
input_height
|
unsigned int
|
512
|
The input height.
|
>0
|
batch_size
|
unsigned int
|
-1
|
The batch size of the ONNX model. If this value is set to -1, the export uses dynamic batch size.
|
>=-1
|
gpu_id
|
unsigned int
|
0
|
The GPU ID to use.
|
on_cpu
|
bool
|
False
|
If set to
|
verbose
|
bool
|
False
|
If set to
Use the following command to export the model:
EXPORT_JOB_ID=$(tao visual_changenet create-job \
--kind experiment \
--name "visual_changenet_export" \
--action export \
--workspace-id $WORKSPACE_ID \
--parent-job-id $TRAIN_JOB_ID \
--specs "$EXPORT_SPECS" \
--base-experiment-ids '["'$BASE_EXPERIMENT_ID'"]' \
--encryption-key "nvidia_tlt" | jq -r '.id')
tao model visual_changenet export [-h] -e <experiment spec file>
task=classify
export.checkpoint=<model to export>
export.onnx_file=<onnx path>
[export.<export_option>=<export_option_value>]
Required Arguments
The following arguments are required to run the command.
-e,
--experiment_spec: The path to an experiment spec file
export.checkpoint: The
.pthmodel to export.
export.onnx_file: The path where the
.etltor
.onnxmodel is saved.
Optional Arguments
The following arguments are optional to run the command.
export.<export_option>: The export options.
For deployment, refer to the TAO Deploy Documentation for VisualChangeNet-Classification.