SiameseOI#

SiameseOI is an NVIDIA-developed optical inspection model for PCB data and is included in the TAO. SiameseOI supports the following tasks:

  • train

  • evaluate

  • inference

  • export

These tasks can be invoked from the TAO Launcher using the following convention on the command-line:

tao model optical_inspection <sub_task> <args_per_subtask>

Where args_per_subtask are the command-line arguments required for a given subtask. Each subtask is explained in detail in the following sections.

Data Input for SiameseOI#

SiameseOI requires the data to be provided as image folders and CSV files. See the Data Annotation Format page for more information about the input data format for SiameseOI.

Creating a Training Experiment Spec File#

Configuring a Custom Dataset#

This section provides an example configuration and commands for training SiameseOI using the dataset format described above. You will need to configure the augmentation_config mean and standard deviation based on your input dataset.

Here is an example spec file for training a SiameseOI model with a custom backbone on a custom dataset using the Data Annotation Format.

results_dir: /path/to/experiment_results
model:
  model_type: Siamese_3
  model_backbone: custom
  embedding_vectors: 5
  margin: 2.0
dataset:
  train_dataset:
    csv_path: /path/to/split/train.csv
    images_dir: /path/to/images_dir/
  validation_dataset:
    csv_path: /path/to/split/val.csv
    images_dir: /path/to/images_dir/
  image_ext: .jpg
  batch_size: 32
  workers: 8
  fpratio_sampling: 0.1
  num_input: 4
  input_map:
    LowAngleLight: 0
    SolderLight: 1
    UniformLight: 2
    WhiteLight: 3
  concat_type: linear
  grid_map:
    x: 2
    y: 2
  image_width: 100
  image_height: 100
  augmentation_config:
    rgb_input_mean: [0.485, 0.456, 0.406]
    rgb_input_std: [0.229, 0.224, 0.225]
train:
  optim:
    type: Adam
    lr: 0.0005
  loss: contrastive
  num_epochs: 10
  checkpoint_interval: 5
  validation_interval: 5
  results_dir: "${results_dir}/train"
  seed: 1234

Parameter

Data Type

Default

Description

Supported Values

model

dict config

The configuration of the model architecture

dataset

dict config

The configuration of the dataset

train

dict config

The configuration of the training task

evaluate

dict config

The configuration of the evaluation task

inference

dict config

The configuration of the inference task

encryption_key

string

None

The encryption key to encrypt and decrypt model files

results_dir

string

/results

The directory where experiment results are saved

export

dict config

The configuration of the ONNX export task

gen_trt_engine

dict config

The configuration of the TensorRT generation task. Only used in TAO deploy

train#

Parameter

Datatype

Default

Description

Supported Values

num_gpus

unsigned int

1

The number of GPUs to use for distributed training

>0

gpu_ids

List[int]

[0]

The indices of the GPU’s to use for distributed training

seed

unsigned int

1234

The random seed for random, NumPy, and torch

>0

num_epochs

unsigned int

10

The total number of epochs to run the experiment

>0

checkpoint_interval

unsigned int

1

The epoch interval at which the checkpoints are saved

>0

validation_interval

unsigned int

1

The epoch interval at which the validation is run

>0

resume_training_checkpoint_path

string

The intermediate PyTorch Lightning checkpoint to resume training from

results_dir

string

/results/train

The directory to save training results

optim

dict config

None

Contains the configurable parameters for the SiameseOI optimizer detailed in the optim section.

loss

str

contrastive

The loss function used during training

optim#

optim:
  lr: 0.0005

Parameter

Datatype

Default

Description

Supported Values

lr

float

0.0005

The learning rate

>=0.0

Model#

The following example model config provides options to change the SiameseOI architecture for training.

model:
  model_type: Siamese_3
  model_backbone: custom
  embedding_vectors: 5
  margin: 2.0

The following example model is used during SiameseOI evaluation/inference.

Parameter

Datatype

Default

Description

Supported Values

model_type

string

Siamese_3

The default model architecture from the supported custom model architectures

Siamese_3, Siamese_1

model_backbone

string

custom

The name of the backbone to use

custom

embedding_vectors

int

5

The embedding dimensions of the final output from the model before computing Euclidian distance

margin

float

2.0

The threshold parameter that determines the minimum distance between embeddings of positive and negative pairs

Dataset#

The dataset parameter defines the dataset source, training batch size, augmentation, and pre-processing. An example dataset is provided below.

dataset:
  train_dataset:
    csv_path: /path/to/split/train.csv
    images_dir: /path/to/images_dir/
  validation_dataset:
    csv_path: /path/to/split/val.csv
    images_dir: /path/to/images_dir/
  image_ext: .jpg
  batch_size: 32
  workers: 8
  fpratio_sampling: 0.1
  num_input: 4
  input_map:
    LowAngleLight: 0
    SolderLight: 1
    UniformLight: 2
    WhiteLight: 3
  concat_type: linear
  grid_map:
    x: 2
    y: 2
  image_width: 100
  image_height: 100
  augmentation_config:
    rgb_input_mean: [0.485, 0.456, 0.406]
    rgb_input_std: [0.229, 0.224, 0.225]

Parameter

Datatype

Default

Description

Supported Values

train_dataset

Dict

The paths to the image directory and CSV files for the training dataset

validation_dataset

Dict

The paths to the image directory and CSV files for the validation dataset

image_ext

str

.jpg

The file extension of the images in the dataset

string

batch_size

int

32

The number of samples per batch

string

workers

int

8

The number of worker processes for data loading

fpratio_sampling

int

0.1

The ratio of false-positive examples to sample

>0

num_input

int

4

The number of lighting conditions for each input image*

>0

input_map

Dict

The mapping of lighting conditions to indices specifying concatenation ordering*

concat_type

string

linear

Type of concatenation to use for different image lighting conditions

linear, grid

grid_map




Dict

Dict

dict config
None

None

None
The parameters to define the grid dimensions to concatenate images as a grid:

* x: The number of images along the x-axis

* y: The number of images along the y-axis
Dict




input_width

int

100

The width of the input image

>0

input_height

int

100

The height of the input image

>0

augmentation_config




Dict

List[float]

List[float]
None

[0.485, 0.456, 0.406]

[0.229, 0.224, 0.225]
The image normalization config, which contains the following parameters:

* rgb_input_mean: The mean to be subtracted for pre-processing

* rgb_input_std: The standard deviation to divide the image by


>=0.0

>=0.0

* See the Dataset Annotation Format definition for more information about specifying lighting conditions.

Training the Model#

Use the following command to run SiameseOI training:

tao model optical_inspection train [-h] -e <experiment_spec>
                             [results_dir=<global_results_dir>]
                             [model.<model_option>=<model_option_value>]
                             [dataset.<dataset_option>=<dataset_option_value>]
                             [train.<train_option>=<train_option_value>]
                             [train.gpu_ids=<gpu indices>]
                             [train.num_gpus=<number of gpus>]

Required Arguments#

The only required argument is the path to the experiment spec:

  • -e, --experiment_spec: The experiment specification file to set up the training experiment

Optional Arguments#

You can set optional arguments to override the option values in the experiment spec file.

Note

For training, evaluation, and inference, we expose 2 variables for each respective task: num_gpus and gpu_ids, which default to 1 and [0], respectively. If both are passed, but inconsistent, for example num_gpus = 1, gpu_ids = [0, 1], then they are modified to follow the setting with more GPUs, for example num_gpus = 1 -> num_gpus = 2.

Checkpointing and Resuming Training#

At every train.checkpoint_interval, a PyTorch Lightning checkpoint is saved. It is called model_epoch_<epoch_num>.pth. These are saved in train.results_dir, like so:

$ ls /results/train

'model_epoch_000.pth'
'model_epoch_001.pth'
'model_epoch_002.pth'
'model_epoch_003.pth'
'model_epoch_004.pth'

The latest checkpoint is also be saved as oi_model_latest.pth. Training automatically resumes from oi_model_latest.pth, if it exists in train.results_dir. This is superseded by train.resume_training_checkpoint_path, if it is provided.

The major implication of this logic is that, if you wish to trigger fresh training from scratch, either:

  • Specify a new, empty results directory (Recommended)

  • Remove the latest checkpoint from the results directory

Creating Testing Experiment Spec File#

Here is an example spec file for testing evaluation and inference of a trained SiameseOI model.

results_dir: /path/to/experiment_results
model:
  model_type: Siamese_3
  model_backbone: custom
  embedding_vectors: 5
  margin: 2.0
dataset:
  validation_dataset:
    csv_path: /path/to/split/val.csv
    images_dir: /path/to/images_dir/
  image_ext: .jpg
  batch_size: 32
  workers: 8
  num_input: 4
  input_map:
    LowAngleLight: 0
    SolderLight: 1
    UniformLight: 2
    WhiteLight: 3
  concat_type: linear
  grid_map:
    x: 2
    y: 2
  image_width: 100
  image_height: 100
  augmentation_config:
    rgb_input_mean: [0.485, 0.456, 0.406]
    rgb_input_std: [0.229, 0.224, 0.225]
evaluate:
  num_gpus: 1
  gpu_ids: [0]
  checkpoint: "${results_dir}/train/oi_model_lastest.pth"
  results_dir: "${results_dir}/evaluate"
inference:
  num_gpus: 1
  gpu_ids: [0]
  checkpoint: "${results_dir}/train/oi_model_latest.pth"
  results_dir: "${results_dir}/inference"

Evaluating the Model#

Use the following command to run SiameseOI evaluation:

tao model optical_inspection evaluate [-h] -e <experiment_spec>
                             evaluate.checkpoint=<model to be evaluated>
                             [evaluate.<evaluate_option>=<evaluate_option_value>]
                             [evaluate.gpu_ids=<gpu indices>]
                             [evaluate.num_gpus=<number of gpus>]

Multi-GPU evaluation is currently not supported for Optical Inspection.

Required Arguments#

  • -e, --experiment_spec: The experiment spec file to set up the evaluation experiment.

  • evaluate.checkpoint: The .pth model to be evaluated.

Optional Arguments#

Running Inference on the Model#

Use the following command to run inference on SiameseOI with the .tlt model:

tao model optical_inspection inference [-h] -e <experiment spec file>
                             inference.checkpoint=<model to be inferenced>
                             [inference.<inference_option>=<inference_option_value>]
                             [inference.gpu_ids=<gpu indices>]
                             [inference.num_gpus=<number of gpus>]

Required Arguments#

  • -e, --experiment_spec: The experiment spec file to set up the inference experiment.

  • inference.checkpoint: The .pth model to inference.

Optional Arguments#

Exporting the Model#

Here is an example spec file for exporting the trained SiameseOI model:

export:
  checkpoint: "${results_dir}/train/oi_model_epoch=004.pth"
  results_dir: "${results_dir}/export"
  onnx_file: "${export.results_dir}/oi_model.onnx"
  batch_size: 32

Use the following command to export the model:

tao model optical_inspection export [-h] -e <experiment spec file>
                             export.checkpoint=<model to export>
                             export.onnx_file=<onnx path>
                             [export.<export_option>=<export_option_value>]

Required Arguments#

  • -e, --experiment_spec: The path to an experiment spec file.

  • export.checkpoint: The .pth model to export.

  • export.onnx_file: The path where the .etlt or .onnx model is saved.

Optional Arguments#

TensorRT Engine Generation, Validation, and int8 Calibration#

For deployment, refer to the TAO Deploy Documentation for SiameseOI.