Getting started with Clara
This chapter provides instructions on everything you need to get started with Clara from preparing your data and training models to exporting, evaluating, and performing inference on the trained classification and segmentation models.
Clara uses an algorithm for supervised training to find the best model based on training and validation datasets.
The training dataset contains pairs of data items that are used for minimizing loss. The validation dataset contains pairs of data items for validation during the training.
A single pass through a full dataset is referred to as an epoch. Since a full dataset cannot typically be processed in a single iteration, it is divided into batches of data items. For each batch, an optimizer minimizes a loss function and adjusts the weights of the model accordingly. Training metrics are collected and logged during this process.
Once all iterations are completed for the epoch, validation is performed if necessary. Validation is performed by running the validation dataset through the current model. Validation metrics are computed, which measure the quality of the current model using several metrics. One important metric is called the stopping metric, which is used to determine the quality of the model.
Validation is usually configured to be run every N epochs, where N is configurable. The result of the validation determines the best model. The algorithm keeps the current best key metric, which is initialized to a large negative number. Each time validation is done, the computed key metric is compared with the current best. If it is better, then the current best is set to the new metric value, and the current model is written to disk in the model.pt
file. The model.pt
represents the best model.
The more validations performed, the more likely you are to find the best model. Finding the best model by performing validation after each iteration can take a long time because validation needs to go through the whole validation dataset. In practice, validation should be performed every N epochs with N being configured using the num_interval_per_valid
parameter.
When the training is complete, the final_model.pt
is written to disk and used for fine-tuning. This general algorithm is used for all modes of training: train, fine-tune, multi-gpu train, and multi-gpu fine-tune.
If the data format is DICOM or the resolution is not isotropic, one can use the provided data converter tool to convert the data to isotropic NIfTI format. Furthermore, many pre-trained models were trained on 1x1x1mm resolution images, and to use those pre-trained models as a starting point, convert the data to 1x1x1mm NIfTI format (Notice: If the dataset is already in NIfTI format, but not with 1x1x1 mm spacing, the data conversion is still required for the dataset.).
The medl-dataconvert
command converts all dicom volumes in your/data/directory to NIfTI format and optionally re-samples them to the provided resolution. If the images to be converted are segmentation labels, the -l
flag needs to be added so the resampler will use nearest neighbor interpolator (otherwise linear interpolator is used).
medl-dataconvert -d your/data/directory -r 1 -s .dcm -e .nii.gz -o your/output/directory
If you need to convert both 3D volumetric images and their segmentation labels, put them into two different folders, and run the converter once for the images and once for the labels using the -l
flag.
Supported options are:
Option |
Description |
---|---|
-d |
Input directory with subdirectories containing dicom images. |
-r |
Output image resolution. If not provided, dicom resolution will be preserved. If only a single value is provided, target resolution will be isotrophic (e.g. -r 1 for 1x1x1mm resolution) |
-s |
Input file format, can be .dcm, .nii, .nii.gz, .mha, .mhd. |
-e |
Output file format, can be .nii, .nii.gz, .mha, .mhd. |
-o |
Output directory. |
-f |
(Optional) Force overwriting exsisting files if output directory already exists. |
-l |
(Optional) Flag indicating that the data is LABEL/SEGMENTATION masks and the nearest neighbor interpolation is used for re-sampling. |
Classification models: Prepare the data
This section describes the format in which the data can be used with transfer learning for 2D classification tasks.
Classification models: Data format
All input images and labels for the chest xray example MMAR are in png format. Additional steps may need to be taken if you have data in a different format or if you are planning to resample images, e.g., to 256x256, and it may be best to do that as a pre-processing step rather than have the Clara Train SDK do that on the fly. You must also have ground truth labels available. These are often binary, i.e., {0,1}, or multi-class, i.e., {0,…,C} if there are C classes.
Classification models: Folder structure
The layout of data files can be arbitrary, but the JSON file describing the data list must contain relative paths to all image files.:
|--dataset_root:
|--datalist.json
|--png_files
|--im1.png
|--im2.png
|--im3.png
Classification models: Datalist JSON file
The JSON file describing the data structure must include a label_format` key. The corresponding value should be a list of natural numbers, specifying the number of type of labels in the dataset. For instance, for the PLCO dataset, there are 15 binary labels, so it should be a list of 15 ones: [1,1,1,1,1,1,1,1,1,1,1,1,1,1,1].
The datafile should also have a training
and validation
key. These keys contain:
a list of dictionaries, where the value for the image key must be a relative path to the png file.
the value for the label key must be a list of natural numbers corresponding to the ground truth labels.
The labels for each image must match the label_format
specified above.:
{
"label_format": [1,1,1,1,1,1,1,1,1,1,1,1,1,1,1],
"training":
[
{
"image" : "im1.png"
"label" : [0,0,1,0,0,0,0,0,0,0,1,0,0,0,0]
},
...
The validation
key is optional and only needs to be specified if the main training config file specifies metrics to
compute. If the validation key is provided, it specifies the corresponding images and labels used to compute the
validation metrics at the end of each training epoch (or less/more frequently if specified in the main training config).
Training a classification model
Run train.sh
to train the model.
cd path/to/mmar/commands/folder
./train.sh
To fine-tune based on the pre-trained model included in the MMAR, first change the DATA_ROOT and DATASET_JSON to point to your dataset and data split configuration. Then run the train_finetune.sh:
cd path/to/mmar/commands/folder
./train_finetune.sh
The resulting checkpoint files are stored in the models folder of the MMAR.
For more details about MMAR, see Medical Model Archive (MMAR).
Classification models: Multi-GPU training
To run multi-gpu training, run train_2gpu.sh
. See Medical Model Archive (MMAR).
When training or finetuning the models in multi-GPU setting on small number of training data, it is recommended to adjust the learning rate provided in the configuration files, e.g. multiple the learning rate by the GPU number as is recommended in https://arxiv.org/pdf/1706.02677.pdf.
Classification models: Tensorboard visualization
You can run the following command to use Tensorboard for visualization.
python3 -m tensorboard.main --logdir "${MODEL_DIR}"
Classification models: Exporting the model to a TorchScript model
After the model has been trained, run export.sh from the “commands” folder in MMAR to export the checkpoint into frozen graphs.
cd path/to/mmar/commands/folder
./export.sh
Two frozen graph files will be produced in the models folder of the MMAR:
model.pt - a regular PyTorch model
model.ts - a TorchScript model
Classification model evaluation with ground truth
Run validate.sh
from the MMAR.
cd path/to/mmar/commands/folder
./validate.sh
The validation result files are created in the eval folder of the MMAR.
Classification model inference
Run infer.sh
from the MMAR.
cd path/to/mmar/commands/folder
./infer.sh
The inference result files are created in the eval folder of the MMAR.
Use the same configuration file for both validation and inference. For inference, the metric values specified in the configuration file won’t be computed, and no ground truth label is needed.
This section provides instructions on preparing your data, training models, exporting, evaluating and performing inference on the trained segmentation models using transfer learning.
Segmentation models: Prepare the data
Example MMARs on NGC have input images and labels in NIfTI format. Each input image and its corresponding label mask must have the same image dimension. To visualize or save NIfTI images, you can use free viewers such as ITK-SNAP or MITK. In MONAI, the LoadImageD
transform can load DICOM images or serials, but it is an experimental feature so custom transforms or other data preparation steps could be required depending on the data.
If your native data format is different from NIfTI or if you want to convert the image and label mask to isotropic resolution, you can use the provided data converter or some other software of your choice, such as ITK-SNAP or directly in Python.
Segmentation models: Folder structure
The layout of data files can be arbitrary, but the JSON file describing the data list must contain the relative paths to all data files.:
|--dataset_root:
|--datalist.json
|--train
|--im1.nii.gz
|--lb1.nii.gz
|--im2.nii.gz
|--lb2.nii.gz
|--im3.nii.gz
|--lb3.nii.gz
|--im4.nii.gz
|--lb4.nii.gz
|--val
|--im1.nii.gz
|--lb1.nii.gz
|--im2.nii.gz
|--lb2.nii.gz
For example, the datalist.json
file looks similar to this. Here all paths are relative to datalist.json
location.:
{
"training": [
{
"image" : "train/im1.nii.gz",
"label" : "train/lb1.nii.gz"
},
{
"image" : "train/im2.nii.gz",
"label" : "train/lb2.nii.gz"
},
{
"image" : "train/im3.nii.gz",
"label" : "train/lb3.nii.gz"
}, {
"image" : "train/im4.nii.gz",
"label" : "train/lb4.nii.gz"
},
],
"validation": [
{
"image" : "val/im1.nii.gz",
"label" : "val/lb1.nii.gz"
},
{
"image" : "val/im2.nii.gz",
"label" : "val/lb2.nii.gz"
},
]
}
The training and validation lists contain the images to be used in the training and validation steps, respectively.
By default, all paths inside the datalist.json are assumed relative to the datalist.json file location. You can optionally specify the ROOT base path of the datasets by specifying it in the main config file (image_base_dir JSON key) or as a command line option (–file_root) to the train command.
Segmentation models: Datalist JSON file
The JSON file describing the data structure must include the training
key with a list of items (each containing image
and label
keys).
The value for the image
key can be a string containing the path to a single NIfTI file or a list of strings that are paths to NIfTI files. If there are several channels they are saved as separate files. Here is an example:
{
"image" : [
"train/im1_ch1.nii.gz",
"train/im1_ch2.nii.gz",
"train/im1_ch3.nii.gz",
"train/im1_ch4.nii.gz"
]
"label" : "train/lb1.nii.gz"
},
If image
includes several files, they will be concatenated as separate channels of the network input. These images must be already spatially aligned.
The value for the label
key, must be a string containing the path to a single NIfTI file with dense segmentation masks. The label mask defines segmentation using indices. Each integer index is a separate class or a multichannel one-hot-encoded image, where each channel represents a separate class.
The validation
key is optional. If provided, the corresponding images/labels will be used to compute the validation metrics at the end of each specified training epoch in this release or less/more frequent if specified in the main training config. The validation section does not need to include the label keys, if the datalist.json is used for inference to compute the output segmentation masks.
Training a segmentation model
Segmentation training
Use train.sh
to train the model:
cd path/to/mmar/commands/folder
./train.sh
Segmentation models: Fine tuning
To fine-tune based on the pre-trained model included in the MMAR, first change the DATA_ROOT and DATASET_JSON to point to your dataset and data split configuration. Then run train_finetune.sh
:
cd path/to/mmar/commands/folder
./train_finetune.sh
The resultant checkpoint files are stored in the models folder of the MMAR.
For more details see Medical Model Archive (MMAR).
Segmentation models: Multi-GPU training
To run 2-gpu training, run train_2gpu.sh
.
cd path/to/mmar/commands/folder
./train_2gpu.sh
To fine-tune based on the pre-trained model included in the MMAR, first change the DATA_ROOT and DATASET_JSON to point to your dataset and data split configuration. Then run train_2gpu_finetune.sh
:
cd path/to/mmar/commands/folder
./train_2gpu_finetune.sh
The resulting checkpoint files are stored in the models folder of the MMAR.
See Medical Model Archive (MMAR) for more details.
When training or fine-tuning the models using the multi-GPU setting on a relatively small training dataset, it is recommended to adjust the learning rate provided in the configuration files, e.g. multiply the learning rate by the number of GPUs as is recommended in https://arxiv.org/pdf/1706.02677.pdf. You can create your own train_Ngpu.sh based on train_2gpu.sh. Make sure to adjust the learning rate accordingly.
Segmentation model evaluation with ground truth
Run validate.sh
from the MMAR.
cd path/to/mmar/commands/folder
./validate.sh
The validation result files are created in the eval folder of the MMAR.
See Model training and validation configurations for example of validation config for classification model.
Segmentation model inference
Use infer.sh
to run inference on the model from the Medical Model Archive.
cd path/to/mmar/commands/folder
./infer.sh
The inference result files are created in the eval folder of the MMAR.
The same configuration file is used for both validation and inference. For inference, the metric values specified in the configuration file won’t be computed, and no ground truth label is needed.