10.29. Clara Deploy AI COVID-19 Lesion Segmentation Operator
CAUTION: This is NOT for diagnostics use.
This asset requires the Clara Deploy SDK. Follow the instructions on the Clara Ansible page to install the Clara Deploy SDK.
This example is a containerized AI inference application, developed for use as one of the operators in the Clara Deploy pipelines. The application is built on the base AI application container, which provides the application framework to deploy models trained with Clara Train SDK. The same inference configuration, transform functions, and scanning window inference logic are used; however, inference is performed on the NVIDIA Triton Inference Server (Triton).
The application uses the pre-trained model for volumetric (3D) segmentation of ground-glass opacity (GGO) nodules in the lung from CT image, segmentation_ct_covid_lesion_v1
, which has been trained using the runnerup [1] awarded pipeline of the “Medical Segmentation Decathlon Challenge 2018” using the SegResNet architecture [2].
The training dataset is from NIH, with image aleady converted to resolution 0.8mm x 0.8mm x 5.0mm before training, and the training was performed with 4 16GB-memory GPUs.
The model’s input is a single channel CT image, and the output is a segmentation image, with label value 1 being ground-glass opacity (GG) nodules, and label value of 0 being everything else. The actual model input dimensions are 224 x 224 x 32
.
10.29.2.1.References
[1] Xia, Yingda, et al. “3D Semi-Supervised Learning with Uncertainty-Aware Multi-View Co-Training.” arXiv preprint arXiv:1811.12506 (2018). https://arxiv.org/abs/1811.12506.
[2] Myronenko, Andriy. “3D MRI brain tumor segmentation using autoencoder regularization.” In International MICCAI Brainlesion Workshop, pp. 311-320. Springer, Cham, 2018. https://arxiv.org/abs/1810.11654.
The application, in the form of a Docker container, expects an input folder (/input
by default), which can be mapped to the host volume when the Docker container is started. This folder must contain a volume image file in the NIfTI or MetaImage format. Furthermore, the volume image must be constructed from a single series of a DICOM study, typically an axial series with the data type of the original primary. In this case, the CT series of the Lung is expected.
The application also uses the same transforms library and configuration file format used during model training. The key model attributes (e.g. the model name) are saved in the inference configuration file, config_inference.json
.
Note: Clara Deploy pipeline interfaces with DICOM network and ingests DICOM imgages of varying pixel spacings. The DICOM instances(images) for the same series are first converted to volumetric image in MHD or NIfTI format with the original pixel spacing, then a resampling of this image is done in the pre-transforms to adjust the pixel spacing to that required by the model, e.g. 0.8 x 0.8 x 5.0
in mm.
The application saves the segmentation results to an output folder, /output
by default, which can also be mapped to a folder on the host volume. After the successful completion of the application, a segmentation volume image of format MetaImage is saved in the output folder. The name of the output file is the same as that of the input file due to certain limitations of the downstream consumer.
The example container also publishes data for the Clara Deploy Render Server to the /publish
folder by default. The original volume image and segmented volume image, along with a render configuration file, are saved in this folder.
10.29.4.1.NVIDIA Triton Inference Server (formerly known as TRTIS)
The application performs inference on the NVIDIA Triton Inference Server, which provides a cloud inferencing solution optimized for NVIDIA GPUs. The server provides an inference service via an HTTP or gRPC endpoint, allowing remote clients to request inferencing for any model being managed by the server.
The application source code files are in the directory structure shown below.
/
├── app_base_inference_v2
├── ai4med
├── config
│ ├── config_render.json
│ ├── config_inference.json
│ └── __init__.py
├── dlmed
├── inferers
├── model_loaders
├── ngc
├── public
├── utils
├── writers
├── app.py
├── Dockerfile
├── executor.py
├── logging_config.json
├── main.py
└── requirements.txt
The following describes the directory contents:
The
ai4med
anddlmed
directories contain the library modules shared with Clara Train SDK, mainly for its transforms functions and base inference client classes.The
config
directory contains model-specific configuration files, which is needed when building a customized container for a specific model.The
config_inference.json
file contains the configuration sections for pre- and post-transforms, as well as the model loader, inferer, and writer.The
config_render.json
contains the configuration for the Clara Deploy Render Server.
The
inferers
directory contains the implementation of the simple and scanning window inference client using the Triton API client libraryThe
model_loaders
directory contains the implementation of the model loader that gets model details from Triton Inference Server.The
ngc
andpublic
directories contain the user documentation.The
utils
directory contains utilities for loading modules and creating application objects.The
Writers
directory contains the specialized output writer required by Clara Deploy SDK, which saves the segmentation result to a volume image file as MetaImage.
10.29.6.1.Prerequisites
Check if the Docker image of
Triton
has been imported into the local Docker repository with the following command: .. code-block:: bashdocker images | grep tritonserver
Look for the image name
tritonserver
and the correct tag for the release, e.g.20.07-v1-py3
. If the image does not exist locally, it will be pulled from NVIDIA Docker registry.Download both the input dataset and the trained model from the
MODEL SCRIPTS
section for this container on NGC, following the steps in theSetup
section.
10.29.6.2.Step 1
Switch to your working directory (e.g. test_seg
).
10.29.6.3.Step 2
Create, if they do not exist, the following directories under your working directory:
input
containing the input image fileoutput
for the segmentation outputpublish
for publishing data for the Render Serverlogs
for the log filesmodels
containing models copied from thesegmentation_ct_covid_lesion_v1
folder
10.29.6.4.Step 3
In your working directory,
Create a shell script (
run_docker.sh
, or another name if you prefer.Copy the sample content below, change the
APP_NAME
to the full name of this docker, e.g.nvcr.io/ea-nvidia-clara/clara/ai-covid-lesion:0.5.0-2004.5
.Save the file.
#!/bin/bash
# Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved.
#
# NVIDIA CORPORATION and its licensors retain all intellectual property
# and proprietary rights in and to this software, related documentation
# and any modifications thereto. Any use, reproduction, disclosure or
# distribution of this software and related documentation without an express
# license agreement from NVIDIA CORPORATION is strictly prohibited.
SCRIPT_DIR=$(dirname "$(readlink -f "$0")")
# Clara Platform server would launch the container with the following environment variables internally,
# to provide runtime information.
export NVIDIA_CLARA_TRTISURI="localhost:8000"
# Default app name unless overwrtitten by command line argument
APP_NAME="app_covid_lesion"
# Default model name, used by the default app. If blank, all available models will be loaded.
MODEL_NAME="segmentation_ct_covid_lesion_v1"
INPUT_TYPE="mhd"
# Customize the app and model name
if [ "$1" != "" ]; then
APP_NAME="$1"
echo "Application name: "$1""
if [ "$2" != "" ]; then
echo "Model name: "$2""
else
echo "Model name is not entered. All models will be loaded"
fi
MODEL_NAME="$2"
else
echo "Default app and its model are using:${APP_NAME},${MODEL_NAME}"
fi
# Specific version of the Triton Inference Server image used in testing
TRITON_IMAGE="nvcr.io/nvidia/tritonserver:20.07-v1-py3"
# Docker network used by the app and Triton Docker container.
NETWORK_NAME="container-demo"
# Create network
docker network create ${NETWORK_NAME}
# Run Triton(name: triton), maping ./models/${MODEL_NAME} to /models/${MODEL_NAME}
# (localhost:8000 will be used)
RUN_TRITON="nvidia-docker run --name triton --network${NETWORK_NAME}-d --rm --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 \
-p 8000:8000 \
-v$(pwd)/models/${MODEL_NAME}:/models/${MODEL_NAME}${TRITON_IMAGE}\
tritonserver --model-repository=/models"
# Display the command
echo ${RUN_TRITON}
# Run the command to start the inference server Docker
eval ${RUN_TRITON}
# Wait until Triton is ready
triton_local_uri=$(docker inspect -f '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' triton)
echo -n "Wait until Triton${triton_local_uri}is ready..."
while [ $(curl -s ${triton_local_uri}:8000/api/status | grep -c SERVER_READY) -eq 0 ]; do
sleep 1
echo -n "."
done
echo "done"
export NVIDIA_CLARA_TRTISURI="${triton_local_uri}:8000"
# Run ${APP_NAME} container.
# Launch the app container with the following environment variables internally
# to provide input/output path information.
docker run --name test_docker --network ${NETWORK_NAME} -t --rm \
-v $(pwd)/input/${INPUT_TYPE}/:/input \
-v $(pwd)/output:/output \
-v $(pwd)/logs:/logs \
-v $(pwd)/publish:/publish \
-e NVIDIA_CLARA_TRTISURI \
-e DEBUG_VSCODE \
-e DEBUG_VSCODE_PORT \
-e NVIDIA_CLARA_NOSYNCLOCK=TRUE \
${APP_NAME}
echo "${APP_NAME}has finished."
# Stop Triton container
echo "Stopping Triton inference server."
docker stop triton > /dev/null
# Remove network
docker network remove ${NETWORK_NAME} > /dev/null
10.29.6.5.Step 4
Execute the script as shown below and wait for the application container to finish:
./run_docker.sh
10.29.6.6.Step 5
Check for the following output files:
Segmentation results in the
output
directory:One file of the same name as your input file, with extension
.mhd
One file of the same name, with extension
.raw
Published data in the
publish
directory:Original volume image, in either MHD or NIfTI format
Segmentation volume image (
<input file name only>.output.mhd
and<input file name only>.output.raw
)Render Server config file (
config_render.json
)Metadata file describing the above file (
config.meta
)
10.29.6.7.Step 6
To visualize the segmentation results, any tool that support MHD or NFiTI can be used, e.g. 3D Slicer.
To see the internals of the container or to run the application within the container, please follow the following steps.
See the above section on how to run the container with the required environment variables and volume mapping, and start the container by replacing the
docker run
command with the following: .. code-block:: bashdocker run -it –rm –entrypoint /bin/bash
Once in the Docker terminal, ensure the current directory is
/
.Execute the following command: .. code-block:: bash
python3 ./app_base_inference_v2/main.py
When finished, type
exit
.