9.27. Clara Deploy AI Lung Segmentation Operator

CAUTION: This is NOT for diagnostics use.

This asset requires the Clara Deploy SDK. Follow the instructions on the Clara Bootstrap page to install the Clara Deploy SDK.

9.27.1. Overview

This example is a containerized AI inference application, developed for use as one of the operators in the Clara Deploy pipelines. The application is built on the base AI application container, which provides the application framework to deploy Clara Train TLT trained models. The same execution configuration file, set of transform functions, and scanning window inference logic are used; however, inference is performed on the NVIDIA Triton Inference Server (Triton), formerly known as TensorRT Inference Server (TRTIS).

9.27.2. Inputs

The application, in the form of a Docker container, expects an input folder (/input by default), which can be mapped to the host volume when the Docker container is started. This folder must contain a volume image file in the NIfTI or MetaImage format. Furthermore, the volume image must be constructed from a single series of a DICOM study, typically an axial series with the data type of the original primary. In this case, the CT series of the Lung is expected.

9.27.3. Outputs

The application saves the segmentation results to an output folder, /output by default, which can also be mapped to a folder on the host volume. After the successful completion of the application, a segmentation volume image of format MetaImage is saved in the output folder. The name of the output file is the same as that of the input file due to certain limitations of the downstream consumer.

The example container also publishes data for the Clara Deploy Render Server to the /publish folder by default. The original volume image and segmented volume image, along with a render configuration file, are saved in this folder.

9.27.4. AI Model

The application uses the segmentation_ct_lung_v1 model, which was developed by NIH and NVIDIA for use in COVID-19 detection pipeline. The model is yet to be published on ngc.nvidia.com. The input tensor is of size 320x320x64 with a single channel. The output is of the same shape with two channels.

The application also uses the same transform library and configuration file for the validation/inference pipeline during TLT model training. The key model attributes (e.g. the model name) are saved in the file config_inference.json. Note that resampling of the input image is done in the pre-transforms to adjust the pixel spacing to 0.8x0.8x5.0 in mm.

9.27.4.1. NVIDIA Triton Inference Server (formerly known as TRTIS)

The application performs inference on the NVIDIA Triton Inference Server, which provides a cloud inferencing solution optimized for NVIDIA GPUs. The server provides an inference service via an HTTP or gRPC endpoint, allowing remote clients to request inferencing for any model being managed by the server.

9.27.5. Directory Structure

The directories in the container are shown below.

  • The app_base_inference directory is from the AI base inference container, except for the files in the config directory.

  • In the /opt/nvidia directory, trtis_client contains the Triton API client library, whereas clara contains Clara Deploy client library.

  • The sdk_dist directory contains the Clara Train V1.0 transforms library.

/app_base_inference
├── Dockerfile
├── app.py
├── build_fatbin.sh
├── config
│   ├── __init__.py
│   ├── config_inference.json
│   ├── config_render.json
│   └── model_config.json
├── executor.py
├── logging_config.json
├── main.py
├── medical
├── ngc
│   ├── metadata.json
│   └── overview.md
├── patched_medical_tlt2_src
├── public
│   └── docs
│       └── README.md
├── requirements.txt
└── writers
    ├── __init__.py
    ├── classification_result_writer.py
    ├── mhd_writer.py
    └── writer.py

/opt
└── nvidia
    ├── clara
    │   └── clara-0.1-py3-none-any.whl
    └── trtis-clients

/sdk_dist
/input
/outpt

9.27.6. Executing Operator Locally

To see the internals of the container or to run the application within the container, please follow the following steps.

  1. See the next section on how to run the container with the required environment variables and volume mapping, and start the container by replacing the docker run command with the following: .. code-block:: bash

    docker run -it –entrypoint /bin/bash

  2. Once in the Docker terminal, ensure the current directory is /.

  3. Execute the following command: .. code-block:: bash

    python3 ./app_base_inference/main.py

  4. When finished, type exit.

9.27.7. Executing Operator in Docker

9.27.7.1. Prerequisites

  1. Check if the Docker image of Triton (formerly TRTIS) has been imported into the local Docker repository with the following command: .. code-block:: bash

    docker images | grep tensorrtserver

  2. Look for the image name tensorrtserver and the correct tag for the release, e.g. 19.08-py3. If the image does not exist locally, it will be pulled from NVIDIA Docker registry.

  3. Download both the input dataset and the trained model from the MODEL SCRIPTS section for this container on NGC, following the steps in the Setup section.

9.27.7.2. Step 1

Switch to your working directory (e.g. test_seg).

9.27.7.3. Step 2

Create, if they do not exist, the following directories under your working directory:

  • input containing the input image file

  • output for the segmentation output

  • publish for publishing data for the Render Server

  • logs for the log files

  • models containing models copied from the segmentation_lung_v1 folder

9.27.7.4. Step 3

In your working directory,

  • Create a shell script (run_docker.sh, or another name if you prefer.

  • Copy the sample content below, change the APP_NAME to the full name of this docker, e.g. nvcr.io/ea-nvidia-clara/clara/ai-lung:0.5.0-2004.5.

  • Save the file.

#!/bin/bash

# Copyright (c) 2020, NVIDIA CORPORATION.  All rights reserved.
#
# NVIDIA CORPORATION and its licensors retain all intellectual property
# and proprietary rights in and to this software, related documentation
# and any modifications thereto.  Any use, reproduction, disclosure or
# distribution of this software and related documentation without an express
# license agreement from NVIDIA CORPORATION is strictly prohibited.

SCRIPT_DIR=$(dirname "$(readlink -f "$0")")
TESTDATA_DIR=$(readlink -f "${SCRIPT_DIR}"/../test-data)

# Default app name. Change to acutally name, e.g. `nvcr.io/ea-nvidia-clara/clara/ai-lung:0.5.0-2004.5`
APP_NAME="app_lung"
# Default model name, used by the app. If blank, all available models will be loaded.
MODEL_NAME="segmentation_ct_lung_v1"
# Forma of input image used in testing
INPUT_TYPE="nii"

# Clara Deploy would launch the container when run in a pipeline with the following
# environment variable to provide runtime information. This is for testing locally
export NVIDIA_CLARA_TRTISURI="localhost:8000"

# Specific version of the Triton Inference Server image used in testing
TRTIS_IMAGE="nvcr.io/nvidia/tensorrtserver:19.08-py3"

# Docker network used by the app and TRTIS Docker container.
NETWORK_NAME="container-demo"

# Create network
docker network create ${NETWORK_NAME}

# Run TRTIS(name: trtis), maping ./models/${MODEL_NAME} to /models/${MODEL_NAME}
# (localhost:8000 will be used)
RUN_TRITON="nvidia-docker run --name trtis --network ${NETWORK_NAME} -d --rm --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 \
    -p 8000:8000 \
    -v $(pwd)/models/${MODEL_NAME}:/models/${MODEL_NAME} ${TRTIS_IMAGE} \
    trtserver --model-store=/models"
# Run the command to start the inference server Docker
eval ${RUN_TRITON}
# Display the command
echo ${RUN_TRITON}

# Wait until TRTIS is ready
trtis_local_uri=$(docker inspect -f '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' trtis)
echo -n "Wait until TRTIS ${trtis_local_uri} is ready..."
while [ $(curl -s ${trtis_local_uri}:8000/api/status | grep -c SERVER_READY) -eq 0 ]; do
    sleep 1
    echo -n "."
done
echo "done"

export NVIDIA_CLARA_TRTISURI="${trtis_local_uri}:8000"

# Run ${APP_NAME} container.
# Launch the app container with the following environment variables internally
# to provide input/output path information.
docker run --name test_docker --network ${NETWORK_NAME} -it --rm \
    -v $(pwd)/input/${INPUT_TYPE}/:/input \
    -v $(pwd)/output:/output \
    -v $(pwd)/logs:/logs \
    -v $(pwd)/publish:/publish \
    -e NVIDIA_CLARA_TRTISURI \
    -e DEBUG_VSCODE \
    -e DEBUG_VSCODE_PORT \
    -e NVIDIA_CLARA_NOSYNCLOCK=TRUE \
    ${APP_NAME}

echo "${APP_NAME} has finished."

# Stop TRTIS container
echo "Stopping Triton(TRTIS) inference server."
docker stop trtis > /dev/null

# Remove network
docker network remove ${NETWORK_NAME} > /dev/null

9.27.7.5. Step 4

Execute the script as shown below and wait for the application container to finish:

./run_docker.sh

9.27.7.6. Step 5

Check for the following output files:

  1. Segmentation results in the output directory:

    • One file of the same name as your input file, with extension .mhd

    • One file of the same name, with extension .raw

  2. Published data in the publish directory:

    • Original volume image, in either MHD or NIfTI format

    • Segmentation volume image (<input file name only>.output.mhd and <input file name only>.output.raw)

    • Render Server config file (config_render.json)

    • Metadata file describing the above file (config.meta)

9.27.7.7. Step 6

To visualize the segmentation results, any tool that support MHD or NFiTI can be used, e.g. 3D Slicer.

9.27.8. License

An End User License Agreement is included with the product. By pulling and using the Clara Deploy asset on NGC, you accept the terms and conditions of these licenses.

9.27.9. Suggested Reading

Release Notes, the Getting Started Guide, and the SDK itself are available at the NVIDIA Developer forum.

For answers to any questions you may have about this release, visit the NVIDIA Devtalk forum.