10.1. Clara Deploy Operator Development Guide

This section describes how to develop an application to run in a Docker container on the NVIDIA Clara Deploy SDK, with emphasis on AI inference application. Also described are the steps to configure AI models and make them available to the NVIDIA TensorRT Inference Server, TRTIS.

10.1.1. Role of an Operator in Clara Deploy

The NVIDIA Clara Deploy SDK provides open, scalable computing platform that enables development of medical imaging applications for hybrid (embedded, on-premise, or cloud) computing environments to create intelligent instruments and automated healthcare pipelines. Applications are containerized and are usd as operators in pipelines. A containerized application can be used in more than one operator in a pipleline, and also can be used in different pipelines. Please refer to sections Introduction and Core Concepts for additional details.

There is no specific library that an application needs to depend on in order to integrate with and deploy on Clara Deploy platform. Containerized applications can be used in operators and deployed in Clara Deploy pipelines. Please see section on Pipeline for details.

The primary I/O model for an operator, and its constituent application container, is through persistent volumes mounted by the Clara Deploy Core server. An operator can define any number of inputs whose individual path is a volume or a rooted directory. Each input also specifies which other operator’s output it will receive input data, and if no other operator is named, the initial payload to the pipeline becomes the input. This is basically how nodes in directed acyclic graph, DAG, are connected, and in Clara Deploy the the pipeline definition is a DAG.

Operators can also define environment variables that will be set by the Clara Deploy core server at runtime, as well as dependent services. Examples are given in the the specific operators, e.g. ai operator.

10.1.2. Building and Testing Application Container Image

Applications used in Clara Deploy operators must run in a Docker container, which should be tested in a development environment before being deployed. The expected volumes and environment variable need to be set simulating what will be set by the Clara Deploy platform. Dependent services also need to be started up through ways such as scripting.

The following is a sample script that builds the container image, starts the dependent service, mounts the volumes according to operator definition, sets the environment variables, and runs the container on a named network.


# Copyright (c) 2019, NVIDIA CORPORATION.  All rights reserved.
# NVIDIA CORPORATION and its licensors retain all intellectual property
# and proprietary rights in and to this software, related documentation
# and any modifications thereto.  Any use, reproduction, disclosure or
# distribution of this software and related documentation without an express
# license agreement from NVIDIA CORPORATION is strictly prohibited.
#     ---- EXAMPLE ONLY ----
# Define the name of the app (aka operator), assumed the same as the project folder name

# Build Dockerfile
docker build -t ${APP_NAME} -f ${APP_NAME}/Dockerfile .

# Define the TenseorRT Inference Server Docker image.Example only.

# Clara Core would launch the container with the following environment variables internally,
# to provide runtime information.
export NVIDIA_CLARA_TRTISURI="localhost:8000"

# Define the model name for use when launching TRTIS with only the specific model

# Create a Docker network so that containers can communicate on this network

# Create network
docker network create ${NETWORK_NAME}

# Run TRTIS(name: trtis), maping ./sampleData/models/${MODEL_NAME} to /models/${MODEL_NAME}
# (localhost:8000 will be used)
cp -f $(pwd)/${APP_NAME}/config/config.pbtxt $(pwd)/sampleData/models/${MODEL_NAME}/
nvidia-docker run --name trtis --network ${NETWORK_NAME} -d --rm --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 \
    -p 8000:8000 \
    -v $(pwd)/sampleData/models/${MODEL_NAME}:/models/${MODEL_NAME} ${TRTIS_IMAGE} \
    trtserver --model-store=/models

# Wait until TRTIS is ready
trtis_local_uri=$(docker inspect -f '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' trtis)
echo -n "Wait until TRTIS ${trtis_local_uri} is ready..."
while [ $(curl -s ${trtis_local_uri}:8000/api/status | grep -c SERVER_READY) -eq 0 ]; do
    sleep 1
    echo -n "."
echo "done"

export NVIDIA_CLARA_TRTISURI="${trtis_local_uri}:8000"

# Run ${APP_NAME} container.
# Like below, Clara Core would launch the app container with the following environment variables internally,
# to provide input/output path information.
docker run --name ${APP_NAME} --network ${NETWORK_NAME} -it --rm \
    -v $(pwd)/input:/input \
    -v $(pwd)/output:/output \
    -v $(pwd)/logs:/logs \
    -v $(pwd)/publish:/publish \

echo "${APP_NAME} is done."

# Stop TRTIS container
echo "Stopping TRTIS"
docker stop trtis > /dev/null

# Remove network
docker network remove ${NETWORK_NAME} > /dev/null

10.1.3. Publishing an Application Docker Image

The Clara Deploy SDK currently does not provide an API to upload an application Docker image into its own registry. An application Docker image needs to be copied to the Clara Deploy host server, and then loaded into the host’s Docker registry.

Please see Docker documentation on how to save and load a Docker image.

10.1.4. Uploading AI Models

For AI inference application, trained models need to be made available for the NVIDIA TensorRT Inference Server. For details on what models are supported, how to configure a model, and TRTIS model repository, please refer to official NVIDIA documentation at the following link: https://docs.nvidia.com/deeplearning/sdk/tensorrt-inference-server-guide/docs/.

After the model and configuration files are ready, they must be copied onto the Clara Deploy host server, and saved in the shared storage folder, namely /clara/common/models, which will be mapped to the models folder of the TRTIS container.

10.1.5. Creating and Testing Pipelines

Once the application Docker image is available on the Clara Deploy, a pipeline can be created to utilize it. Please refer to the section on pipeline creation for details.