Clara Application Container Developer Guide

This section describes how to develop an application to run in a Docker container on the NVIDIA Clara Deploy SDK, specifically an AI inference application. Also described are the steps to configure AI models and make them available to the NVIDIA TensorRT Inference Server, TRTIS.

Using the Clara Deploy SDK

The NVIDIA Clara Deploy SDK provides a Software Development Toolkit, and included in the Clara Deploy SDK is the Clara Workflow Driver library. The library is written in C programing language targeting Linux platform, but also comes with bindings for C# and Python. In this section, the samples use the Python binding.

In this release, the Clara Deploy SDK Python binding is released in the form of a Python wheel, namely sdk_dist/clara-0.1-py3-none-any.whl. It defines a set of callback functions that an application must implement in order to integrate with Clara Deploy SDK. The startup and cleanup functions are called at the beginning and end of the application life cycle, whereas prepare is called when a specific job is ready to be started, and execute is called when data is available for the specific job. For more details, see the Workflow Driver library section.

The following are the event callbacks that must be implemented using the decorators:

@clara.startup_cb
@clara.prepare_cb
@clara.execute_cb
@clara.cleanup_cb
@clara.error_cb

The following sample code shows how these callback functions are implemented as global functions, though they can also be implemented as class functions.

main.py

Copy
Copied!

            
            import clara

@clara.startup_cb
def startup():
    print("[clara] startup() is called. Do app init as needed.")

@clara.prepare_cb
def prepare():
    print("[clara] prepare() is called. Prepare for a new job.")

@clara.execute_cb
def execute(payload):
    print("[clara] execute() is called. Execute on input data and save output.")

@clara.cleanup_cb
def cleanup():
    print("[clara] cleanup() is called. Clean up resources.")

@clara.error_cb
def error(result, message, entry_name, kind):
    print(result, message, entry_name, kind)

if __name__ == "__main__":
    clara.start()
    # Wait until all callbacks are called.
    clara.wait_for_completion()

The execute callback accepts a payload parameter. You can get input/output file paths using code similar to the following:

Copy
Copied!

            
            @clara.execute_cb
def execute(payload):
    print("Input file paths:")
    for input_stream in payload.inputs:
        print("  - {}".format(input_stream.path))

    print("Output file paths:")
    for output_stream in payload.outputs:
        print("  - {}".format(output_stream.path))

The Clara Deploy SDK Python wheel file, and the TRTIS client wheel file in the case of an AI inference application, need to be copied to your project, and installed in the Python virtual environment.

For samples of Python package structure, please refer to sections on reference workflow containers.

Testing in Development Environment

The Clara Deploy SDK Workflow Driver supports a testing mode enabled by the following environment variables:

Copy
Copied!

            
            #!/bin/bash
export NVIDIA_CLARA_RUNSIMPLE=TRUE
export NVIDIA_CLARA_JOBID=$(cat /proc/sys/kernel/random/uuid) # Random uuid
export NVIDIA_CLARA_JOBNAME="test-vnet"
export NVIDIA_CLARA_STAGENAME="ai"
export NVIDIA_CLARA_APPDIR=$(pwd)
export NVIDIA_CLARA_INPUTS="input/recon.mhd"
export NVIDIA_CLARA_OUTPUTS="output/recon.seg.mhd"
export NVIDIA_CLARA_TRTISURI="localhost:8000"
export NVIDIA_CLARA_PUBLISHPATH="publish"

With these environment variables set according to your local environment, the Workflow Driver invokes the callbacks once your application starts. For example scripts to run tests locally, please see sections on reference workflow containers.

Building and Testing Application Docker Image

Applications for Clara Deploy SDK must run in a Docker container, and should be tested in a development environment before being deployed on Clara Deploy SDK. For a sample script to test the Docker in a development environment, please see sections on reference workflow containers.

The following is a sample Docker file:

Copy
Copied!

            
            # Copyright (c) 2019, NVIDIA CORPORATION.  All rights reserved.
#
# NVIDIA CORPORATION and its licensors retain all intellectual property
# and proprietary rights in and to this software, related documentation
# and any modifications thereto.  Any use, reproduction, disclosure or
# distribution of this software and related documentation without an express
# license agreement from NVIDIA CORPORATION is strictly prohibited.

FROM ubuntu:16.04

ENV PYVER=3.5
ENV LC_ALL=C.UTF-8
ENV LANG=C.UTF-8

# Use the same user name and home folder regardless of actual user.
ENV USER=root
ENV HOME=/root

RUN apt-get update \
    && apt-get install -y --no-install-recommends \
        python3.5 \
        curl \
        libcurl3 \
        rsync \
        ca-certificates \
    && curl -O https://bootstrap.pypa.io/get-pip.py \
    && rm -rf /var/lib/apt/lists/*

# Make /usr/bin/python point to the $PYVER version of python
RUN rm -f /usr/bin/python \
    && rm -f /usr/bin/python`echo $PYVER | cut -c1-1` \
    && ln -s /usr/bin/python$PYVER /usr/bin/python \
    && ln -s /usr/bin/python$PYVER /usr/bin/python`echo $PYVER | cut -c1-1` \
    && python$PYVER get-pip.py \
    && rm get-pip.py

WORKDIR /app

RUN mkdir -p input output

# Copy code, scripts and packages
COPY ./sdk_dist/*.whl ./sdk_dist/
COPY ./trtis_client/*.whl ./trtis_client/
COPY ./app_vnet ./app_vnet
COPY ./Pipfile ./

# Remove unnecessary files in 3rd-party package (python3.5/site-packages/future/backports/test/*.pem)
# Allow any users to access /root/.cache and /root/.local for pip/pipenv
RUN pip install --upgrade setuptools pipenv \
    && pipenv install --python 3.5 ./trtis_client/*.whl ./sdk_dist/*.whl --skip-lock \
    && pipenv run pip install -r ./app_vnet/requirements.txt \
    && rm -rf $(pipenv --venv)/lib/python3.5/site-packages/future/backports/test/*.pem \
    && rm -rf /root/.cache/* \
    && chmod -R 777 /root

ENTRYPOINT ["pipenv", "run", "python", "app_vnet/main.py"]

Publishing an Application Docker Image

The Clara Deploy SDK currently does not provide an API to upload an application Docker image into its own registry. An application Docker image needs to be copied to the Clara Deploy SDK host, and then imported into the host’s Docker.

Please see Docker documentation on how to save and import a Docker image.

Uploading AI Models

For AI inference application, trained models need to be made available for the NVIDIA TensorRT Inference Server. For details on what models are supported, how to configure a model, and TRTIS model repository, please refer to official NVIDIA documentation at the following link: https://docs.nvidia.com/deeplearning/sdk/tensorrt-inference-server-guide/docs/.

After the model and configuration files are ready, they must be copied onto the Clara Deploy SDK, and saved in the shared storage folder that is mapped to the models folder of the TRTIS container, e.g. /clara-io/models.

Creating and Testing Workflow

Once the application Docker image is available on the Clara Deploy SDK, a workflow can be created to utilize it. Please refer to the section on workflow creation for details.

NVIDIA Clara Workflow Driver Guidance

The NVIDIA Clara Pipeline Driver, or WFD, is an essential piece of pipeline orchestration. The WFD is provided as a library to be included as part of your pipelinevvv stage’s worker process.

Using the Library Shared Object

Bind the library to your source code. In C or C++ this is easy as adding libnvclara.so to your MAKE file and including clara.h in your source code. If you’re using a language like Python, Java, Node.js, or C#, use that language’s method of binding compiled binaries:
- Python uses C-Types. For more information, see: Python Standard Library. Additionally, NVIDIA has provided an already generated Python library (see: Python APIs section).
Once your process has started, call nvidia_clara_wfd__create with appropriate function pointers, and keep a handle on the wdf_out value.
Have your code do whatever it needs to do to get started, then have it wait on the callbacks from WFD.
WFD calls your provided functions in a specific event order: startup, prepare, execute, cleanup.
When the process has completed, call nvidia_clara_wfd__release with the reference you took from nvidia_clara_wfd__create.
Optionally, your process can block the current thread and wait for callback completion to occur by calling nvidia_clara_wfd__wait_for_completion and passing in the WFD pointer from nvidia_clara_wfd__create. This blocks the calling thread until all other life-cycle events have completed.

Pipeline Client Process Life-Cycle

WFD provides life-cycle orchestration for your process. Once the WFD instance has been created, your process receives event callbacks in the following order:

Start Up

This is an early initialization phase where your process can do any required early-stage setup or initialization. No pipeline specific data or information is available at this time.
Prepare

This is a pre-execute phase. By this point in the life-cycle all other pipeline stages have been initialized, and the creator of the job (pipeline instance) has supplied all required inputs.

Stage inputs are not yet available.
Execute

This is the execution phase where your process most likely does the majority of its work.

Stage inputs are available, and output streams are awaiting results. When Clara Info indicates that a render service is available, properly prepared study data can be published to the service for 3D visualization. To publish study data, see publishing study data for visualization.

Once this phase completes, the next stage is told to execute. If this is the last stage in the pipeline, the job creator is notified that its job has completed.
Clean Up

This phase occurs after the execute phase has completed and the job has been notified that this stage is complete. Clean up resources that need to be cleaned up, freed, or released.

Do not delete your results as other stages or the job creator may need those.

example:

Copy
Copied!

            
            /* The example provided is for the Clara Pipeline Driver C Client. */

// Execute callback provided to WFD for handling execution.
int my_execute_callback(nvidia_clara_payload *payload) {
    nvidia_clara_payload_stream **streams = NULL;
    int streams_count = 0;
    result r = 0;

    r = nvidia_clara_payload__get_inputs(payload, &streams, &streams_count);

    if (r != 0) {
        printf("error: Failed to read input streams from payload (%d)\n", r);
        return -1;
    }

    printf("Read %d input stream(s) from the payload.\n", streams_count);

    r = nvidia_clara_payload__get_outputs(payload, &streams, &streams_count);

    if (r != 0) {
        printf("error: Failed to read output streams from payload (%d)\n", r);
        return -1;
    }

    printf("Read %d output stream(s) from the payload.\n", streams_count);

    return 0;
}

void error_callback(int code, char *message, char *source, int is_fatal)
{
    const char _error[] = "error";
    const char _fatal[] = "fatal";
    const char _info[] = "info";
    const char *prefix = NULL;

    if (is_fatal) {
        prefix = _fatal;
    }
    else if (code == 0) {
        prefix = _info;
    }
    else {
        prefix = _error;
    }

    printf("%s: [%s (%d)] %s\n", prefix, source, code, message);
}

// Entry-point for the application.
int main(int arc, char *argv[]) {
    nvidia_clara_wfd *wfd;

    // Create the Pipeline driver instance, this will start the client life-cycle.
    if (nvidia_clara_wfd__create(NULL, NULL, my_execute_callback, NULL, error_callback, &wfd) == 0) {

        // Block the current thread and wait for the pipeline driver to complete.
        nvidia_clara_wfd__wait_for_completion(wfd);

        // Clean up our WFD allocation.
        nvidia_clara_wfd__release(wfd);

        // Return 0 (success!)
        return 0;
    }

    // Return -1 (error)
    return -1;
}

Publishing Study Data for Visualization

NVIDIA Clara supports rendering of specific kinds of study data as 3D visualizations. Accessing the visualizations is done though the web-based dashboard. While pipelinepipeline jobs do not have direct access to visualization services, they can publish study data to the Clara-provided Render Server.

Publishing study data is done as part of the Execute phase of a pipeline stage. If a pipeline stage has or produces content to be published for visualization the nvidia_clara_study_data__publish API can make the study available to Render Server for visualization.

example:

Copy
Copied!

            
            /* The example provided is for the Clara Pipeline Driver C Client. */

// Execute callback provided to WFD for handling execution.
int publish_stage_execute_callback(nvidia_clara_payload *payload) {
    nvidia_clara_wfd *wfd = NULL;
    nvidia_clara_study_data *study = NULL;
    nvidia_clara_payload_stream **streams = NULL;
    int streams_count = 0;
    int success = -1;

    // First get the WFD reference from the payload.
    if (nvidia_clara_payload__get_wfd(payload, &wfd) == 0) {

        // Next use the WFD reference to create a study.
        if (nvidia_clara_study_data__create(wfd, &study) == 0) {

            // Link all of the inputs to the study for publication.
            if (nvidia_clara_payload__get_inputs(payload, &streams, &streams_count) == 0) {
                int copied = 0;

                for (int i = 0; i < streams_count; i += 1) {
                    if (nvidia_clara_study_data__add_stream(study, streams[i]) == 0) {
                        copied += 1;
                    }
                    else {
                        printf("Failed to copy input stream %d to study.\n", i);
                    }
                }

                printf("Successfully copied %d input stream(s) to study for publication.\n", copied);

                // Publish the study for visualization by Clara Render Server.
                if (nvidia_clara_study_data__publish(study) == 0) {
                    success = 0;
                }
                else {
                    printf("Failed to publish study to visualization service.\n");
                }
            }
            else {
                printf("Failed to read input stream data from payload.\n");
            }

            // Finally release the study allocation.
            nvidia_clara_study_data__release(study);
        }
    }

    return success;
}

// Entry-point for the application.
int main(int arc, char *argv[]) {
    nvidia_clara_wfd *wfd;

    // Create the Pipeline driver instance, this will start the client life-cycle.
    if (nvidia_clara_wfd__create(NULL, NULL, publish_stage_execute_callback, NULL, NULL, &wfd) == 0) {

        // Wait for the pipeline driver to complete.
        nvidia_clara_wfd__wait_for_completion(wfd);

        // Clean up our WFD allocation.
        nvidia_clara_wfd__release(wfd);

        // Return 0 (success!)
        return 0;
    }

    // Return -1 (error)
    return -1;
}

Querying Clara Info

The NVIDIA Clara Pipeline Driver provides a query interface for discovering state about the current pipelinepipeline job and/or stage. The query interface is provided by nvidia_clara_info__read with the nvidia_clara_info info_kind parameter determining the type of information returned.

The API provides the following information:

Job Identifier

This is the unique identifier for the currently running pipeline job. Unique identifiers are 32-character hexadecimal strings which represent 128-bit values.
Job Name

This is the human-readable name given to the currently running pipeline job. This value is intended to provide an easy reference point for humans when looking at user interfaces or log files.
Stage Name

This is the human-readable name given to the currently running pipeline stage. This value is intended to provide an easy reference point for humans when looking at user interfaces or log files.
Stage Timeout

This is a string representing the number of seconds the currently running pipeline stage has been allocated for completion. The stage can be terminated if it exceeds this value. If this value is null or empty, then there is no assigned timeout.
Tensor RT Inference Service

This is the URL of the Tensor RT Inference Service [TRTIS] associated with the currently running pipeline job. If the value is null or empty, then TRTIS is unavailable and no TRTIS instance is associated with the currently running pipeline job.
Render Service Availability

This returns a Boolean value which indicates if the currently running pipeline stage has access to publish study data to a visualization service.

NOTE: This API is in its pre-alpha stages, and is subject to changes in future releases of NVIDIA Clara Pipeline Driver.

example:

Copy
Copied!

            
            /* The example provided is for the Clara Pipeline Driver C Client. */

// Entry-point for the application.
int main(int arc, char *argv[]) {
    char allocation[sizeof(strbuf) + sizeof(char) * 4096]; // 4KiB allocation to use as a buffer.
    strbuf *buffer = &allocation; // Utilize our local buffer as a string buffer to be used to query information from Clara.
    result r = 0;

    buffer->size = 4096;

    /* Query Clara to discover the unique identifier for the current job.    */
    if ((r = nvidia_clara_info__read(NVIDIA_CLARA_INFO_JOB_ID, buffer)) != 0) {
        printf("Querying Clara for Job ID failed with error: %d\n", r);
    }

    printf("The current Job ID is %s.", buffer);

    /* Query Clara to discover the name of the current job.                    */
    if ((r = nvidia_clara_info__read(NVIDIA_CLARA_INFO_JOB_NAME, buffer)) != 0) {
        printf("Querying Clara for Job Name failed with error: %d\n", r);
    }

    printf("The current Job Name is %s.", buffer);

    /* Query Clara to discover the name of the current job stage.                */
    if ((r = nvidia_clara_info__read(NVIDIA_CLARA_INFO_STAGE_NAME, buffer)) != 0) {
        printf("Querying Clara for Stage Name failed with error: %d\n", r);
    }

    printf("The current Stage Name is %s.", buffer);

    /* Query Clara to discover how long the current job stage has to complete its work. */
    if ((r = nvidia_clara_info__read(NVIDIA_CLARA_INFO_STAGE_TIMEOUT, buffer)) != 0) {
        printf("Querying Clara for Stage Timeout failed with error: %d\n", r);
    }

    printf("The current Stage Timeout is %s seconds.", buffer);

    /* Query Clara to discover if TRTIS is available to the current job stage.      */
    if ((r = nvidia_clara_info__read(NVIDIA_CLARA_INFO_TRTIS_SERVICE, buffer)) != 0) {
        printf("Querying Clara for TRTIS URL failed with error: %d\n", r);
    }

    printf("The current TensorRT Inference Server URL is %s.", buffer);

    /* Query Clara to discover if the current job stages supports study publication. */
    /* Publication is available when the query results is 0.                       */
    if ((r = nvidia_clara_info__read(NVIDIA_CLARA_INFO_RENDER_SERVICE, NULL)) == 0) {
        printf("Study publication is available.\n");
    }
    else {
        printf("Study publication is not available.\n");
    }

    return 0;
}

Liver Segmentation AI Example Application

Model Used

The Transfer Learning Toolkit for Medical Imaging provides pre-trained models unique to medical imaging plus additional capabilities such as integration with AI-assisted Annotation SDK for speeding up annotation of medical images. This allows you to have access to AI assisted labeling [Reference].

This example application uses a model provided by the NVIDIA Transfer Learning Toolkit, for liver tumor segmentation. Only a binary of the TLT components is shared.

Input File Format

The input and output file format for this application is MetaImage. The expected input format is a study with a single series.

Folder structure :

Copy
Copied!

            
            app_liver_tumor
├── config_validation.json
├── Dockerfile
├── main.py
├── README.md
├── requirements.txt
├── tlt-aiaa
│   └── medical
├── tlt_clara.py
└── transforms
    └── clarawriters.py
    └── clarareaders.py

sampleData
└── models
    ├── segmentation_liver_v1
    │   ├── 1
    │   │   └── model.graphdef
    │   └── config.pbtxt
    └── v_net
        ├── 1
        │   └── model.savedmodel
        │       ├── saved_model.pb
        │       └── variables
        │           ├── variables.data-00000-of-00001
        │           └── variables.index
        └── config.pbtxt

Application Dependencies for Clara Integration

Clara Integration of the application has the following dependencies:

NVIDIA Clara Workflow Driver (WFD)
NVIDIA TensorRT Inference Server (TRTIS)

Workflow Driver

The NVIDIA Clara Workflow Driver (WFD) is an essential piece of workflow orchestration. The WFD is available as a library which is included as part of the worker process of a workflow stage.

NVIDIA TensorRT Inference Server (TRTIS)

The NVIDIA TensorRT Inference Server provides a cloud inferencing solution optimized for NVIDIA GPUs. The server provides an inference service via an HTTP or gRPC endpoint, allowing remote clients to request inferencing for any model being managed by the server. Read more on TRTIS

Control Flow

Copy
Copied!

            
            +-----------------+
     |  Previous stage |
     +-----------------+
              |
              |mhd
              |
+-------------v----------------+     +------------------------+
| mhd to numpy (3D volume data)|     | config validation.json |
+------------------------------+     +------------------------+
    |                                  |
    |mhd            json+--------------+
    |                   |
+---v-------------------v-+            +---------------------------------+
| Segmentation/Inferencing+<---------->+Tranfer Learning Toolkit Adapter |
+-------------------------+    numpy   +---------------------------------+
            |                                           |
            |numpy                                      |
            |                                        +--v--+
   +--------v---------+                              |TRTIS|
   |numpy to mhd image|                              +-----+
   +------------------+
            |
            |mhd
            |
       +----v------+
       |Next stage |
       +-----------+

Executing Liver Segmentation Application Locally

When developing and testing the Liver Segmentation Example application, you can test it locally with the following command:

./run_liver_local.sh

This executable sets up the environment, installs the pre-requisites and runs the application locally. Input, output and publish directories are set using NVIDIA_CLARA_INPUTS, NVIDIA_CLARA_OUTPUTS, and NVIDIA_CLARA_PUBLISHPATH respectively in run_liver_local.sh. The publish folder (in the sample app: publish subfolder) is used to store published study data via the StudyData.publish() API, and the folder is consumed by the Render Server when the application is executed as part of a whole workflow in Clara.

Executing Liver Segmentation Application in Docker

Execute Liver Segmentation locally with the following command:

./run_liver_docker.sh

This executable sets up the environment, builds the docker image, runs TRTIS, runs the docker container, installs pre-requisites, and runs the application inside the container. Input, output and publish directories that get mounted to the docker container are set using NVIDIA_CLARA_INPUTS, NVIDIA_CLARA_OUTPUTS, and NVIDIA_CLARA_PUBLISHPATH respectively in run_liver_docker.sh.

The publish folder (in the sample application, the publish subfolder) is used to store published study data via the StudyData.publish() API, and the folder is consumed by the Render Server when the application is executed as part of a whole workflow in Clara.

Multi-organ Segmentation AI Inference Application

Overview

This sample AI inference application, which provides multi-organ segmentation on abdominal CT with dense v-networks, is not intended for diagnosis or any medical use. It only serves to demonstrate how a medical imaging AI inference application can be developed and run on the NVIDIA Clara Deploy SDK. This is a workflow compliant reference application that uses the Clara Workflow Driver Python API.

This application is designed to run on a Docker container on the NVIDIA Clara Deploy SDK, so it is dependent on the Clara Workflow Driver package and must implement the required callback methods. The application must also use the NVIDIA TensorRT Inference Server (TRTIS) which is hosted as a service on the Clara Deploy SDK. TRTIS Version 0.11.0 Beta is supported in this release. To use TRTIS inference API, the application is dependent on the TRTIS Python API client package.

This AI inference application uses MetaIO image format for its input and output files. This is partly due to the fact that the sample DICOM data ingestor shipped with Clara Deploy SDK converts DICOM images into MetaIO format. Ingestors supporting other image data formats, e.g. Nifti format, or other types of data, can be developed for specific use cases.

This application accepts a single volumetric image in the input file. For tests running the application in development environment, a MetaIO image file (two physical files, including a header and a raw pixel file) needs to be staged in the input folder. When running on the Clara Deploy SDK, DICOM images are sent to the Clara DICOM Adapter, which converts them into MetaIO image file as input to this application. Only a single series of a DICOM study, typically original primary axial, should be exported to the Clara DICOM ingestor.

Package Structure

The application package structure is shown below. It serves as a guidance for structuring source code to build and test a Clara container application, however, you are free to choose a structure based on your preferences and best practices.

The Clara Workflow Driver API Python client package is distributed as a Python wheel, as is the NVIDIA TensorRT Inferencer Server Python API Client package. Specific versions of these wheels are shown in the package structure, and they need to be installed in the Python virtual environment for the application.

Copy
Copied!

            
            .
├── app_vnet                        # V-net (dense v-net) application
│   ├── main.py                     #   - include main method implementing callbacks
│   ├── app.py                      #   - include inference call core method
│   ├── Dockerfile                  #   - include Dockerfile for generating vnet app container
│   ├── requirements.txt            #   - List of dependent packages (except local .whl files)
│   ├── transforms.py               #   - include transformation method
│   └── util.py                     #   - include utility methods for image conversions
├── envsetup.sh                     # Script for installing pre-requisite
├── input                           # Test input folder
│   ├── recon.mhd                   # Test data in MetaIO image format, aka, mhd.
│   └── recon.raw
├── output                          # Test output folder (output mhd files will be generated here)
├── Pipfile                         # pipenv's Pipfile to install necessary packages
├── Pipfile.lock                    # pipenv's lock file for necessary packages
├── README.md                       # The Readme file
├── requirements.txt                # List of dependent packages (except local .whl files)
├── run_vnet_docker.sh              # Script to run vnet app in docker container
├── run_vnet_local.sh               # Script to run vnet app locally
├── sampleData                      # Holding vnet model and model configuration for organ segmentation
│   └── models
│       └── v_net
├── sdk_dist                        # Clara Python client SDKs
│   └── clara-0.1-py3-none-any.whl  # Clara client SDK
└── trtis_client                    # TRTIS client SDK
    └── tensorrtserver-0.11.0-cp35-cp35m-linux_x86_64.whl

Please look through the main.py in the app_vnet folder for reference.

Testing Locally

The Clara Workflow Driver supports a mode to run in a development environment, so you can perform functional testing without running the Clara Deploy SDK in the development environment. The following sections show sample scripts to do this locally as well as inside docker container. In both cases TRTIS container must be pulled from NGC and run locally, and the AI model files must be made accessible to TRTIS container.

Running the VNET Application Locally

Run the script below with the following commmand:

run_vnet_local.sh

The script is listed below:

Copy
Copied!

            
            #!/bin/bash

# Copyright (c) 2019, NVIDIA CORPORATION.  All rights reserved.
#
# NVIDIA CORPORATION and its licensors retain all intellectual property
# and proprietary rights in and to this software, related documentation
# and any modifications thereto.  Any use, reproduction, disclosure or
# distribution of this software and related documentation without an express
# license agreement from NVIDIA CORPORATION is strictly prohibited.

# Clara Core would launch the container with the following environment variables internally,
# to provide input/output path information.
# (They are subject to change. Do not use the environment variables directly in your application!)
export NVIDIA_CLARA_RUNSIMPLE=TRUE
export NVIDIA_CLARA_JOBID=$(cat /proc/sys/kernel/random/uuid) # Random uuid
export NVIDIA_CLARA_JOBNAME="test-vnet"
export NVIDIA_CLARA_STAGENAME="ai"
export NVIDIA_CLARA_APPDIR=$(pwd)
export NVIDIA_CLARA_INPUTS="input/recon.mhd"
export NVIDIA_CLARA_OUTPUTS="output/recon.seg.mhd"
export NVIDIA_CLARA_TRTISURI="localhost:8000"
export NVIDIA_CLARA_PUBLISHPATH="publish"

APP_NAME="app_vnet"
TRTIS_IMAGE="nvcr.io/nvidia/tensorrtserver:19.02-py3"
MODEL_NAME="v_net"

# Install prerequisites
. envsetup.sh

# Run TRTIS(name: trtis), maping ./sampleData/models/${MODEL_NAME} to /models/${MODEL_NAME}
# (localhost:8000 will be used)
nvidia-docker run --name trtis -d --rm --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 \
    -p 8000:8000 \
    -v $(pwd)/sampleData/models/${MODEL_NAME}:/models/${MODEL_NAME} ${TRTIS_IMAGE} \
    trtserver --model-store=/models

# Install dependencies
pipenv run pip install -q -r ${APP_NAME}/requirements.txt

# Wait until TRTIS is ready
echo -n "Wait until TRTIS is ready..."
while [ $(curl -s ${NVIDIA_CLARA_TRTISURI}/api/status | grep -c SERVER_READY) -eq 0 ]; do
    sleep 1
    echo -n "."
done
echo "done"

# Run app
pipenv run python ${APP_NAME}/main.py

# Stop TRTIS container
docker stop trtis > /dev/null

Testing Liver Segmentation AI Model Containerized

Run the script below with the following command:

run_vnet_docker.sh

Copy
Copied!

            
            #!/bin/bash

# Copyright (c) 2019, NVIDIA CORPORATION.  All rights reserved.
#
# NVIDIA CORPORATION and its licensors retain all intellectual property
# and proprietary rights in and to this software, related documentation
# and any modifications thereto.  Any use, reproduction, disclosure or
# distribution of this software and related documentation without an express
# license agreement from NVIDIA CORPORATION is strictly prohibited.

# Clara Core would launch the container with the following environment variables internally,
# to provide input/output path information.
# (They are subject to change. Do not use the environment variables directly in your application!)
export NVIDIA_CLARA_RUNSIMPLE=TRUE
export NVIDIA_CLARA_JOBID=$(cat /proc/sys/kernel/random/uuid) # Random uuid
export NVIDIA_CLARA_JOBNAME="test-vnet"
export NVIDIA_CLARA_STAGENAME="ai"
export NVIDIA_CLARA_APPDIR="/app"
export NVIDIA_CLARA_INPUTS="input/recon.mhd"
export NVIDIA_CLARA_OUTPUTS="output/recon.seg.mhd"
export NVIDIA_CLARA_TRTISURI="trtis:8000"
export NVIDIA_CLARA_PUBLISHPATH="/publish"

APP_NAME="app_vnet"
TRTIS_IMAGE="nvcr.io/nvidia/tensorrtserver:19.02-py3"
MODEL_NAME="v_net"

# Install prerequisites
. envsetup.sh

# Create network
docker network create container-demo

# Run TRTIS(name: trtis), maping ./sampleData/models/${MODEL_NAME} to /models/${MODEL_NAME}
# (localhost:8000 will be used)
nvidia-docker run --name trtis --network container-demo -d --rm --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 \
    -p 8000:8000 \
    -v $(pwd)/sampleData/models/${MODEL_NAME}:/models/${MODEL_NAME} ${TRTIS_IMAGE} \
    trtserver --model-store=/models

# Build Dockerfile
docker build -t ${APP_NAME} -f ${APP_NAME}/Dockerfile .

# Wait until TRTIS is ready
trtis_local_uri=$(docker inspect -f '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' trtis)
echo -n "Wait until TRTIS is ready..."
while [ $(curl -s ${trtis_local_uri}:8000/api/status | grep -c SERVER_READY) -eq 0 ]; do
    sleep 1
    echo -n "."
done
echo "done"

# Run ${APP_NAME} container.
# Like below, Clara Core would launch the app container with the following environment variables internally,
# to provide input/output path information.
# (They are subject to change. Do not use the environment variables directly in your application!)
# Parameters for vnet segmentation can be used as following
# --roi <string> --target_shape <string> --pre_interp_order <int> --post_interp_order <int>
# --pre_axcodes <string> --post_axcodes <string>
docker run --name ${APP_NAME} --network container-demo -it --rm \
    -u $(id -u):$(id -g) \
    -v $(pwd)/input:/app/input \
    -v $(pwd)/output:/app/output \
    -v $(pwd)/publish:${NVIDIA_CLARA_PUBLISHPATH} \
    -e NVIDIA_CLARA_RUNSIMPLE \
    -e NVIDIA_CLARA_JOBID \
    -e NVIDIA_CLARA_JOBNAME \
    -e NVIDIA_CLARA_STAGENAME \
    -e NVIDIA_CLARA_APPDIR \
    -e NVIDIA_CLARA_INPUTS \
    -e NVIDIA_CLARA_OUTPUTS \
    -e NVIDIA_CLARA_TRTISURI \
    -e NVIDIA_CLARA_PUBLISHPATH \
    ${APP_NAME}

# Stop TRTIS container
docker stop trtis > /dev/null

# Remove network
docker network remove container-demo > /dev/null

The output volume image is located in the output folder.

Using Python Binding for Clara Workflow Driver (WFD)

The section describes an example of using sdk_dist/clara-0.1-py3-none-any.whl, which provides Clara Workflow Driver APIs for Python which wrap on top of the C based implementation that is covered in more detail in the section on Clara Workflow Driver.

The following decorators are mandatory for a compliant Clara Python Application.

@clara.startup_cb
@clara.prepare_cb
@clara.execute_cb
@clara.cleanup_cb
@clara.error_cb

Below is a strawman implementation of a simple python application which uses the Workflow Driver API.

main.py

Copy
Copied!

            
            import clara

@clara.startup_cb
def startup():
    print("[clara] startup() is called.")

@clara.prepare_cb
def prepare():
    print("[clara] prepare() is called.")

@clara.execute_cb
def execute(payload):
    print("[clara] execute() is called.")

@clara.cleanup_cb
def cleanup():
    print("[clara] cleanup() is called.")

@clara.error_cb
def error(result, message, entry_name, kind):
    print(result, message, entry_name, kind)

if __name__ == "__main__":
    clara.start()
    # Wait until all callbacks are called.
    clara.wait_for_completion()

The execute callback accepts the payload parameter. You can get input/output file paths like below:

Copy
Copied!

            
            @clara.execute_cb
def execute(payload):
    print("[clara] execute() is called.")
    print("Input file paths:")
    for input_stream in payload.inputs:
        print("  - {}".format(input_stream.path))

    print("Output file paths:")
    for output_stream in payload.outputs:
        print("  - {}".format(output_stream.path))