6. How to Run a Reference Pipeline

Clara Deploy SDK comes bundled with a set of reference pipelines. This section provides general guidelines on how to execute them on your local development machine. For information specific to a reference pipeline, please view the documentation of that pipeline.

6.1. Select a Reference Pipeline

  • Change directory into a directory where pipelines will be stored, e.g.

    mkdir -p ~/.clara/pipelines
    cd ~/.clara/pipelines
    
  • Use clara pull pipeline to download a pipeline definition:

    clara pull pipeline <NGC ID for the pipeline>
    cd <NGC ID for the pipeline>
    

    For Example:

    clara pull pipeline clara_ai_livertumor_pipeline
    cd clara_ai_livertumor_pipeline
    

    Available pipelines are listed in the table below.

Pipeline Name NGC ID
Advanced DICOM Integration clara_ai_advanced_dicom_integration_pipeline
CT Reconstruction clara_nvrtk_recon_pipeline
AI Spleen clara_ai_spleen_pipeline
AI Pancreas Tumor clara_ai_pancreas_pipeline
AI Chest X-Ray clara_ai_chestxray_pipeline
AI Brain Tumor clara_ai_brain_tumor_pipeline
AI Lung Tumor clara_ai_lung_tumor_pipeline
AI vNet / CT Organ Segmentation clara_ai_vnet_pipeline
AI Malaria clara_ai_malaria_pipeline
AI Hippocampus clara_ai_hippocampus_pipeline
AI Colon Tumor clara_ai_colontumor_pipeline
AI Liver Tumor clara_ai_livertumor_pipeline
AI Breast clara_ai_breast_pipeline
AI Prostate clara_ai_prostate_pipeline
Denovo Assembly clara_denovo_assembly_pipeline
3D Image Processing using FastIO clara_fastio_itk_pipeline
DeepStream Batch clara_deepstream_batch_pipeline
Multi AI Segmentation clara_multiai_pipeline
Digital Pathology Image Processing clara_dp_sample_pipeline
COVID-19 Classification clara_ai_covid19_pipeline
DICOM Series Selection clara_series_selection_pipeline
Digital Pathology Nuclei Segmentation clara_dp_nuclei_seg_pipeline
Hardware Qualification Suite clara_hqs_pipeline
  • Unzip test data - if test data is bundled with the pipeline, run:

    unzip <test-data-filename>.zip -d input
    
  • Unzip and Install model - if the model is bundled with the pipeline, run:

    sudo unzip <model-filename>.zip -d /clara/common/models/
    

6.2. Run Reference Pipelines by Sending DICOM Instances Over Network

Many reference pipelines are triggered by Clara DICOM Adaptor after it has received DICOM instances from external sender to one of its mapped AE Titles. This section explains the process with an example pipeline Liver Segmentation Pipeline.

  • Navigate to the pipeline directory mentioned above. The YAML file defines the reference pipeline. The pipeline definition can be customized to suit a specific deployment environment.

6.2.1. External DICOM Sender and DICOM Receiver

You need an external DICOM Service Class User (SCU) application to send images to the Clara DICOM Adapter (acting as a DICOM SCP). Similarly, when your pipeline finishes executing, you may want to send the output to an external DICOM receiver.

For this example, we will use the open-source DICOM toolkit DCMTK to provide the external DICOM sender and DICOM receiver functionality.

6.2.1.1. Install DCMTK

Install DCMTK utilities by issuing the following command:

sudo apt-get install dcmtk

6.2.1.2. Setup an External DICOM Receiver

You can use the storescp utility as an external DICOM receiver. Open a terminal and run the following command to set up a DICOM Receiver. Create a directory in your machine where DICOM images produced by Clara Deploy SDK will be stored. We will refer to that directory as “dicom-destination-directory”

mkdir dicom-destination-directory
cd dicom-destination-directory
sudo storescp -v --fork -aet MYPACS 1004

MYPACS is the Application Entity (AE) title of the external DICOM Receiver.

1004 is the port number where storescp would be listening for any incoming DICOM traffic. Please note that the IP address will be the same as the local system.

6.2.1.3. Setup an External DICOM Sender

You can use the storescu utility from DCMTK as a DICOM sender.

6.2.2. Customize the Pipeline

Before publishing a pipeline, default settings in the pipeline definition YAML file can be modified to configure the resource request and orchestration mode. We will use the liver-tumor-pipeline.yaml to show the steps.

6.2.2.1. Customize GPU request

The number of GPU’s requested for an operator and/or a pipeline dependent service e.g. TensorRT Inference Server, can be customized if multiple GPU’s are available.

The GPU request for TensorRT Inference Server is set at 1 by default shown below,

- name: trtis
  # TensorRT Inference Server, required by this AI application.
    container:
      image: nvcr.io/nvidia/tensorrtserver
      tag: 21.07-v1-py3
      command: ["trtserver", "--model-store=$(NVIDIA_CLARA_SERVICE_DATA_PATH)/models"]
    requests:
      gpu: 1

With this setting, the TensorRT Inference Server can only access one GPU. When more GPU’s are available and intended to be used to increase the inference performance, the GPU request number must be customized to match desired number, e.g. 4, as shown below,

- name: trtis
  # TensorRT Inference Server, required by this AI application.
    container:
      image: nvcr.io/nvidia/tensorrtserver
      tag: 21.07-v1-py3
      command: ["trtserver", "--model-store=$(NVIDIA_CLARA_SERVICE_DATA_PATH)/models"]
    requests:
      gpu: 4

Note

If another pipeline with a lesser number of GPU’s for TensorRT Inference Server has been published and triggered, the changes made here will not affect the running instance of TensorRT Inference Server.

6.2.2.2. Customize the Orchestration Mode

In this release of the Clara Deploy Platform, Clara Pipeline Driver [cpd] orchestration is supported, in addition to the Argo based orchestration mode.

This mode can be customized with the api-version attribute in the pipeline definition YAML file, setting the api-version to 0.3.0 makes the pipeline run in Argo mode, whereas setting this attribute to 0.4.0 Clara Pipeline Driver mode by default. The orchestration mode can also be explicitly set for api-version 0.4.0 and above, as shown in the following example which specifies Clara Pipeline Driver mode.

api-version: 0.4.0
orchestrator: Clara
name: liver-tumor-pipeline

6.2.3. Create the Pipeline

Creating a pipeline registers it with the platform. Once a pipeline is registered, it is represented with an ID. Follow the steps below:

  • Open a terminal and navigate to the directory clara_ai_livertumor_pipeline which contains the reference pipeline
cd ~/.clara/pipelines/clara_ai_livertumor_pipeline
  • Create the pipeline by issuing the following command
clara create pipeline -p liver-tumor-pipeline.yaml

Clara CLI should successfully submit the pipeline and in return should print a pipeline ID.

Note

Note down this pipeline ID as it would be used in a later section to define a new pipeline mapping in the DICOM adapter. The command clara list pipelines can be also used to list available pipelines.

  • Verify that the pipeline is correctly stored by issuing the following command
clara list pipelines
  • In the list of pipeline IDs, you should see the pipeline ID created in the previous step.

6.2.4. Configure the Clara DICOM Adapter

When you send images to the Clara Deploy SDK, the DICOM Adapter component acts as the internal DICOM Service Class Provider or a receiving agent.

Please refer to Clara DICOM Adapter for setup instructions.

6.2.5. Trigger Pipeline

A pipeline is triggered when a DICOM series is sent to the specific AE title of the Clara DICOM Adapter which is associated with that pipeline. To trigger this pipeline, you will need to have a set of relevant DICOM images that can be used as source input data.

For this pipeline, a sample test dataset in dicom format is available within the package in input/dcm.

  • Open a terminal and go to the directory which contains all your DICOM images.

    cd ~/.clara/pipelines/clara_ai_livertumor_pipeline/input/dcm
    
  • Trigger the pipeline (push DICOM data to the proper ae title)

    storescu -v +sd +r -xb -aet "DCM4CHEE" -aec "LiverSeg" AA.BBB.CC.DD 104 ./
    

Replace “AA.BBB.CC.DD” with the IP of your local machine.

When the above step is completed, you should see the following result in the terminal

I: Received Store Response (Success) <br>
I: Releasing Association

6.2.6. Check Job Status and Download/View Payloads in Clara Console/Dashboard

With Clara Orchestrator (API version >= 0.4.0)

  • Go to the Clara Management Console using a web browser: The URL is: <IP of the machine>:32002
  • Go to the JOBS view. You should see a job with a name that includes the name of the job you specified
  • You can inspect the status of each operator inside that job

Please check Clara Management Console to see the web application in detail.

With Argo Orchestrator (API version <= 0.3.0)

Go to the Clara dashboard UI using a web browser: The URL is <machine-ip>:8000.

  • You should see a job with a name that includes the name of the pipeline you created.
  • You can inspect the status of each operator inside that job

See: orchestrator

6.2.7. Verify That the External DICOM Receiver Received Your Images

If your pipeline outputs DICOM images and sends them to an external DICOM receiver, you may want to verify that the images were indeed received. If you used the storescp utility from DCMTK as mentioned earlier, please use a terminal to navigate to the “dicom-destination-directory” and verify whether the expected images are stored there. Also, you may want to use a third-party DICOM viewer to view those images.

6.3. Run Reference Pipelines using Local Input Files

Some reference pipelines such as Malaria Microscopy or Chest X-ray Classification Pipeline are triggered via Clara CLI.

Here we explain with Chest X-ray Classification Pipeline.

  • Use clara_ai_chestxray_pipeline for <NGC ID for the pipeline> in the pipeline setup command:

    mkdir -p ~/.clara/pipelines
    cd ~/.clara/pipelines
    
    clara pull pipeline clara_ai_chestxray_pipeline
    cd clara_ai_chestxray_pipeline
    
    unzip app_chestxray-input_*.zip -d input
    sudo unzip app_chestxray-model_*.zip -d /clara/common/models
    

6.3.1. Publish the Pipeline

Publishing a pipeline registers it with the platform. Once a pipeline is registered, it is represented with an ID. Follow the steps below:

  • Open a terminal and navigate to the directory clara_ai_chestxray_pipeline which contains the reference pipeline

    cd ~/.clara/pipelines/clara_ai_chestxray_pipeline
    
  • Create the pipeline by issuing the following command

    clara create pipeline -p chestxray-pipeline.yaml
    

    Clara CLI should successfully submit the pipeline and in return should print a pipeline ID.

    Note

    Note down the pipeline ID as it would be used in a later section to trigger a job with input files.

  • Verify that the pipeline is correctly stored by issuing the following command

    kubectl get pipeline
    # or
    clara list pipelines
    
  • In the list of pipeline IDs, you should be able to see the pipeline ID created in the previous step.

6.3.2. Trigger Pipeline

A pipeline can be triggered via Clara CLI. To trigger this pipeline, you will need to have a set of relevant images that can be used as source input data.

For this created pipeline, a sample chest x-ray file in 16-bit png format is available within the SDK in ~/.clara/pipelines/clara_ai_chestxray_pipeline/input/png.

  • Open a terminal and go to the directory which contains the reference pipeline.

    cd ~/.clara/pipelines/clara_ai_chestxray_pipeline
    
  • Trigger the pipeline (push an input png file to Clara API Server) with the pipeline ID in the previous section.

    clara create jobs -n chestxray-test -p <pipeline ID> -f input/png/
    

When the above step is completed, you should see the result in the terminal similar to the following (For example only and the UUID of Job and Payload will be different).

JOB_ID:f3170b457f5c41879951e6da8f37341e
PAYLOAD_ID:12bb398cb54844d5958d51e66e10713a

Note

Note down this job ID and payload ID as they would be used in a later section to access output files.

Then, start the job:

clara start job -j <JOB ID>

6.3.3. Check Job Status and Download/View Payloads in Clara Console/Dashboard

With Clara Orchestrator (API version >= 0.4.0)

  • Go to the Clara Management Console using a web browser: The URL is: <IP of the machine>:32002
  • Go to the JOBS view. You should see a job with a name that includes the name of the job you specified
  • You can inspect the status of each operator inside that job

Please check Clara Management Console to see the web application in detail.

With Argo Orchestrator (API version <= 0.3.0)

Go to the Clara dashboard UI using a web browser: The URL is <machine-ip>:8000.

  • You should see a job with a name that includes the name of the pipeline you created.
  • You can inspect the status of each operator inside that job

See: orchestrator

6.3.4. Download and Verify the Outputs

Once the job has completed successfully, we can download output payload by using clara download command.

Output payload files are located under /operators/<operator name defined in the pipeline definition>/ folder. Since we triggered Chest X-ray Pipeline whose AI operator’s name is ai-app-chestxray, we can use the following command to download output files:

clara download <Job ID>:/operators/ai-app-chestxray/*

Then, you can see the output files using the following command:

cat output-AA0007572607123433_v2.csv # View text output
eog output-AA0007572607123433_v2.png # View image output using `eog` which is an image viewer available in Ubuntu
width:400 align:center

If you are working with the server machine, the payload folder is directly accessible at the default payload folder /clara/payloads/<Payload ID>/.