11.23. Digital Pathology Nuclei Segmentation Pipeline

The Digital Pathology Nuclei Segmentation pipeline is one of the reference pipelines provided with Clara Deploy SDK. It accepts a image in formats that are readable by OpenSlide format. The output is a color image where cell nuclei segmentation results are overlaid on top of the original image. The result image is published to the Render Server so that it can be viewed on the web browser.

11.23.1. Pipeline Definition

The Digital Pathology Nuclei Segmentation pipeline is defined in the Clara Deploy pipeline definition language. This pipeline utilizes built-in reference containers to construct the following operator:

The followings are pipeline definitions available:

11.23.1.1. dp-nuclei-segmentation-pipeline.yaml

api-version: 0.4.0
name: dp-nuclei-segmentation-pipeline
parameters:
  NUM_WORKERS: -1              # -1: # of cpus
operators:
  - name: segmentation
    description: Do cell nuclei segmentation
    container:
      image: clara/dp-nuclei-seg
      tag: latest
      command: ["/bin/bash", "-c", "python -u /app/main.py segmentation_skimage --num-workers=${{NUM_WORKERS}}"]
    requests:
      gpu: 1
    input:
    - path: /input
    - path: /config
    output:
    - path: /output
    services:
      - name: triton
        # TRITON inference server, required by this AI application.
        container:
          image: nvcr.io/nvidia/tritonserver
          tag: 20.03.1-py3
          command: ["trtserver", "--model-repository=$(NVIDIA_CLARA_SERVICE_DATA_PATH)/models"]
        # services::connections defines how the TRITON service is expected to
        # be accessed. Clara Platform supports network ("http") and
        # volume ("file") connections.
        connections:
          http:
          # The name of the connection is used to populate an environment
          # variable inside the operator's container during execution.
          # This AI application inside the container needs to read this variable to
          # know the IP and port of TRITON in order to connect to the service.
          - name: NVIDIA_TRITONURI
            port: 8000
          # Some services need a specialized or minimal set of hardware. In this case
          # NVIDIA TRITON inference server requires at least one GPU to function.
  - name: register-images-for-rendering
    description: Register pyramid images in tiff format for rendering.
    container:
      image: clara/register-results
      tag: latest
      command: ["python", "register.py", "--agent", "renderserver"]
    input:
    - from: segmentation
      path: /input

The parameter NUM_WORKERS is for setting the number of workers in the pipeline.

11.23.2. Executing the Pipeline

Please refer to the Run Reference Pipelines using Local Input Files in the How to run a Reference Pipeline section to learn how to register a pipeline and execute the pipeline using local input files.

Example)

mkdir -p ~/.clara/pipelines
cd ~/.clara/pipelines
# Download pipeline
clara pull pipeline clara_dp_nuclei_seg_pipeline
cd clara_dp_nuclei_seg_pipeline
# Unzip test input data
unzip app_dp_nuclei_seg-input_v1.zip -d input
# Unzip model data
sudo unzip app_dp_nuclei_seg-model_v1.zip -d /clara/common/models/
# Create a pipeline
clara create pipeline -p dp-nuclei-segmentation-pipeline.yaml

clara create jobs -p <PIPELINE ID> -n <JOB NAME> -f <INPUT PATH>

# Start a job (with parameters `NUM_WORKERS` if needed. e.g., '-a NUM_WORKERS=8')
clara start job -j <JOB ID>

Creating a pipeline/job and starting/monitoring a job can also be done by run_pipeline.sh script:

mkdir -p ~/.clara/pipelines
cd ~/.clara/pipelines
# Download pipeline
clara pull pipeline clara_dp_nuclei_seg_pipeline
cd clara_dp_nuclei_seg_pipeline
# Unzip test input data
unzip app_dp_nuclei_seg-input_v1.zip -d input
# Unzip model data
sudo unzip app_dp_nuclei_seg-model_v1.zip -d /clara/common/models/
# Run script
./run_pipeline.sh

11.23.3. Data Input

Input requires a folder containing the following files:

  • .tif or .svs - Input image file
  • config_render.json - Configuration for Render Server

Bundled input data in this pipeline is a breast cancer case from The Cancer Genome Atlas.

11.23.4. Data Output

A RGB image where the segmentation part is overlaid on top of the original image, shown on Render Server.

11.23.5. Viewing the Output Volume Image

  • Go to the Clara RenderServer UI using a web browser: The URL is: <IP of the machine>:8080
  • You should see a dataset with a name that includes the name of the job you specified and the operator name (e.g., dp-nuclei-seg-segmentation)
  • Clicking the dataset would show the rendered image on the screen
    • Zooming: mouse scroll.
    • Panning: mouse middle button. On Mac: Hold the key ‘a’ while dragging the mouse (drag with the left button).