12.24. Digital Pathology Nuclei Segmentation Pipeline

The Digital Pathology Nuclei Segmentation pipeline is one of the reference pipelines provided with Clara Deploy SDK. It accepts an image in formats that are readable by OpenSlide format. The output is a color image where cell nuclei segmentation results are overlaid on top of the original image. The result image is published to the Render Server so that it can be viewed on the web browser.

12.24.1. Pipeline Definition

The Digital Pathology Nuclei Segmentation pipeline is defined in the Clara Deploy pipeline definition language. This pipeline utilizes built-in reference containers to construct the following operator:

Digital Pathology Nuclei Segmentation Operator to generate a segmentation image. It uses the following model/packages:
- An open-sourced implementation of nuclei segmentation model from the Kaggle 2018 Data Science Bowl (https://github.com/limingwu8/UNet-pytorch)
- OpenSlide (https://openslide.org/)
- Tifffile (https://github.com/cgohlke/tifffile)
- Scikit-image (https://scikit-image.org/)
- NumPy (https://numpy.org/)
- OpenCV (https://opencv.org/)
Register Results Operator to publish the image to Render Server

The followings are pipeline definitions available:

12.24.1.1.dp-nuclei-segmentation-pipeline.yaml

Copy
Copied!

            
            api-version: 0.4.0
name: dp-nuclei-segmentation-pipeline
parameters:
  NUM_WORKERS: -1              # -1: # of cpus
operators:
  - name: segmentation
    description: Do cell nuclei segmentation
    container:
      image: clara/dp-nuclei-seg
      tag: latest
      command: ["/bin/bash", "-c", "python-u/app/main.pysegmentation_skimage--num-workers=${{NUM_WORKERS}}"]
    requests:
      memory: 8192
    input:
    - path: /input
    - path: /config
    output:
    - path: /output
    services:
      - name: triton
        # TRITON inference server, required by this AI application.
        container:
          image: nvcr.io/nvidia/tritonserver
          tag: 20.07-v1-py3
          command: ["tritonserver", "--model-repository=$(NVIDIA_CLARA_SERVICE_DATA_PATH)/models"]
        # services::connections defines how the TRITON service is expected to
        # be accessed. Clara Platform supports network ("http") and
        # volume ("file") connections.
        connections:
          http:
          # The name of the connection is used to populate an environment
          # variable inside the operator's container during execution.
          # This AI application inside the container needs to read this variable to
          # know the IP and port of TRITON in order to connect to the service.
          - name: NVIDIA_CLARA_TRTISURI
            port: 8000
          # Some services need a specialized or minimal set of hardware. In this case
          # NVIDIA TRITON inference server requires at least one GPU to function.
  - name: register-images-for-rendering
    description: Register pyramid images in tiff format for rendering.
    container:
      image: clara/register-results
      tag: latest
      command: ["python", "register.py", "--agent", "renderserver"]
    input:
    - from: segmentation
      path: /input

The parameter NUM_WORKERS is for setting the number of workers in the pipeline.

12.24.2. Executing the Pipeline

Please refer to the Run Reference Pipelines using Local Input Files in the How to run a Reference Pipeline section to learn how to register a pipeline and execute the pipeline using local input files.

Example)

Copy
Copied!

            
            mkdir -p ~/.clara/pipelines
cd ~/.clara/pipelines
# Download pipeline
clara pull pipeline clara_dp_nuclei_seg_pipeline
cd clara_dp_nuclei_seg_pipeline
# Unzip test input data
unzip app_dp_nuclei_seg-input_v1.zip -d input
# Unzip model data
sudo unzip app_dp_nuclei_seg-model_v1.zip -d /clara/common/models/
# Create a pipeline
clara create pipeline -p dp-nuclei-segmentation-pipeline.yaml

clara create jobs -p <PIPELINE ID> -n <JOB NAME> -f <INPUT PATH>

# Start a job (with parameters `NUM_WORKERS` if needed. e.g., '-a NUM_WORKERS=8')
clara start job -j <JOB ID>

Creating a pipeline/job and starting/monitoring a job can also be done by run_pipeline.sh script:

Copy
Copied!

            
            mkdir -p ~/.clara/pipelines
cd ~/.clara/pipelines
# Download pipeline
clara pull pipeline clara_dp_nuclei_seg_pipeline
cd clara_dp_nuclei_seg_pipeline
# Unzip test input data
unzip app_dp_nuclei_seg-input_v1.zip -d input
# Unzip model data
sudo unzip app_dp_nuclei_seg-model_v1.zip -d /clara/common/models/
# Run script
./run_pipeline.sh

12.24.3. Data Input

Input requires a folder containing the following files:

.tif or .svs - Input image file
config_render.json - Configuration for Render Server

Bundled input data in this pipeline is a breast cancer case from The Cancer Genome Atlas.

12.24.4. Data Output

An RGB image where the segmentation part is overlaid on top of the original image, shown on Render Server.

12.24.5. Viewing the Output Volume Image

Go to the Clara RenderServer UI using a web browser: The URL is: <IP of the machine>:8080
You should see a dataset with a name that includes the name of the job you specified and the operator name (e.g., dp-nuclei-seg-segmentation)
Clicking the dataset would show the rendered image on the screen
- Zooming: mouse scroll.
- Panning: mouse middle button. On Mac: Hold the key ‘a’ while dragging the mouse (drag with the left button).