12.24. Digital Pathology Nuclei Segmentation Pipeline
The Digital Pathology Nuclei Segmentation pipeline is one of the reference pipelines provided with Clara Deploy SDK. It accepts an image in formats that are readable by OpenSlide format. The output is a color image where cell nuclei segmentation results are overlaid on top of the original image. The result image is published to the Render Server so that it can be viewed on the web browser.
The Digital Pathology Nuclei Segmentation pipeline is defined in the Clara Deploy pipeline definition language. This pipeline utilizes built-in reference containers to construct the following operator:
- Digital Pathology Nuclei Segmentation Operator to generate a segmentation image. It uses the following model/packages:
- An open-sourced implementation of nuclei segmentation model from the Kaggle 2018 Data Science Bowl (https://github.com/limingwu8/UNet-pytorch)
- OpenSlide (https://openslide.org/)
- Tifffile (https://github.com/cgohlke/tifffile)
- Scikit-image (https://scikit-image.org/)
- NumPy (https://numpy.org/)
- OpenCV (https://opencv.org/)
- Register Results Operator to publish the image to Render Server
The followings are pipeline definitions available:
12.24.1.1. dp-nuclei-segmentation-pipeline.yaml
api-version: 0.4.0
name: dp-nuclei-segmentation-pipeline
parameters:
NUM_WORKERS: -1 # -1: # of cpus
operators:
- name: segmentation
description: Do cell nuclei segmentation
container:
image: clara/dp-nuclei-seg
tag: latest
command: ["/bin/bash", "-c", "python-u/app/main.pysegmentation_skimage--num-workers=${{NUM_WORKERS}}"]
requests:
memory: 8192
input:
- path: /input
- path: /config
output:
- path: /output
services:
- name: triton
# TRITON inference server, required by this AI application.
container:
image: nvcr.io/nvidia/tritonserver
tag: 20.07-v1-py3
command: ["tritonserver", "--model-repository=$(NVIDIA_CLARA_SERVICE_DATA_PATH)/models"]
# services::connections defines how the TRITON service is expected to
# be accessed. Clara Platform supports network ("http") and
# volume ("file") connections.
connections:
http:
# The name of the connection is used to populate an environment
# variable inside the operator's container during execution.
# This AI application inside the container needs to read this variable to
# know the IP and port of TRITON in order to connect to the service.
- name: NVIDIA_CLARA_TRTISURI
port: 8000
# Some services need a specialized or minimal set of hardware. In this case
# NVIDIA TRITON inference server requires at least one GPU to function.
- name: register-images-for-rendering
description: Register pyramid images in tiff format for rendering.
container:
image: clara/register-results
tag: latest
command: ["python", "register.py", "--agent", "renderserver"]
input:
- from: segmentation
path: /input
The parameter NUM_WORKERS
is for setting the number of workers in the pipeline.
Please refer to the Run Reference Pipelines using Local Input Files
in the How to run a Reference Pipeline section to learn how to register a pipeline and
execute the pipeline using local input files.
Example)
mkdir -p ~/.clara/pipelines
cd ~/.clara/pipelines
# Download pipeline
clara pull pipeline clara_dp_nuclei_seg_pipeline
cd clara_dp_nuclei_seg_pipeline
# Unzip test input data
unzip app_dp_nuclei_seg-input_v1.zip -d input
# Unzip model data
sudo unzip app_dp_nuclei_seg-model_v1.zip -d /clara/common/models/
# Create a pipeline
clara create pipeline -p dp-nuclei-segmentation-pipeline.yaml
clara create jobs -p <PIPELINE ID> -n <JOB NAME> -f <INPUT PATH>
# Start a job (with parameters `NUM_WORKERS` if needed. e.g., '-a NUM_WORKERS=8')
clara start job -j <JOB ID>
Creating a pipeline/job and starting/monitoring a job can also be done by run_pipeline.sh
script:
mkdir -p ~/.clara/pipelines
cd ~/.clara/pipelines
# Download pipeline
clara pull pipeline clara_dp_nuclei_seg_pipeline
cd clara_dp_nuclei_seg_pipeline
# Unzip test input data
unzip app_dp_nuclei_seg-input_v1.zip -d input
# Unzip model data
sudo unzip app_dp_nuclei_seg-model_v1.zip -d /clara/common/models/
# Run script
./run_pipeline.sh
Input requires a folder containing the following files:
- .tif or .svs - Input image file
- config_render.json - Configuration for Render Server
Bundled input data in this pipeline is a breast cancer case from The Cancer Genome Atlas.
An RGB image where the segmentation part is overlaid on top of the original image, shown on Render Server.
- Go to the Clara RenderServer UI using a web browser:
The URL is:
<IP of the machine>:8080
- You should see a dataset with a name that includes the name of the job you specified and the operator name (e.g.,
dp-nuclei-seg-segmentation
) - Clicking the dataset would show the rendered image on the screen
- Zooming: mouse scroll.
- Panning: mouse middle button. On Mac: Hold the key ‘a’ while dragging the mouse (drag with the left button).