12.20. Digital Pathology Image Processing Pipeline
The Digital Pathology Image Processing pipeline is one of the reference pipelines provided with Clara Deploy SDK. It accepts a image in formats that are readable by OpenSlide format, and optionally accept parameters for Canny Edge Detection Filter. The output is the filtered image and the image is published to the Render Server so that it can be viewed on the web browser.
The Digital Pathology Image Processing pipeline is defined in the Clara Deploy pipeline definition language. This pipeline utilizes built-in reference containers to construct the following operator:
- Digital Pathology Image Processing Operator to read/filter/write image using CPDriver, OpenSlide, ITK, and tifffile
dp-sample-pipeline-no-optimization.yaml
- Commands used:
process_image_no
- Commands used:
dp-sample-pipeline.yaml
- Commands used:
tile_image_jpg
,filter_image_jpg_multiprocessing
, andstitch_image_jpg
- Commands used:
- Register Results Operator to publish the image to Render Server
The followings are pipeline definitions available:
12.20.1.1. dp-sample-pipeline-no-optimization.yaml
api-version: 0.4.0
name: dp-sample-pipeline-no-optimization
parameters:
FILTER_METHOD: canny.canny_itk
VARIANCE: 10
LOWER_THRESHOLD: 0.01 # Use 0.1 for Arrayfire's canny filter
UPPER_THRESHOLD: 0.03 # Use 0.3 for Arrayfire's canny filter
operators:
- name: process-image
description: Read a image file and apply Canny Edge Detection Filter, then write the filtered image.
container:
image: clara/dp-sample
tag: latest
command: ["/bin/bash", "-c", "python-u/app/main.pyprocess_image_no--filter=${{FILTER_METHOD}}--variance=${{VARIANCE}}--lower-threshold=${{LOWER_THRESHOLD}}--upper-threshold=${{UPPER_THRESHOLD}}"]
requests:
gpu: 1
memory: 10240
input:
- path: /input
output:
- path: /output
- name: register-images-for-rendering
description: Register pyramid images in tiff format for rendering.
container:
image: clara/register-results
tag: latest
command: ["python", "register.py", "--agent", "renderserver"]
input:
- from: process-image
path: /input
12.20.1.2. dp-sample-pipeline.yaml
api-version: 0.4.0
name: dp-sample-pipeline
parameters:
TILE_COMMAND: tile_image_jpg
FILTER_COMMAND: filter_image_jpg_multiprocessing
STITCH_COMMAND: stitch_image_jpg
FILTER_METHOD: canny.canny_itk
VARIANCE: 10
LOWER_THRESHOLD: 0.01 # Use 0.1 for Arrayfire's canny filter
UPPER_THRESHOLD: 0.03 # Use 0.3 for Arrayfire's canny filter
HOST: 0.0.0.0
NUM_WORKERS: -1 # -1: # of cpus
BLOCK_SIZE_LIMIT: 100MB
TILE_SIZE: 224
operators:
- name: tile-image
description: Read a image file and save it into multi-tiled images.
container:
image: clara/dp-sample
tag: latest
command: ["/bin/bash", "-c", "python-u/app/main.py${{TILE_COMMAND}}--host=${{HOST}}--num-workers=${{NUM_WORKERS}}--block-size-limit=${{BLOCK_SIZE_LIMIT}}--tile-size=${{TILE_SIZE}}"]
input:
- path: /input
output:
- path: /output
requests:
cpu: 4
memory: 8192
- name: filter-image
description: Read tiled images and apply Canny Edge Detection Filter, then write the filtered-tiled images.
container:
image: clara/dp-sample
tag: latest
command: ["/bin/bash", "-c", "python-u/app/main.py${{FILTER_COMMAND}}--filter=${{FILTER_METHOD}}--variance=${{VARIANCE}}--lower-threshold=${{LOWER_THRESHOLD}}--upper-threshold=${{UPPER_THRESHOLD}}--host=${{HOST}}--num-workers=${{NUM_WORKERS}}--block-size-limit=${{BLOCK_SIZE_LIMIT}}--tile-size=${{TILE_SIZE}}"]
requests:
cpu: 4
gpu: 1
memory: 8192
input:
- from: tile-image
path: /input
output:
- path: /output
- name: stitch-image
description: Read filtered-tiled images and stitch the images, then write a big tiff file (pyramid).
container:
image: clara/dp-sample
tag: latest
command: ["/bin/bash", "-c", "python-u/app/main.py${{STITCH_COMMAND}}--host=${{HOST}}--num-workers=${{NUM_WORKERS}}--block-size-limit=${{BLOCK_SIZE_LIMIT}}--tile-size=${{TILE_SIZE}}"]
input:
- from: filter-image
path: /input
- path: /config
output:
- path: /output
requests:
memory: 8192
- name: register-images-for-rendering
description: Register pyramid images in tiff format for rendering.
container:
image: clara/register-results
tag: latest
command: ["python", "register.py", "--agent", "renderserver"]
input:
- from: stitch-image
path: /input
The following configuration can be used for this pipeline:
Configuration # | TILE_COMMAND | FILTER_COMMAND | STITCH_COMMAND |
---|---|---|---|
1 | tile_image_jpg | filter_image_jpg_serial | stitch_image_jpg |
2 | tile_image_jpg | filter_image_jpg_multithreading | stitch_image_jpg |
3 | tile_image_jpg | filter_image_jpg_multiprocessing | stitch_image_jpg |
4 | tile_image_jpg | filter_image_jpg_dali | stitch_image_jpg |
5 | tile_image_jpg_chunk | filter_image_jpg_dali_chunk | stitch_image_jpg_chunk |
6 | tile_image_zarr | filter_image_zarr | stitch_image_zarr |
Note that ArrayFire’s canny edge filter (with ‘canny.canny_af’ for FILTER_METHOD
parameter) doesn’t handle border cases well so the overall image wouldn’t look good (you will see border lines for each tile). ‘canny.canny_itk’ is recommended to use, though ITK’s canny edge filter is slower than one with ArrayFire.
Please refer to the Run Reference Pipelines using Local Input Files
in the How to run a Reference Pipeline section to learn how to register a pipeline and
execute the pipeline using local input files.
Example)
mkdir -p ~/.clara/pipelines
cd ~/.clara/pipelines
# Download pipeline
clara pull pipeline clara_dp_sample_pipeline
cd clara_dp_sample_pipeline
# Unzip test input data
unzip app_dp_sample-input_v1.zip -d input
# Create a pipeline
clara create pipeline -p dp-sample-pipeline.yaml # or `dp-sample-pipeline-no-optimization.yaml`
clara create jobs -p <PIPELINE ID> -n <JOB NAME> -f <INPUT PATH>
# Start a job (with parameters `TILE_COMMAND`, `FILTER_COMMAND`, `STITCH_COMMAND`, `VARIANCE`, `LOWER_THRESHOLD`, and `UPPER_THRESHOLD` if needed. e.g., '-a VARIANCE=10.0' )
clara start job -j <JOB ID>
Creating a pipeline/job and starting/monitoring a job can also be done by run_pipeline.sh
script:
mkdir -p ~/.clara/pipelines
cd ~/.clara/pipelines
# Download pipeline
clara pull pipeline clara_dp_sample_pipeline
cd clara_dp_sample_pipeline
# Unzip test input data
unzip app_dp_sample-input_v1.zip -d input
# Run script
# ./run_pipeline.sh [method]
# [method] can be one of ['no-optimization', ''] (default: '')
#
./run_pipeline.sh
Input requires a folder containing the following files:
- .tif or .svs - Input image file
- config_render.json - Configuration for Render Server
Bundled input data in this pipeline is a breast cancer case from The Cancer Genome Atlas.
Filtered image on Render Server
- Go to the Clara RenderServer UI using a web browser:
The URL is:
<IP of the machine>:8080
- You should see a dataset with a name that includes the name of the job you specified and the operator name (e.g.,
dp-sample-stitch-image
) - Clicking the dataset would show the rendered image on the screen
- Zooming: mouse scroll.
- Panning: mouse middle button. On Mac: Hold the key ‘a’ while dragging the mouse (drag with the left button).