11.20. Digital Pathology Image Processing Pipeline¶
The Digital Pathology Image Processing pipeline is one of the reference pipelines provided with Clara Deploy SDK. It accepts a image in formats that are readable by OpenSlide format, and optionally accept parameters for Canny Edge Detection Filter. The output is the filtered image and the image is published to the Render Server so that it can be viewed on the web browser.
11.20.1. Pipeline Definition¶
The Digital Pathology Image Processing pipeline is defined in the Clara Deploy pipeline definition language. This pipeline utilizes built-in reference containers to construct the following operator:
- Digital Pathology Image Processing Operator to read/filter/write image using CPDriver, OpenSlide, ITK, and tifffile
dp-sample-pipeline-no-optimization.yaml
- Commands used:
process_image_no
- Commands used:
dp-sample-pipeline.yaml
- Commands used:
tile_image_jpg
,filter_image_jpg_multiprocessing
, andstitch_image_jpg
- Commands used:
- Register Results Operator to publish the image to Render Server
The followings are pipeline definitions available:
11.20.1.1. dp-sample-pipeline-no-optimization.yaml¶
api-version: 0.4.0
name: dp-sample-pipeline-no-optimization
parameters:
FILTER_METHOD: canny.canny_itk
VARIANCE: 10
LOWER_THRESHOLD: 0.01 # Use 0.1 for Arrayfire's canny filter
UPPER_THRESHOLD: 0.03 # Use 0.3 for Arrayfire's canny filter
operators:
- name: process-image
description: Read a image file and apply Canny Edge Detection Filter, then write the filtered image.
container:
image: clara/dp-sample
tag: latest
command: ["/bin/bash", "-c", "python -u /app/main.py process_image_no --filter=${{FILTER_METHOD}} --variance=${{VARIANCE}} --lower-threshold=${{LOWER_THRESHOLD}} --upper-threshold=${{UPPER_THRESHOLD}}"]
requests:
gpu: 1
input:
- path: /input
output:
- path: /output
- name: register-images-for-rendering
description: Register pyramid images in tiff format for rendering.
container:
image: clara/register-results
tag: latest
command: ["python", "register.py", "--agent", "renderserver"]
input:
- from: process-image
path: /input
11.20.1.2. dp-sample-pipeline.yaml¶
api-version: 0.4.0
name: dp-sample-pipeline
parameters:
TILE_COMMAND: tile_image_jpg
FILTER_COMMAND: filter_image_jpg_multiprocessing
STITCH_COMMAND: stitch_image_jpg
FILTER_METHOD: canny.canny_itk
VARIANCE: 10
LOWER_THRESHOLD: 0.01 # Use 0.1 for Arrayfire's canny filter
UPPER_THRESHOLD: 0.03 # Use 0.3 for Arrayfire's canny filter
HOST: 0.0.0.0
NUM_WORKERS: -1 # -1: # of cpus
BLOCK_SIZE_LIMIT: 100MB
TILE_SIZE: 224
operators:
- name: tile-image
description: Read a image file and save it into multi-tiled images.
container:
image: clara/dp-sample
tag: latest
command: ["/bin/bash", "-c", "python -u /app/main.py ${{TILE_COMMAND}} --host=${{HOST}} --num-workers=${{NUM_WORKERS}} --block-size-limit=${{BLOCK_SIZE_LIMIT}} --tile-size=${{TILE_SIZE}}"]
input:
- path: /input
output:
- path: /output
- name: filter-image
description: Read tiled images and apply Canny Edge Detection Filter, then write the filtered-tiled images.
container:
image: clara/dp-sample
tag: latest
command: ["/bin/bash", "-c", "python -u /app/main.py ${{FILTER_COMMAND}} --filter=${{FILTER_METHOD}} --variance=${{VARIANCE}} --lower-threshold=${{LOWER_THRESHOLD}} --upper-threshold=${{UPPER_THRESHOLD}} --host=${{HOST}} --num-workers=${{NUM_WORKERS}} --block-size-limit=${{BLOCK_SIZE_LIMIT}} --tile-size=${{TILE_SIZE}}"]
requests:
gpu: 1
input:
- from: tile-image
path: /input
output:
- path: /output
- name: stitch-image
description: Read filtered-tiled images and stitch the images, then write a big tiff file (pyramid).
container:
image: clara/dp-sample
tag: latest
command: ["/bin/bash", "-c", "python -u /app/main.py ${{STITCH_COMMAND}} --host=${{HOST}} --num-workers=${{NUM_WORKERS}} --block-size-limit=${{BLOCK_SIZE_LIMIT}} --tile-size=${{TILE_SIZE}}"]
input:
- from: filter-image
path: /input
- path: /config
output:
- path: /output
- name: register-images-for-rendering
description: Register pyramid images in tiff format for rendering.
container:
image: clara/register-results
tag: latest
command: ["python", "register.py", "--agent", "renderserver"]
input:
- from: stitch-image
path: /input
The following configuration can be used for this pipeline:
Configuration # | TILE_COMMAND | FILTER_COMMAND | STITCH_COMMAND |
---|---|---|---|
1 | tile_image_jpg | filter_image_jpg_serial | stitch_image_jpg |
2 | tile_image_jpg | filter_image_jpg_multithreading | stitch_image_jpg |
3 | tile_image_jpg | filter_image_jpg_multiprocessing | stitch_image_jpg |
4 | tile_image_jpg | filter_image_jpg_dali | stitch_image_jpg |
5 | tile_image_jpg_chunk | filter_image_jpg_dali_chunk | stitch_image_jpg_chunk |
6 | tile_image_zarr | filter_image_zarr | stitch_image_zarr |
Note that ArrayFire’s canny edge filter (with ‘canny.canny_af’ for FILTER_METHOD
parameter) doesn’t handle border cases well so the overall image wouldn’t look good (you will see border lines for each tile). ‘canny.canny_itk’ is recommended to use, though ITK’s canny edge filter is slower than one with ArrayFire.
11.20.2. Executing the Pipeline¶
Please refer to the Run Reference Pipelines using Local Input Files
in the How to run a Reference Pipeline section to learn how to register a pipeline and
execute the pipeline using local input files.
Example)
mkdir -p ~/.clara/pipelines
cd ~/.clara/pipelines
# Download pipeline
clara pull pipeline clara_dp_sample_pipeline
cd clara_dp_sample_pipeline
# Unzip test input data
unzip app_dp_sample-input_v1.zip -d input
# Create a pipeline
clara create pipeline -p dp-sample-pipeline.yaml # or `dp-sample-pipeline-no-optimization.yaml`
clara create jobs -p <PIPELINE ID> -n <JOB NAME> -f <INPUT PATH>
# Start a job (with parameters `TILE_COMMAND`, `FILTER_COMMAND`, `STITCH_COMMAND`, `VARIANCE`, `LOWER_THRESHOLD`, and `UPPER_THRESHOLD` if needed. e.g., '-a VARIANCE=10.0' )
clara start job -j <JOB ID>
Creating a pipeline/job and starting/monitoring a job can also be done by run_pipeline.sh
script:
mkdir -p ~/.clara/pipelines
cd ~/.clara/pipelines
# Download pipeline
clara pull pipeline clara_dp_sample_pipeline
cd clara_dp_sample_pipeline
# Unzip test input data
unzip app_dp_sample-input_v1.zip -d input
# Run script
# ./run_pipeline.sh [method]
# [method] can be one of ['no-optimization', ''] (default: '')
#
./run_pipeline.sh
11.20.3. Data Input¶
Input requires a folder containing the following files:
- .tif or .svs - Input image file
- config_render.json - Configuration for Render Server
Bundled input data in this pipeline is a breast cancer case from The Cancer Genome Atlas.
11.20.4. Data Output¶
Filtered image on Render Server
11.20.5. Viewing the Output Volume Image¶
- Go to the Clara RenderServer UI using a web browser:
The URL is:
<IP of the machine>:8080
- You should see a dataset with a name that includes the name of the job you specified and the operator name (e.g.,
dp-sample-stitch-image
) - Clicking the dataset would show the rendered image on the screen
- Zooming: mouse scroll.
- Panning: mouse middle button. On Mac: Hold the key ‘a’ while dragging the mouse (drag with the left button).