11.20. Digital Pathology Image Processing Pipeline

The Digital Pathology Image Processing pipeline is one of the reference pipelines provided with Clara Deploy SDK. It accepts a image in formats that are readable by OpenSlide format, and optionally accept parameters for Canny Edge Detection Filter. The output is the filtered image and the image is published to the Render Server so that it can be viewed on the web browser.

11.20.1. Pipeline Definition

The Digital Pathology Image Processing pipeline is defined in the Clara Deploy pipeline definition language. This pipeline utilizes built-in reference containers to construct the following operator:

The followings are pipeline definitions available:

11.20.1.1. dp-sample-pipeline-no-optimization.yaml

api-version: 0.4.0
name: dp-sample-pipeline-no-optimization
parameters:
  FILTER_METHOD: canny.canny_itk
  VARIANCE: 10
  LOWER_THRESHOLD: 0.01  # Use 0.1 for Arrayfire's canny filter
  UPPER_THRESHOLD: 0.03  # Use 0.3 for Arrayfire's canny filter
operators:
  - name: process-image
    description: Read a image file and apply Canny Edge Detection Filter, then write the filtered image.
    container:
      image: clara/dp-sample
      tag: latest
      command: ["/bin/bash", "-c", "python -u /app/main.py process_image_no --filter=${{FILTER_METHOD}} --variance=${{VARIANCE}} --lower-threshold=${{LOWER_THRESHOLD}} --upper-threshold=${{UPPER_THRESHOLD}}"]
    requests:
      gpu: 1
    input:
    - path: /input
    output:
    - path: /output
  - name: register-images-for-rendering
    description: Register pyramid images in tiff format for rendering.
    container:
      image: clara/register-results
      tag: latest
      command: ["python", "register.py", "--agent", "renderserver"]
    input:
    - from: process-image
      path: /input

11.20.1.2. dp-sample-pipeline.yaml

api-version: 0.4.0
name: dp-sample-pipeline
parameters:
  TILE_COMMAND: tile_image_jpg
  FILTER_COMMAND: filter_image_jpg_multiprocessing
  STITCH_COMMAND: stitch_image_jpg
  FILTER_METHOD: canny.canny_itk
  VARIANCE: 10
  LOWER_THRESHOLD: 0.01         # Use 0.1 for Arrayfire's canny filter
  UPPER_THRESHOLD: 0.03         # Use 0.3 for Arrayfire's canny filter
  HOST: 0.0.0.0
  NUM_WORKERS: -1              # -1: # of cpus
  BLOCK_SIZE_LIMIT: 100MB
  TILE_SIZE: 224
operators:
  - name: tile-image
    description: Read a image file and save it into multi-tiled images.
    container:
      image: clara/dp-sample
      tag: latest
      command: ["/bin/bash", "-c", "python -u /app/main.py ${{TILE_COMMAND}} --host=${{HOST}} --num-workers=${{NUM_WORKERS}} --block-size-limit=${{BLOCK_SIZE_LIMIT}} --tile-size=${{TILE_SIZE}}"]
    input:
    - path: /input
    output:
    - path: /output
  - name: filter-image
    description: Read tiled images and apply Canny Edge Detection Filter, then write the filtered-tiled images.
    container:
      image: clara/dp-sample
      tag: latest
      command: ["/bin/bash", "-c", "python -u /app/main.py ${{FILTER_COMMAND}} --filter=${{FILTER_METHOD}} --variance=${{VARIANCE}} --lower-threshold=${{LOWER_THRESHOLD}} --upper-threshold=${{UPPER_THRESHOLD}}  --host=${{HOST}} --num-workers=${{NUM_WORKERS}} --block-size-limit=${{BLOCK_SIZE_LIMIT}} --tile-size=${{TILE_SIZE}}"]
    requests:
      gpu: 1
    input:
    - from: tile-image
      path: /input
    output:
    - path: /output
  - name: stitch-image
    description: Read filtered-tiled images and stitch the images, then write a big tiff file (pyramid).
    container:
      image: clara/dp-sample
      tag: latest
      command: ["/bin/bash", "-c", "python -u /app/main.py ${{STITCH_COMMAND}}  --host=${{HOST}} --num-workers=${{NUM_WORKERS}} --block-size-limit=${{BLOCK_SIZE_LIMIT}} --tile-size=${{TILE_SIZE}}"]
    input:
    - from: filter-image
      path: /input
    - path: /config
    output:
    - path: /output
  - name: register-images-for-rendering
    description: Register pyramid images in tiff format for rendering.
    container:
      image: clara/register-results
      tag: latest
      command: ["python", "register.py", "--agent", "renderserver"]
    input:
    - from: stitch-image
      path: /input

The following configuration can be used for this pipeline:

Configuration #

TILE_COMMAND

FILTER_COMMAND

STITCH_COMMAND

1

tile_image_jpg

filter_image_jpg_serial

stitch_image_jpg

2

tile_image_jpg

filter_image_jpg_multithreading

stitch_image_jpg

3

tile_image_jpg

filter_image_jpg_multiprocessing

stitch_image_jpg

4

tile_image_jpg

filter_image_jpg_dali

stitch_image_jpg

5

tile_image_jpg_chunk

filter_image_jpg_dali_chunk

stitch_image_jpg_chunk

6

tile_image_zarr

filter_image_zarr

stitch_image_zarr

Note that ArrayFire’s canny edge filter (with ‘canny.canny_af’ for FILTER_METHOD parameter) doesn’t handle border cases well so the overall image wouldn’t look good (you will see border lines for each tile). ‘canny.canny_itk’ is recommended to use, though ITK’s canny edge filter is slower than one with ArrayFire.

11.20.2. Executing the Pipeline

Please refer to the Run Reference Pipelines using Local Input Files in the How to run a Reference Pipeline section to learn how to register a pipeline and execute the pipeline using local input files.

Example)

mkdir -p ~/.clara/pipelines
cd ~/.clara/pipelines
# Download pipeline
clara pull pipeline clara_dp_sample_pipeline
cd clara_dp_sample_pipeline
# Unzip test input data
unzip app_dp_sample-input_v1.zip -d input
# Create a pipeline
clara create pipeline -p dp-sample-pipeline.yaml # or `dp-sample-pipeline-no-optimization.yaml`

clara create jobs -p <PIPELINE ID> -n <JOB NAME> -f <INPUT PATH>

# Start a job (with parameters `TILE_COMMAND`, `FILTER_COMMAND`, `STITCH_COMMAND`, `VARIANCE`, `LOWER_THRESHOLD`, and `UPPER_THRESHOLD` if needed. e.g., '-a VARIANCE=10.0' )
clara start job -j <JOB ID>

Creating a pipeline/job and starting/monitoring a job can also be done by run_pipeline.sh script:

mkdir -p ~/.clara/pipelines
cd ~/.clara/pipelines
# Download pipeline
clara pull pipeline clara_dp_sample_pipeline
cd clara_dp_sample_pipeline
# Unzip test input data
unzip app_dp_sample-input_v1.zip -d input

# Run script
#   ./run_pipeline.sh [method]
#       [method] can be one of ['no-optimization', ''] (default: '')
#
./run_pipeline.sh

11.20.3. Data Input

Input requires a folder containing the following files:

  • .tif or .svs - Input image file

  • config_render.json - Configuration for Render Server

Bundled input data in this pipeline is a breast cancer case from The Cancer Genome Atlas.

11.20.4. Data Output

Filtered image on Render Server

11.20.5. Viewing the Output Volume Image

  • Go to the Clara RenderServer UI using a web browser: The URL is: <IP of the machine>:8080

  • You should see a dataset with a name that includes the name of the job you specified and the operator name (e.g., dp-sample-stitch-image)

  • Clicking the dataset would show the rendered image on the screen

    • Zooming: mouse scroll.

    • Panning: mouse middle button. On Mac: Hold the key ‘a’ while dragging the mouse (drag with the left button).