12.20. Digital Pathology Image Processing Pipeline

The Digital Pathology Image Processing pipeline is one of the reference pipelines provided with Clara Deploy SDK. It accepts a image in formats that are readable by OpenSlide format, and optionally accept parameters for Canny Edge Detection Filter. The output is the filtered image and the image is published to the Render Server so that it can be viewed on the web browser.

12.20.1. Pipeline Definition

The Digital Pathology Image Processing pipeline is defined in the Clara Deploy pipeline definition language. This pipeline utilizes built-in reference containers to construct the following operator:

Digital Pathology Image Processing Operator to read/filter/write image using CPDriver, OpenSlide, ITK, and tifffile
- dp-sample-pipeline-no-optimization.yaml
  - Commands used: process_image_no
- dp-sample-pipeline.yaml
  - Commands used: tile_image_jpg, filter_image_jpg_multiprocessing, and stitch_image_jpg
Register Results Operator to publish the image to Render Server

The followings are pipeline definitions available:

12.20.1.1.dp-sample-pipeline-no-optimization.yaml

Copy
Copied!

            
            api-version: 0.4.0
name: dp-sample-pipeline-no-optimization
parameters:
  FILTER_METHOD: canny.canny_itk
  VARIANCE: 10
  LOWER_THRESHOLD: 0.01  # Use 0.1 for Arrayfire's canny filter
  UPPER_THRESHOLD: 0.03  # Use 0.3 for Arrayfire's canny filter
operators:
  - name: process-image
    description: Read a image file and apply Canny Edge Detection Filter, then write the filtered image.
    container:
      image: clara/dp-sample
      tag: latest
      command: ["/bin/bash", "-c", "python-u/app/main.pyprocess_image_no--filter=${{FILTER_METHOD}}--variance=${{VARIANCE}}--lower-threshold=${{LOWER_THRESHOLD}}--upper-threshold=${{UPPER_THRESHOLD}}"]
    requests:
      gpu: 1
      memory: 10240
    input:
    - path: /input
    output:
    - path: /output
  - name: register-images-for-rendering
    description: Register pyramid images in tiff format for rendering.
    container:
      image: clara/register-results
      tag: latest
      command: ["python", "register.py", "--agent", "renderserver"]
    input:
    - from: process-image
      path: /input

12.20.1.2.dp-sample-pipeline.yaml

Copy
Copied!

            
            api-version: 0.4.0
name: dp-sample-pipeline
parameters:
  TILE_COMMAND: tile_image_jpg
  FILTER_COMMAND: filter_image_jpg_multiprocessing
  STITCH_COMMAND: stitch_image_jpg
  FILTER_METHOD: canny.canny_itk
  VARIANCE: 10
  LOWER_THRESHOLD: 0.01         # Use 0.1 for Arrayfire's canny filter
  UPPER_THRESHOLD: 0.03         # Use 0.3 for Arrayfire's canny filter
  HOST: 0.0.0.0
  NUM_WORKERS: -1              # -1: # of cpus
  BLOCK_SIZE_LIMIT: 100MB
  TILE_SIZE: 224
operators:
  - name: tile-image
    description: Read a image file and save it into multi-tiled images.
    container:
      image: clara/dp-sample
      tag: latest
      command: ["/bin/bash", "-c", "python-u/app/main.py${{TILE_COMMAND}}--host=${{HOST}}--num-workers=${{NUM_WORKERS}}--block-size-limit=${{BLOCK_SIZE_LIMIT}}--tile-size=${{TILE_SIZE}}"]
    input:
    - path: /input
    output:
    - path: /output
    requests:
      cpu: 4
      memory: 8192
  - name: filter-image
    description: Read tiled images and apply Canny Edge Detection Filter, then write the filtered-tiled images.
    container:
      image: clara/dp-sample
      tag: latest
      command: ["/bin/bash", "-c", "python-u/app/main.py${{FILTER_COMMAND}}--filter=${{FILTER_METHOD}}--variance=${{VARIANCE}}--lower-threshold=${{LOWER_THRESHOLD}}--upper-threshold=${{UPPER_THRESHOLD}}--host=${{HOST}}--num-workers=${{NUM_WORKERS}}--block-size-limit=${{BLOCK_SIZE_LIMIT}}--tile-size=${{TILE_SIZE}}"]
    requests:
      cpu: 4
      gpu: 1
      memory: 8192
    input:
    - from: tile-image
      path: /input
    output:
    - path: /output
  - name: stitch-image
    description: Read filtered-tiled images and stitch the images, then write a big tiff file (pyramid).
    container:
      image: clara/dp-sample
      tag: latest
      command: ["/bin/bash", "-c", "python-u/app/main.py${{STITCH_COMMAND}}--host=${{HOST}}--num-workers=${{NUM_WORKERS}}--block-size-limit=${{BLOCK_SIZE_LIMIT}}--tile-size=${{TILE_SIZE}}"]
    input:
    - from: filter-image
      path: /input
    - path: /config
    output:
    - path: /output
    requests:
      memory: 8192
  - name: register-images-for-rendering
    description: Register pyramid images in tiff format for rendering.
    container:
      image: clara/register-results
      tag: latest
      command: ["python", "register.py", "--agent", "renderserver"]
    input:
    - from: stitch-image
      path: /input

The following configuration can be used for this pipeline:

Configuration #	TILE_COMMAND	FILTER_COMMAND	STITCH_COMMAND
1	tile_image_jpg	filter_image_jpg_serial	stitch_image_jpg
2	tile_image_jpg	filter_image_jpg_multithreading	stitch_image_jpg
3	tile_image_jpg	filter_image_jpg_multiprocessing	stitch_image_jpg
4	tile_image_jpg	filter_image_jpg_dali	stitch_image_jpg
5	tile_image_jpg_chunk	filter_image_jpg_dali_chunk	stitch_image_jpg_chunk
6	tile_image_zarr	filter_image_zarr	stitch_image_zarr

Note that ArrayFire’s canny edge filter (with ‘canny.canny_af’ for FILTER_METHOD parameter) doesn’t handle border cases well so the overall image wouldn’t look good (you will see border lines for each tile). ‘canny.canny_itk’ is recommended to use, though ITK’s canny edge filter is slower than one with ArrayFire.

12.20.2. Executing the Pipeline

Please refer to the Run Reference Pipelines using Local Input Files in the How to run a Reference Pipeline section to learn how to register a pipeline and execute the pipeline using local input files.

Example)

Copy
Copied!

            
            mkdir -p ~/.clara/pipelines
cd ~/.clara/pipelines
# Download pipeline
clara pull pipeline clara_dp_sample_pipeline
cd clara_dp_sample_pipeline
# Unzip test input data
unzip app_dp_sample-input_v1.zip -d input
# Create a pipeline
clara create pipeline -p dp-sample-pipeline.yaml # or `dp-sample-pipeline-no-optimization.yaml`

clara create jobs -p <PIPELINE ID> -n <JOB NAME> -f <INPUT PATH>

# Start a job (with parameters `TILE_COMMAND`, `FILTER_COMMAND`, `STITCH_COMMAND`, `VARIANCE`, `LOWER_THRESHOLD`, and `UPPER_THRESHOLD` if needed. e.g., '-a VARIANCE=10.0' )
clara start job -j <JOB ID>

Creating a pipeline/job and starting/monitoring a job can also be done by run_pipeline.sh script:

Copy
Copied!

            
            mkdir -p ~/.clara/pipelines
cd ~/.clara/pipelines
# Download pipeline
clara pull pipeline clara_dp_sample_pipeline
cd clara_dp_sample_pipeline
# Unzip test input data
unzip app_dp_sample-input_v1.zip -d input

# Run script
#   ./run_pipeline.sh [method]
#       [method] can be one of ['no-optimization', ''] (default: '')
#
./run_pipeline.sh

12.20.3. Data Input

Input requires a folder containing the following files:

.tif or .svs - Input image file
config_render.json - Configuration for Render Server

Bundled input data in this pipeline is a breast cancer case from The Cancer Genome Atlas.

12.20.4. Data Output

Filtered image on Render Server

12.20.5. Viewing the Output Volume Image

Go to the Clara RenderServer UI using a web browser: The URL is: <IP of the machine>:8080
You should see a dataset with a name that includes the name of the job you specified and the operator name (e.g., dp-sample-stitch-image)
Clicking the dataset would show the rendered image on the screen
- Zooming: mouse scroll.
- Panning: mouse middle button. On Mac: Hold the key ‘a’ while dragging the mouse (drag with the left button).