10.32. Digital Pathology Nuclei Segmentation Operator
Digital Pathology Nuclei Segmentation Operator is a reference application that makes use of Clara Pipeline Driver and OpenSlide for Digital Pathology image segmentation (cell nuclei setmentation).
This application, in the form of a Docker container, is expected to work with the Clara (CPDriver) orchestrator engine to use FastIO features, but it can work as a standalone application with Docker if the environment variable NVIDIA_CLARA_NOSYNCLOCK
is set to TRUE
.
The Digital Pathology Nuclei Segmentation Operator uses the following model/packages:
- An open-source implementation of the nuclei segmentation model from the Kaggle 2018 Data Science Bowl (https://github.com/limingwu8/UNet-pytorch)
- OpenSlide (https://openslide.org/)
- Tifffile (https://github.com/cgohlke/tifffile)
- Scikit-image (https://scikit-image.org/)
- NumPy (https://numpy.org/)
- OpenCV (https://opencv.org/)
The main code is available at /app/main.py
, and it is executed with parameters inside the container, as shown below:
/bin/bash -c 'python -u /app/main.py <command name>'
usage: main.py [-h] [-d DEBUG_LEVEL] [--input-path INPUT_PATH]
[--output-path OUTPUT_PATH] [--config-path CONFIG_PATH]
[-w NUM_WORKERS]
[--mask-pixel-count-limit MASK_PIXEL_COUNT_LIMIT]
[-t TILE_SIZE] [-m MODEL_NAME] [-o OVERLAP]
command
positional arguments:
command Command to execute
optional arguments:
-h, --help show this help message and exit
-d DEBUG_LEVEL, --debug-level DEBUG_LEVEL
Set debug level (e.g., 'INFO', 'DEBUG')
--input-path INPUT_PATH
Input folder path. Default is '/input'
--output-path OUTPUT_PATH
Output folder path. Default is '/output'
--config-path CONFIG_PATH
Config folder path. Default is '/config'
-w NUM_WORKERS, --num-workers NUM_WORKERS
Number of workers. Default is (# of cpus)
--mask-pixel-count-limit MASK_PIXEL_COUNT_LIMIT
Mask pixel count limit. Default is 1024 * 1024
-t TILE_SIZE, --tile-size TILE_SIZE
Tile size. Default is 256
-m MODEL_NAME, --model-name MODEL_NAME
Model name. Default is 'segmentation_unet_nuclei'
-o OVERLAP, --overlap OVERLAP
Overlap size. Default is 0. Not used for now
According to the <command name>
, it does a different job and each command acts as a stage in the pipeline.
10.32.2.1. segmentation
This executes all the operations (load/filter/stitch) at once.
This command loads a multi-res SVS file, tiles it, performs inferences with TRITON, and then writes out the multi-resolution/tiled image into the file system. This process consists of the following three stages:
- Pre-processing: It loads a whole slide image at a low-resolution to generate a mask. The generated mask image is used to skip inferencing on background tiles. For each tile, some filters (color conversion, normalization, and so on) are applied before inferencing.
- Inferencing: For each tile (256x256x3, uint8), it uses TRITON-based inference to segment nuclei in the tile.
- Post-processing: For each segmentation result in the tile, the segmentation part is overlaid on top of the original image. Each post-processed tile is saved into a single multi-resolution/tiled TIFF file using the Tifffile library.
10.32.2.1.1. Input
Input requires a folder (mounted at the /input
folder inside the container) containing the following files:
.tif
or.svs
: Input image fileconfig_render.json
: Configuration for Render Server
This command expects that a TRITON server is running and the server HTTP API address is available through the environment variable NVIDIA_CLARA_TRTISURI
(e.g., ‘172.24.0.2:8000’) so that inference calls are done on a model specified by the model name parameter (-m
or --model-name
).
10.32.2.1.2. Output
The following files are be stored at /output
folder inside the container:
image.tif
: Output image fileconfig.meta
: Metadata for Render Serverconfig_render.json
: Configuration for Render Server