10.3. Clara Deploy DICOM Parser Operator
CAUTION: This is NOT for diagnostics use.
This asset requires the Clara Deploy SDK. Follow the instructions on the Clara Ansible page to install the Clara Deploy SDK.
This example operator, in the form of a Docker container, parses DICOM instances in DICOM Part 10 file format to extract and save key DICOM metadata in a JSON file. It also converts DICOM pixel data, for applicable series such as CT and MR axial, into volume images, and saves them in a configurable format, e.g. MetaImage or NIfTI. The mappings of series instance UID to converted image file are also saved in a JSON file.
Furthermore, this operator can also select the relevant series using configurable series matching rules, and saves the selected series’ instance UID to image file mapping in a JSON file. The series selection rules are loaded by default from an empty JSON file within the container, but can be overridden with a user’s rules file whose path is set via a well-known environment variable, or alternatively, selection rules can be Base64 encoded and set via another well-known environment variable. More details are provided in the following section.
The output of this operator is expected to be consumed by the downstream operators in the pipeline to deterministically select and arrange image files as input to their processing step, e.g. a multi-channel AI inference operator.
This operator expects by default an input folder /input
, which contains DICOM instance files, extension dcm
, and subfolders of such files.
This operator also loads the series selection rules from a JSON configuration file, by default from ./config/selection-rules.json
, but the user’s rules file path can be provided via the well-known environment variable, NVIDIA_CLARA_SELECTION_RULES_FILE
. Alternatively, and to take the highest precedence, the rules can be Base64 encoded and set via another environment variable, NVIDIA_CLARA_SELECTION_RULES_BASE64
. A empty string as rules will select all converted series and its image, with the series instance UID as the selection name.
Multiple selections can be specified in the selection rules, each with a unique name. For each selection, multiple conditions can be specified, interpreted as logical and
. Each condition’s attribute name corresponds to the DICOM series attribute JSON name, and the value is expected to match the actual DICOM series attribute, with regular expression supported for string types. Below is an example of the selection rules,
{
"selections": [
{
"name": "T1C",
"conditions": {
"Modality": "(?i)^MR",
"StudyDescription": "(?i)Brain",
"ImageType": [
"ORIGINAL",
"PRIMARY"
],
"SeriesDescription": "^(?=.*T1c)(?!.*(ref|rfmt|[0-9]x[0-9])).*"
}
}
]
This application saves the meatdata in JSON format as well as image files to the output folder, /output
by default, as the following:
dicom-metadata.json
contains the extracted DICOM metadata in JSON format, organized in study, series, and instances hierarchy.series-images.json
contains the series to converted images mapping in JSON format.- images files, whose format depends on the runtime configuration of the supported formats:
mhd
,nii
, andnii.gz
. selected-images.json
contains the matched selection’s series instance UID to image file name. For each matched selection, the selection name is the attribute, and the value is a dictionary with matched series instance UID as attribute and the value being the image file name.
Below is an example of series-images.json
.
{
"1.2.826.0.1.3680043.2.1125.1.29610378020801235314660915067543534":"1.2.826.0.1.3680043.2.1125.1.29610378020801235314660915067543534.nii"
}
Below is an example of selected-images.json
.
{
"T1C":{
"1.2.826.0.1.3680043.2.1125.1.29610378020801235314660915067543534":"1.2.826.0.1.3680043.2.1125.1.29610378020801235314660915067543534.nii"
}
}
Logs generated by the application is saved in a folder, /logs
by default, which similarly must be mapped to a host folder.
The directories in the container are shown below.
The core of the application code is under the folder parser
.
.
├── buildContainers.sh
├── config
│ └── selection-rules.json
├── Dockerfile
├── __init__.py
├── logging_config.json
├── main.py
├── ngc
│ ├── metadata.json
│ └── overview.md
├── parser
│ ├── app.py
│ ├── __init__.py
│ ├── metadata_parser.py
│ ├── series_converter.py
│ ├── series_selector.py
│ └── runtime_envs.py
├── public
│ └── docs
│ └── README.md
├── requirements.txt
├── run_app_docker.sh
└── test-data
└── dcm
└── study1
└── series1
├── IMG0001.dcm
├── IMG0002.dcm
...
├── IMG0018.dcm
└── IMG0019.dcm
└── series2
└── RT000000.dcm
└── study2
└── series1
├── IMG0001.dcm
├── IMG0002.dcm
...
├── IMG0020.dcm
└── IMG0021.dcm
To control the format of converted image files, the operator uses a runtime environment variable to specify the extension of the expected file format, with default being mhd
:
NVIDIA_CLARA_IMAGE_FORMAT=['mhd' | 'nii' | 'nii.gz']
If you want to see the internals of the container and want to manually run the application, follow these steps.
- Start the container in interactive mode. See the next section on how to run the
container, and replace the
docker run
command withdocker run --entrypoint /bin/bash
- Once in the the Docker terminal, ensure the current directory is
dicom-parser
. - Copy
test-data/dcm/*
to/input
- Create folders
/output
and/logs
- Type in command
python ./main.py"
- Once finished, type
exit
.
10.3.7.1. Prerequisites
- Ensure there are DICOM instance files, with extension
dcm
, from a single or multiple DICOM studies or series.
10.3.7.2. Step 1
Change to your working directory, e.g. my_test
.
10.3.7.3. Step 2
Create, if they do not exist, the following directories under your working directory:
input
, and copy over the DCIOM instance files or folders.output
for the generated metadata files and image files.logs
for log files.
10.3.7.4. Step 3
In your working directory, create a shell script, e.g. run_app_docker.sh
or other name if you
prefer, copy the sample content below, save it, and make sure the variable TAG
has the same value as the actual container tag, e.g. 0.5.0-2003.6,
SCRIPT_DIR=$(dirname "$(readlink -f "$0")")
TESTDATA_DIR=$(readlink -f "${SCRIPT_DIR}"/test-data)
APP_NAME="dicom-parser"
OUTPUT_IMAGE_FORMAT='mhd'
# Pull Docker image if not already, example below
# docker pull nvcr.io/ea-nvidia-clara/clara/dicom-parser:
# docker tag dicom-parser:
dicom-parser:latest
# Run ${APP_NAME} container.
docker run --name ${APP_NAME} -t --rm \
-v ${TESTDATA_DIR}:/input \
-v ${SCRIPT_DIR}/output:/output \
-v ${SCRIPT_DIR}/logs:/logs \
-e NVIDIA_CLARA_IMAGE_FORMAT=${OUTPUT_IMAGE_FORMAT} \
-e DEBUG_VSCODE \
-e DEBUG_VSCODE_PORT \
-e NVIDIA_CLARA_NOSYNCLOCK=TRUE \
${APP_NAME}
echo "${APP_NAME}has finished."
10.3.7.5. Step 4
Execute the script below, and wait for the application container to finish,
./run_app_docker.sh
.
10.3.7.6. Step 5
Check for the following output files:
- Metadata files in the
output
directory,dicom-metadata.json
,series-images.json
, andselected-images.json
. - Image files, whose names are also in the
series-images.json
An End User License Agreement is included with the product. By pulling and using the Clara Deploy asset on NGC, you accept the terms and conditions of these licenses.
Release Notes, the Getting Started Guide, and the SDK itself are available at the NVIDIA Developer forum.
For answers to any questions you may have about this release, visit the NVIDIA Devtalk forum.