NVIDIA Holoscan is the AI sensor processing platform that combines hardware systems for low-latency sensor and network connectivity, optimized libraries for data processing and AI, and core microservices to run streaming, imaging, and other applications, from embedded to edge to cloud. It can be used to build streaming AI pipelines for a variety of domains, including Medical Devices, High Performance Computing at the Edge, Industrial Inspection and more.
In previous releases, the prefix
Clara was used to define Holoscan as a platform designed initially for medical devices. As Holoscan has grown, its potential to serve other areas has become apparent. With version 0.4.0, we’re proud to announce that the Holoscan SDK is now officially built to be domain-agnostic and can be used to build sensor AI applications in multiple domains. Note that some of the content of the SDK (sample applications) or the documentation might still appear to be healthcare-specific pending additional updates. Going forward, domain specific content will be hosted on the HoloHub repository.
Visit the NGC demo website for a live demonstration of some of Holoscan capabilities.
Many aspects of the Holoscan SDK are currently implemented within extensions. The extensions packaged in the SDK cover tasks such as IO, machine learning inference, image processing, and visualization. They rely on a set of Core Technologies.
This guide will provide more information on the existing extensions, and how to create your own.
This SDK includes multiple sample applications to show how users can implement their own end-to-end inference pipeline for streaming use cases, as well as additional “bring your own model” (BYOM) abilities which are modality agnostic. This guide provides detailed information on the inner-workings of those applications, and how to create your own.
See below for some information regarding the sample applications:
Endoscopy Tool Tracking
Based on a LSTM (long-short term memory) stateful model, these applications demonstrate the use of custom components for tool tracking, including composition and rendering of text, tool position, and mask (as heatmap) combined with the original video stream.
Full workflow including a generic visualization of segmentation results from a spinal scoliosis segmentation model of ultrasound videos. The model used is stateless, so this workflow could be configured to adapt to any vanilla DNN model. These applications comes with the support of an AJA capture card or replay from a video file included in the sample application container.
Colonoscopy Polyp Segmentation
As an example of the BYOM ability mentioned above, we show how the same code used for ultrasound segmentation may be used for a polyp segmentation application.
This model was trained on the Kvasir-SEG dataset 1, using the ColonSegNet model architecture 2.
Refer to the sample data resource on NGC for more information related to the model and video.
The example app showcases how high resolution cameras can be used to capture the scene, post-processed on GPU and displayed at high frame rate. This app requires Emergent Vision Technologies camera and a display with high refresh rate to keep up with the camera’s framerate.
Multi AI Ultrasound
Demonstrates how to run multiple inference pipelines in a single application by leveraging the Holoscan Inference module, a framework that facilitates designing and executing inference applications in the Holoscan SDK. The Multi AI operators (inference and postprocessor) use APIs from the Holoscan Inference module to extract data, initialize and execute the inference workflow, process, and transmit data for visualization. The applications uses models and echocardiogram data from iCardio.ai. The models include:
a Plax chamber model, that identifies four critical linear measurements of the heart
a Viewpoint Classifier model, that determines confidence of each frame to known 28 cardiac anatomical view as defined by the guidelines of the American Society of Echocardiography
an Aortic Stenosis Classification model, that provides a score which determines likeability for the presence of aortic stenosis
C++ And Python APIs
The Holoscan SDK also includes C++ and Python APIs for the creation of applications. These APIs are designed to be user-friendly and flexible to use.
Please see the Using the SDK section for more information.
Video Pipeline Latency Tool
To help developers make sense of the overall end-to-end latency that could be added to a video stream by augmenting it through a GPU-powered Holoscan platform such as the NVIDIA IGX Orin Developer Kit, the Holoscan SDK includes a Video Pipeline Latency Measurement Tool. This tool can be used to measure and estimate the total end-to-end latency of a video streaming application including the video capture, processing, and output using various hardware and software components that are supported by the Holoscan Developer Kits. The measurements taken by this tool can then be displayed with a comprehensive and easy-to-read visualization of the data.
High-level changes are described in the Using the SDK section. More detailed changes can be found in the Holoscan SDK release notes on Github.
Jha, Debesh, Pia H. Smedsrud, Michael A. Riegler, Pål Halvorsen, Thomas de Lange, Dag Johansen, and Håvard D. Johansen, “Kvasir-seg: A segmented polyp dataset” Proceedings of the International Conference on Multimedia Modeling, pp. 451-462, 2020.
Jha D, Ali S, Tomar NK, Johansen HD, Johansen D, Rittscher J, Riegler MA, Halvorsen P. Real-Time Polyp Detection, Localization and Segmentation in Colonoscopy Using Deep Learning. IEEE Access. 2021 Mar 4;9:40496-40510. doi: 10.1109/ACCESS.2021.3063716. PMID: 33747684; PMCID: PMC7968127.