Class HolovizOp
Defined in File holoviz.hpp
Base Type
public holoscan::Operator(Class Operator)
-
class HolovizOp : public holoscan::Operator
Operator class for data visualization.
This high-speed viewer handles compositing, blending, and visualization of RGB or RGBA images, masks, geometric primitives, text and depth maps. The operator can auto detect the format of the input tensors acquired at the
receiversport. Else the input specification can be set at creation time using the
tensorsparameter or at runtime when passing input specifications to the
input_specsport.
Depth maps and 3D geometry are rendered in 3D and support camera movement. The camera is controlled using the mouse:
Orbit (LMB)
Pan (LMB + CTRL | MMB)
Dolly (LMB + SHIFT | RMB | Mouse wheel)
Look Around (LMB + ALT | LMB + CTRL + SHIFT)
Zoom (Mouse wheel + SHIFT) Or by providing new values at the
camera_eye_input,
camera_look_at_inputor
camera_up_inputinput ports. The camera pose can be output at the
camera_pose_outputport when
enable_camera_pose_outputis set to
true.
Callbacks can be used to receive updates on key presses, mouse position and buttons, and window size.
==Named Inputs==
receivers : multi-receiver accepting
nvidia::gxf::Tensorand/or
nvidia::gxf::VideoBuffer
Any number of upstream ports may be connected to this
receiversport. This port can accept either VideoBuffers or Tensors. These inputs can be in either host or device memory. Each tensor or video buffer will result in a layer. The operator autodetects the layer type for certain input types (e.g. a video buffer will result in an image layer). For other input types or more complex use cases, input specifications can be provided either at initialization time as a parameter or dynamically at run time (via
input_specs). On each call to
compute, tensors corresponding to all names specified in the
tensorsparameter must be found or an exception will be raised. Any extra, named tensors not present in the
tensorsparameter specification (or optional, dynamic
input_specsinput) will be ignored.
-
input_specs :
std::vector<holoscan::ops::HolovizOp::InputSpec>(optional)
A list of
InputSpecobjects. This port can be used to dynamically update the overlay specification at run time. No inputs are required on this port in order for the operator to
compute.
-
render_buffer_input :
nvidia::gxf::VideoBuffer(optional)
An empty render buffer can optionally be provided. The video buffer must have format GXF_VIDEO_FORMAT_RGBA and be in device memory. This input port only exists if
enable_render_buffer_inputwas set to true, in which case
computewill only be called when a message arrives on this input.
-
depth_buffer_input :
nvidia::gxf::VideoBuffer(optional)
An empty depth buffer can optionally be provided. The video buffer must have format GXF_VIDEO_FORMAT_D32F and be in device memory. This input port only exists if
enable_depth_buffer_inputwas set to true, in which case
computewill only be called when a message arrives on this input.
-
camera_eye_input :
std::array<float, 3>(optional)
Camera eye position. The camera is animated to reach the new position.
-
camera_look_at_input :
std::array<float, 3>(optional)
Camera look at position. The camera is animated to reach the new position.
-
camera_up_input : :
std::array<float, 3>(optional)
Camera up vector. The camera is animated to reach the new vector.
-
==Named Outputs==
render_buffer_output :
nvidia::gxf::VideoBuffer(optional)
Output for a filled render buffer. If an input render buffer is specified, it is using that one, else it allocates a new buffer. The video buffer will have format GXF_VIDEO_FORMAT_RGBA and will be in device memory. This output is useful for offline rendering or headless mode. This output port only exists if
enable_render_buffer_outputwas set to true.
-
depth_buffer_output :
nvidia::gxf::VideoBuffer(optional)
Output for a filled depth buffer. If an input depth buffer is specified, it is using that one, else it allocates a new buffer. The video buffer will have format GXF_VIDEO_FORMAT_D32F and will be in device memory. This output is useful for offline rendering or headless mode. This output port only exists if
enable_depth_buffer_outputwas set to true.
-
camera_pose_output :
std::array<float, 16>or
nvidia::gxf::Pose3D(optional)
Output the camera pose. Depending on the value of
camera_pose_output_typethis outputs a 4x4 row major projection matrix (type
std::array<float, 16>) or the camera extrinsics model (type
nvidia::gxf::Pose3D). This output port only exists if
enable_camera_pose_outputwas set to
True.
-
==Parameters==
receivers: List of input queues to component accepting
gxf::Tensoror
gxf::VideoBuffer.
type:
std::vector<gxf::Handle<gxf::Receiver>>
-
enable_render_buffer_input: Enable
render_buffer_input(default:
false)
type:
bool
-
enable_render_buffer_output: Enable
render_buffer_output(default:
false)
type:
bool
-
enable_depth_buffer_input: Enable
depth_buffer_input(default:
false)
type:
bool
-
enable_depth_buffer_output: Enable
depth_buffer_output(default:
false)
type:
bool
-
enable_camera_pose_output: Enable
camera_pose_output(default:
false)
type:
bool
-
tensors: List of input tensor specifications (default:
[])
type:
std::vector<InputSpec>
name: name of the tensor containing the input data to display
type:
std::string
-
type: input type (default
"unknown")
type:
std::string
possible values:
unknown: unknown type, the operator tries to guess the type by inspecting the tensor.
color: RGB or RGBA color 2d image.
color_lut: single channel 2d image, color is looked up.
points: point primitives, one coordinate (x, y) per primitive.
lines: line primitives, two coordinates (x0, y0) and (x1, y1) per primitive.
line_strip: line strip primitive, a line primitive i is defined by each coordinate (xi, yi) and the following (xi+1, yi+1).
triangles: triangle primitive, three coordinates (x0, y0), (x1, y1) and (x2, y2) per primitive.
crosses: cross primitive, a cross is defined by the center coordinate and the size (xi, yi, si).
rectangles: axis aligned rectangle primitive, each rectangle is defined by two coordinates (xi, yi) and (xi+1, yi+1).
ovals: oval primitive, an oval primitive is defined by the center coordinate and the axis sizes (xi, yi, sxi, syi).
text: text is defined by the top left coordinate and the size (x, y, s) per string, text strings are defined by InputSpec member text.
depth_map: single channel 2d array where each element represents a depth value. The data is rendered as a 3d object using points, lines or triangles. The color for the elements can be specified through
depth_map_color. Supported formats for the depth map:
8-bit unsigned normalized format that has a single 8-bit depth component
32-bit signed float format that has a single 32-bit depth component
-
depth_map_color: RGBA 2d image, same size as the depth map. One color value for each element of the depth map grid. Supported format: 32-bit unsigned normalized format that has an 8-bit R component in byte 0, an 8-bit G component in byte 1, an 8-bit B component in byte 2, and an 8-bit A component in byte 3.
-
-
opacity: layer opacity, 1.0 is fully opaque, 0.0 is fully transparent (default:
1.0)
type:
float
-
priority: layer priority, determines the render order, layers with higher priority values are rendered on top of layers with lower priority values (default:
0)
type:
int32_t
-
image_format: color image format, used if
typeis
color,
color_lutor
depth_map_color. (default:
auto_detect).
type:
std::string
-
color: RGBA color of rendered geometry (default:
[1.F, 1.F, 1.F, 1.F])
type:
std::vector<float>
-
line_width: line width for geometry made of lines (default:
1.0)
type:
float
-
point_size: point size for geometry made of points (default:
1.0)
type:
float
-
text: array of text strings, used when
typeis
text. (default:
[])
type:
std::vector<std::string>
-
depth_map_render_mode: depth map render mode (default:
points)
type:
std::string
possible values:
points: render as points
lines: render as lines
triangles: render as triangles
-
-
-
-
color_lut: Color lookup table for tensors of type ‘color_lut’, vector of four float RGBA values
type:
std::vector<std::vector<float>>
-
window_title: Title on window canvas (default:
"Holoviz")
type:
std::string
-
display_name: In exclusive display or fullscreen mode, name of display to use as shown with
xrandror
hwinfo --monitor(default:
"")
type:
std::string
-
width: Window width or display resolution width if in exclusive display or fullscreen mode (default:
1920)
type:
uint32_t
-
height: Window height or display resolution height if in exclusive display or fullscreen mode (default:
1080)
type:
uint32_t
-
framerate: Display framerate if in exclusive display mode (default:
60)
type:
uint32_t
-
use_exclusive_display: Enable exclusive display mode (default:
false)
type:
bool
-
fullscreen: Enable fullscreen window (default:
false)
type:
bool
-
headless: Enable headless mode. No window is opened, the render buffer can be output to
render_buffer_outputand/or
depth_buffer_outputif enabled. (default:
false)
type:
bool
-
framebuffer_srgb: Enable sRGB framebuffer. If set to true, the operator will use an sRGB framebuffer for rendering. If set to false, the operator will use a linear framebuffer. (default:
false)
type:
bool
-
vsync: Enable vertical sync. If set to true the operator waits for the next vertical blanking period of the display to update the current image. (default:
false)
type:
bool
-
display_color_space: Set the display color space. Supported color spaces depend on the display setup. ‘ColorSpace::SRGB_NONLINEAR’ is always supported. In headless mode, only ‘ColorSpace::PASS_THROUGH’ is supported since there is no display. For other color spaces the display needs to be configured for HDR (default:
ColorSpace::AUTO)
type:
std::string
-
window_close_condition.: BooleanCondition on the operator that will cause it to stop executing if the display window is closed. By default, this condition is created automatically during HolovizOp::initialize. The user may want to provide it if, for example there are multiple HolovizOp operators and you want to share the same window close condition across both. By sharing the same condition, if one of the display windows is closed it would also close the other(s).
window_close_scheduling_term: This is a deprecated parameter name for
window_close_condition. Please use
window_close_conditioninstead as
window_close_scheduling_termwill be removed in a future release.
type:
gxf::Handle<gxf::BooleanSchedulingTerm>
-
allocator: Allocator used to allocate memory for
render_buffer_outputand
depth_buffer_output
type:
gxf::Handle<gxf::Allocator>
-
font_path: File path for the font used for rendering text (default:
"")
type:
std::string
-
cuda_stream_pool: Instance of gxf::CudaStreamPool
type:
gxf::Handle<gxf::CudaStreamPool>
-
camera_pose_output_type: Type of data output at
camera_pose_output. Supported values are
projection_matrixand
extrinsics_model. Default value is
projection_matrix.
type:
std::string
-
camera_eye: Initial camera eye position.
type:
std::array<float, 3>
-
camera_look_at: Initial camera look at position.
type:
std::array<float, 3>
-
camera_up: Initial camera up vector.
type:
std::array<float, 3>
-
key_callback: The callback function is called when a key is pressed, released or repeated.
type:
KeyCallbackFunction
-
unicode_char_callback: The callback function is called when a Unicode character is input.
type:
UnicodeCharCallbackFunction
-
mouse_button_callback: The callback function is called when a mouse button is pressed or released.
type:
MouseButtonCallbackFunction
-
scroll_callback: The callback function is called when a scrolling device is used, such as a mouse scroll wheel or the scroll area of a touch pad.
type:
ScrollCallbackFunction
-
cursor_pos_callback: The callback function is called when the cursor position changes. Coordinates are provided in screen coordinates, relative to the upper left edge of the content area.
type:
CursorPosCallbackFunction
-
framebuffer_size_callback: The callback function is called when the framebuffer is resized.
type:
FramebufferSizeCallbackFunction
-
window_size_callback: The callback function is called when the window is resized.
type:
WindowSizeCallbackFunction
-
window_close_callback: The callback function is called when the window is closed.
type:
WindowCloseCallbackFunction
-
layer_callback: The callback function is called when HolovizOp processed all layers defined by the input specification. It can be used to add extra layers.
type:
LayerCallbackFunction
-
==Device Memory Requirements==
If
render_buffer_inputor
depth_buffer_inputis enabled, the provided buffer is used and no memory block will be allocated. Otherwise, when using this operator with a
BlockMemoryPool, a single device memory block is needed (
storage_type= 1). The size of this memory block can be determined by rounding the width and height up to the nearest even size and then padding the rows as needed so that the row stride is a multiple of 256 bytes. C++ code to calculate the block size is as follows:
#include <cstdint> int64_t get_block_size(int32_t height, int32_t width) { int32_t height_even = height + (height & 1); int32_t width_even = width + (width & 1); int64_t row_bytes = width_even * 4; // 4 bytes per pixel for 8-bit RGBA or 32-bit depth int64_t row_stride = (row_bytes % 256 == 0) ? row_bytes : ((row_bytes / 256 + 1) * 256); return height_even * row_stride; }
==Notes==
Displaying Color Images
Image data can either be on host or device (GPU). Multiple image formats are supported
R 8 bit unsigned
R 16 bit unsigned
R 16 bit float
R 32 bit unsigned
R 32 bit float
RGB 8 bit unsigned
BGR 8 bit unsigned
RGBA 8 bit unsigned
BGRA 8 bit unsigned
RGBA 16 bit unsigned
RGBA 16 bit float
RGBA 32 bit float
When the
typeparameter is set to
color_lutthe final color is looked up using the values from the
color_lutparameter. For color lookups these image formats are supported
R 8 bit unsigned
R 16 bit unsigned
R 32 bit unsigned
-
Drawing Geometry
In all cases,
xand
yare normalized coordinates in the range
[0, 1]. The
xand
ycorrespond to the horizontal and vertical axes of the display, respectively. The origin
(0, 0)is at the top left of the display. Geometric primitives outside of the visible area are clipped. Coordinate arrays are expected to have the shape
(N, C)where
Nis the coordinate count and
Cis the component count for each coordinate.
Points are defined by a
(x, y)coordinate pair.
Lines are defined by a set of two
(x, y)coordinate pairs.
Lines strips are defined by a sequence of
(x, y)coordinate pairs. The first two coordinates define the first line, each additional coordinate adds a line connecting to the previous coordinate.
Triangles are defined by a set of three
(x, y)coordinate pairs.
Crosses are defined by
(x, y, size)tuples.
sizespecifies the size of the cross in the
xdirection and is optional, if omitted it’s set to
0.05. The size in the
ydirection is calculated using the aspect ratio of the window to make the crosses square.
Rectangles (bounding boxes) are defined by a pair of 2-tuples defining the upper-left and lower-right coordinates of a box:
(x1, y1), (x2, y2).
Ovals are defined by
(x, y, size_x, size_y)tuples.
size_xand
size_yare optional, if omitted they are set to
0.05.
Texts are defined by
(x, y, size)tuples.
sizespecifies the size of the text in
ydirection and is optional, if omitted it’s set to
0.05. The size in the
xdirection is calculated using the aspect ratio of the window. The index of each coordinate references a text string from the
textparameter and the index is clamped to the size of the text array. For example, if there is one item set for the
textparameter, e.g.
text=["my_text"]and three coordinates, then
my_textis rendered three times. If
text=["first text", "second text"]and three coordinates are specified, then
first textis rendered at the first coordinate,
second textat the second coordinate and then
second textagain at the third coordinate. The
textstring array is fixed and can’t be changed after initialization. To hide text which should not be displayed, specify coordinates greater than
(1.0, 1.0)for the text item, the text is then clipped away.
3D Points are defined by a
(x, y, z)coordinate tuple.
3D Lines are defined by a set of two
(x, y, z)coordinate tuples.
3D Lines strips are defined by a sequence of
(x, y, z)coordinate tuples. The first two coordinates define the first line, each additional coordinate adds a line connecting to the previous coordinate.
3D Triangles are defined by a set of three
(x, y, z)coordinate tuples.
-
Displaying Depth Maps
When
typeis
depth_mapthe provided data is interpreted as a rectangular array of depth values. Additionally a 2d array with a color value for each point in the grid can be specified by setting
typeto
depth_map_color.
The type of geometry drawn can be selected by setting
depth_map_render_mode.
Depth maps are rendered in 3D and support camera movement.
Output
By default a window is opened to display the rendering, but the extension can also be run in headless mode with the
headlessparameter.
Using a display in exclusive mode is also supported with the
use_exclusive_displayparameter. This reduces the latency by avoiding the desktop compositor.
The rendered framebuffer can be output to
render_buffer_outputor
depth_buffer_outputif enabled.
==Notes==
When
render_buffer_outputor
depth_buffer_outputare enabled, this operator may launch CUDA kernels that execute asynchronously on a CUDA stream. As a result, the
computemethod may return before all GPU work has completed. Downstream operators that receive data from this operator should call
op_input.receive_cuda_stream(<port_name>)to synchronize the CUDA stream with the downstream operator’s dedicated internal stream. This ensures proper synchronization before accessing the data. For more details on CUDA stream handling in Holoscan, see: https://docs.nvidia.com/holoscan/sdk-user-guide/holoscan_cuda_stream_handling.html
Public Types
-
enum class InputType
Input type.
All geometric primitives expect a 1d array of coordinates. Coordinates range from 0.0 (left, top) to 1.0 (right, bottom).
Values:
-
enumerator UNKNOWN
unknown type, the operator tries to guess the type by inspecting the tensor
-
enumerator COLOR
GRAY, RGB or RGBA 2d color image.
-
enumerator COLOR_LUT
single channel 2d image, color is looked up
-
enumerator POINTS
point primitives, one coordinate (x, y) per primitive
-
enumerator LINES
line primitives, two coordinates (x0, y0) and (x1, y1) per primitive
-
enumerator LINE_STRIP
line strip primitive, a line primitive i is defined by each coordinate (xi, yi) and the following (xi+1, yi+1)
-
enumerator TRIANGLES
triangle primitive, three coordinates (x0, y0), (x1, y1) and (x2, y2) per primitive
-
enumerator CROSSES
cross primitive, a cross is defined by the center coordinate and the size (xi, yi, si)
-
enumerator RECTANGLES
axis aligned rectangle primitive, each rectangle is defined by two coordinates (xi, yi) and (xi+1, yi+1)
-
enumerator OVALS
oval primitive, an oval primitive is defined by the center coordinate and the axis sizes (xi, yi, sxi, syi)
-
enumerator TEXT
text is defined by the top left coordinate and the size (x, y, s) per string, text strings are define by InputSpec::text_
-
enumerator DEPTH_MAP
single channel 2d array where each element represents a depth value. The data is rendered as a 3d object using points, lines or triangles. The color for the elements can be specified through
DEPTH_MAP_COLOR. Supported format: 8-bit unsigned normalized format that has a single 8-bit depth component
-
enumerator DEPTH_MAP_COLOR
RGBA 2d image, same size as the depth map. One color value for each element of the depth map grid. Supported format: 32-bit unsigned normalized format that has an 8-bit R component in byte > 0, an 8-bit G component in byte 1, an 8-bit B component in byte 2, and an 8-bit A component in byte 3
-
enumerator POINTS_3D
3D point primitives, one coordinate (x, y, z) per primitive
-
enumerator LINES_3D
3D line primitives, two coordinates (x0, y0, z0) and (x1, y1, z1) per primitive
-
enumerator LINE_STRIP_3D
3D line strip primitive, a line primitive i is defined by each coordinate (xi, yi, zi) and the following (xi+1, yi+1, zi+1)
-
enumerator TRIANGLES_3D
3D triangle primitive, three coordinates (x0, y0, z0), (x1, y1, z1) and (x2, y2, z2) per primitive
- enumerator UNKNOWN
-
enum class ImageFormat
Image formats.
{component format}_{numeric format}
component format
indicates the size in bits of the R, G, B, A or Y, U, V components if present
-
numeric format
UNORM - unsigned normalize values, range [0, 1]
SNORM - signed normalized values, range [-1,1]
UINT - unsigned integer values, range [0,2n-1]
SINT - signed integer values, range [-2n-1,2n-1-1]
SFLOAT - signed floating-point numbers
SRGB - the R, G, and B components are unsigned normalized values that represent values using sRGB nonlinear encoding, while the A component (if one exists) is a regular unsigned normalized value
-
multi-planar formats
2PLANE - data is stored in two separate memory planes
3PLANE - data is stored in three separate memory planes
-
YUV formats
420 - the horizontal and vertical resolution of the chroma (UV) planes is halved
422 - the horizontal of the chroma (UV) planes is halved
-
Note: this needs to match the viz::ImageFormat enum (except the AUTO_DETECT value).
Values:
-
enumerator R8_UINT
specifies a one-component, 8-bit unsigned integer format that has a single 8-bit R component
-
enumerator R8_SINT
specifies a one-component, 8-bit signed integer format that has a single 8-bit R component
-
enumerator R8_UNORM
specifies a one-component, 8-bit unsigned normalized format that has a single 8-bit R component
-
enumerator R8_SNORM
specifies a one-component, 8-bit signed normalized format that has a single 8-bit R component
-
enumerator R8_SRGB
specifies a one-component, 8-bit unsigned normalized format that has a single 8-bit R component stored with sRGB nonlinear encoding
-
enumerator R16_UINT
specifies a one-component, 16-bit unsigned integer format that has a single 16-bit R component
-
enumerator R16_SINT
specifies a one-component, 16-bit signed integer format that has a single 16-bit R component
-
enumerator R16_UNORM
specifies a one-component, 16-bit unsigned normalized format that has a single 16-bit R component
-
enumerator R16_SNORM
specifies a one-component, 16-bit signed normalized format that has a single 16-bit R component
-
enumerator R16_SFLOAT
specifies a one-component, 16-bit signed floating-point format that has a single 16-bit R component
-
enumerator R32_UINT
specifies a one-component, 16-bit unsigned integer format that has a single 16-bit R component
-
enumerator R32_SINT
specifies a one-component, 16-bit signed integer format that has a single 16-bit R component
-
enumerator R32_SFLOAT
specifies a one-component, 32-bit signed floating-point format that has a single 32-bit R component
-
enumerator R8G8B8_UNORM
specifies a three-component, 24-bit unsigned normalized format that has a 8-bit R component in byte 0, a 8-bit G component in byte 1, and a 8-bit B component in byte 2
-
enumerator R8G8B8_SNORM
specifies a three-component, 24-bit signed normalized format that has a 8-bit R component in byte 0, a 8-bit G component in byte 1, and a 8-bit B component in byte 2
-
enumerator R8G8B8_SRGB
specifies a three-component, 24-bit unsigned normalized format that has a 8-bit R component stored with sRGB nonlinear encoding in byte 0, a 8-bit G component stored with sRGB nonlinear encoding in byte 1, and a 8-bit B component stored with sRGB nonlinear encoding in byte 2
-
enumerator R8G8B8A8_UNORM
specifies a four-component, 32-bit unsigned normalized format that has a 8-bit R component in byte 0, a 8-bit G component in byte 1, a 8-bit B component in byte 2, and a 8-bit A component in byte 3
-
enumerator R8G8B8A8_SNORM
specifies a four-component, 32-bit signed normalized format that has a 8-bit R component in byte 0, a 8-bit G component in byte 1, a 8-bit B component in byte 2, and a 8-bit A component in byte 3
-
enumerator R8G8B8A8_SRGB
specifies a four-component, 32-bit unsigned normalized format that has a 8-bit R component stored with sRGB nonlinear encoding in byte 0, a 8-bit G component stored with sRGB nonlinear encoding in byte 1, a 8-bit B component stored with sRGB nonlinear encoding in byte 2, and a 8-bit A component in byte 3
-
enumerator R16G16B16A16_UNORM
specifies a four-component, 64-bit unsigned normalized format that has a 16-bit R component in bytes 0..1, a 16-bit G component in bytes 2..3, a 16-bit B component in bytes 4..5, and a 16-bit A component in bytes 6..7
-
enumerator R16G16B16A16_SNORM
specifies a four-component, 64-bit signed normalized format that has a 16-bit R component in bytes 0..1, a 16-bit G component in bytes 2..3, a 16-bit B component in bytes 4..5, and a 16-bit A component in bytes 6..7
-
enumerator R16G16B16A16_SFLOAT
specifies a four-component, 64-bit signed floating-point format that has a 16-bit R component in bytes 0..1, a 16-bit G component in bytes 2..3, a 16-bit B component in bytes 4..5, and a 16-bit A component in bytes 6..7
-
enumerator R32G32B32A32_SFLOAT
specifies a four-component, 128-bit signed floating-point format that has a 32-bit R component in bytes 0..3, a 32-bit G component in bytes 4..7, a 32-bit B component in bytes 8..11, and a 32-bit A component in bytes 12..15
-
enumerator D16_UNORM
specifies a one-component, 16-bit unsigned normalized format that has a single 16-bit depth component
-
enumerator X8_D24_UNORM
specifies a two-component, 32-bit format that has 24 unsigned normalized bits in the depth component, and, optionally, 8 bits that are unused
-
enumerator D32_SFLOAT
specifies a one-component, 32-bit signed floating-point format that has 32 bits in the depth component
-
enumerator A2B10G10R10_UNORM_PACK32
specifies a four-component, 32-bit packed unsigned normalized format that has a 2-bit A component in bits 30..31, a 10-bit B component in bits 20..29, a 10-bit G component in bits 10..19, and a 10-bit R component in bits 0..9.
-
enumerator A2R10G10B10_UNORM_PACK32
specifies a four-component, 32-bit packed unsigned normalized format that has a 2-bit A component in bits 30..31, a 10-bit R component in bits 20..29, a 10-bit G component in bits 10..19, and a 10-bit B component in bits 0..9.
-
enumerator B8G8R8A8_UNORM
specifies a four-component, 32-bit unsigned normalized format that has a 8-bit B component in byte 0, a 8-bit G component in byte 1, a 8-bit R component in byte 2, and a 8-bit A component in byte 3
-
enumerator B8G8R8A8_SRGB
specifies a four-component, 32-bit unsigned normalized format that has a 8-bit B component stored with sRGB nonlinear encoding in byte 0, a 8-bit G component stored with sRGB nonlinear encoding in byte 1, a 8-bit R component stored with sRGB nonlinear encoding in byte 2, and a 8-bit A component in byte 3
-
enumerator A8B8G8R8_UNORM_PACK32
specifies a four-component, 32-bit packed unsigned normalized format that has an 8-bit A component in bits 24..31, an 8-bit B component in bits 16..23, an 8-bit G component in bits 8..15, and an 8-bit R component in bits 0..7.
-
enumerator A8B8G8R8_SRGB_PACK32
specifies a four-component, 32-bit packed unsigned normalized format that has an 8-bit A component in bits 24..31, an 8-bit B component stored with sRGB nonlinear encoding in bits 16..23, an 8-bit G component stored with sRGB nonlinear encoding in bits 8..15, and an 8-bit R component stored with sRGB nonlinear encoding in bits 0..7.
-
enumerator Y8U8Y8V8_422_UNORM
specifies a four-component, 32-bit format containing a pair of Y components, a V component, and a U component, collectively encoding a 2×1 rectangle of unsigned normalized RGB texel data. One Y value is present at each i coordinate, with the U and V values shared across both Y values and thus recorded at half the horizontal resolution of the image. This format has an 8-bit Y component for the even i coordinate in byte 0, an 8-bit U component in byte 1, an 8-bit Y component for the odd i coordinate in byte 2, and an 8-bit V component in byte 3. This format only supports images with a width that is a multiple of two.
-
enumerator U8Y8V8Y8_422_UNORM
specifies a four-component, 32-bit format containing a pair of Y components, a V component, and a U component, collectively encoding a 2×1 rectangle of unsigned normalized RGB texel data. One Y value is present at each i coordinate, with the U and V values shared across both Y values and thus recorded at half the horizontal resolution of the image. This format has an 8-bit U component in byte 0, an 8-bit Y component for the even i coordinate in byte 1, an 8-bit V component in byte 2, and an 8-bit Y component for the odd i coordinate in byte 3. This format only supports images with a width that is a multiple of two.
-
enumerator Y8_U8V8_2PLANE_420_UNORM
specifies an unsigned normalized multi-planar format that has an 8-bit Y component in plane 0, and a two-component, 16-bit UV plane 1 consisting of an 8-bit U component in byte 0 and an 8-bit V component in byte 1. The horizontal and vertical dimensions of the UV plane are halved relative to the image dimensions. This format only supports images with a width and height that are a multiple of two.
-
enumerator Y8_U8V8_2PLANE_422_UNORM
specifies an unsigned normalized multi-planar format that has an 8-bit Y component in plane 0, and a two-component, 16-bit UV plane 1 consisting of an 8-bit U component in byte 0 and an 8-bit V component in byte 1. The horizontal dimension of the UV plane is halved relative to the image dimensions. This format only supports images with a width that is a multiple of two.
-
enumerator Y8_U8_V8_3PLANE_420_UNORM
specifies an unsigned normalized multi-planar format that has an 8-bit Y component in plane 0, an 8-bit U component in plane 1, and an 8-bit V component in plane 2. The horizontal and vertical dimensions of the V and U planes are halved relative to the image dimensions. This format only supports images with a width and height that are a multiple of two.
-
enumerator Y8_U8_V8_3PLANE_422_UNORM
specifies an unsigned normalized multi-planar format that has an 8-bit Y component in plane 0, an 8-bit U component in plane 1, and an 8-bit V component in plane 2. The horizontal dimension of the V and U plane is halved relative to the image dimensions. This format only supports images with a width that is a multiple of two.
-
enumerator Y16_U16V16_2PLANE_420_UNORM
specifies an unsigned normalized multi-planar format that has a 16-bit Y component in each 16-bit word of plane 0, and a two-component, 32-bit UV plane 1 consisting of a 16-bit U component in the word in bytes 0..1, and a 16-bit V component in the word in bytes 2..3. The horizontal and vertical dimensions of the UV plane are halved relative to the image dimensions. This format only supports images with a width and height that are a multiple of two.
-
enumerator Y16_U16V16_2PLANE_422_UNORM
specifies an unsigned normalized multi-planar format that has a 16-bit Y component in each 16-bit word of plane 0, and a two-component, 32-bit UV plane 1 consisting of a 16-bit U component in the word in bytes 0..1, and a 16-bit V component in the word in bytes 2..3. The horizontal dimension of the UV plane is halved relative to the image dimensions. This format only supports images with a width that is a multiple of two.
-
enumerator Y16_U16_V16_3PLANE_420_UNORM
specifies an unsigned normalized multi-planar format that has a 16-bit Y component in each 16-bit word of plane 0, a 16-bit U component in each 16-bit word of plane 1, and a 16-bit V component in each 16-bit word of plane 2. The horizontal and vertical dimensions of the V and U planes are halved relative to the image dimensions. This format only supports images with a width and height that are a multiple of two.
-
enumerator Y16_U16_V16_3PLANE_422_UNORM
specifies an unsigned normalized multi-planar format that has a 16-bit Y component in each 16-bit word of plane 0, a 16-bit U component in each 16-bit word of plane 1, and a 16-bit V component in each 16-bit word of plane 2. The horizontal dimension of the V and U plane is halved relative to the image dimensions. This format only supports images with a width that is a multiple of two.
-
enumerator AUTO_DETECT
Auto detect the image format. If the input is a video buffer the format of the video buffer is used, if the input is a tensor then the format depends on the component count
one component : gray level image
three components : RGB image
four components : RGBA image and the component type.
-
-
-
enum class YuvModelConversion
Defines the conversion from the source color model to the shader color model.
Values:
-
enumerator YUV_601
specifies the color model conversion from YUV to RGB defined in BT.601
-
enumerator YUV_709
specifies the color model conversion from YUV to RGB defined in BT.709
-
enumerator YUV_2020
specifies the color model conversion from YUV to RGB defined in BT.2020
- enumerator YUV_601
-
enum class YuvRange
Specifies the YUV range
Values:
-
enumerator ITU_FULL
specifies that the full range of the encoded values are valid and interpreted according to the ITU “full range” quantization rules
-
enumerator ITU_NARROW
specifies that headroom and foot room are reserved in the numerical range of encoded values, and the remaining values are expanded according to the ITU “narrow range” quantization rules
- enumerator ITU_FULL
-
enum class ChromaLocation
Defines the location of downsampled chroma component samples relative to the luma samples.
Values:
-
enumerator COSITED_EVEN
specifies that downsampled chroma samples are aligned with luma samples with even coordinates
-
enumerator MIDPOINT
specifies that downsampled chroma samples are located half way between each even luma sample and the nearest higher odd luma sample.
- enumerator COSITED_EVEN
-
enum class DepthMapRenderMode
Depth map render mode.
Values:
-
enumerator POINTS
render points
-
enumerator LINES
render lines
-
enumerator TRIANGLES
render triangles
- enumerator POINTS
-
enum class ColorSpace
The color space specifies how the surface data is interpreted when presented on screen.
Note: this needs to match the viz::ColorSpace enum (except the AUTO value).
Values:
-
enumerator SRGB_NONLINEAR
sRGB color space
-
enumerator EXTENDED_SRGB_LINEAR
extended sRGB color space to be displayed using a linear EOTF
-
enumerator BT2020_LINEAR
BT2020 color space to be displayed using a linear EOTF.
-
enumerator HDR10_ST2084
HDR10 (BT2020 color) space to be displayed using the SMPTE ST2084 Perceptual Quantizer (PQ) EOTF
-
enumerator PASS_THROUGH
color components are used “as is”
-
enumerator BT709_LINEAR
BT709 color space to be displayed using a linear EOTF.
-
enumerator AUTO
Auto select the color format. Is a display is connected then
SRGB_NONLINEARis used, in headless mode
PASS_THROUGHis used.
- enumerator SRGB_NONLINEAR
-
using Key = viz::Key
export the types used by the callbacks directly from Holoviz module
-
using KeyAndButtonAction = viz::KeyAndButtonAction
-
using KeyModifiers = viz::KeyModifiers
-
using MouseButton = viz::MouseButton
-
using KeyCallbackFunction = std::function<void(Key key, KeyAndButtonAction action, KeyModifiers modifiers)>
Function pointer type for key callbacks.
The callback function receives:
key: the key that was pressed
action: key action (PRESS, RELEASE, REPEAT)
modifiers: bit field describing which modifiers were held down
-
-
using UnicodeCharCallbackFunction = std::function<void(uint32_t code_point)>
Function pointer type for Unicode character callbacks.
The callback function receives:
code_point: Unicode code point of the character
-
-
using MouseButtonCallbackFunction = std::function<void(MouseButton button, KeyAndButtonAction action, KeyModifiers modifiers)>
Function pointer type for mouse button callbacks.
The callback function receives:
button: the mouse button that was pressed
action: button action (PRESS, RELEASE)
modifiers: bit field describing which modifiers were held down
-
-
using ScrollCallbackFunction = std::function<void(double x_offset, double y_offset)>
Function pointer type for scroll callbacks.
The callback function receives:
x_offset: scroll offset along the x-axis
y_offset: scroll offset along the y-axis
-
-
using CursorPosCallbackFunction = std::function<void(double x_pos, double y_pos)>
Function pointer type for cursor position callbacks.
The callback function receives:
x_pos: new cursor x-coordinate in screen coordinates, relative to the left edge of the content area
y_pos: new cursor y-coordinate in screen coordinates, relative to the left edge of the content area
-
-
using FramebufferSizeCallbackFunction = std::function<void(int width, int height)>
Function pointer type for framebuffer size callbacks.
The callback function receives:
width: new width of the framebuffer in pixels
height: new height of the framebuffer in pixels
-
-
using WindowSizeCallbackFunction = std::function<void(int width, int height)>
Function pointer type for window size callbacks.
The callback function receives:
width: new width of the window in screen coordinates
height: new height of the window in screen coordinates
-
-
using WindowCloseCallbackFunction = std::function<void()>
Function pointer type for window close callbacks.
The callback function receives no parameter.
-
using LayerCallbackFunction = std::function<void(const std::vector<holoscan::gxf::Entity> &inputs)>
Function pointer type for layer callbacks. This function is called when HolovizOp processed all layers defined by the input specification. It can be used to add extra layers.
The callback function receives:
inputs: the entities received from the ‘receivers’ input port
-
Public Functions
- HOLOSCAN_OPERATOR_FORWARD_ARGS (HolovizOp) HolovizOp()=default
-
virtual void setup(OperatorSpec &spec) override
Define the operator specification.
- Parameters
spec – The reference to the operator specification.
-
virtual void initialize() override
Initialize the operator.
This function is called when the fragment is initialized by Executor::initialize_fragment().
-
virtual void start() override
Implement the startup logic of the operator.
This method is called multiple times over the lifecycle of the operator according to the order defined in the lifecycle, and used for heavy initialization tasks such as allocating memory resources.
-
virtual void compute(InputContext &op_input, OutputContext &op_output, ExecutionContext &context) override
Implement the compute method.
This method is called by the runtime multiple times. The runtime calls this method until the operator is stopped.
- Parameters
op_input – The input context of the operator.
op_output – The output context of the operator.
context – The execution context of the operator.
-
-
virtual void stop() override
Implement the shutdown logic of the operator.
This method is called multiple times over the lifecycle of the operator according to the order defined in the lifecycle, and used for heavy deinitialization tasks such as deallocation of all resources previously assigned in start.
-
void default_window_close_callback()
Default window-close behavior executed by Holoviz
This helper performs Holoviz’s built-in window-close handling. When the Holoviz window is requested to close:
If the application is running in a distributed configuration, this method initiates a distributed application shutdown via
Application::initiate_distributed_app_shutdown().
Otherwise, it performs no-op (normal single-fragment shutdown proceeds via the associated
window_close_condition).
This method is also exposed to Python so user-provided callbacks can easily preserve the default shutdown semantics by calling
HolovizOp.default_window_close_callback()inside their custom
window_close_callback.
-
Public Static Functions
-
static nvidia::gxf::Expected<holoscan::ops::HolovizOp::InputType> inputTypeFromString(const std::string &string)
Convert a string to a input type enum
- Parameters
string – input type string
- Returns
input type enum
-
static std::string inputTypeToString(holoscan::ops::HolovizOp::InputType input_type)
Convert a input type enum to a string
- Parameters
input_type – input type enum
- Returns
input type string
-
static nvidia::gxf::Expected<holoscan::ops::HolovizOp::ImageFormat> imageFormatFromString(const std::string &string)
Convert a string to a image format enum
- Parameters
string – image format string
- Returns
image format enum
-
static std::string imageFormatToString(holoscan::ops::HolovizOp::ImageFormat image_format)
Convert a image format enum to a string
- Parameters
image_format – image format enum
- Returns
image format string
-
static nvidia::gxf::Expected<holoscan::ops::HolovizOp::DepthMapRenderMode> depthMapRenderModeFromString(const std::string &string)
Convert a string to a depth map render mode enum
- Parameters
string – depth map render mode string
- Returns
depth map render mode enum
-
static std::string depthMapRenderModeToString(holoscan::ops::HolovizOp::DepthMapRenderMode depth_map_render_mode)
Convert a depth map render mode enum to a string
- Parameters
depth_map_render_mode – depth map render mode enum
- Returns
depth map render mode string
-
static nvidia::gxf::Expected<holoscan::ops::HolovizOp::YuvModelConversion> yuvModelConversionFromString(const std::string &string)
Convert a string to a yuv model conversion enum
- Parameters
string – yuv model conversion string
- Returns
yuv model conversion enum
-
static std::string yuvModelConversionToString(holoscan::ops::HolovizOp::YuvModelConversion yuv_model_conversion)
Convert a yuv model conversion enum to a string
- Parameters
yuv_model_conversion – yuv model conversion enum
- Returns
depth map render mode string
-
static nvidia::gxf::Expected<holoscan::ops::HolovizOp::YuvRange> yuvRangeFromString(const std::string &string)
Convert a string to a yuv range enum
- Parameters
string – yuv range string
- Returns
yuv range enum
-
static std::string yuvRangeToString(holoscan::ops::HolovizOp::YuvRange yuv_range)
Convert a yuv range enum to a string
- Parameters
yuv_range – yuv range enum
- Returns
yuv range string
-
static nvidia::gxf::Expected<holoscan::ops::HolovizOp::ChromaLocation> chromaLocationFromString(const std::string &string)
Convert a string to a chroma location enum
- Parameters
string – chroma location string
- Returns
chroma location enum
-
static std::string chromaLocationToString(holoscan::ops::HolovizOp::ChromaLocation chroma_location)
Convert a chroma location enum to a string
- Parameters
chroma_location – chroma location enum
- Returns
chroma location string
-
static nvidia::gxf::Expected<holoscan::ops::HolovizOp::ColorSpace> colorSpaceFromString(const std::string &string)
Convert a string to a color space enum
- Parameters
string – color space string
- Returns
color space enum
-
static std::string colorSpaceToString(holoscan::ops::HolovizOp::ColorSpace color_space)
Convert a color space enum to a string
- Parameters
color_space – color space enum
- Returns
color space string
Public Static Attributes
-
static const std::array<std::pair<InputType, std::string>, 17> kInputTypeToStr
table to convert input type to string
-
static const std::array<std::pair<holoscan::ops::HolovizOp::ImageFormat, std::string>, 41> kImageFormatToStr
table to convert image format to string
-
static const std::array<std::pair<holoscan::ops::HolovizOp::DepthMapRenderMode, std::string>, 3> kDepthMapRenderModeToStr
table to convert depth map render mode to string
-
static const std::array<std::pair<holoscan::ops::HolovizOp::YuvModelConversion, std::string>, 3> kYuvModelConversionToStr
table to convert yuv model conversion enum to string
-
static const std::array<std::pair<holoscan::ops::HolovizOp::YuvRange, std::string>, 2> kYuvRangeToStr
table to convert yuv range enum to string
-
static const std::array<std::pair<holoscan::ops::HolovizOp::ChromaLocation, std::string>, 2> kChromaLoactionToStr
table to convert chroma location enum to string
-
static const std::array<std::pair<holoscan::ops::HolovizOp::ColorSpace, std::string>, 7> kColorSpaceToStr
table to convert color space enum to string
Protected Functions
-
void disable_via_window_close()
Friends
- friend class ::holoscan::FirstPixelOutCondition
- friend class ::holoscan::PresentDoneCondition
-
struct InputSpec
Input specification
Public Functions
-
InputSpec() = default
-
inline InputSpec(const std::string &tensor_name, InputType type)
-
InputSpec(const std::string &tensor_name, const std::string &type_str)
-
explicit InputSpec(const std::string &yaml_description)
- Returns
an InputSpec from the YAML form output by description()
-
inline explicit operator bool() const noexcept
- Returns
true if the input spec is valid
-
std::string description() const
- Returns
a YAML string representation of the InputSpec
Public Members
-
std::string tensor_name_
name of the tensor/video buffer containing the input data
-
float opacity_ = 1.F
layer opacity, 1.0 is fully opaque, 0.0 is fully transparent
-
int32_t priority_ = 0
layer priority, determines the render order, layers with higher priority values are rendered on top of layers with lower priority values
-
ImageFormat image_format_ = ImageFormat::AUTO_DETECT
image format
-
YuvModelConversion yuv_model_conversion_ = YuvModelConversion::YUV_601
YUV model conversion.
-
ChromaLocation x_chroma_location_ = ChromaLocation::COSITED_EVEN
chroma location in x direction for formats which are chroma downsampled in width (420 and 422)
-
ChromaLocation y_chroma_location_ = ChromaLocation::COSITED_EVEN
chroma location in y direction for formats which are chroma downsampled in height (420)
-
std::vector<float> color_ = {1.F, 1.F, 1.F, 1.F}
color of rendered geometry
-
float line_width_ = 1.F
line width for geometry made of lines
-
float point_size_ = 1.F
point size for geometry made of points
-
std::vector<std::string> text_
array of text strings, used when type_ is TEXT.
-
DepthMapRenderMode depth_map_render_mode_ = DepthMapRenderMode::POINTS
depth map render mode, used if type_ is DEPTH_MAP or DEPTH_MAP_COLOR.
-
std::vector<View> views_
-
struct View
Layer view.
By default a layer will fill the whole window. When using a view the layer can be placed freely within the window.
Layers can also be placed in 3D space by specifying a 3D transformation matrix. Note that for geometry layers there is a default matrix which allows coordinates in the range of [0 … 1] instead of the Vulkan [-1 … 1] range. When specifying a matrix for a geometry layer, this default matrix is overwritten.
When multiple views are specified the layer is drawn multiple times using the specified layer views.
It’s possible to specify a negative term for height, which flips the image. When using a negative height, one should also adjust the y value to point to the lower left corner of the viewport instead of the upper left corner.
Public Members
-
float offset_x_ = 0.F
-
float offset_y_ = 0.F
offset of top-left corner of the view. Top left coordinate of the window area is (0, 0) bottom right coordinate is (1, 1).
-
float width_ = 1.F
-
float height_ = 1.F
width and height of the view in normalized range. 1.0 is full size.
-
std::optional<std::array<float, 16>> matrix_
row major 4x4 transform matrix (optional, can be nullptr)
- float offset_x_ = 0.F
- InputSpec() = default
-