For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
  • Introduction
    • Overview
    • Relevant Technologies
    • Getting Started
  • Setup
    • SDK Installation
    • Additional Setup
    • Third Party Hardware Setup
  • Using the SDK
    • Holoscan Core
    • GPU Resident Execution
    • Holoscan by Example
    • Create an Application
    • Create a Distributed Application
    • Create an Operator
    • Create an Operator via Decorator
    • Create a Condition
    • Dynamic Flow Control
    • CUDA Stream Handling
    • Logging
    • Data Logging
    • Debugging
    • Python Operator Bindings
  • Operators
    • Operators and Extensions
    • Visualization
    • Inference
    • Testing
    • Video I/O Vendor Implementation Guide
  • Components
    • Schedulers
    • Conditions
    • Resources
    • Analytics
  • AI Skills
    • Ai Skills
  • API reference
              • TensorTransmitCache
  • Performance
    • Performance Considerations
    • Flow Tracking
    • GXF Job Statistics
    • Nsight Profiling
  • HoloHub
    • HoloHub Overview
  • FAQ
    • FAQ
NVIDIANVIDIA
Developer-friendly docs for your API
Privacy Policy | Your Privacy Choices | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2026, NVIDIA Corporation.

LogoLogoDocumentation
On this page
  • Member variables
API referenceC++ APIHoloscanNamespacesUtilsStructs

holoscan::utils::TensorTransmitCache

Beta
||View as Markdown|
Previous

holoscan::utils::cuda::get_element_size

Next

holoscan::viz::InstanceHandle

Persistent cache for output message and tensor allocations used by the cached variant of transmit_data_per_model.

Holds a reusable GXF entity and per-tensor metadata to minimise per-frame overhead. Three cases are handled each frame:

  1. First use or buffer too small: full reshape (reallocate + update shape).
  2. Incoming fits in existing allocation but dims changed: wrapMemory to update shape metadata only — no free/alloc of the underlying buffer.
  3. Same dims as last frame: fast path, no tensor mutation at all.
#include <holoscan/utils/holoinfer_utils.hpp>

Member variables

NameTypeDescription
out_messagenvidia::gxf::Expected< nvidia::gxf::Entity >Persistent output entity. Invalid (falsy) until the first call to transmit_data_per_model.
allocated_sizesstd::map< std::string, size_t >Maximum element count that has been allocated for each output tensor.
last_dimsstd::map< std::string, std::vector< int64_t > >Dimension vector from the most recent frame for each output tensor, used to detect shape changes that require a wrapMemory call even when the element count has not grown.