Data Flow Tracking
Data Flow Tracking is currently not supported between multiple fragments in a distributed application.
The Holoscan SDK provides the Data Flow Tracking APIs as a mechanism to profile your application and analyze the fine-grained timing properties and data flow between operators in the graph of a fragment.
Currently, data flow tracking is only supported between the root operators and leaf operators of a graph and in simple cycles in a graph (support for tracking data flow between any pair of operators in a graph is planned for the future).
A root operator is an operator without any predecessor nodes
A leaf operator (also known as a sink operator) is an operator without any successor nodes.
When data flow tracking is enabled, every message is tracked from the root operators to the leaf operators and in cycles. Then, the maximum (worst-case), average and minimum end-to-end latencies of one or more paths can be retrieved using the Data Flow Tracking APIs.
The end-to-end latency between a root operator and a leaf operator is the time taken between the start of a root operator and the end of a leaf operator. Data Flow Tracking enables the support to track the end-to-end latency of every message being passed between a root operator and a leaf operator.
The reported end-to-end latency for a cyclic path is the time taken between the start of the first operator of a cycle and the time when a message is again received by the first operator of the cycle.
The API also provides the ability to retrieve the number of messages sent from the root operators.
The Data Flow Tracking feature is also illustrated in the flow_tracker
Look at the
<a href="api/cpp/classholoscan_1_1DataFlowTracker.html#_CPPv4N8holoscan15DataFlowTrackerE">C++</a>
and<a href="api/python/holoscan_python_api_core.html#holoscan.core.DataFlowTracker">python</a>
API documentation for exhaustive definitions
Before an application (<a href="api/cpp/classholoscan_1_1Application.html#_CPPv4N8holoscan11ApplicationE">C++</a>
/<a href="api/python/holoscan_python_api_core.html#holoscan.core.Application">python</a>
) is run with the run()
method,
data flow tracking can be enabled by calling the track()
method in
<a href="api/cpp/classholoscan_1_1Fragment.html#_CPPv4N8holoscan8Fragment5trackE8uint64_t8uint64_ti">C++</a>
and using the Tracker
class in
<a href="api/python/holoscan_python_api_core.html#holoscan.core.Tracker">python</a>
.
auto app = holoscan::make_application<MyPingApp>();
auto& tracker = app->track(); // Enable Data Flow Tracking
// Change tracker and application configurations
...
app->run();
from holoscan.core import Tracker
...
app = MyPingApp()
with Tracker(app) as tracker:
# Change tracker and application configurations
...
app.run()
After an application has been run, data flow tracking results can be accessed by various functions:
print()
(<a href="api/cpp/classholoscan_1_1DataFlowTracker.html#_CPPv4NK8holoscan15DataFlowTracker5printEv">C++</a>
/<a href="api/python/holoscan_python_api_core.html#holoscan.core.DataFlowTracker.print">python</a>
)Prints all data flow tracking results including end-to-end latencies and the number of source messages to the standard output.
get_num_paths()
(<a href="api/cpp/classholoscan_1_1DataFlowTracker.html#_CPPv4N8holoscan15DataFlowTracker13get_num_pathsEv">C++</a>
/<a href="api/python/holoscan_python_api_core.html#holoscan.core.DataFlowTracker.get_num_paths">python</a>
)Returns the number of paths between the root operators and the leaf operators.
get_path_strings()
(<a href="api/cpp/classholoscan_1_1DataFlowTracker.html#_CPPv4N8holoscan15DataFlowTracker16get_path_stringsEv">C++</a>
/<a href="api/python/holoscan_python_api_core.html#holoscan.core.DataFlowTracker.get_path_strings">python</a>
)Returns a vector of strings, where each string represents a path between the root operators and the leaf operators. A path is a comma-separated list of operator names.
get_metric()
(<a href="api/cpp/classholoscan_1_1DataFlowTracker.html#_CPPv4N8holoscan15DataFlowTracker10get_metricENSt6stringEN8holoscan14DataFlowMetricE">C++</a>
/<a href="api/python/holoscan_python_api_core.html#holoscan.core.DataFlowTracker.get_metric">python</a>
)Returns the value of different metrics based on the arguments.
get_metric(std::string pathstring, holoscan::DataFlowMetric metric)
returns the value of a metricmetric
for a pathpathstring
. The metric can be one of the following:holoscan::DataFlowMetric::kMaxE2ELatency
(<a href="api/python/holoscan_python_api_core.html#holoscan.core.DataFlowMetric.MAX_E2E_LATENCY">python</a>
): the maximum end-to-end latency in the pathholoscan::DataFlowMetric::kAvgE2ELatency
(<a href="api/python/holoscan_python_api_core.html#holoscan.core.DataFlowMetric.AVG_E2E_LATENCY">python</a>
): the average end-to-end latency in the pathholoscan::DataFlowMetric::kMinE2ELatency
(<a href="api/python/holoscan_python_api_core.html#holoscan.core.DataFlowMetric.MIN_E2E_LATENCY">python</a>
): the minimum end-to-end latency in the pathholoscan::DataFlowMetric::kMaxMessageID
(<a href="api/python/holoscan_python_api_core.html#holoscan.core.DataFlowMetric.MAX_MESSAGE_ID">python</a>
): the message number or ID which resulted in the maximum end-to-end latencyholoscan::DataFlowMetric::kMinMessageID
(<a href="api/python/holoscan_python_api_core.html#holoscan.core.DataFlowMetric.MIN_MESSAGE_ID">python</a>
): the message number or ID which resulted in the minimum end-to-end latency
get_metric(holoscan::DataFlowMetric metric = DataFlowMetric::kNumSrcMessages)
returns a map of source operator and its edge, and the number of messages sent from the source operator to the edge.
In the above example, the data flow tracking results can be printed to the standard output like the following:
auto app = holoscan::make_application<MyPingApp>();
auto& tracker = app->track(); // Enable Data Flow Tracking
// Change application configurations
...
app->run();
tracker.print();
from holoscan.core import Tracker
...
app = MyPingApp()
with Tracker(app) as tracker:
# Change tracker and application configurations
...
app.run()
tracker.print()
Data flow tracking can be customized using a few, optional configuration parameters. The
track()
method (<a href="api/cpp/classholoscan_1_1Fragment.html#_CPPv4N8holoscan8Fragment5trackE8uint64_t8uint64_ti">C++</a>
/<a href="api/python/holoscan_python_api_core.html#holoscan.core.Tracker">Tracker class in python</a>
) can be configured to skip a few messages at
the beginning of an application’s execution as a warm-up period. It is also possible to discard a few
messages at the end of an application’s run as a wrap-up period. Additionally, outlier
end-to-end latencies can be ignored by setting a latency threshold value which is the minimum
latency below which the observed latencies are ignored.
For effective benchmarking, it is common practice to include warm-up and cool-down periods by skipping the initial and final messages.
Listing 37 Optional parameters to
track()
Fragment::track(uint64_t num_start_messages_to_skip = kDefaultNumStartMessagesToSkip,
uint64_t num_last_messages_to_discard = kDefaultNumLastMessagesToDiscard,
int latency_threshold = kDefaultLatencyThreshold);
Listing 38 Optional parameters to
Tracker
Tracker(num_start_messages_to_skip=num_start_messages_to_skip,
num_last_messages_to_discard=num_last_messages_to_discard,
latency_threshold=latency_threshold)
The default values of these parameters of track()
are as follows:
kDefaultNumStartMessagesToSkip
: 10kDefaultNumLastMessagesToDiscard
: 10kDefaultLatencyThreshold
: 0 (do not filter out any latency values)
These parameters can also be configured using the helper functions:
<a href="api/cpp/classholoscan_1_1DataFlowTracker.html#_CPPv4N8holoscan15DataFlowTracker26set_skip_starting_messagesE8uint64_t">set_skip_starting_messages</a>
,
<a href="api/cpp/classholoscan_1_1DataFlowTracker.html#_CPPv4N8holoscan15DataFlowTracker25set_discard_last_messagesE8uint64_t">set_discard_last_messages</a>
and <a href="api/cpp/classholoscan_1_1DataFlowTracker.html#_CPPv4N8holoscan15DataFlowTracker18set_skip_latenciesEi">set_skip_latencies</a>
.
The Data Flow Tracking API provides the ability to log every message’s graph-traversal information to a file. This enables developers to analyze the data flow at a granular level. When logging is enabled, every message’s received and sent timestamps at every operator between the root and the leaf operators are logged after a message has been processed at the leaf operator.
The logging is enabled by calling the enable_logging
method in <a href="api/cpp/classholoscan_1_1DataFlowTracker.html#_CPPv4N8holoscan15DataFlowTracker14enable_loggingENSt6stringE8uint64_t">C++</a>
and by providing the filename
parameter to Tracker
in <a href="api/python/holoscan_python_api_core.html#holoscan.core.Tracker">python</a>
.
auto app = holoscan::make_application<MyPingApp>();
auto& tracker = app->track(); // Enable Data Flow Tracking
tracker.enable_logging("logging_file_name.log");
...
app->run();
from holoscan.core import Tracker
...
app = MyPingApp()
with Tracker(app, filename="logger.log") as tracker:
...
app.run()
The logger file logs the paths of the messages after a leaf operator has finished its compute
method.
Every path in the logfile includes an array of tuples of the form:
“(root operator name, message receive timestamp, message publish timestamp) -> … -> (leaf operator name, message receive timestamp, message publish timestamp)”.
This log file can further be analyzed to understand latency distributions, bottlenecks, data flow and other characteristics of an application.