Flow Analysis

Use the flow analysis tool to sample data from TCP and UDP flows in your environment and to review latency and buffer utilization statistics across network paths.

Flow analysis is supported on NVIDIA Spectrum 2 and 3 platforms, and requires a switch fabric running Cumulus Linux version 5.0 or above. You must enable Lifecycle Management (LCM) to use the flow analysis tool. If LCM is disabled, you will not see the flow analysis menu in the UI. LCM is enabled for on-premises deployments by default and disabled for cloud deployments by default. Contact your local NVIDIA sales representative or submit a support ticket to activate LCM on cloud deployments.

Create New Flow Analysis

To start a new flow analysis, click the Flow Analysis menu and select Create new flow analysis.

flow analysis menu with options to create a new flow analysis or view a previous analysis

Flow Analysis Settings

The flow analysis wizard prompts you to enter the source IP address, destination IP address, source port, and destination port of the flow you wish to analyze. Select the respective menus to choose the protocol and VRF for the flow.

flow analysis wizard prompting user to enter application parameters

Flow Monitor Settings

After you enter the settings, the flow analysis wizard prompts you to enter monitor settings, where you schedule the flow analysis and select sampling parameters.

flow analysis wizard prompting user to enter sampling and scheduling information

Running a flow analysis will affect switch CPU performance. For high-volume flows, set a lower sampling rate to limit switch CPU impact.

View Flow Analysis Data

After starting the flow analysis, a flow analysis card will appear on the NetQ Workbench.

flow analysis card showing that a flow analysis is in progress

View a previous flow analysis by selecting Flow Analysis and View previous flow analysis.

flow analysis menu with the option to view previous flow analysis highlighted

Select View details next to the name of the flow analysis to display the analysis dashboard. You can use this dashboard to view latency and buffer statistics for the monitored flow. If bi-directional monitoring was enabled, you can view the reverse direction of the flow by selecting the icon. The following example shows flow data across a single path:

flow analysis dashboard displaying flow data across a single path

The dashboard header shows the monitored flow settings:

dashboard header displaying settings and paramters selected with the flow analysis wizard
Flow SettingsDescription
LifetimeThe lifetime of the flow analysis. This example completed in 11 minutes.
Source IPThe source IP address of the flow. In this example it is 10.1.100.125.
Destination IPThe destination IP address of the flow. In this example it is 10.1.10.105.
Source PortThe source port of the flow. In this example it displays N/A because it was not set.
Destination PortThe destination port of the flow. In this example it is 2222.
ProtocolThe protocol of the monitored flow. In this example it is UDP.
Sampling RateThe sampling rate of the flow. In this example it is low.
VRFThe VRF the flow is present in. In this example it is the default VRF.
Bi-directional MonitoringThis determines if the flow is monitored in both directions between the source IP address and the destination IP address. In this example it is enabled. Click to change the direction that is displayed.

Understanding the Flow Analysis Graph

The flow analysis graph is color coded relative to the values measured across devices. Lower values are displayed in green, and higher values are displayed in orange. The color gradient is displayed below the graph along with the low and high values from the collected flow data. Each hop in the path is represented in the graph with a vertical, gray-striped line labeled by hostname. The following example shows a single path:

single-path flow analysis with five hops ranging from low to high values

The flow graph panel on the right side of the dashboard displays the devices along the selected path.

flow graph panel showing the five devices associated with the flow analysis graph

View Flow Latency

The latency measured by the flow analysis is the total transit time of the sampled packets through individual devices. A summary of measured latency for each device is displayed above the main flow analysis graph.

three devices displaying their average latencies, including minimum, maximum and P95 value.

The average latency for packets in the flow is displayed under the hostname of each device, along with the minimum and maximum latencies observed during the analysis lifetime. The 95th percentile (P95) latency value for sampled packets is also displayed. The P95 calculation means that 95% of the sampled packets have a latency value less than or equal to the calculation.

Use your cursor to hover over sections of the main analysis graph to view average latency values for each device in a path.

cursor hovering over a device to show latency values

The left panel of the flow analysis dashboard also displays a timeline of measured latency for each device on that path. Use your cursor to hover over the plotted data points on the timeline for each device to view the latency measured at each time interval.

a cursor hovering over a device's timeline showing maximum, minimum, and average latency at 6:15 AM on November 24th 2021

View Buffer Occupancy

The main flow analysis dashboard also displays the buffer occupancy of each device along the path. To change the graph view to display buffer occupancy for the flow, click next to Avg. flow latency and select Avg. buffer occupancy. You can view an overview graph of buffer occupancy or select each device to see the buffer occupancy for the analyzed flow:

overview graph displaying average buffer occupancy between 8 total devices

The percentages represent the amount of buffer space on the switch that the analyzed flow occupied while the analysis was running.

buffer occupancy displaying percentages at 0

View Multiple Paths

When packets matching the flow settings traverse multiple paths in the topology, the flow graph displays latency and buffer occupancy for each path:

flow graph displaying multiple paths along with latency and buffer-occupancy data along those paths

You can switch between paths by clicking on an alternate path in the Flow Graph panel, or by clicking on an unselected path on the main analysis graph:

flow graph panel highlighting a selected path with several unselected paths also displayed

In the detail panel on the left side of the dashboard, you can select a path to view the percent of packets distributed over each path.

a selected path showing that 50.1% of packets are distributed over that path

Partial Path Support

Some flows can still be analyzed if they traverse a network path that includes switches lacking flow analysis support. Partial-path flow analysis is supported in the following conditions:

  • The unsupported device cannot be the initial ingress or terminating egress device in the path of the analyzed flow.
  • If there is more than one consecutive transit device in the path that lacks flow analysis support, the path discovery will terminate at that point in the topology and some devices will not be displayed in the flow graph.

An unsupported device is represented in the flow analysis graph as a black bar lined with red x’s . Flow statistics are not displayed for that device.

flow analysis graph showing an unsupported switch

Unsupported devices are also designated in the flow graph panel:

flow graph panel with an unsupported switch

Selecting the unsupported device shows device statistics in the left panel if available to NetQ. Otherwise, the display will indicate why the device is not supported:

a panel showing an unsupported device. The device is not supported because the CL version is not supported for flow analysis

Path discovery will terminate if multiple consecutive switches do not support flow analysis. When additional data is available from switches outside of discovered paths, you can view data from those devices from the menu at the top of the page:

menu displaying three unsupported devices

The left panel displays the data, along with ingress and egress ports.

View Device Statistics

You can view latency, buffer occupancy, interface statistics, resource utilization, and WJH events for each device by clicking on a device in the Flow Graph panel, or by clicking on the line associated with a device in the main flow analysis graph. The left panel will then update to reflect statistics for the respective device.

panel displaying statistics of a selected device

After selecting a device, click to expand the statistics chart:

a cursor hovering over an icon that, when selected, expands the chart

In this view, you can select additional categories to add to the chart:

expanded chart displaying latency and WJH data, with buffer occupancy and total packet unselected and therefore not dispayed

The Flow Graph panel allows you to access the topology view, where you can also click the paths and devices to view statistics. Click to switch to the topology view:

topology view showing both selected and unselected devices and their paths

View WJH Events

Flow analysis monitors the path for WJH events and records any drops for the flow. Switches with WJH events recorded are represented in the flow analysis graph as a red bar with white stripes . Hover over the device to see a WJH event summary.

a user hovering over a device in the main flow analysis graph with a WJH event summary showing 94,300 total packet drops

You can also view devices with WJH events in the flow graph panel:

a user hovering over a device in the flow graph panel with a WJH event summary showing 94,300 total packet drops

Click on a device with WJH events to see the statistics in the left panel. Hover over the data to reveal the type of drops over time:

invdividual device WJH statistics showing 2673 router drops

WJH drops can also be viewed from the expanded device chart by selecting the WJH category:

expanded device chart showing WJH data of 24 total router drops

Select Show all drops to display a list of all WJH drops for the device:

WJH statistics for all drops, including tabular information on count, drop type, drop reason, severity, and corrective action