DOCA Ngauge
This document provides instructions on the usage of the ngauge
tool.
ngauge
tool is used to analyze, visualize, and debug network performance on a single node. The tool is designed for probing NIC hardware counters and storing the collected data in HDF5 format, along with relevant metadata, for subsequent processing. Additionally, the tool provides graphical progress updates and a measurement summary directly on the CLI, offering real-time insights into the measurement process.
Supported hardware: NVIDIA® BlueField®-3, NVIDIA® ConnectX®-7, and above.
ngauge
relies on the fwctl
driver and, therefore, cannot be run simultaneously with other tools or services that also utilize this driver.
BlueField-3 or ConnectX-7 and above with firmware version xx.43.1000 or higher
fwctl
driver installed on the host:OS
Commands
Deb-based 1
Search for the package:
apt-cache search fwctl
Install the package:
sudo
aptinstall
<package-name>
RPM-based
Search for the package:
dnf search fwctl
Install the package:
sudo
dnfinstall
<package-name>
On Ubuntu 20.04, the
fwctl
driver is not loaded automatically upon system startup. To load it, run the commandmodprobe mlx5_fwctl
after every reboot. ↩
Installing Ngauge
Install ngauge
by running sudo apt-get install ngauge
or sudo dnf install ngauge
(on x86 or Arm 64 hosts).
On the DPU, the ngauge
package is pre-installed, so the above step is not necessary.
All the configurations for ngauge
are defined in an input YAML file.
Copy a sample configuration file from
/usr/share/doc/ngauge/examples/settings
.InfoUse the one which best fits your scenario single/dual port, etc.
Specify the device to run on using its PCIe address. For example:
device:
"0000:03:00.0"
Configure the output path and file prefix (both are mandatory):
output: path: /path/to/output/directory prefix:
"ngauge_data_"
silent:false
The output file is saved in the format
/path/to/output/directory/ngauge_data_<DATE>_<TIME>.h5
.The exact file name is printed after each run.
If the
silent
option is set totrue
, progress indications on the command line are suppressed (default:false
).
Configure parameters for the application's runtime behavior:
params: mode: repetitive # [repetitive, single] period_us: 1e2 # Sampling period in microseconds (e.g.,
"1e2"
=100
μs)InfoNumbers in decimal or scientific notation are accepted. In the example,
1e2
means 100 μs.Define the counters to measure. The
id
(data ID) is the only mandatory field. Additional fields are optional:counters: - id:
0x1020000100000000
# Data ID (mandatory) desc: RX bytes port0
# Description (optional) unit: RX port # Unit type (optional) accumulating:false
# Whether the counter accumulates values (optional) normalizer: time # Normalizer,if
present, must be either'time'
, or a non-zero number, or one of the configured Data IDs with an'id/'
prefix (optional) cutoff_min:1
# Cutoff minimum: data belowthis
value will not be recorded (optional) cutoff_max: 3e10 # Cutoff maximum: data abovethis
value will not be recorded (optional)InfoAll supported performance counters may be found under section "Supported Data IDs".
Parsing Output
A sample plugin named simple-plot
is provided and installed under /usr/share/doc/ngauge/examples/plugins
.
This plugin demonstrates how to open the output HDF5 file generated by ngauge
and plot the data. While it focuses on plotting, the data can also be used for various types of analysis. This plugin is a basic demonstration and is not intended for advanced use.
To plot the data from an ngauge
output file, use the following command:
/usr/share/doc/ngauge/examples/plugins/simple_plot.py <ngauge output .h5 file
> <counter ID> [<counter ID> ...]
If your output directory is /tmp
(the default), you can always reference the most recent results without manually copying the file name by using the expression "$(ls -1 /tmp/ngauge_data_*.h5 | tail -n1)"
.
The following is a simple plot example. The lighter area around the samples represents the worst-case measurement error.

The GUI includes a zoom-in function that allows you to drill down for higher resolution.
An alternative plugin, simple_text_plot.py
, produces text-based plots in the terminal. While the resolution is much lower, this method is highly useful when graphical output is unavailable or when the network connection to the server is slow.
The usage syntax is identical to the graphical plotting plugin:
/usr/share/doc/ngauge/examples/plugins/simple_text_plot.py <ngauge output .h5 file
> <counter ID> [<counter ID> ...]
Simple text plot example:
RX bytes port 0
┌─────────────────────────────────────────────────────────────────────────────────────┐
24864860578.2
┤ ▗▐██▄▄▙▙▙█▄▙▄▄██▄▟██▄▟█▟▙▄█▟▄▙█▄▙▄▄▟▄▄█▟▄▄▄▟▙▄▄▟▟▙│
│ ▐█▛ │
│ ▐█ │
│ ▝ │
20720717148.5
┤ │
│ │
│ │
│ │
16576573718.8
┤ ▝ │
│ │
│ │
│ │
12432430289.1
┤ ▝ │
│ │
│ ▝ │
│ │
8288286859.4
┤ │
│ │
│ │
│ ▝ │
4144143429.7
┤ │
│ │
│ │
│ ▗ │
0.0
┤▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▟ │
└┬────────────────────┬────────────────────┬────────────────────┬────────────────────┬┘
0.0
4.2
8.4
12.6
16.8
Approx. time (s)
The sample plugins are provided as examples and are not integral parts of the ngauge
tool. Dependencies such as NumPy, H5py, Matplotlib, plotext, and others may need to be installed separately to run these plugins.
To run ngauge
:
ngauge <configuration YAML file
>
The output is saved as an HDF5 file (.h5
) in the path specified in the configuration YAML.
To end a run before the full dataset is collected Ctrl+C
(SIGINT) can be used. This is a normal and supported way to end the run, and all results collected up to that point will be saved as usual.
During the run, progress bars for each counter will be displayed. These bars provide visual feedback on the counter activity, with color coding to indicate value levels:
Blue – Represents low values relative to other values of the same counter
Red – Represents high values relative to other values of the same counter
Intermediate colors (gradient) – Values between low and high, transitioning from blue to red
Solid gray bars – Indicate no changes in the values of this counter during the run
Data is updated once per second, showing the mean, standard deviation (σ), minimum, and maximum values observed during the last second.
This visual representation helps track counter activity in real time, offering immediate insights into system behavior.
