NVLink Plugin
The NVLink plugin enables centralized monitoring and management of multiple NVLink domains through both the UFM UI and REST APIs. At its core is the NMX Aggregator (NMXAGGR), which connects to multiple NVLink domains, gathers data from their NMX Controllers (NMX-C), and consolidates information about monitored components. By default, the plugin includes a built-in NMXAGGR, but it can also be configured to connect to an external NMXAGGR instance—either on the same host or a different system. Communication with NVLink domains is performed via the NMX-C using a gRPC-based API.
Download the Plugin Image
Run the following command to download the NVLink plugin image:
docker pull mellanox/ufm-plugin-nvlink
Load the Plugin into UFM
After downloading, you can load the plugin into UFM using one of the following methods:
Via UFM UI:
Navigate to Settings → Plugins Management in the UFM web interface.
Via Command Line:
Execute the following command on the UFM server terminal:
/opt/ufm/scripts/manage_ufm_plugins.sh add -p nvlink
Container Volume Mapping
The UFM plugin management system creates the following mappings between the plugin docker container file system and the host machine one:
Container Directory | Host Directory |
|
|
|
|
Any file system path mentioned in this document refers to the container's file system, unless stated otherwise.
NVLink Domains Connection Security
The plugin, specifically its NMXAGGR component, interacts with NVLink domains over a gRPC connection. In this setup, the domain controller (NMX-C) acts as the server, while NMXAGGR functions as the client.
NMXAGGR supports three modes of gRPC communication:
Insecure – No encryption is used. This is the default mode.
Server-side TLS – Communication is encrypted. Only the server needs to present a certificate to the client. This mode is enabled by setting the
cacert
option (refer to the Configuration section).Mutual TLS (mTLS) – Communication is encrypted, and both the client and server must authenticate each other using certificates. This mode requires setting the
cacert
,cert
, andkey
options (refer to the Configuration section).
Managed Domains List
All NVLink domains that are managed or monitored by NMXAGGR are recorded in a list stored in the file <data_dir>/domains.txt
(see the Configuration section for the definition of data_dir
).
This file serves as an alternative method—alongside the Web UI and REST API—for adding or removing managed domains. Each line in the file represents the gRPC endpoint address of a domain controller (NMX-C), including its port number.
Addresses may include hostnames or IP addresses, and both can incorporate numerical ranges to define multiple addresses in one line.
Examples:
10.222.16.333:9370
nv-dmn-01:6666
10.222.[16,17,20-28].[330-350]:9346
nv-dmn-[01-8]:9371
For any changes made directly to the file to take effect, the plugin must be restarted.
When NMXAGGR writes the file (as a result of changes to the managed domains list performed via UI or REST API), it expands addresses containing ranges and writes one address per line.
The plugin can be configured by editing the config file /config/nvlink_plugin.conf.
There are two sections in the config file:
nmxaggr
Option | Description | Default |
| If |
|
| The address of the NMXAGGR REST API server. |
|
| The path to a file containing trusted root certificates for verifying NMX-C servers. If not set, insecure gRPC connections will be used. |
|
| The path to a file containing client certificate to present to NMX-C servers. Must be used with |
|
| The path to a file containing client private key to present to NMX-C servers. Must be used with |
|
| The path to a directory where the internal NXAGGR will store its persistent data. | /config |
| In the case the plugin fails to subscribe to domain change notifications, the periodic data fetches from a domain will be performed. This option specifies the delay between those periodic fetches in a duration string format1. |
|
| Normally, after the initial data fetch, data will be fetched from a domain only upon receiving a change notification from a domain controller. Additionally, supplementary fetch will be initiated if there is a long delay since the last fetch. This option specifies the delay in a duration string format1. |
|
1 A duration string is a sequence of decimal numbers, each with optional fraction and a unit suffix, such as 300ms
, 1.5h
or 2h45m
. Valid time units are ns
, us
, ms
, s
, m
, h
.
logging
Option | Description | Default |
| The path to the plugin log file. |
|
| The path to the internal NMXAGGR log file. |
|
| The log level. Possible values: |
|
| The maximal size of a log file after which the file is rotated. |
|
| The number of rotated log files to retain. |
|
1 10 MB
After the plugin is activated, an "NVLink" section becomes available in the dashboards.
NVLink Dashboard View
This view presents an overview of inventory elements—such as domains, switches, and GPUs—along with a filter for their health status.
Users can drill down from overall status indicators to specific elements, and further into the individual ports or links associated with each selected element.

The user can select a specific domain, upon which a list of associated switches and GPUs will be displayed, as illustrated in the example below.
If the selected domain has any health issues, a detailed breakdown of the affected devices will also be presented.

When an unhealthy device is selected, a list of all its ports and links will be displayed.

Additionally, the "Recent Events" notification panel on the right side of the screen is updated with the most recent health status changes of the devices.

Managed Elements View
The Managed Elements view is a tree-tabular display that shows all inventory elements, allowing users to browse through them. It also provides the option to add or remove domains.
Domains View

Add New Domain Model
Click the + icon in the upper dashboard to add a new domain.

Available Actions for the Selected Domain
The following actions are available when you right-click on the selected domain's row.
Action | Description |
Remove | Removes the selected domain and its elements from the inventory. |
Go To Switches | Redirects you to the switches of the selected domain. |
Go To GPUs | Redirects you to the GPUs of the selected domain. |
Go To Ports | Redirects you to all Ports of the selected domain. |
Go To Links | Redirects you to all Links of the selected domain. |
Switches View
This screen presents a table listing all the switches, including key details.

Available Actions for the Selected Switch
The following actions are available when you right-click on the selected switch's row.
Action | Description |
Go To Domain | It redirects you to the parent domain of the selected switch. |
Go To Ports | It redirects you to the Ports of the selected switch. |
GPUs View
This screen presents a table listing all the GPUs, including key details.

Available Actions for the Selected GPU
The following actions are available when you right-click on the selected GPU's row.
Action | Description |
Go To Domain | It redirects you to the parent domain of the selected GPU. |
Go To Ports | It redirects you to the Ports of the selected GPU. |
Ports View
This screen presents a table listing all the ports, including key details.

Available Actions for the Selected Port
The following actions are available when you right-click on the selected port's row.
Action | Description |
Go To Domain | It redirects you to the parent domain of the selected Port. |
Links View
This screen presents a table listing all the links, including key details.

Available Actions for the Selected Link
The following actions are available when you right-click on the selected link's row.
Action | Description |
Go To Domain | It redirects you to the parent domain of the selected Link. |
This section contains short descriptions of available REST API endpoints. The more detailed documentation is available at /ufmRestV2/plugin/nmxaggr/v1/app/swagger
endpoint of the running plugin.
When operating with a standalone NMXAGGR instance, the /ufmRestV2/plugin/nmxaggr
prefix is not required. In contrast, when operating with the NVLink plugin, the prefix must be used.
Therefore, depending on your deployment scenario—plugin mode or standalone—you should adjust the API endpoint URLs accordingly to ensure proper communication.
App Operations
Endpoint | Method | Description |
| GET | Gets a version of NMXAGGR component |
| GET | Gets a Swagger UI for browsing API documentation |
Managed Domains Operations
Endpoint | Method | Description |
| GET | Gets a list of managed domains |
| POST | Adds a new managed domain |
| POST | Removes an existing managed domain |
Inventory Operations
Endpoint | Method | Description |
| GET | Gets a list of domains |
| GET | Gets a list of GPUs |
| GET | Gets a list of switches |
| GET | Gets a list of ports |
| GET | Gets a list of links |
| GET | Gets statistics about inventory elements |