NVIDIA UFM Enterprise User Manual v6.18.0
NVIDIA UFM Enterprise User Manual v6.18.0

ClusterMinder Plugin

Note

This plugin is supported on UFM Enterprise Appliance only.

The ClusterMinder plugin collects telemetry data from multiple data sources and aggreats, streams and visualizes the backed. The plugin can cluster/group aggregated Redfish data from multiple machines that allows operational anomaly and misconfigurations detection. The plugin provides Cluster-wide histograms of hardware telemetry which details compute node configuration and inventory, PCIe bus, hardware information (SN and FW version) and health alerts of all relevant devices on each Redfish category.

The plugin can be deployed as a container and supports multiple data sources, including:

  • Redfish on Host

  • Redfish on DPU

  • MLNX Switch Data

  • DOCA Telemetry Service on DPU (BlueField)

  • DOCA Telemetry Service on Host

  • Unmanaged InfiniBand Switches

The plugin can be deployed using the following methods:

  1. On the UFM Appliance

  2. On the UFM Software

To deploy the plugin, follow these steps:

  • The plugin is included in the default plugin bundle available at NVIDIA's Licensing Portal .

  • Load the downloaded image onto the UFM server. This can be done either by using the UFM GUI by navigating to the Settings -> Plugins Management tab or by loading the image via the following instructions:

    • Log in to the UFM server terminal.

    • Run:

      Copy
      Copied!
                  

      docker load < <path_to_image>

    • After successfully loading the plugin image, the plugin should become visible within the plugins management table within the UFM GUI. To initiate the plugin’s execution, simply right-click on the respective in the table.

      image-2024-8-8_15-6-41-version-1-modificationdate-1724059745440-api-v2.png

After the successful deployment of the plugin, a new item is shown in the UFM side menu for the ClusterMinder plugin: 

Example of Adding Data Source

image-2024-8-8_15-15-59-version-1-modificationdate-1724059744963-api-v2.png


Example of Adding the Redfish Host

After inputting the "BMC IP", "Protocol","Username" and "Password". Pressing the button tests the connection and allows to hosts if successful.

image-2024-8-8_15-15-12-version-1-modificationdate-1724059745210-api-v2.png


Example of Removing Data Source

Removing hosts is done through the "Data Sources" section, Right click any available host and click the remove option.

image-2024-8-8_15-18-39-version-1-modificationdate-1724059744380-api-v2.png


© Copyright 2024, NVIDIA. Last updated on Aug 27, 2024.