NVIDIA Docs Hub Homepage NVIDIA Networking Networking Software Management Software NVIDIA UFM Enterprise Appliance Software User Manual v2.4.1 Appendix - UFM Clustered Telemetry

Appendix - UFM Clustered Telemetry

UFM Clustered Telemetry is an advanced feature that enables distributed telemetry data collection across multiple network adapters (HCAs) in your InfiniBand fabric . This feature provides improved performance and scalability for large-scale deployments through workload distribution.

Key Benefits

Better Performance: Workload distribution across multiple instances reduces collection bottlenecks
HCA Utilization: Leverages multiple network adapters for parallel data collection
Scalability: Handles larger fabric deployments more efficiently
Flexibility: Customizable instance distribution based on your infrastructure

Configuring Cluster Telemetry in Multi-Node Mode

To set up cluster telemetry in multi-node mode on both the master and standby nodes, execute the following CLI commands:

1. Configure HA (Active-Active) on Both Nodes

Run the appropriate command on each machine:

On the Standby (slave) Node:

Copy
Copied!

            
            # ufm ha configure standby 3.3.3.2 3.3.3.1 10.236.17.102 10.236.17.101 10.236.17.103 123456 multi-node

On the Master Node:

Copy
Copied!

            
            # ufm ha configure master 3.3.3.1 3.3.3.2 10.236.17.101 10.236.17.102 10.236.17.103 123456 multi-node

2. Enable Infrastructure Mode (Run on Both Nodes)

Copy
Copied!

            
            # ufm infra-mode --enable

3. Enable Cluster Telemetry Mode (Master Node Only)

Copy
Copied!

            
            # ufm telemetry utm-mode --enable

4. Start UFM (Run on Both Nodes)

Copy
Copied!

            
            # ufm start

For more information, refer to Appendix - UFM Clustered Telemetry.

On This Page

Appendix - UFM Clustered Telemetry

Key Benefits

Configuring Cluster Telemetry in Multi-Node Mode