NVIDIA UFM Enterprise Appliance Software User Manual v2.4.1

Appendix - UFM Clustered Telemetry

UFM Clustered Telemetry is an advanced feature that enables distributed telemetry data collection across multiple network adapters (HCAs) in your InfiniBand fabric . This feature provides improved performance and scalability for large-scale deployments through workload distribution.

  • Better Performance: Workload distribution across multiple instances reduces collection bottlenecks

  • HCA Utilization: Leverages multiple network adapters for parallel data collection

  • Scalability: Handles larger fabric deployments more efficiently

  • Flexibility: Customizable instance distribution based on your infrastructure

To set up cluster telemetry in multi-node mode on both the master and standby nodes, execute the following CLI commands:

1. Configure HA (Active-Active) on Both Nodes

Run the appropriate command on each machine:

On the Standby (slave) Node:

Copy
Copied!
            

# ufm ha configure standby 3.3.3.2 3.3.3.1 10.236.17.102 10.236.17.101 10.236.17.103 123456 multi-node

On the Master Node:

Copy
Copied!
            

# ufm ha configure master 3.3.3.1 3.3.3.2 10.236.17.101 10.236.17.102 10.236.17.103 123456 multi-node

2. Enable Infrastructure Mode (Run on Both Nodes)

Copy
Copied!
            

# ufm infra-mode --enable

3. Enable Cluster Telemetry Mode (Master Node Only)

Copy
Copied!
            

# ufm telemetry utm-mode --enable

4. Start UFM (Run on Both Nodes)

Copy
Copied!
            

# ufm start

For more information, refer to Appendix - UFM Clustered Telemetry.

© Copyright 2026, NVIDIA. Last updated on Feb 20, 2026