NVIDIA UFM Telemetry Documentation v1.17.0
NVIDIA UFM Telemetry Documentation v1.17.0

Settings and Configuration

Inside the container, the directory /config contains the configuration files for the NVIDIA® UFM® Telemetry application. The file launch_ibdiagnet_config.ini is the main configuration file.

The basic configurations of launch_ibdiagnet_config.ini are listed in the following table.

Section

Key

Type

Default Value

Description

ibdiagnet

ibdiagnet_enabled

bool

true

Enable/disable run ibdiagnet process

data_dir

String

/data

Directory in which UFM Telemetry data is placed

ibdiag_output_dir

String

/tmp/ibd

Directory in which ibdiagnet places files

sample_rate

Int

-

Frequency of collecting ports counters data

hca

String

mlx5_2

Card to use. Can provide a comma-separated list of cards for local high availability

force_hca

bool

false

Skip hca state check

app_name

String

/opt/collectx/bin/ibdiagnet

Allow user to specify full path of the ibdiagnet application if necessary

topology_mode

String

discover

Topology policy

topology_discovery_factor

Int

0

Every "n" iterations, do discovery, otherwise, use result from last run if 0 or 1

m_key

int

-

Set the m_key used by ibdiagnet for data collection

Retention

retention_enabled

bool

true

Enable/disable retention service

retention_interval

time

1d

Interval to wait before running the retention process

retention_age

time

100d

Period to reserve the collected data

compression

compression_enable

bool

true

Enable/disable compression service

compression_interval

time

6h

Interval to wait before running the compression service

compression_age

time

12h

Period to reserve the compressed data

cable_info

cable_info_schedule

CSV

-

weekday/hr:min,hr:hm

Time to collect cable info data

UFM telemetry log file “ibdiagnet2_port_counters.log” size is monitored by log rotation mechanism. This is highly relevant for cases of long execution time and/or high verbosity, where the number of logs can get excessively big.

To disable log rotation, verify that the following flag is set to 0 (default is 1):

Copy
Copied!
            

plugin_env_CLX_LOG_ROTATE_ENABLED

To change the number of rotated files, set the following flag (default is 3):

Copy
Copied!
            

plugin_env_CLX_LOG_ROTATE_NUM_FILES

To change the rotation’s threshold, set the following flag (default is 100M), use [K|M|G] as units:

Copy
Copied!
            

plugin_env_CLX_LOG_ROTATE_SIZE

There are three optional rotation methods, used in the following order:

  1. rotatelogs - If this executable exists, it will be used for logs rotation, and the rotated files name will differ by index suffix.

  2. logrotate - If this executable exists, it will be used for logs rotation, and the rotated files name will differ by timestamp suffix.

  3. manual rotation - In case both executables are not available, UFM telemetry will manually rotate 2 log files. The older log file will have “.bck

To skip options, the following flag set the executables to use (default is “rotatelogs,logrotate”):

Copy
Copied!
            

plugin_env_CLX_LOG_ROTATE_APP

© Copyright 2024, NVIDIA. Last updated on May 6, 2024.