NVIDIA UFM Cyber-AI Documentation v2.13.0

Settings and Configuration

the directory /opt/ufm/cyber-ai/conf contains the configuration files for the UFM Cyber-AI application. The files cyberai.cfg and scheduler_settings.cfg are the main configuration files.

The basic configurations of cyberai.cfg are listed in the following table:

Section

Key

Type

Default

Description

JobSettings

retries

Integer

2

the number of reties to run the analytics jobs when its failed or timedout

check_interval

Interval

120 (seconds)

the interval for running health for checking jobs if its get timeout

GRPC

host

String

cyberai-plm

name of the docker network that will be used to communicate between plm and worker

grpc_port

Integer

50051

the port that used for communication

max_workers

Integer

10

max number of worker that can be run in parallel

Cleanup

cleanup_run_interval

Integer

24

This parameter sets how often (in hours) the cleanup process runs. The system will automatically trigger cleanup operations every 24 hours to remove old files and data

files_days_to_keep

Integer

28

This controls retention for hourly files. The system will keep these files for 28 days before automatically deleting them.

files_hours_to_keep

Integer

24

This specifically controls retention for topology hourly files. These files are kept for only 24 hours before being cleaned up,

SecondaryParams

ip

String

172.17.0.1

the IP address of the secondary endpoint/server that will be used. to fetch the telemetry data

port

Integer

9002

The port number on the secondary endpoint where the service is listening

url

String

csv/xcset/cyberai_telemetry

The URL path or endpoint

The scheduler_settings.cfg file defines configuration parameters for three categories of tasks: UFM data preparation tasks, telemetry collectors, and analytics jobs.

Section

Key

Description

data_prep_ufm::<task_name>

interval

Interval between task runs

delay

Delay before the first run

skip_collection

Enable/Disable task

json_collection

Collect data from a JSON file or from a REST API

data_prep_telemetry::<task_name>

interval

Interval between task runs

delay

Delay before the first run

timeout

Maximum execution time before the job times out

skip_collection

Enable/Disable task

analytics_job::<job_name>

interval

Interval between job runs

delay

Delay before the first run

max_input

Maximum number of input files to process in a single run

standard_timeout

Maximum execution time before the job times out

enabled

Enable/Disable task

Inside the ufm-telemetry container, the /config directory contains all configuration files for UFM-Telemetry. The primary configuration file is launch_ibdiagnet_config.ini.

The table below lists the basic configuration options available in launch_ibdiagnet_config.ini:

Section

Key

Type

Default

Description

ibdiagnet

ibdiagnet_enabled

Boolean

true

Enable or disable the ibdiagnet process

data_dir

String

/data

Directory where UFM Cyber-AI data is stored

ibdiag_output_dir

String

/tmp/ibd

Directory where ibdiagnet stores output files

sample_rate

Integer

Frequency for collecting port counter data

hca

String

mlx5_2

Host Channel Adapter (HCA) to use

app_name

String

/opt/collectx/bin/ibdiagnet

Full path to the ibdiagnet application

topology_mode

String

discover

Topology discovery mode

topology_discovery_factor

Integer

0

Run topology discovery every “n” iterations; if set to 0 or 1, use the last run’s results

retention

retention_enabled

Boolean

true

Enable or disable the retention service

retention_interval

Time

1d

Time interval between retention operations

retention_age

Time

100d

Retention period for collected data

compression

compression_enabled

Boolean

true

Enable or disable the compression service

compression_interval

Time

6h

Time interval between compression operations

compression_age

Time

12h

Retention period for compressed data

cable_info

cable_info_schedule

csv

Schedule for collecting cable information (format: weekday/hr:min,hr:min)

© Copyright 2025, NVIDIA. Last updated on Aug 7, 2025.