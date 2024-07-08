The UFM Server Health Monitoring module is a standalone module that monitors UFM resources and processes according to the settings in the /opt/ufm/files/conf/UFMHealthConfiguration.xml file.

For example:

Each monitored resource or process has its own failure condition (number of retries and/or timeout), which you can configure.

If a test fails, UFM will perform a corrective operation , if defined for the process, for example, to restart the process. You can change the configured corrective operation. If the corrective operation is set to "None", after the defined number of failures, the give-up operation is performed.

If a test reaches the configured threshold for the number of retries, the health monitoring initiates the give-up operation defined for the process, for example, UFM failover or stop.

By default, events and alarms are sent when a process fails, and they are also recorded in the internal log file.

Each process runs according to its own defined schedule, which you can change in the configuration file.

Changes to the configuration file take effect only after a UFM Server restart. (It is possible to kill and run in background the process nohup python /opt/ufm/ufmhealth/UfmHealthRunner.pyo &.)

You can also use the configuration file to improve disk space management by configuring:

How often to purge MySQL binary log files.

When to delete compressed UFM log files (according to free disk space).

The settings in the /opt/ufm/files/conf/UFMHealthConfiguration.xml file are also used to generate the UFM Health Report.

The following section describes the configuration file options for UFM server monitoring.