Appendix - Chassis Health Monitoring

NVIDIA UFM Enterprise Appliance Software User Manual v1.4.1

On This Page

Chassis Health Monitoring enables monitoring hardware alerts via rsyslog and generating external events in UFM. The alerts are written to /var/log/syslog.

Monitoring hardware health status is essential for failure prevention and maintenance. The Chassis Health Monitoring service is run as a Docker container.

  1. Generate UFM token authentication. Run:

    Copy
    Copied!
                

    POST https://<UFM server IP>/ufmRest/app/tokens

  2. Set the UFM server hostname and authentication token in /opt/ufm/chassis_health/chassis_health.conf:

    Copy
    Copied!
                

    [connection] # UFM server hostname. In case of HA, it should be the VIP hostname =   [authentication] # UFM server user credentials token =

  3. Restart the Chassis Health Monitoring service for changes to take effect. Run:

    Copy
    Copied!
                

    systemctl restart ufm-chassis-health.service

Once the service runs, the status can be viewed via systemctl (systemctl status ufm-chassis-health.service) and /var/log/chassis_health_fluentd_console.log file.

© Copyright 2023, NVIDIA. Last updated on Sep 5, 2023.