Appendix - Chassis Health Monitoring

On This Page

Chassis Health Monitoring enables monitoring hardware alerts via rsyslog and generating external events in UFM. The alerts are written to /var/log/syslog.

Monitoring hardware health status is essential for failure prevention and maintenance. The Chassis Health Monitoring service is run as a Docker container.

  1. Generate UFM token authentication. Run:

    Copy
    Copied!
                

    POST https://<UFM server IP>/ufmRest/app/tokens

  2. Set the UFM server hostname and authentication token in /opt/ufm/chassis_health/chassis_health.conf:

    Copy
    Copied!
                

    [connection] # UFM server hostname. In case of HA, it should be the VIP hostname =   [authentication] # UFM server user credentials token =

  3. Restart the Chassis Health Monitoring service for changes to take effect. Run:

    Copy
    Copied!
                

    systemctl restart ufm-chassis-health.service

Once the service runs, the status can be viewed via systemctl (systemctl status ufm-chassis-health.service) and /var/log/chassis_health_fluentd_console.log file.

© Copyright 2023, NVIDIA. Last updated on Feb 20, 2024.