System Monitoring REST API

  • Description – Retrieves Prometheus-formatted metrics for system monitoring, including CPU Utilization Percentage, Memory Usage Percentage, IO Operations Statistics, and additional metrics associated with UFM REST API calls and UFM Events.

  • Request URL – GET ufmRest/system_monitoring/metrics

  • Response - Text in Prometheus format

  • Status Code:

    • 200 – Ok

  • Description – This API grants access to event history counters associated with topology changes, including events such as node status changes (up/down), switch status changes (up/down), director switch status changes (up/down), and link status changes (up/down). These events are collected through the Prometheus endpoint.

  • Request URL – GET ufmRest/system_monitoring/events_counters

  • Request Content Type – Application/json

  • Response

    Copy
    Copied!
                

    {    "12h": {        "Director Switch is Down": 0,        "Director Switch is Up": 0,        "Link is Down": 0,        "Link is Up": 0,        "Node is Down": 0,        "Node is Up": 6,        "Switch is Down": 0,        "Switch is Up": 0    },    "1h": {        "Director Switch is Down": 0,        "Director Switch is Up": 0,        "Link is Down": 0,        "Link is Up": 0,        "Node is Down": 0,        "Node is Up": 0,        "Switch is Down": 0,        "Switch is Up": 0    }, …… …… }

  • Status Code:

    • 200 – Ok

© Copyright 2023, NVIDIA. Last updated on Mar 4, 2024.