Monitoring REST API
Description – APIs for managing monitoring sessions data and monitoring templates data
Request URL – /ufmRest/monitoring
Main Operations
Monitoring sessions:
Create a monitoring session
Delete a monitoring session
Get data of a monitoring session
Monitoring session snapshot
Request data of a monitoring session’s attributes
Get all monitoring available attributes
Get traffic/congestion map
Monitoring templates:
Create a monitoring template
Update a monitoring template
Get a monitoring template
Get all monitoring templates
Delete a monitoring template
The below are all the available values of the Monitoring attributes.
Monitor Class – the selected object type for monitoring
Monitor Attributes – the selected attributes (counters) for monitoring the monitored objects
Monitor Functions – list of optional functions to apply for the monitored objects data
Attribute |
Value |
Description |
Monitoring class |
"Device" |
General device in the fabric (can be switch/ host/bridge, etc.) |
"Port" |
Represents a physical port in the fabric |
|
Monitor attributes |
"Infiniband_MBOut" "Infiniband_MBOutRate"* |
Total number of data octets, divided by 4, transmitted on all VLs from the port, including all octets between (and not including) the start of packet delimiter and the VCRC, and may include packets containing errors. All link packets are excluded. Results are reported as a multiple of four octets |
"Infiniband_MBIn" "Infiniband_MBInRate"* |
Total number of data octets, divided by 4, received on all VLs at the port. All octets between (and not including) the start of packet delimiter and the VCRC are excluded, and may include packets containing errors. All link packets are excluded. When the received packet length exceeds the maximum allowed packet length specified in C7-45, the counter may include all data octets exceeding this limit. Results are reported as a multiple of four octets |
|
"Infiniband_PckOut" "Infiniband_PckOutRate"* |
Total number of packets transmitted on all VLs from the port, including packets with errors, and excluding link packets |
|
"Infiniband_PckIn" "Infiniband_PckInRate"* |
Total number of packets, including packets containing errors and excluding link packets, received from all VLs on the port |
|
"Infiniband_RcvErrors" "Infiniband_RcvErrors_Delta"** |
Total number of packets containing errors that were received on the port including:
|
|
"Infiniband_XmtDiscards" "Infiniband_XmtDis- cards_Delta"** |
Total number of outbound packets discarded by the port when the port is down or congested for the following reasons:
|
|
"Infiniband_SymbolErrors" "Infiniband_SymbolErrors_Delta"** |
Total number of minor link errors detected on one or more physical lanes |
|
"Infiniband_LinkRecovers" "Infiniband_LinkRecovers_Delta"** |
Total number of times the Port Training state machine has successfully completed the link error recovery process |
|
"Infiniband_LinkDowned" "Infiniband_LinkDowned_Delta"** |
Total number of times the Port Training state machine has failed the link error recovery process and downed the link |
|
"Infiniband_LinkIntegrityErrors" "Infiniband_LinkIntegrityErrors_Delta"** |
The number of times that the count of local physical errors exceeded the threshold specified by LocalPhyErrors |
|
"Infiniband_RcvRemotePhysErrors" "Infiniband_RcvRemotePhysErrors_Delta"** |
Total number of packets marked with the EBP delimiter received on the port |
|
"Infiniband_XmtConstraintErrors" "Infiniband_XmtConstraintErrors_Delta"** |
Total number of packets not transmitted from the switch physical port for the following reasons:
|
|
"Infiniband_RcvConstraintErrors" "Infiniband_RcvConstraintErrors_Delta"** |
Total number of packets received on the switch physical port that are discarded for the following reasons:
|
|
"Infiniband_ExcBufOverrunErrors" "Infiniband_ExcBufOverrunErrors_Delta"** |
The number of times that OverrunErrors consecutive flow control update periods occurred, each having at least one overrun error |
|
"Infiniband_RcvSwRelayErrors" "Infiniband_RcvSwRelayErrors_Delta"** |
Total number of packets received on the port that were discarded when they could not be forwarded by the switch relay for the following reasons:
|
|
"Infiniband_VL15Dropped" "Infiniband_VL15Dropped_Delta"** |
Number of incoming VL15 packets dropped because of resource limitations (e.g., lack of buffers) in the port |
|
"Infiniband_XmitWait" |
The number of ticks during which the port selected by PortSelect had data to transmit but no data was sent during the entire tick because of insufficient credits or of lack of arbitration |
|
"Infiniband_CumulativeErrors" |
The sum of several error counters indicating link integrity issues |
|
"Infiniband_CBW" |
Congestion bandwidth rate, measure the rate of congestion measured by XmitWait counter |
|
"Infiniband_Normalized_MBOut" |
Effective port bandwidth utilization in % XmitData incremental/Link Capacity |
|
"Infiniband_Normalized_CBW" |
Amount of bandwidth that was suppressed due to congestion (XmitWait incremental/Time) * Link Capacity Separate counters are used for Tier 4 ports and for the rest of the ports |
|
"Infiniband_NormalizedXW" |
Congestion in relation to packets transmitted over the link XmitWait incremental / XmitPackets incremental. This event is calculated only for the port directly connected to receiving hosts. Separate counters are used for Tier 4 ports and for the rest of the ports |
|
Monitor functions |
"RAW" |
Raw data values of selected monitoring objects |
"AVG" |
Average value of all selected monitoring objects |
|
"SUM" |
Sum value of all selected monitoring objects |
|
"MIN" |
Minimum value of all selected monitoring objects |
|
"MAX |
Maximum value of all selected monitoring objects |
* Rate Counter – Counter value that is calculated based on the delta from the previous sampled value divided by elapsed time from previous sample (the ratio between two sequential samples).
** Delta Counter – Counter value that is calculated based on the delta from the previous counter value.
Create Monitoring Session
Description – creates and starts a monitoring session
Request URL – POST /ufmRest/monitoring/start
Request Content Type – application/json
Request Data Format
{ "scope_object": MonitorClass, "monitor_object": MonitorClass, "objects": [ "object_id" ], "counters": [ MonitorAttributes ], "functions": [ "MonitorFunctions" ], "interval":2 }
NoteRefer to the table in "Possible Attribute Values" for possible values for monitor class, monitor attributes, and monitor functions.
Request Data Example
{ "attributes": ["Infiniband_MBOut","Infiniband_MBIn"], "functions": ["RAW"], "scope_object": "Site", "interval":2, "monitor_object": "Device", "objects": ["Grid.default"] }
Request Data Example -Creates and starts a monitoring session on top of Group of Devices.
{
"interval"
:15
,"functions"
: ["RAW"
],"scope_object"
:"Group"
,"monitor_object"
:"Device"
,"attributes"
: ["Infiniband_MBOutRate"
],"objects"
: ["Grid.default.groups.<group_name>"
] }400
BAD_REQUEST201
CREATEDResponse Format
/ufmRest/monitoring/session/<session_id>
Response Example
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN"> <title>Redirecting...</title> <h1>Redirecting...</h1> <p>You should be redirected automatically to target URL: <a href="/ufmRest/monitoring/session/3">/ufmRest/monitoring/session/3</a>. If not click the link.
Note – the resource ID can be found by parsing the location header.
Status Codes
201 – created
Delete Monitoring Session
Description – deletes a monitoring session
Request URL – DELETE /ufmRest/monitoring/<session>/<session_id>
Request Data Format – not required
Response – N/A
Status Codes
202 – accepted
Get Monitoring Session Data
Description – returns monitoring session data
Request URL – GET /ufmRest/monitoring/<session>/<session_id>/data
Notehttp://localhost:4300/ufmRestV2/telemetry?type=history&membersType=Device&attributes=[Infiniband_PckInRate]&function=RAW&result_format=Port&members=[ec0d9a03007d7f0a]&start_time=-5min&end_time=-0min
Request Data Format – not required
Response Format
{ timestamp: { monitor_object: { name: { "statistics": { counter1...counter2......... }, dname: ..., last_updated: } } }
Response Example
{ "2020-10-27 18:52:42": { "Device": { "98039b030000e456": { "dname": "r-dmz-ufm128", "last_updated": "2020-10-27 18:52:14", "statistics": { "Infiniband_PckIn": 22035108156, "Infiniband_PckOut": 330352264, "Infiniband_PckOutRate": 0.06599832808486128, "Infiniband_PckInRate": 0.06599832808486128 } }, "0c42a103008b3bd0": { "dname": "r-dmz-ufm131", "last_updated": "2020-10-27 18:52:14", "statistics": { "Infiniband_PckIn": 1297449, "Infiniband_PckOut": 1286924, "Infiniband_PckOutRate": 0.13199665616972256, "Infiniband_PckInRate": 0.13199665616972256 } }, "0c42a103008b40d0": { "dname": "r-dmz-ufm134", "last_updated": "2020-10-27 18:52:14", "statistics": { "Infiniband_PckIn": 4681865, "Infiniband_PckOut": 3223445, "Infiniband_PckOutRate": 2.2109439908428525, "Infiniband_PckInRate": 2.309941482970145 } }, "248a0703002e61da": { "dname": "r-dmz-ufm137", "last_updated": "2020-10-27 18:52:14", "statistics": { "Infiniband_PckIn": 333267757, "Infiniband_PckOut": 22034474531, "Infiniband_PckOutRate": 0.13199665616972256, "Infiniband_PckInRate": 0.13199665616972256 } }, "0002c903007b78b0": { "dname": "r-dmz-ufm-sw49", "last_updated": "2020-10-27 18:52:14", "statistics": { "Infiniband_PckIn": 22374553061, "Infiniband_PckOut": 22380620225, "Infiniband_PckOutRate": 3.5309105525400772, "Infiniband_PckInRate": 3.6299080446673697 } } } } }
Note – UFM default session which collects all port statistics every 30 seconds (by default) can be queried by using session ID 0 (zero): GET /ufmRest/monitoring/session/0/data
Status Codes
200 – OK
Get Default Monitoring Session Data by PKey Filtering
Description – returns default monitoring session data by Pkey filtering
Request URL – GET /ufmRest/monitoring/session/<session_id>/data?pkey=<pkey name>
Request Data Format – not required
Response Format
{ timestamp: { monitor_object: { name: { "statistics": { counter1...counter2......... }, dname: ..., last_updated: } } }
Response Example
{ "2022-10-19 13:23:11": { "Ports": { "b8599f03000a7768_1": { "dname": "default / Computer: r-ufm77 / HCA-1/1", "last_updated": "2022-10-19 13:23:11", "statistics": { "raw_ber": 0, "dev_temperature": 0, "Infiniband_PckOutRate": 1.1333333333333333, "Infiniband_PckInRate": 1.1333333333333333, "Infiniband_MBInRate": 0.0, "Infiniband_MBOutRate": 0.03333333333333333, "Infiniband_MBOut": 26165, "Infiniband_MBIn": 26126, "Infiniband_PckOut": 95263867, "Infiniband_PckIn": 95123933, "Infiniband_SymbolErrors": 0, "Infiniband_LinkRecovers": 0, "Infiniband_LinkDowned": 12, "Infiniband_RcvErrors": 0, "Infiniband_RcvRemotePhysErrors": 0, "Infiniband_RcvSwRelayErrors": 0, "Infiniband_XmtDiscards": 440, "Infiniband_XmtConstraintErrors": 0, "Infiniband_RcvConstraintErrors": 0, "Infiniband_LinkIntegrityErrors": 0, "Infiniband_ExcBufOverrunErrors": 0, "Infiniband_VL15Dropped": 0, "Infiniband_XmitWait": 0, "Infiniband_CBW": 0, "Infiniband_Normalized_CBW": 0, "Infiniband_Normalized_MBOut": 2.61104e-6 } }, "b8599f03000a7769_2": { "dname": "default / Computer: r-ufm77 / HCA-1/2", "last_updated": "2022-10-19 13:23:11", "statistics": { "raw_ber": 0, "dev_temperature": 0, "Infiniband_PckOutRate": 0.13333333333333333, "Infiniband_PckInRate": 0.13333333333333333, "Infiniband_MBInRate": 0.0, "Infiniband_MBOutRate": 0.0, "Infiniband_MBOut": 3197, "Infiniband_MBIn": 3197, "Infiniband_PckOut": 11642100, "Infiniband_PckIn": 11642006, "Infiniband_SymbolErrors": 0, "Infiniband_LinkRecovers": 0, "Infiniband_LinkDowned": 2, "Infiniband_RcvErrors": 0, "Infiniband_RcvRemotePhysErrors": 0, "Infiniband_RcvSwRelayErrors": 7, "Infiniband_XmtDiscards": 80, "Infiniband_XmtConstraintErrors": 0, "Infiniband_RcvConstraintErrors": 0, "Infiniband_LinkIntegrityErrors": 0, "Infiniband_ExcBufOverrunErrors": 0, "Infiniband_VL15Dropped": 0, "Infiniband_XmitWait": 0, "Infiniband_CBW": 0, "Infiniband_Normalized_CBW": 0, "Infiniband_Normalized_MBOut": 3.07182e-7 } }, "f452140300383a01_1": { "dname": "default / Computer: r-ufm51 / HCA-1/1", "last_updated": "2022-10-19 13:23:11", "statistics": { "raw_ber": 0, "dev_temperature": 0, "Infiniband_PckOutRate": 0.06666666666666667, "Infiniband_PckInRate": 0.06666666666666667, "Infiniband_MBInRate": 0, "Infiniband_MBOutRate": 0, "Infiniband_MBOut": 3050, "Infiniband_MBIn": 3050, "Infiniband_PckOut": 11106861, "Infiniband_PckIn": 11106856, "Infiniband_SymbolErrors": 0, "Infiniband_LinkRecovers": 0, "Infiniband_LinkDowned": 0, "Infiniband_RcvErrors": 0, "Infiniband_RcvRemotePhysErrors": 0, "Infiniband_RcvSwRelayErrors": 0, "Infiniband_XmtDiscards": 0, "Infiniband_XmtConstraintErrors": 0, "Infiniband_RcvConstraintErrors": 0, "Infiniband_LinkIntegrityErrors": 0, "Infiniband_ExcBufOverrunErrors": 0, "Infiniband_VL15Dropped": 0, "Infiniband_XmitWait": 0, "Infiniband_CBW": 0, "Infiniband_Normalized_CBW": 0, "Infiniband_Normalized_MBOut": 2.74269e-7 } }, "f452140300383a02_2": { "dname": "default / Computer: r-ufm51 / HCA-1/2", "last_updated": "2022-10-19 13:23:11", "statistics": { "raw_ber": 0, "dev_temperature": 0, "Infiniband_PckOutRate": 0.06666666666666667, "Infiniband_PckInRate": 0.06666666666666667, "Infiniband_MBInRate": 0.0, "Infiniband_MBOutRate": 0.0, "Infiniband_MBOut": 3064, "Infiniband_MBIn": 3064, "Infiniband_PckOut": 11156319, "Infiniband_PckIn": 11156290, "Infiniband_SymbolErrors": 0, "Infiniband_LinkRecovers": 0, "Infiniband_LinkDowned": 0, "Infiniband_RcvErrors": 0, "Infiniband_RcvRemotePhysErrors": 0, "Infiniband_RcvSwRelayErrors": 0, "Infiniband_XmtDiscards": 0, "Infiniband_XmtConstraintErrors": 0, "Infiniband_RcvConstraintErrors": 0, "Infiniband_LinkIntegrityErrors": 0, "Infiniband_ExcBufOverrunErrors": 0, "Infiniband_VL15Dropped": 0, "Infiniband_XmitWait": 0, "Infiniband_CBW": 0, "Infiniband_Normalized_CBW": 0, "Infiniband_Normalized_MBOut": 2.74269e-7 } } } } }
Status Codes
200 – OK
400 - PKey is not found
Monitoring Session Snapshot
Description – creates a one-time monitoring session and receives data
Request URL – POST /ufmRest/monitoring/snapshot
Request Content Type – application/json
Request Data Format
{ "scope_object": MonitorClass, "monitor_object": MonitorClass, "objects": [ "object_id" ], "counters": [ MonitorAttributes ], "functions": [ "MonitorFunctions" ], "interval":2 }
NoteRefer to the table in "Possible Attribute Values" for possible values for monitor class, monitor attributes, and monitor functions.
Request Data Example
{ "attributes": ["Infiniband_MBOut","Infiniband_MBIn"], "functions": ["RAW"], "scope_object": "Site", "interval":2, "monitor_object": "Device", "objects": ["Grid.default"] }
Response Format
{ timestamp: { monitor_object: { name: { "statistics":{ counter1... counter2... ... ... }, dname: } } }
Response Example
{ "2017-01-17 13:41:29": { "Device": { "0002c903001c6740": { "dname": "l-qa-150 HCA-3", "statistics": { "Infiniband_MBIn": 0, "Infiniband_MBOut": 0 } }, "f45214030042ccd0": { "dname": "MTX6000-Interop", "statistics": { "Infiniband_MBIn": 0, "Infiniband_MBOut": 0 } }, "0002c90300b71030": { "dname": "MT4113 ConnectIB Mellanox Technologies", "statistics": { "Infiniband_MBIn": 0, "Infiniband_MBOut": 0 } }, "f452140300289f80": { "dname": "sqadell49 HCA-3", "statistics": { "Infiniband_MBIn": 0, "Infiniband_MBOut": 0 } }, "f452140300188900": { "dname": "sqadell47 HCA-6", "statistics": { "Infiniband_MBIn": 0, "Infiniband_MBOut": 0 } }, "f452140300188840": { "dname": "sqadell49 HCA-6", "statistics": { "Infiniband_MBIn": 0, "Infiniband_MBOut": 0 } }, "f45214030028a020": { "dname": "l-qa-150 HCA-2", "statistics": { "Infiniband_MBIn": 0, "Infiniband_MBOut": 0 } } } } }
Status Codes
200 – OK
Request Monitoring Session Attributes Data
Description – requests the data that was used to create the monitoring session
Request URL – GET /ufmRest/monitoring/<session>/<session_id>
Request – not required
Response Format
{ "scope_object": MonitorClass, "monitor_object": MonitorClass, "objects": [ "object_id" ], "counters": [ MonitorAttributes ], "functions": [ "MonitorFunctions" ], "interval":2 }
NoteRefer to the table in "Possible Attribute Values" for possible values for monitor class, monitor attributes, and monitor functions.
Response Example
{ "attributes": [ "Infiniband_PckIn", "Infiniband_PckOutRate", "Infiniband_PckInRate" ], "functions": [ "RAW" ], "scope_object": "Device", "interval": 2, "monitor_object": "Device", "objects": [ "Grid.default.ec0d9a03007d7d0a", "Grid.default.98039b030000e456", "Grid.default.0c42a103008b3bd0", "Grid.default.0c42a103008b40d0" ] }
Status Codes
200 – OK
Get All Monitoring Available Attributes
Description – returns all possible values of monitoring metadata (counters, classes, and functions)
Request URL – GET /ufmRest/monitoring/attributes
Request Data – not required
Response Format
{ "functions":[ MonitorFunctions], "classes":[ MonitorClass], "counters":[ MonitorAttributes] }
NoteRefer to the table in "Possible Attribute Values" for possible values for monitor class, monitor attributes, and monitor functions.
Response Example
{ "functions": [ "RAW", "AVG", "SUM", "MIN", "MAX" ], "classes": [ "Port", "Device", "Switch", "Bridge", "Computer", "LogicalServer", "Site", "PortsGroup" ], "counters": [ "Infiniband_MBIn", "Infiniband_PckIn", "Infiniband_MBOut", "Infiniband_PckOut", "Infiniband_MBInRate", "Infiniband_PckInRate", "Infiniband_MBOutRate", "Infiniband_SymbolErrors", "Infiniband_LinkRecovers", "Infiniband_LinkDowned", "Infiniband_RcvErrors", "Infiniband_RcvRemotePhysErrors", "Infiniband_RcvSwRelayErrors", "Infiniband_XmtDiscards", "Infiniband_XmtConstraintErrors", "Infiniband_RcvConstraintErrors", "Infiniband_LinkIntegrityErrors", "Infiniband_ExcBufOverrunErrors", "Infiniband_VL15Dropped", "Infiniband_SymbolErrors_Delta", "Infiniband_LinkRecovers_Delta", "Infiniband_LinkDowned_Delta", "Infiniband_RcvErrors_Delta", "Infiniband_RcvRemotePhysErrors_Delta", "Infiniband_RcvSwRelayErrors_Delta", "Infiniband_XmtDiscards_Delta", "Infiniband_XmtConstraintErrors_Delta", "Infiniband_RcvConstraintErrors_Delta", "Infiniband_LinkIntegrityErrors_Delta", "Infiniband_ExcBufOverrunErrors_Delta", "Infiniband_VL15Dropped_Delta", "Infiniband_CBW", "Infiniband_Normalized_CBW", "Infiniband_Normalized_MBOut", "Infiniband_XmitWait", "Infiniband_NormalizedXW", "Infiniband_CumulativeErrors" ] }
Status Codes
200 – OK
Get Traffic/Congestion Map
Description – returns traffic and congestion information on the different tiers in the fabric.
Request URL – GET /ufmRest/monitoring/congestion
Content Type – Application/json
Response
{ "1": { "traffic": { "max": 0, "avg": 0, "min": 0 }, "cong": { "max": 0, "avg": 0, "min": 0 } }, "3": { "traffic": { "max": 0, "avg": 0, "min": 0 }, "cong": { "max": 0, "avg": 0, "min": 0 } }, "2": { "traffic": { "max": 0, "avg": 0, "min": 0 }, "cong": { "max": 0, "avg": 0, "min": 0 } }, "4": { "traffic": { "max": 0, "avg": 0, "min": 0 }, "cong": { "max": 0, "avg": 0, "min": 0 } } }
Status Codes
200 – OK
Get Port Groups Traffic/Congestion Map
Description – returns traffic and congestion information for all port groups.
Request URL – GET /ufmRest/monitoring/port_groups
Content type – Application/json
Response
{ <group_name>: { "traffic": { "max": 0, "avg": 0, "min": 0 }, "cong": { "max": 0, "avg": 0, "min": 0 } }
Status Codes
200 – OK
Create Monitoring Template
Description – Creates and starts a new monitoring template
Request URL – POST /ufmRest/app/monitoring
Content type – Application/json
Request Data Format
{ "interval": 5, "functions": [ "RAW" ], "scope_object": "Device", "monitor_object": "Device", "attributes": [ "attribute " ], "objects": [ "object_id" ], "name": "template", "description": "", "view_type": "Line" }
NoteRefer to the table in "Possible Attribute Values" for the list of attributes.
Request Data Example
{ "interval": 5, "functions": [ "RAW" ], "scope_object": "Device", "monitor_object": "Device", "attributes": [ "Infiniband_XmtConstraintErrors" ], "objects": [ "Grid.default.e41d2d0300167ee0" ], "name": "template", "description": "", "view_type": "Line" }
Response
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN"> <title>Redirecting...</title> <h1>Redirecting...</h1> <p>You should be redirected automatically to target URL: <a href="/ufmRest/app/monitoring/template">/ufmRest/app/monitoring/template</a>. If not click the link..
Status Codes
201 – created successfully
403 – bad request
Update Monitoring Template
Description – updates an existing monitoring template
Request URL – PUT /ufmRest/app/monitoring
Content type – Application/json
Request Data
{ "interval": <interval>, "functions": [ "<function>" ], "scope_object": "Device", "monitor_object": "Device", "attributes": [ "attribute " ], "objects": [ "object_id" ], "description": "", "view_type": "<view_type>" }
NoteRefer to the table in "Possible Attribute Values" for the list of attributes.
Status Codes
201 – updated successfully
403 – bad request
Get Monitoring Template
Description – retrieve information on an existing monitoring template
Request URL – GET ufmRest/app/monitoring/<template_name>
Content type – Application/json
Request Data
{ "functions": [ "RAW" ], "description": "N/A", "view_type": "Line", "template_name": "jhgljlj", "interval": 5, "objects": [ "Grid.default.e41d2d0300167ee0" ], "scope_object": "Device", "attributes": [ "Infiniband_XmtDiscards", "Infiniband_RcvErrors", "Infiniband_RcvRemotePhysErrors", "Infiniband_RcvConstraintErrors" ], "monitor_object": "Device", "name": "admin_jhgljlj" }
NoteRefer to the table in "Possible Attribute Values" for the list of attributes.
Status Codes
200 – OK
404 – not found
Get All Monitoring Templates
Description – returns a list of all existing monitoring templates
Request URL – GET ufmRest/app/monitoring
Content type – Application/json
Response
[ "template_name1", "template_name2", ]
Status Codes
200 – OK
Delete Monitoring Template
Description – remove an existing monitoring template
Request URL – DELETE ufmRest/app/monitoring/<template_name>
Content type – Application/json
Status Codes
200 – OK
404 – not found