High-Frequency (Primary) Telemetry Fields
The following is a list of available counters which includes a variety of metrics related to timestamps, port and node information, error statistics, firmware versions, temperatures, cable details, power levels, and various other telemetry-related data.
Field Name |
Description |
timestamp |
|
source_id |
|
tag |
|
node_guid |
node GUID |
port_guid |
Port GUID |
port_num |
Port Number |
PortXmitDataExtended |
Transmitted data rate per egress port in bytes passing through the port during the sample period |
PortRcvDataExtended |
The received data on the ingress port in bytes during the sample period |
PortXmitPktsExtended |
Total number of packets transmitted on the port. |
PortRcvPktsExtended |
Total number of packets received on the port |
SymbolErrorCounterExtended |
This counter provides information on error bits that were not corrected by phy correction mechanisms. |
LinkErrorRecoveryCounterExtended |
Total number of times the Port Training state machine has successfully completed the link error recovery process. |
LinkDownedCounterExtended |
Perf.PortCounters |
PortRcvErrorsExtended |
Total number of packets containing an error that were received on the port |
PortRcvRemotePhysicalErrorsExtended |
Total number of packets marked with the EBP delimiter received on the port. |
PortRcvSwitchRelayErrorsExtended |
Total number of packets received on the port that were discarded because they could not be forwarded by the switch relay. |
PortXmitDiscardsExtended |
Total number of outbound packets discarded by the port because the port is down or congested. |
PortXmitConstraintErrorsExtended |
Total number of packets not transmitted from the switch physical port. |
PortRcvConstraintErrorsExtended |
Total number of packets received on the switch physical port that are discarded. |
LocalLinkIntegrityErrorsExtended |
The number of times that the count of local physical errors exceeded the threshold specified by LocalPhyErrors |
ExcessiveBufferOverrunErrorsExtended |
The number of times that OverrunErrors consecutive flow control update periods occurred, each having at least one overrun error |
VL15DroppedExtended |
Number of incoming VL15 packets dropped due to resource limitations (e.g., lack of buffers) in the port |
PortXmitWaitExtended |
The time an egress port had data to send but could not send it due to lack of credits or arbitration - in time ticks within the sample-time window |
hist[0-4] |
Hist[i] give the number of FEC blocks that had RS-FEC symbols errors of value i or range of errors |
infiniband_CBW |
|
Normalized_CBW |
|
NormalizedXW |
|
Normalized_XmitData |
The following is a list of available counters which includes a variety of metrics related to timestamps, port and node information, error statistics, firmware versions, temperatures, cable details, power levels, and various other telemetry-related data.
Field Name |
Description |
timestamp |
|
source_id |
|
tag |
|
node_guid |
node GUID |
port_guid |
Port GUID |
port_num |
Port Number |
PortXmitDataExtended |
Transmitted data rate per egress port in bytes passing through the port during the sample period |
PortRcvDataExtended |
The received data on the ingress port in bytes during the sample period |
PortXmitPktsExtended |
Total number of packets transmitted on the port. |
PortRcvPktsExtended |
Total number of packets received on the port |
SymbolErrorCounterExtended |
|
LinkErrorRecoveryCounterExtended |
|
LinkDownedCounterExtended |
|
PortRcvErrorsExtended |
|
PortRcvRemotePhysicalErrorsExtended |
|
PortRcvSwitchRelayErrorsExtended |
|
PortXmitDiscardsExtended |
|
PortXmitConstraintErrorsExtended |
|
PortRcvConstraintErrorsExtended |
|
LocalLinkIntegrityErrorsExtended |
|
ExcessiveBufferOverrunErrorsExtended |
|
VL15DroppedExtended |
|
PortXmitWaitExtended |
|
hist[0-4] |
Hist[i] give the number of FEC blocks that had RS-FEC symbols errors of value i or range of errors |
infiniband_CBW |
|
Normalized_CBW |
|
NormalizedXW |
|
Normalized_XmitData |