Device Proprietary Counters
Device propriety counters are per device and not per port.
These counters are intended for advanced debug of performance issues and may be used by Mellanox support to identify root cause in such cases. They do not necessarily indicate the existence of a problem but are often useful as additional information in the debug of performance issues.
Name |
Description |
PCI Back-pressure/sec |
Device core clocks without PCIe read/write credits. This value will be larger if the Host’s ability to receive data from the NIC is lower. Possible causes: the memory accessed is not cached or aligned properly, or CPU frequency is low or throttled by power management. |
No-WQE drops/sec |
Number of times per second a received queue from the device to the host has no software buffers (WQE - Work Queue Entries) allocated for the adapter's incoming traffic. This counter indicates that the NIC hardware was not able to post received data to the host due to lack of software allocated buffers. Possible causes: Slow or overloaded CPU cores. Possible fixes: Increase the number of receive buffers in the driver's advanced properties tab. This counter is summed in OID_GEN_STATISTICS.ifInDiscards and is not counted in Packets Received. |
Scatter Back-pressure/sec |
Device core clocks where Scatter delays Rx packet processing. Supported only on ConnectX3-Pro. |
WQE fetch/Atomic Back-pressure/sec |
Device core clocks where Work-Queue-Element fetch or Atomic operation delay Rx packet processing. Supported only on ConnectX3-Pro. |
Steering/QPC Back-pressure/sec |
Device core clocks where packet steering or queue-context handling delay Rx packet processing. Supported only on ConnectX3-Pro. |
SQ Miss/sec |
Transmit-queue/Requestor-QP context cache miss. |
RQ Miss/sec |
Receive-queue/Responder-QP context cache miss. |
CQ Miss/sec |
Completion-Queue (CQ) context cache miss. |
EQ Miss/sec |
Event-Queue (EQ) context cache miss. |
MTT Miss/sec |
Address translation page table (MTT) cache miss. |
MPT Miss/sec |
Address translation region table (MPT) cache miss. |
External Blueflame hit/sec |
Latency critical work-queue-element (BlueFlame) read from NIC buffer. |
External Blueflame replace/sec |
Latency critical work-queue-element (BlueFlame) swap out from NIC buffer. |
External Doorbell push/sec |
Amount of doorbells received. |
External Doorbell drop/sec |
Amount of doorbells dropped. |
This set of counters contains device’s low-level counters used for debugging and behavior analysis.
Mellanox WinOF Bus Counters |
Description |
PCI Back-pressure/sec |
Device core clocks without PCIe read/write credits. |
No-WQE Drops/sec |
The amount of packet drops due to no available receive buffers in the host. |
Scatter Back-pressure/sec |
Device core clocks where the Scatter delays Rx packet processing. Supported only on Connectx3-Pro. |
WQE fetch/Atomic Back-pressure/sec |
Device core clocks where Work-Queue-Element fetch or Atomic operation delay Rx packet processing. Supported only on Connectx3-Pro. |
Steering/QPC Back-pressure/sec |
Device core clocks where packet steering or queue-context handling delay Rx packet processing. Supported only on Connectx3-Pro. |
Receive WQE cache hit/sec |
The number of receive WQE cache lookups resulted in a hit. |
Receive WQE cache lookup/sec |
The number of receive WQE cache lookups. |
SQ Miss/sec |
Transmit-queue/Requester-QP context cache miss. |
RQ Miss/sec |
Receive-queue/Responder-QP context cache miss. |
CQ Miss/sec |
Completion-Queue (CQ) context cache miss. |
EQ Miss/sec |
Event-Queue (EQ) context cache miss. |
MTT Miss/sec |
Address translation page table (MTT) cache miss. |
MPT Miss/sec |
Address translation region table (MPT) cache miss. |
External Blueflame hit/sec |
Latency critical work-queue-element (BlueFlame) read from NIC buffer. |
External Blueflame Replace/sec |
Latency critical work-queue-element (BlueFlame) swap out from NIC buffer. |
External Doorbell Push/sec |
Amount of doorbells received. |
External Doorbell Drop/sec |
Amount of doorbells dropped. |
Internal Processor0 Maximum Latency |
The longest internal processor[0] process cycle in microSec. |
Internal Processor1 Maximum Latency |
The longest internal processor[1] process cycle in microSec. |
Internal Processor2 Maximum Latency |
The longest internal processor[2] process cycle in microSec. |
Internal Processor3 Maximum Latency |
The longest internal processor[3] process cycle in microSec. |
Internal processor executed commands |
The number of commands executed by the internal processor due to driver request via HCR command interface. |
Last Retransmitted QP |
The last QP that performed retransmission - RC QP only. |
Current QPS in error state |
The number of QPs in error state due to async error (e.g. retry exceeded) or due to CMD with errors (e.g. 2eer_qp cmd). |
QP priority update flow events |
The number of QP priority/SL update events. |
Transmission engine hang events |
The number of SX execution engine hang events. |
Current QPS in limited state |
The number of QPs that are in a limited state. |
Total QPS in limited state |
The total number of QPs that were in limited state. |
Maximum QPS in limited state |
Maximum number of QPs that were in limited state at the same time |
MPT entries used for QP |
The number of Memory Protection Table (MPT) entries used for QPs. |
MPT entries used for CQ |
The number of Memory Protection Table (MPT) entries used for CQs. |
MPT entries used for EQ |
The number of Memory Protection Table (MPT) entries used for EQs. |
MPT entries used for MR |
The number of Memory Protection Table (MPT) entries used for MRs. |
MTT entries used for QP |
The number of Memory Translation Table (MTT) entries used for QPs. |
MTT entries used for CQ |
The number of Memory Translation Table (MTT) entries used for CQs. |
MTT entries used for EQ |
The number of Memory Translation Table (MTT) entries used for EQs. |
MTT entries used for MR |
The number of Memory Translation Table (MTT) entries used for MRs. |
CPU MEM-pages (4K) mapped by TPT for QP |
The total number of CPU memory pages (4K) mapped by TPT for QPs. |
CPU MEM-pages (4K) mapped by TPT for CQ |
The total number of CPU memory pages (4K) mapped by TPT for CQs. |
CPU MEM-pages (4K) mapped by TPT for EQ |
The total number of CPU memory pages (4K) mapped by TPT for EQs. |
CPU MEM-pages (4K) mapped by TPT for MR |
The total number of CPU memory pages (4K) mapped by TPT for MRs. |
Arrived RDMA CNPs |
The total number of received CNP packets for both ports. |
Packets discarded due to invalid QP |
The number of packets discarded due to an invalid QP. |