The xmit_wait increment behavior on NVIDIA hardware may vary depending on the generation of the devices.
As per the InfiniBand specification, the xmit_wait counter represents the number of ticks in which the selected port had data to transmit, but no data was sent during the entire tick. This can happen due to insufficient credits or a lack of arbitration.
A tick is defined as a multiple of the time required to transfer one byte on an IBA lane, which is also known as the symbol time. For instance, in case of links operating at IBA SDR (Single Data Rate), the symbol time is 4 nsec regardless of the link width. Implementers can use a variety of multipliers of the basic tick interval.
The tick encoding for the portsamplecontrol attribute is:
0x00
= 1x symbol time (4
nanoseconds for
SDR)
0x01
= 2x symbol time
...
0xFF
= 256x symbol time
To confirm the tick setting of a device, users can employ the perfquery tool, which displays the portsamplecontrol parameters for the desired port.
E.g., Querying device with LID 32 / port 5:
$ perfquery -c 32
5
# PortSamplesControl: Lid 32
port 5
OpCode:..........................0xff
PortSelect:......................0
Tick:............................0x07
The table below illustrates how the xmit_wait counter increments on different device hardware, and provides instructions for converting the port's xmit_wait counter to BW loss. This will enable users to evaluate the congestion on the port.
ASIC | HW Behavior | PORTSAMPLESCONTROL indication | Congestion Calculation |
Quantum, Quantum-2,Quantum-3 | Increments xmit_wait every 8 bytes – regardless of the port width. | HW counts 8B per port hence port will report: | Lost bytes per port = Counter*(tick+1) |
SwitchIB-2, Switch-IB, Switch-X | Increments xmit_wait every 32 bytes per lane | HW counts 32B per lane hence port will report: | Lost bytes per port = Counter*(tick+1)*#lanes |
To calculate the lost bandwidth rate in Gb/s per port, one can use the following formula: