Mellanox WinOF VPI Documentation v5.50.52000
Linux Kernel Upstream Release Notes v6.5

Tunable Performance Parameters

The following is a list of key parameters for performance tuning.

  • Jumbo Packet: The maximum available size of the transfer unit, also known as the Maximum Transmission Unit (MTU). For IPoIB, the MTU should not include the size of the IPoIB header (=4B). For example, if the network adapter card supports a 4K MTU, the upper threshold for payload MTU is 4092B and not 4096B. The MTU of a network can have a substantial impact on performance. A 4K MTU size improves performance for short messages, since it allows the OS to coalesce many small messages into a large one.

    • Valid MTU values range for an Ethernet driver is between 614 and 9614.

    • Valid MTU values range for an IPoIB driver is between 1500 and 4092.

      Warning

      All devices on the same physical network, or on the same logical network, must have the same MTU.

  • Receive Buffers: The number of receive buffers

  • Send Buffers: The number of send buffers

  • Performance Options: Configures parameters that can improve adapter performance.

    • Interrupt Moderation: Moderates or delays the interrupts’ generation. Hence, optimizes network throughput and CPU utilization (default Enabled).

      • When the interrupt moderation is enabled, the system accumulates interrupts and sends a single interrupt rather than a series of interrupts. An interrupt is generated after receiving 5 packets or after 10ms from the first packet received. It improves performance and reduces CPU load however, it increases latency.

      • When the interrupt moderation is disabled, the system generates an interrupt each time a packet is received or sent. In this mode, the CPU utilization data rates increase, as the system handles a larger number of interrupts. However, the latency decreases as the packet is handled faster.

  • Receive Side Scaling (RSS Mode): Improves incoming packet processing performance. RSS enables the adapter port to utilize the multiple CPUs in a multi-core system for receiving incoming packets and steering them to the designated destination. RSS can significantly improve the number of transactions, the number of connections per second, and the network throughput.
    This parameter can be set to one of the following values:

    • Enabled (default): Set RSS Mode

    • Disabled: The hardware is configured once to use the Toeplitz hash function, and the indirection table is never changed.

      Warning

      IOAT is not used while in RSS mode.

    • Receive Completion Method: Sets the completion methods of the received packets, and can affect network throughput and CPU utilization.

    • Polling Method: Increases the CPU utilization as the system polls the received rings for the incoming packets. However, it may increase the network performance as the incoming packet is handled faster.

    • Interrupt Method: Optimizes the CPU as it uses interrupts for handling incoming messages. However, in certain scenarios it can decrease the network throughput.

    • Adaptive (Default Settings): A combination of the interrupt and polling methods dynamically, depending on traffic type and network usage. Choosing a different setting may improve network and/or system performance in certain configurations.

  • Interrupt Moderation RX Packet Count: Number of packets that need to be received before an interrupt is generated on the receive side (default 5).

  • Interrupt Moderation RX Packet Time: Maximum elapsed time (in usec) between the receiving of a packet and the generation of an interrupt, even if the moderation count has not been reached (default 10).

  • Rx Interrupt Moderation Type: Sets the rate at which the controller moderates or delays the generation of interrupts making it possible to optimize network throughput and CPU utilization. The default setting (Adaptive) adjusts the interrupt rates dynamically depending on the traffic type and network usage. Choosing a different setting may improve network and system performance in certain configurations.

  • Send completion method: Sets the completion methods of the Send packets and it may affect network throughput and CPU utilization.

  • Interrupt Moderation TX Packet Count: Number of packets that need to be sent before an interrupt is generated on the send side (default 0).

  • Interrupt Moderation TX Packet Time: Maximum elapsed time (in usec) between the sending of a packet and the generation of an interrupt even if the moderation count has not been reached (default 0).

  • Offload Options: Allows you to specify which TCP/IP offload settings are handled by the adapter rather than the operating system.
    Enabling offloading services increases transmission performance as the offload tasks are performed by the adapter hardware rather than the operating system. Thus, freeing CPU resources to work on other tasks.

    • IPv4 Checksums Offload: Enables the adapter to compute IPv4 checksum upon transmit and/or receive instead of the CPU (default Enabled).

    • TCP/UDP Checksum Offload for IPv4 packets: Enables the adapter to compute TCP/UDP checksum over IPv4 packets upon transmit and/or receive instead of the CPU (default Enabled).

    • TCP/UDP Checksum Offload for IPv6 packets: Enables the adapter to compute TCP/UDP checksum over IPv6 packets upon transmit and/or receive instead of the CPU (default Enabled).

    • Large Send Offload (LSO): Allows the TCP stack to build a TCP message up to 64KB long and sends it in one call down the stack. The adapter then re-segments the message into multiple TCP packets for transmission on the wire with each pack sized according to the MTU. This option offloads a large amount of kernel processing time from the host CPU to the adapter.

  • IB Options: Configures parameters related to InfiniBand functionality.

    • SA Query Retry Count: Sets the number of SA query retries once a query fails. The valid values are 1 - 64 (default 10).

    • SA Query Timeout: Sets the waiting timeout (in millisecond) of an SA query completion. The valid values are 500 - 60000 (default 1000 ms).

© Copyright 2023, NVIDIA. Last updated on May 23, 2023.