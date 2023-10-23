The IP over IB (IPoIB) ULP driver is a network interface implementation over InfiniBand. IPoIB encapsulates IP datagrams over an InfiniBand Connected or Datagram transport service. The IPoIB driver, ib_ipoib, exploits the following capabilities:

VLAN simulation over an InfiniBand network via child interfaces

High Availability via Bonding

Varies MTU values: up to 4k in Datagram mode up to 64k in Connected mode

Uses any ConnectX® IB ports (one or two)

Inserts IP/UDP/TCP checksum on outgoing packets

Calculates checksum on received packets

Support net device TSO through ConnectX® LSO capability to defragment large data- grams to MTU quantas.

Dual operation mode - datagram and connected

Large MTU support through connected mode

IPoIB also supports the following software based enhancements:

Giant Receive Offload

NAPI

Ethtool support

Warning This feature is supported only on ConnectX-4 adapter cards and in IPoIB ULP mode.

The eth_ipoib driver provides a standard Ethernet interface to be used as a Physical Interface (PIF) into the Hypervisor virtual network, and serves one or more Virtual Interfaces (VIF). This driver supports L2 Switching (Direct Bridging) as well as other L3 Switching modes (e.g. NAT). This document explains the configuration and driver behavior when configured in Bridging mode.

In virtualization environment, a virtual machine can be expose to the physical network by performing the next setting:

Create a virtual bridge Attach the para-virtualized interface created by the eth_ipoib driver to the bridge Attach the Ethernet interface in the Virtual Machine to that bridge

The diagram below describes the topology that was created after these steps:



The diagram shows how the traffic from the Virtual Machine goes to the virtual-bridge in the Hypervisor and from the bridge to the eIPoIB interface. eIPoIB interface is the Ethernet interface that enslaves the IPoIB interfaces in order to send/receive packets from the Ethernet interface in the Virtual Machine to the IB fabric beneath.

Warning You must switch to ULP mode in order to be able to operate the eIPoIB driver. This can be done by setting ib_ipoib module parameter "ipoib_enhanced" to 0.

Once the mlnx_ofed driver installation is completed, perform the following:

Open the /etc/infiniband/openib.conf file and include: Copy Copied! LOAD_EIPOIB=yes Restart the InfiniBand drivers. Copy Copied! /etc/init.d/openibd restart

When eth_ipoib is loaded, number of eIPoIB interfaces are created, with the following default naming scheme: ethX, where X represents the ETH port available on the system.

To check which eIPoIB interfaces were created:

Copy Copied! cat /sys/ class /net/eth_ipoib_interfaces

For example, on a system with dual port HCA, the following two interfaces might be created; eth4 and eth5.

Copy Copied! cat /sys/ class /net/eth_ipoib_interfaces eth4 over IB port: ib0 eth5 over IB port: ib1

These interfaces can be used to configure the network for the guest. For example, if the guest has a VIF that is connected to the Virtual Bridge br0, then enslave the eIPoIB interface to br0 by running:

Copy Copied! brctl addif br0 ethX

Warning In RHEL KVM environment, there are other methods to create/configure your virtual net- work (e.g. macvtap). For additional information, please refer to the Red Hat User Manual.

The IPoIB daemon (ipoibd) detects the new virtual interface that is attached to the same bridge as the eIPoIB interface and creates a new IPoIB instances for it in order to send/receive data. As a result, number of IPoIB interfaces (ibX.Y) are shown as being created/destroyed, and are being enslaved to the corresponding ethX interface to serve any active VIF in the system according to the set configuration, This process is done automatically by the ipoibd service.

To see the list of IPoIB interfaces enslaved under eth_ipoib interface:

Copy Copied! cat /sys/ class /net/ethX/eth/vifs

For example:

Copy Copied! # cat /sys/ class /net/eth5/eth/vifs SLAVE=ib0. 1 MAC=9a:c2:1f:d7:3b: 63 VLAN=N/A SLAVE=ib0. 2 MAC= 52 : 54 : 00 : 60 : 55 : 88 VLAN=N/A SLAVE=ib0. 3 MAC= 52 : 54 : 00 : 60 : 55 : 89 VLAN=N/A

Each ethX interface has at least one ibX.Y slave to serve the PIF itself. In the VIFs list of ethX you will notice that ibX.1 is always created to serve applications running from the Hypervisor on top of the ethX interface directly.

For InfiniBand applications that require native IPoIB interfaces (e.g. CMA), the original IPoIB interfaces ibX can still be used. For example, CMA and ethX drivers can co-exist and make use of IPoIB ports; CMA can use ib0, while eth0.ipoib interface will use ibX.Y interfaces.

To see the list of eIPoIB interfaces:

Copy Copied! cat /sys/ class /net/eth_ipoib_interfaces

For example:

Copy Copied! # cat /sys/ class /net/eth_ipoib_interfaces eth4 over IB port: ib0 eth5 over IB port: ib1

The example above shows two eIPoIB interfaces, where eth4 runs traffic over ib0, and eth5 runs traffic over ib1.

Virtual Network Example

This example shows a few IPoIB instances that serve the virtual interfaces at the Virtual Machines.

To display the services provided to the Virtual Machine interfaces:

Copy Copied! cat /sys/ class /net/eth0/eth/vifs

Example:

Copy Copied! cat /sys/ class /net/eth0/eth/vifs SLAVE=ib0. 2 MAC= 52 : 54 : 00 : 60 : 55 : 88 VLAN=N/A

In the example above the ib0.2 IPoIB interface serves the MAC 52:54:00:60:55:88 with no VLAN tag for that interface.

eIPoIB driver supports VLAN Switch Tagging (VST) mode, which enables the virtual machine interface to have no VLAN tag over it, thus allowing VLAN tagging to be handled by the Hypervisor.

To attach a Virtual Machine interface to a specific isolated tag:

Verify the VLAN tag to be used has the same pkey value that is already configured on that ib port. Copy Copied! cat /sys/ class /infiniband/mlx4_0/ports/<ib port>/pkeys/* Create a VLAN interface in the Hypervisor, over the eIPoIB interface. Copy Copied! vconfig add <eIPoIB interface > <vlan tag> Attach the new VLAN interface to the same bridge that the virtual machine interface is already attached to. Copy Copied! brctl addif <br-name> < interface -name>

For example, to create the VLAN tag 3 with pkey 0x8003 over that port in the eIPoIB interface eth4, run:

Copy Copied! #vconfig add eth4 3 #brctl addif br2 eth4. 3





Use 4K MTU over OpenSM. Copy Copied! Default= 0xffff , ipoib, mtu= 5 : ALL=full;

Use MTU for 4K (4092 Bytes): In UD mode, the maximum MTU value is 4092 Bytes

Make sure all interfaces (including the guest interface and its virtual bridge) have the same MTU value (MTU 4092 Bytes).

For further information on MTU settings, please refer to the Hypervisor User Manual.



Tune the TCP/IP stack using sysctl (dom0/domu) Copy Copied! /sbin/sysctl_perf_tuning