NVIDIA MLNX_EN Documentation Rev 4.9-5.1.0.0 LTS

Introduction

This manual is intended for system administrators responsible for the installation, configuration, management and maintenance of the software and hardware of Ethernet adapter cards. It is also intended for application developers.

This document provides instructions on how to install the driver on NVIDIA ConnectX® network adapter solutions supporting the following uplinks to servers.

Uplink/NICs

Driver Name

Uplink Speed

ConnectX-3/ConnectX-3 Pro

mlx4

  • 10GbE, 40GbE and 56GbE1

ConnectX-4

mlx5

  • Ethernet: 1GbE, 10GbE, 25GbE, 40GbE, 50GbE, 56GbE1, and 100GbE

ConnectX-4 Lx

  • Ethernet: 1GbE, 10GbE, 25GbE, 40GbE, and 50GbE

ConnectX-5/ConnectX-5 Ex

  • Ethernet: 1GbE, 10GbE, 25GbE, 40GbE, 50GbE, and 100GbE

ConnectX-6

  • Ethernet - 10GbE, 25GbE, 40GbE, 50GbE2, 100GbE2, 200GbE2

ConnectX-6 Dx

  • Ethernet - 1GbE, 10GbE, 25GbE, 40GbE, 50GbE1, 100GbE1, 200GbE2

Innova™ IPsec EN

  • Ethernet: 10GbE, 40GbE

  1. 56 GbE is a NVIDIA propriety link speed and can be achieved while connecting a NVIDIA adapter card to
    NVIDIA SX10XX switch series, or connecting a NVIDIA adapter card to another NVIDIA adapter card.

  2. Supports both NRZ and PAM4 modes.

MLNX_EN driver release exposes the following capabilities:

  • Single/Dual port

  • Multiple Rx and Tx queues

  • Rx steering mode: Receive Core Affinity (RCA)

  • MSI-X or INTx

  • Adaptive interrupt moderation

  • HW Tx/Rx checksum calculation

  • Large Send Offload (i.e., TCP Segmentation Offload)

  • Large Receive Offload

  • Multi-core NAPI support

  • VLAN Tx/Rx acceleration (HW VLAN stripping/insertion)

  • Ethtool support

  • Net device statistics

  • SR-IOV support

  • Flow steering

  • Ethernet Time Stamping

Package Images

MLNX_EN is provided as an ISO image or as a tarball per Linux distribution and CPU architecture that includes source code and binary RPMs, firmware and utilities. The ISO image contains an installation script (called install) that performs the necessary steps to accomplish the following:

  • Discover the currently installed kernel

  • Uninstall any previously installed MLNX_OFED/MLNX_EN packages

  • Install the MLNX_EN binary RPMs (if they are available for the current kernel)

  • Identify the currently installed HCAs and perform the required firmware updates

Software Components

MLNX_EN contains the following software components:

Components

Description

mlx5 driver

mlx5 is the low level driver implementation for the ConnectX®-4 adapters designed by Mellanox Technologies. ConnectX®-4 operates as a VPI adapter.

mlx5_core

Acts as a library of common functions (e.g. initializing the device after reset) required by the ConnectX®-4 adapter cards.

mlx4 driver

mlx4 is the low level driver implementation for the ConnectX adapters designed by Mellanox Technologies. The ConnectX can operate as an InfiniBand adapter and as an Ethernet NIC.

To accommodate the two flavors, the driver is split into modules: mlx4_core, mlx4_en, and mlx4_ib.

Note: mlx4_ib is not part of this package.

mlx4_core

Handles low-level functions like device initialization and firmware commands processing. Also controls resource allocation so that the InfiniBand, Ethernet and FC functions can share a device without interfering with each other.

mlx4_en

Handles Ethernet specific functions and plugs into the netdev mid-layer.

mstflint

An application to burn a firmware binary image.

Software modules

Source code for all software modules (for use under conditions mentioned in the modules' LICENSE files)

Firmware

The image includes the following firmware item:

  • Firmware images (.bin format wrapped in the mlxfwmanager tool) for ConnectX®-2/ConnectX®-3/ConnectX®-3 Pro/ConnectX®-4 and ConnectX®-4 Lx network adapters

Directory Structure

The tarball image of MLNX_EN contains the following files and directories:

  • install - the MLNX_EN installation script

  • uninstall.sh - the MLNX_EN un-installation script

  • RPMS/ - directory of binary RPMs for a specific CPU architecture

  • src/ - directory of the OFED source tarball

  • mlnx_add_kernel_support.sh - a script required to rebuild MLNX_EN for customized kernel version on supported Linux distribution

mlx4 VPI Driver

mlx4 is the low-level driver implementation for the ConnectX® family adapters designed by Mellanox Technologies. ConnectX®-3 adapters can operate. To accommodate the supported configurations, the driver is split into the following modules:

mlx4_core

Handles low-level functions like device initialization and firmware commands processing. Also controls resource allocation so that the Ethernet functions can share the device without interfering with each other.

mlx4_en

A 10/40GigE driver under drivers/net/ethernet/mellanox/mlx4 that handles Ethernet specific functions and plugs into the netdev mid layer.

mlx5 Driver

mlx5 is the low-level driver implementation for the ConnectX®-4 adapters designed by Mellanox Technologies. ConnectX®-4 operates as a VPI adapter. The mlx5 driver is comprised of the following kernel module:

mlx5_core

Acts as a library of common functions (e.g. initializing the device after reset) required by ConnectX®-4 adapter cards. mlx5_core driver also implements the Ethernet interfaces for ConnectX®-4. Unlike mlx4_en/core, mlx5 drivers do not require the mlx5_en module as the Ethernet functionalities are built-in in the mlx5_core module.

Unsupported Features in MLNX_EN

  • InfiniBand protocol

  • Remote Direct Memory Access (RDMA)

  • Storage protocols that use RDMA, such as:

    • iSCSI Extensions for RDMA (iSER)

    • SCSI RDMA Protocol (SRP)

    • Sockets Direct Protocol (SDP)

mlx4 Module Parameters

In order to set mlx4 parameters, add the following line(s) to /etc/modprobe.d/mlx4.conf:

Copy
Copied!
            

options mlx4_core parameter=<value>

and/or

Copy
Copied!
            

options mlx4_en parameter=<value>

The following sections list the available mlx4 parameters.

mlx4_core Parameters

debug_level

Enable debug tracing if > 0 (int)

msi_x

0 - don't use MSI-X,

1 - use MSI-X,

>1 - limit number of MSI-X irqs to msi_x (non-SRIOV only) (int)

enable_sys_tune

Tune the cpu's for better performance (default 0) (int)

block_loopback

Block multicast loopback packets if > 0 (default: 1) (int)

num_vfs

Either a single value (e.g. '5') to define uniform num_vfs value for all devices functions or a string to map device function numbers to their num_vfs values (e.g. '0000:04:00.0-5,002b:1c:0b.a-15').

Hexadecimal digits for the device function (e.g. 002b:1c:0b.a) and decimal for num_vfs value (e.g. 15). (string)

probe_vf

Either a single value (e.g. '3') to indicate that the Hypervisor driver itself should activate this number of VFs for each HCA on the host, or a string to map device function numbers to their probe_vf values (e.g. '0000:04:00.0-3,002b:1c:0b.a-13').

Hexadecimal digits for the device function (e.g. 002b:1c:0b.a) and decimal for probe_vf value (e.g. 13). (string)

log_num_mgm_entry_size

log mgm size, that defines the num of qp per mcg, for example: 10 gives 248.range: 7 <= log_num_mgm_entry_size <= 12. To activate device managed flow steering when available, set to -1 (int)

high_rate_steer

Enable steering mode for higher packet rate (obsolete, set "Enable optimized steering" option in log_num_mgm_entry_size to use this mode). (int)

fast_drop

Enable fast packet drop when no recieve WQEs are posted (int)

enable_64b_cqe_eqe

Enable 64 byte CQEs/EQEs when the FW supports this if non-zero (default: 1) (int)

log_num_mac

Log2 max number of MACs per ETH port (1-7) (int)

log_num_vlan

(Obsolete) Log2 max number of VLANs per ETH port (0-7) (int)

log_mtts_per_seg

Log2 number of MTT entries per segment (0-7) (default: 0) (int)

port_type_array

Either pair of values (e.g. '1,2') to define uniform port1/port2 types configuration for all devices functions or a string to map device function numbers to their pair of port types values (e.g. '0000:04:00.0-1;2,002b:1c:0b.a-1;1').

Valid port types: 1-ib, 2-eth, 3-auto, 4-N/A

If only a single port is available, use the N/A port type for port2 (e.g '1,4').

log_num_qp

log maximum number of QPs per HCA (default: 19) (int)

log_num_srq

log maximum number of SRQs per HCA (default: 16) (int)

log_rdmarc_per_qp

log number of RDMARC buffers per QP (default: 4) (int)

log_num_cq

log maximum number of CQs per HCA (default: 16) (int)

log_num_mcg

log maximum number of multicast groups per HCA (default: 13) (int)

log_num_mpt

log maximum number of memory protection table entries per HCA (default: 19) (int)

log_num_mtt

log maximum number of memory translation table segments per HCA (default: max(20, 2*MTTs for register all of the host memory limited to 30)) (int)

enable_qos

Enable Quality of Service support in the HCA (default: off) (bool)

internal_err_reset

Reset device on internal errors if non-zero (default is 1) (int)

ingress_parser_mode

Mode of ingress parser for ConnectX3-Pro. 0 - standard. 1 - checksum for non TCP/UDP. (default: standard) (int)

roce_mode

Set RoCE modes supported by the port

ud_gid_type

Set gid type for UD QPs

log_num_mgm_entry_size

log mgm size, that defines the num of qp per mcg, for example: 10 gives 248.range: 7 <= log_num_mgm_entry_size <= 12 (default = -10).

use_prio

Enable steering by VLAN priority on ETH ports (deprecated) (bool)

enable_vfs_qos

Enable Virtual VFs QoS (default: off) (bool)

mlx4_en_only_mode

Load in Ethernet only mode (int)

enable_4k_uar

Enable using 4K UAR. Should not be enabled if have VFs which do not support 4K UARs (default: true) (bool)

mlx4_en_only_mode

Load in Ethernet only mode (int)

rr_proto

IP next protocol for RoCEv1.5 or destination port for RoCEv2. Setting 0 means using driver default values (deprecated) (int)

mlx4_en Parameters

inline_thold

The threshold for using inline data (int)

Default and max value is 104 bytes. Saves PCI read operation transaction, packet less then threshold size will be copied to hw buffer directly. (range: 17-104)

udp_rss:

Enable RSS for incoming UDP traffic (uint)

On by default. Once disabled no RSS for incoming UDP traffic will be done.

pfctx

Priority-based Flow Control policy on TX[7:0]. Per priority bit mask (uint)

pfcrx

Priority-based Flow Control policy on RX[7:0]. Per priority bit mask (uint)

udev_dev_port_dev_id

Work with dev_id or dev_port when supported by the kernel. Range: 0 <= udev_dev_port_dev_id <= 2 (default = 0).

udev_dev_port_dev_id:

Work with dev_id or dev_port when supported by the kernel. Range: 0 <= udev_dev_port_dev_id <= 2 (default = 0).

• 0: Work with dev_port if supported by the kernel, otherwise work with dev_id.

• 1: Work only with dev_id regardless of dev_port support.

• 2: Work with both of dev_id and dev_port (if dev_port is supported by the kernel). (int)

mlx5_core Module Parameters

The mlx5_core module supports a single parameter used to select the profile which defines the number of resources supported.

prof_sel

The parameter name for selecting the profile. The supported values for profiles are:

  • 0 - for medium resources, medium performance

  • 1 - for low resources

  • 2 - for high performance (int) (default)

guids

charp

node_guid

guids configuration. This module parameter will be obsolete!

debug_mask

debug_mask: 1 = dump cmd data, 2 = dump cmd exec time, 3 = both. Default=0 (uint)

probe_vf

probe VFs or not, 0 = not probe, 1 = probe. Default = 1 (bool)

num_of_groups

Controls the number of large groups in the FDB flow table.

Default=4; Range=1-1024

The following parameters, supported in mlx4 driver only, can be changed using the Devlink user interface:

Parameter

Description

Parameter Type

internal_error_reset

Enables resetting the device on internal errors

Generic

max_macs

Max number of MACs per ETH port

Generic

region_snapshot_enable

Enables capturing region snapshots

Generic

enable_64b_cqe_eqe

Enables 64 byte CQEs/EQEs when supported by FW

Driver-specific

enable_4k_uar

Enables using 4K UAR

Driver-specific

© Copyright 2023, NVIDIA. Last updated on May 23, 2023.