40.44.0208

General This is the initial firmware release of NVIDIA® ConnectX®-8 SuperNIC. ConnectX-8 has the same feature set as ConnectX-7 adapter card. For the list of the ConnectX-7 firmware features, please see ConnectX-7 Firmware Release Notes. The features described here are new features in addition to the ConnectX-7 set.

Link Speed NVIDIA® ConnectX®-8 SuperNIC supports 800Gb/s or XDR IB or 2 x 400GbE link speeds. Note: 800GbE link speed is not supported on a single port.

Planarized Topology Network ConnectX®-8 SuperNIC uses planarized topology network to reach Extended Data Rate (XDR) performance.

Direct NIC-GPU Datapath To read/write data directly from the GPU and to overcome grace CPU PCIe bandwidth issue a direct NIC-GPU datapath is required. To do so, the HCA exposes a side DMA engine as an additional PCIe function which is called “Data Direct”. This additional DMA engine allows vHCA access data buffers using MKEY through it, providing multiple PCIe data path interfaces. Such behavior is needed in a scenario where different memory region requires different PCIe data path, i.e NUMA (Non Uniform Memory Access) systems. A vHCA is allowed to use a Data Direct function if HCA_CAP.data_direct is set. To use the Data Direct interface, the vHCA should create an MKEY with the data_direct bit set. The MKEY returned enables access through the side DMA engine. The MKEY access mode must be PA. It supports only the following fields: a, rw, rr, lw, lr, relaxed_ordering_write, relaxed_ordering_read, mkey[7:0], length64, pd, start_addr, len . All other fields are reserved.

Congestion Control Congestion Control provides performance isolation when multiple applications running on the same cluster. Additionally, it prevents congestion spreading when there is a slow receiver, reduce latency in the cluster, improves fairness, prevents parking-lot effects and packet's drop in lossy networks.

Multiple Encapsulation/Decapsulation Operation on a Packet This capability enables the encapsulation table to be opened on both the FDB and the NIC tables together.

Crypto Algorithms Extended the role-based authentication to cover all crypto algorithms. Now the TLS. IPsec. MACsec. GCM, mem2mem , and NISP work when nv_crypto_conf.crypto_policy = CRYPTO_POLICY_FIPS_LEVEL_2 , meaning all cryptographic engines can also work in wrapped mode and not only in plaintext mode.

RoCE: Adaptive Timer Enabled ADP timer to allow the user to configure RC or DC qp_timeout values lower than 16.

Multiple-Window in DPA Mode Multi-window capability is now supported in DPA mode.

Doorbell Less QP The new capability enables the user to send a queue without a doorbell record. To create a doorbell less QP/SP, set send_dbr_mode = 1 in qp/sq ctx as defined in the PRM.

Packet's Flow Label Fields The flow_label fields can be set, added or copied from the packet.

ODP Event The following prefetch fields are available ODP event: pre_demand_fault_pages, post_demand_fault_pages