NVIDIA ConnectX-8 SuperNIC User Manual

Introduction

The NVIDIA® ConnectX®-8 SuperNIC™ is optimized to supercharge hyperscale AI computing workloads. With support for both InfiniBand and Ethernet networking at up to 800 gigabits per second (Gb/s), ConnectX-8 SuperNIC delivers high-speed, efficient network connectivity, significantly enhancing system performance for AI factories and cloud data center environments.

Powerful Networking for Generative AI

Central to NVIDIA’s AI networking portfolio, ConnectX-8 SuperNICs fuel the next wave of innovation in forming accelerated, massive-scale AI compute fabrics. They seamlessly integrate with next-gen NVIDIA networking platforms, providing end-

to-end 800Gb/s connectivity. These platforms offer the robustness, feature sets, and scalability required for trillion-parameter GPU computing and generative AI applications.

With enhanced power efficiency, ConnectX-8 SuperNICs support the creation of sustainable AI data centers operating hundreds of thousands of GPUs, ensuring a future-ready infrastructure for AI advancements.

ConnectX-8 SuperNICs enable advanced routing and telemetry-based congestion control capabilities, achieving the highest network performance and peak AI workload efficiency. Additionally, ConnectX-8 InfiniBand SuperNICs extend the

capabilities of NVIDIA® Scalable Hierarchical Aggregation and Reduction Protocol (SHARP)™ to boost In-network computing in high-performance computing environments, further enhancing overall efficiency and performance.

There are two available extension options:

  1. For 900-9X81Q-00CN-ST0 and 900-9X81E-00EX-ST0: Utilizing the Socket-Direct/Multi-Host capability, where the PCIe extension card is connected to the SuperNIC, and is used as an end-point.

  2. For 900-9X81E-00EX-DT0: Utilizing the Down Stream Port (DSP) option, where the MCIO connector is used as a root complex for storage devices (GPUs or SSDs).

Socket Direct SuperNICs

The Socket Direct™ technology offers improved performance to dual-socket servers by enabling direct access from each CPU in a dual-socket server to the network through its dedicated PCIe interface. Utilizing the Socket-Direct or the Multi-Host capability, the PCIe extension card is connected to the SuperNIC and is used as an end-point extension.

NVIDIA offers ConnectX-8 Socket Direct, which enables 800Gb/s or 400Gb/s connectivity for servers with PCIe Gen5 or Gen4 capability, respectively. The SuperNIC's 32-lane PCIe bus is split into two 16-lane buses, with one bus accessible through a PCIe x16 edge connector and the other bus through an x16 Auxiliary PCIe Connection card. The two cards should be installed into two PCIe x16 slots and connected using an MCIO harness.

Please order the additional PCIe Auxiliary Card kit to use the SuperNIC in the Socket-Direct configuration. SuperNICs that support Socket Direct can function as separate x16 PCIe cards.

For more information, please refer to the PCIe Auxiliary Card Kit.

Down Stream Port (DSP)

The ConnectX-8 SuperNIC with downstream port extension option provides connectivity to the server backplane or PCIe switch through the MCIO connector.

The default PCI interface is x4 x 4 to manage four SSD devices.

Item

Description

PCI Express Slot

In PCIe x16 Configuration

PCIe Gen6 @ 64GT/s through x16 edge connector

In Socket Direct/Multi-Host Configuration (2x PCIe x16)

  • PCIe Gen5 SERDES @32GT/s through edge connector

  • PCIe Gen5 SERDES @32GT/s through PCIe Auxiliary Connection Card or SFF-TA-1016 MCIO

In PCIe x16 Extension Option - Switch DSP (Data Stream Port)

  • PCIe Gen6 (64GT/s) through x16 edge connector

System Power Supply

Refer to Specifications

Operating System

  • In-box drivers for major operating systems:

    • Linux: RHEL, Ubuntu

    • Windows

  • DOCA Host

  • OpenFabrics Windows Distribution (WinOF-2)

Connectivity

  • Interoperable with 25/100/200/400 Gb/s Ethernet switches and SDR/EDR/HDR100/HDR/NDR/XDR InfiniBand switches

  • Passive copper cable with ESD protection

  • Powered connectors for optical and active cable support

Category

Qty

Item

Cards

1

ConnectX-8 SuperNIC

Accessories

1

Short bracket

1

Tall bracket (shipped assembled on the SuperNIC)

Optional Accessories, not included in the package, can be purchased separately:

OPN

Description

930-9XAX6-0025-000

NVIDIA SocketDirect/MultiHost Auxiliary Kit for Additional PCIe Gen6x16 Connection, 250mm MCIO Harness

930-9XCBL-000A-000

NVIDIA ConnectX-8 200mm Cable Extender for Low-Speed Signals Over 30p Debug Connector

Note

Make sure to use a PCIe slot capable of supplying the required power and airflow to the ConnectX-8 SuperNICs as stated in the Specifications chapter.

Note

This section describes hardware features and capabilities. Please refer to the relevant driver and firmware release notes for feature availability.

PCI Express (PCIe)

According to the OPN you have purchased, the card uses the following PCIe express interfaces:

  • PCIe x16 configurations:

    PCIe Gen6 (64GT/s) through x16 edge connector

  • 2x PCIe x16 configurations (Socket-Direct/Mult-Host):

    PCIe Gen6/5 ( SERDES @ 64GT/s / 32GT/s) through x16 edge connector

    PCIe Gen5 SERDES @ 32GT/s through PCIe Auxiliary Connection Card

  • 2x PCIe x16 configurations (PCIe Down Stream Port Extension Option):

    PCIe Gen6/5 ( SERDES @ 64GT/s / 32GT/s) through x16 edge connector

InfiniBand Architecture Specification v1.7 compliant

ConnectX-8 delivers low latency, high bandwidth, and computing efficiency for high-performance computing (HPC), artificial intelligence (AI), and hyperscale cloud data center applications. ConnectX-8 is InfiniBand Architecture Specification v1.7 compliant.

InfiniBand Network Protocols and Rates

Protocol

Standard

Rate (Gb/s)

Encoding

4x

(4 lane)

Port

2x

(2 lane)

Port

1x

(1 lane)

Port

800G XDR

IBTA Vol1 1.7

--

425

212.5

PAM4

NDR

IBTA Vol2 1.5

425

212.5

106.25

PAM4

HDR / HDR100

IBTA Vol2 1.4

212.5

106.25

106.25

PAM4

Up to 400 Gigabit Ethernet

ConnectX-8 SuperNICs comply with the following IEEE 802.3 standards:

400GbE / 200GbE / 100GbE / 25GbE / 10GbE

Protocol

Ethernet Network Rate

IEEE802.3ck

100/200/400Gb/s Gigabit Ethernet

(Include ETC enhancement)

IEEE802.3cd

IEEE802.3bs

IEEE802.3cm

IEEE802.3cn

IEEE802.3cu

50/100/200/400Gb/s Gigabit Ethernet

(Include ETC enhancement)

IEEE 802.3bj

IEEE 802.3bm

100 Gigabit Ethernet

IEEE 802.3by

Ethernet Technology Consortium

25/50 Gigabit Ethernet

IEEE 802.3ba

40 Gigabit Ethernet

IEEE 802.3ae

10 Gigabit Ethernet

IEEE 802.3cb

2.5/5 Gigabit Ethernet

(For 2.5: support only 2.5 x1000BASE-X)

IEEE 802.3ap

Based on auto-negotiation and KR startup

IEEE 802.3ad

IEEE 802.1AX

Link Aggregation

IEEE 802.1Q

IEEE 802.1P VLAN tags and priority

IEEE 802.1Qau (QCN)

Congestion Notification

IEEE 802.1Qaz (ETS)

EEE 802.1Qbb (PFC)

IEEE 802.1Qbg

IEEE 1588v2

IEEE 802.1AE (MACSec)

Jumbo frame support (9.6KB)

Memory Components

  • SPI - includes 512Mbit SPI Quad Flash device.

  • FRU EEPROM - Stores the parameters and personality of the SuperNIC. The EEPROM capacity is 128Kbit. FRU I2C address is (0x50) and is accessible through the PCIe SMBus. (Note: Address 0x58 is reserved.)

Overlay Networks

In order to better scale their networks, data center operators often create overlay networks that carry traffic from individual virtual machines over logical tunnels in encapsulated formats such as NVGRE and VXLAN. While this solves network scalability issues, it hides the TCP packet from the hardware offloading engines, placing higher loads on the host CPU. ConnectX-8 effectively addresses this by providing advanced NVGRE and VXLAN hardware offloading engines that encapsulate and de-capsulate the overlay protocol.

Quality of Service (QoS)

Support for port-based Quality of Service enabling various application requirements for latency and SLA.

Hardware-based I/O Virtualization

ConnectX-8 provides dedicated adapter resources and guaranteed isolation and protection for virtual machines within the server.

SR-IOV

ConnectX-8 SR-IOV technology provides dedicated adapter resources and guaranteed isolation and protection for virtual machines (VM) within the server.

High-Performance Accelerations

  • Vector collective operations offload

  • MPI_Alltoall offloads

  • Rendezvous protocol offload

Secure Boot

The secure boot process assures the booting of authentic firmware/software that is intended to run on ConnectX-8. This is achieved using cryptographic primitives using asymmetric cryptography. ConnectX-8 supports several cryptographic functions in its HW Root-of-Trust (RoT) that has its key stored in on-chip FUSES.

Secure Firmware Update

The Secure firmware update feature enables a device to verify digital signatures of new firmware binaries to ensure that only officially approved versions can be installed from the host, the network, or a Board Management Controller (BMC). The firmware of devices with “secure firmware update” functionality (secure FW), restricts access to specific commands and registers that can be used to modify the firmware binary image on the flash, as well as commands that can jeopardize security in general.

For further information, refer to the MFT User Manual.

Host Management

ConnectX-8 technology maintains support for host manageability through a BMC. ConnectX-8 PCIe stand-up adapter can be connected to a BMC using MCTP over SMBus or MCTP over PCIe protocols as if it is a standard NVIDIA PCIe stand-up SuperNIC. For configuring the adapter for the specific manageability solution in use by the server, please contact NVIDIA Support.

  • Protocols: PLDM, NCSI

  • Transport layer – RBT, MCTP over SMBus, and MCTP over PCIe

  • Physical layer: SMBus 2.0 / I2C interface for device control and configuration, PCIe

  • PLDM for Monitor and Control DSP0248

  • PLDM for Firmware Update DSP026

  • IEEE 1149.6

  • Secured FW update

  • FW Recovery

  • NIC reset

  • Monitoring and control

  • Network port settings

  • Boot setting

RDMA and RDMA over Converged Ethernet (RoCE)

ConnectX-8, utilizing IBTA RDMA (Remote Data Memory Access) and RoCE (RDMA over Converged Ethernet) technology, delivers low latency and high-performance over InfiniBand and Ethernet networks. Leveraging datacenter bridging (DCB) capabilities as well as ConnectX-8 advanced congestion control hardware mechanisms, RoCE provides efficient low-latency RDMA services over Layer 2 and Layer 3 networks.

NVIDIA PeerDirect™

PeerDirect™ communication provides high-efficiency RDMA access by eliminating unnecessary internal data copies between components on the PCIe bus (for example, from GPU to CPU), and therefore significantly reduces application run time. ConnectX-8 advanced acceleration technology enables higher cluster efficiency and scalability to tens of thousands of nodes.

CPU Offload

Adapter functionality enables reduced CPU overhead allowing more available CPU for computation tasks.

  • Flexible match-action flow tables

  • Open VSwitch (OVS) offload using ASAP2®

  • Tunneling encapsulation/decapsulation

Cryptography Accelerations

ConnectX-8 supports IPSec, MACSec, and PSP cryptography acceleration. Connectx-8 SuperNIC hardware-based accelerations offload the crypto operations and free up the CPU, reducing latency and enabling scalable crypto solutions.

© Copyright 2025, NVIDIA. Last updated on Jan 23, 2025.