NVIDIA Skyway InfiniBand-to-Ethernet Gateway User Manual
NVIDIA Skyway InfiniBand-to-Ethernet Gateway User Manual

Introduction

This is the user guide for the NVIDIA Skyway InfiniBand-to-Ethernet gateway. This document contains the complete product overview, installation and initialization instructions, and product specifications.

Note

This document is preliminary and subject to change.

NVIDIA Skyway GA100 is an appliance-based InfiniBand-to-Ethernet gateway, enabling Ethernet storage or other Ethernet-based communications to access the InfiniBand datacenter, and vice versa. The solution, leveraging ConnectX’s hardware-based forwarding of IP packets and standard IP-routing protocols, supports 200Gb/s HDR connectivity today, and is future-ready to support higher speeds.

NVIDIA Skyway Highlights

Componenet

MGA100-HS2

Form Factor

2U rackmount: 19″

Weight

NVIDIA Skyway gateway: 21kg

The gateway with ACC and package: 32kg

PCIe Cards

8x NVIDIA® ConnectX®-6 VPI dual-port network interface cards

InfiniBand/Ethernet Ports

8x InfiniBand ports

8x Ethernet ports

Connectivity Speed

InfiniBand: SDR/EDR/HDR100/HDR

Ethernet: 25/50/100/200 Gb/s

Bandwidth

Up to 100Gb/s bi-directional per port

Power Supplies

2x AC power supplies


The NVIDIA Skyway system populates eight ConnectX-6 InfiniBand/VPI adapter cards, fans, and two PSUs in the system's rear panel, as shown in the below figure.

Network Interface Cards

NVIDIA Skyway is shipped populated with eight ConnectX-6 dual-port network interface cards (NICs) which enable the hardware-based forwarding of IP packets from the InfiniBand to Ethernet, and vice versa.

Power Supply Units

NVIDIA Skyway is equipped with two redundant, load-sharing power supply units at the rear side of the system. The PSUs are housed in a 2U canister containing the power supplies. Each PSU has an extraction handle, PSU status LED, and a power socket.

For power supply unit LEDs operations, please refer to the System Monitoring section.

The system enables hot-swapping which enables components to be exchanged while the system is online without affecting operational integrity.

Warning

These power supply units can be removed from the system only if they are being replaced.


Fans

Power Supply Fans

NVIDIA Skyway is equipped with one fan per power supply unit on the rear panel of the appliance.

Internal Fans

NVIDIA Skyway is equipped with six internal fans for cooling the CPU and expansion cards. Under normal operation, the cooling fans operate at a constant speed. If the system module fails or one of the temperature thresholds is exceeded, the cooling fans automatically raise their rotation speeds to draw more airflow.

Check the package contents list to see that all the parts have been sent. Check the parts for visible damage that may have occurred during shipping. Please note that the product must be placed on an antistatic surface.

Category

Qty.

Item

Systems

1

NVIDIA Skyway 2U system

Slide Rail Kit

1

1U/2U 36" slide kit pair for NVIDIA Skyway

Power Cables

2

250V 10A 1830MM C14 TO C13 power cable

2

Cable retainers

Harness

1

Harness RS232 2M cable—DB9 to RJ-45 (do not connect to the COM port)

Documentation

1

Quick Installation Guide

Rail Kit Package Contents

Category

Qty.

Item

Slides

1

2 sets of slides

Screw M5* 15L

2

8 pcs

Management Interfaces, PSUs, and Fans

Processor System

Chipset

Intel 4209T, 2.2GHz, 11M, 8 Cores

CPU Type

Dual Intel LGA3647 Xeon Scalable processor (up to 140W TDP)

Memory Type

Supports DDR4 2133/2400/2666 MHz ECC-REG Modules

Memory Size

4 x 16GB DDR4 2666MHz

Memory Voltage

1.2V

Error Detection

  • Corrects single-bit errors

  • Detects double-bit errors (using ECC memory)

Rear I/O Panel

USB

4x USB 3.0

RJ-45/LAN

4 x RJ-45 LAN ports:

  • 2x 10GbE

  • 2x 1GbE/IPMI-LAN

On-board Devices

EC

TE 8528E chip provides motherboard, RS-232, and hardware monitor functions

BMC

Sharing with the LAN 1/4.

Expansion slots

PCI-Express

8x network interface cards

Cooling

Chassis Fan

2x 4-pin 80x38 high-speed fans for CPU

4x 4-pin 80x38 high-speed fans for expansion cards

PSU Fans

One fan per power supply unit

PC Health Monitoring

Voltage

Monitors for CPU Cores, +3.3 V, +5V, +12V, +5V standby, VBAT

Temperature

Monitoring for CPU0 & CPU1 (PECI)

Monitoring for system (HWM)

Other Features

(Case Open)

Chassis intrusion detection


For a full list of features, please refer to the system’s product brief at www.nvidia.com/en-us/networking. In the main menu, click on PRODUCTS → INFINIBAND → GATEWAY & ROUTERS SYSTEMS → select the desired product page.

InfiniBand-to-Ethernet Gateway Operational Description

NVIDIA Skyway GA100 is an appliance-based InfiniBand-to-Ethernet gateway, enabling Ethernet storage or other Ethernet-based communications to access the InfiniBand datacenter, and vice versa. The solution, leveraging ConnectX’s hardware-based forwarding of IP packets and standard IP-routing protocols, supports 200Gb/s HDR connectivity, today, and is future-ready to support higher speeds.

NVIDIA Skyway contains 8 ConnectX VPI dual-port adapter cards which enable the hardware-based forwarding of IP packets from InfiniBand to Ethernet, and vice versa. NVIDIA Skyway also includes the NVIDIA Gateway Operating System, MLNX-GW, which manages the appliance and handles the high availability and load balancing between the ConnectX cards and gateway appliances.

A single NVIDIA Skyway supports a maximum bandwidth of 1.6Tb/s, utilizing 16 ports, each of which reaches 100Gb/s traffic. In terms of connectivity, the InfiniBand ports can be connected to the InfiniBand network using HDR/HDR100 or EDR speeds, while the Ethernet ports can be connected to the Ethernet network using 200Gb/s or100Gb/s.

Load Balancing and High Availability Operational Description

On the Ethernet side, the load balancing and high availability functions are achieved by leveraging Ethernet LAG (Link Aggregation). LACP (Link Aggregation Control Protocol) is used to establish the LAG and to verify connectivity. On the InfiniBand side, these functions are achieved by assuring that different flows go through different ConnectX HCAs, so that, in case a HCA drops, another HCA will continue passing its flows.

At initialization, 256 gateway GIDs are spread evenly among all InfiniBand ports of the gateway appliances. When an InfiniBand node initiates a traffic flow through the gateway, it first sends a broadcast ARP request with the default gateway IP address to determine the gateway’s GID. All ConnectX cards receive the request, but only one sends the ARP response. The response is sent from the ConnectX cards that were assigned to handle the range of GIDs corresponding to the sending node’s IP address. When the originating node receives the gateway GID, it sends a path query to the subnet manager (SM) to determine the gateway LID and the communication flow is performed as usual.

The dynamic assignment of the 256 gateway GIDs is the basic element of the load balancing and high availability operations. For any change in gateway configurations (e.g., a cable is dropped, an Ethernet link is disabled, or an appliance is powered off), the gateway GIDs get reassigned by MLNX-GW to be handled by other ConnectX cards. From the end-node perspective, nothing is changed. The same GID and LID remain, even when handled by a different ConnectX HCA.

NVIDIA Skyway includes the NVIDIA Gateway operating system, MLNX-GW, which manages the appliance and handles the high availability and load balancing between the ConnectX cards and between gateway appliances. For a detailed description of MLNX-GW, see please see NVIDIA MLNX-GW User Manual for NVIDIA Skyway or contact your NVIDIA representative.

The list of certifications per system for different regions of the world (such as EMC, safety, and others) is located on the NVIDIA Netowrking website at http://www.mellanox.com/page/environmental_compliance.

© Copyright 2024, NVIDIA. Last updated on Apr 17, 2024.