Introduction
This is the user guide for the NVIDIA Skyway InfiniBand-to-Ethernet gateway. This document contains the complete product overview, installation and initialization instructions, and product specifications.
This document is preliminary and subject to change.
NVIDIA Skyway GA100 is an appliance-based InfiniBand-to-Ethernet gateway, enabling Ethernet storage or other Ethernet-based communications to access the InfiniBand datacenter, and vice versa. The solution, leveraging ConnectX’s hardware-based forwarding of IP packets and standard IP-routing protocols, supports 200Gb/s HDR connectivity today, and is future-ready to support higher speeds.
NVIDIA Skyway Highlights
Componenet |
MGA100-HS2 |
Form Factor |
2U rackmount: 19″ |
Weight |
NVIDIA Skyway gateway: 21kg The gateway with ACC and package: 32kg |
PCIe Cards |
8x NVIDIA® ConnectX®-6 VPI dual-port network interface cards |
InfiniBand/Ethernet Ports |
8x InfiniBand ports 8x Ethernet ports |
Connectivity Speed |
InfiniBand: SDR/EDR/HDR100/HDR Ethernet: 25/50/100/200 Gb/s |
Bandwidth |
Up to 100Gb/s bi-directional per port |
Power Supplies |
2x AC power supplies |
The NVIDIA Skyway system populates eight ConnectX-6 InfiniBand/VPI adapter cards, fans, and two PSUs in the system's rear panel, as shown in the below figure.
Network Interface Cards
NVIDIA Skyway is shipped populated with eight ConnectX-6 dual-port network interface cards (NICs) which enable the hardware-based forwarding of IP packets from the InfiniBand to Ethernet, and vice versa.
Power Supply Units
NVIDIA Skyway is equipped with two redundant, load-sharing power supply units at the rear side of the system. The PSUs are housed in a 2U canister containing the power supplies. Each PSU has an extraction handle, PSU status LED, and a power socket.
For power supply unit LEDs operations, please refer to the System Monitoring section.
The system enables hot-swapping which enables components to be exchanged while the system is online without affecting operational integrity.
These power supply units can be removed from the system only if they are being replaced.
Fans
Power Supply Fans
NVIDIA Skyway is equipped with one fan per power supply unit on the rear panel of the appliance.
Internal Fans
NVIDIA Skyway is equipped with six internal fans for cooling the CPU and expansion cards. Under normal operation, the cooling fans operate at a constant speed. If the system module fails or one of the temperature thresholds is exceeded, the cooling fans automatically raise their rotation speeds to draw more airflow.
Check the package contents list to see that all the parts have been sent. Check the parts for visible damage that may have occurred during shipping. Please note that the product must be placed on an antistatic surface.
Category |
Qty. |
Item |
Systems |
1 |
NVIDIA Skyway 2U system |
Slide Rail Kit |
1 |
1U/2U 36" slide kit pair for NVIDIA Skyway |
Power Cables |
2 |
250V 10A 1830MM C14 TO C13 power cable |
2 |
Cable retainers |
|
Harness |
1 |
Harness RS232 2M cable—DB9 to RJ-45 (do not connect to the COM port) |
Documentation |
1 |
Quick Installation Guide |
Rail Kit Package Contents
Category |
Qty. |
Item |
Slides |
1 |
2 sets of slides |
Screw M5* 15L |
2 |
8 pcs |
Management Interfaces, PSUs, and Fans
Processor System |
Chipset |
Intel 4209T, 2.2GHz, 11M, 8 Cores |
CPU Type |
Dual Intel LGA3647 Xeon Scalable processor (up to 140W TDP) |
|
Memory Type |
Supports DDR4 2133/2400/2666 MHz ECC-REG Modules |
|
Memory Size |
4 x 16GB DDR4 2666MHz |
|
Memory Voltage |
1.2V |
|
Error Detection |
|
|
Rear I/O Panel |
USB |
4x USB 3.0 |
RJ-45/LAN |
4 x RJ-45 LAN ports:
|
|
On-board Devices |
EC |
TE 8528E chip provides motherboard, RS-232, and hardware monitor functions |
BMC |
Sharing with the LAN 1/4. |
|
Expansion slots |
PCI-Express |
8x network interface cards |
Cooling |
Chassis Fan |
2x 4-pin 80x38 high-speed fans for CPU 4x 4-pin 80x38 high-speed fans for expansion cards |
PSU Fans |
One fan per power supply unit |
|
PC Health Monitoring |
Voltage |
Monitors for CPU Cores, +3.3 V, +5V, +12V, +5V standby, VBAT |
Temperature |
Monitoring for CPU0 & CPU1 (PECI) Monitoring for system (HWM) |
|
Other Features (Case Open) |
Chassis intrusion detection |
For a full list of features, please refer to the system’s product brief at www.nvidia.com/en-us/networking. In the main menu, click on PRODUCTS → INFINIBAND → GATEWAY & ROUTERS SYSTEMS → select the desired product page.
InfiniBand-to-Ethernet Gateway Operational Description
NVIDIA Skyway GA100 is an appliance-based InfiniBand-to-Ethernet gateway, enabling Ethernet storage or other Ethernet-based communications to access the InfiniBand datacenter, and vice versa. The solution, leveraging ConnectX’s hardware-based forwarding of IP packets and standard IP-routing protocols, supports 200Gb/s HDR connectivity, today, and is future-ready to support higher speeds.
NVIDIA Skyway contains 8 ConnectX VPI dual-port adapter cards which enable the hardware-based forwarding of IP packets from InfiniBand to Ethernet, and vice versa. NVIDIA Skyway also includes the NVIDIA Gateway Operating System, MLNX-GW, which manages the appliance and handles the high availability and load balancing between the ConnectX cards and gateway appliances.
A single NVIDIA Skyway supports a maximum bandwidth of 1.6Tb/s, utilizing 16 ports, each of which reaches 100Gb/s traffic. In terms of connectivity, the InfiniBand ports can be connected to the InfiniBand network using HDR/HDR100 or EDR speeds, while the Ethernet ports can be connected to the Ethernet network using 200Gb/s or100Gb/s.
Load Balancing and High Availability Operational Description
On the Ethernet side, the load balancing and high availability functions are achieved by leveraging Ethernet LAG (Link Aggregation). LACP (Link Aggregation Control Protocol) is used to establish the LAG and to verify connectivity. On the InfiniBand side, these functions are achieved by assuring that different flows go through different ConnectX HCAs, so that, in case a HCA drops, another HCA will continue passing its flows.
At initialization, 256 gateway GIDs are spread evenly among all InfiniBand ports of the gateway appliances. When an InfiniBand node initiates a traffic flow through the gateway, it first sends a broadcast ARP request with the default gateway IP address to determine the gateway’s GID. All ConnectX cards receive the request, but only one sends the ARP response. The response is sent from the ConnectX cards that were assigned to handle the range of GIDs corresponding to the sending node’s IP address. When the originating node receives the gateway GID, it sends a path query to the subnet manager (SM) to determine the gateway LID and the communication flow is performed as usual.
The dynamic assignment of the 256 gateway GIDs is the basic element of the load balancing and high availability operations. For any change in gateway configurations (e.g., a cable is dropped, an Ethernet link is disabled, or an appliance is powered off), the gateway GIDs get reassigned by MLNX-GW to be handled by other ConnectX cards. From the end-node perspective, nothing is changed. The same GID and LID remain, even when handled by a different ConnectX HCA.
NVIDIA Skyway includes the NVIDIA Gateway operating system, MLNX-GW, which manages the appliance and handles the high availability and load balancing between the ConnectX cards and between gateway appliances. For a detailed description of MLNX-GW, see please see NVIDIA MLNX-GW User Manual for NVIDIA Skyway or contact your NVIDIA representative.
The list of certifications per system for different regions of the world (such as EMC, safety, and others) is located on the NVIDIA Netowrking website at http://www.mellanox.com/page/environmental_compliance.