Overview
NVIDIA® SKYWAY™operating system, MLNX-GW, enables the management and configuration of NVIDIA's InfiniBand-to-Ethernet gateway system. It controls the CONNECTX® adapters, as well as the high availability and load balancing between cards and between gateway appliances.
MLNX-GW provides a full suite of management options, including incorporating with a familiar industry-standard CLI which enables administrators to easily configure and manage the system.
This manual provides information about the scope, organization, and command line interface of MLNX-GW as well as configuration examples.
Skyway GA100 is an appliance-based InfiniBand-to-Ethernet gateway, enabling Ethernet storage or other Ethernet-based communications to access the InfiniBand datacenter, and vice versa. The solution, leveraging ConnectX’s hardware-based forwarding of IP packets and standard IP-routing protocols, supports 200Gb/s HDR connectivity today, and is future-ready to support higher speeds.
Skyway contains 8 ConnectX VPI dual-port adapter cards which enable the hardware-based forwarding of IP packets between InfiniBand to Ethernet systems.
A single Skyway module supports a maximum bandwidth of 1.6Tb/s, utilizing 16 ports with each reaching 100Gb/s traffic. Connectivity-wise, the InfiniBand ports can be connected to the InfiniBand network via HDR/HDR100 or EDR; and the Ethernet ports using 200Gb/s, 100Gb/s Ethernet.
These pages are intended for network administrators who are responsible for configuring and managing Skyway gateway platforms.
The following table lists the documents referenced in this User Manual.
Document Name | Description |
Skyway System Hardware User Manual | This document contains hardware descriptions, LED assignments, and hardware specifications among other things |
Skyway Product Release Notes | Please look up the relevant gateway release notes file |
For list of the changes made to this document in the current version, see the.Document Revision History v6.8-3.10.1000 section .
Term | Description |
AAA | Authentication, Authorization, and Accounting:
|
ARP | Address Resolution Protocol. A protocol that translates IP addresses into MAC addresses for communication over a local area network (LAN). |
CLI | Command Line Interface. A user interface in which you type commands at the prompt |
DHCP | The Dynamic Host Configuration Protocol (DHCP) is an automatic configuration protocol used on IP networks. |
DNS | Domain Name System. A hierarchical naming system for devices in a computer network. |
Fabric management | The use of a set of tools (APIs) to configure, discover, and manage and a group of devices organized as a connected fabric. |
FTP/TFTP/sFTP | File Transfer Protocol (FTP) is a standard network protocol used to transfer files from one host to another over a TCP-based network, such as the Internet. |
Gateway | A network node that interfaces with both InfiniBand and Ethernet, using different network protocols |
GID | Global Identifier. A 128-bit number used to identify a Port on a network adapter (see below), a port on a Router, or a Multicast Group. |
GUID | Globally Unique Identifier. A 64-bit number that uniquely identifies a device or component in a subnet. |
HA | High Availability. A system design protocol that provides redundancy of system components, thus enables overcoming single or multiple failures in minimal downtime. |
Host | A computer platform executing an Operating System which may control one or more network adapters |
IB | InfiniBand |
LDAP | The Lightweight Directory Access Protocol is an industry standard application protocol for accessing and maintaining distributed directory information services over an IP network. |
LID | Local Identifier. A 16 bit address assigned to end nodes by the subnet manager. Each LID is unique within its subnet. |
MAC | A Media Access Control address (MAC address) is a unique identifier assigned to network interfaces for communications on the physical network segment. MAC addresses are used for numerous network technologies and most IEEE 802 network technologies including Ethernet. |
MTU | Maximum Transfer Unit. The maximum size of a packet payload (not including headers) that can be sent /received from a port. |
Network Adapter | A hardware device that allows for communication between computers in a network. |
RADIUS | Remote Authentication Dial In User Service. A networking protocol that enables AAA centralized management for computers to connect and use a network service. |
SA | Subnet Administrator (SA) is the interface for querying and manipulating subnet management data. |
SCP | Secure Copy or SCP is a means of securely transferring computer files between a local and a remote host or between two remote hosts. It is based on the Secure Shell (SSH) protocol. |
SNMP | Simple Network Management Protocol. A network protocol for the management of a network and the monitoring of network devices and their functions. |
SOL | Serial Over LAN |
NTP | Network Time Protocol. A protocol for synchronizing computer clocks in a network. |
SSH | Secure Shell. A protocol (program) for securely logging in to and running programs on remote machines across a network. The program authenticates access to the remote machine and encrypts the transferred information through the connection. |
syslog | A standard for forwarding log messages in an IP network. |
TACACS+ | Terminal Access Controller Access-Control System Plus. A networking protocol that enables access to a network of devices via one or more centralized servers. TACACS+ provides separate AAA services. |
NVIDIA Skyway enables establishing a High Availability (HA) environment that shares resources among multiple Skyway appliances (comprising a Skyway domain). HA minimizes downtime when any system or connectivity failure occurs. Skyway leverages its load balancing capabilities to distribute the workload to optimize the aggregate domain performance for traffic.
On the Ethernet side, Skyway load balancing and HA functions are achieved by leveraging Ethernet Link Aggregation (LAG) support. Link Aggregation Control Protocol (LACP) is used to establish LAG and to verify connectivity. On the InfiniBand side, these functions are achieved through guaranteed availability of fallback network adapters (HCAs) of the Skyway appliances that will execute the traffic flows if an HCA drops.
At initialization, up to 254 gateway group identifiers (GIDs) are spread evenly among all InfiniBand ports of the Skyway gateway appliance. When an InfiniBand node initiates a traffic flow through a gateway, it first sends a broadcast ARP request with the default Gateway IP Address to determine the gateway GID. All HCAs receive the request, but only the adapter assigned to handle the relevant range of GIDs corresponding to the sending node IP address will send back a response to the ARP request. When the originating node receives the gateway GID, it sends a path query to the subnet manager (SM) to determine the gateway local identifier (LID), and the communication flow is then performed as usual.
The dynamic assignment of the 254 gateway GIDs is the basic element for the load balancing and high availability of the entire system. If a change to the gateway(s) configuration occurs—for example, if a cable is dropped, an Ethernet link is disabled, or an appliance is powered off—then the gateway GIDs are reassigned by the MLNX-GW operating system to other HCAs to be handled. From the end-node point of view, nothing has changed—the same GID and LID remain valid even when handled by a different HCA (on the same or different Skyway appliance).
High Availability (HA) Details
A gateway domain is a set of Skyway appliances sharing the same InfiniBand subnet.
The HA protocol runs individually on each Skyway appliance in the domain.
Skyway appliances which belong to the same domain share the same domain ID.
Possible gateway domain roles are as follows:
Master Gateway
Active Backup Gateway(s)
Non-Active Backup Gateway(s)
In each gateway domain there is a single Master Gateway.
The domain's Master Gateway is responsible for GID assignment, which is the basis of HA and load balancing.
Based on GID assignment, each HCA is configured to know ARP requests it should respond to and the Host IP addresses that it should pass traffic to.
Every domain member distributes its InfiniBand host list to the rest of the domain members.
To monitor the health of the Skyway’s domain members, each member sends unicast UDP "keepalive" messages to the Master, containing, among other things, the number of its active ports in the domain (that is, the number of active HCAs that can pass traffic). Skyway HA information (including keepalive statistics) will be reflected in the CLI.
If an Active Backup Gateway fails to receive an advertisement confirming that the Master Gateway is functioning well within a prescheduled timeout, it will take over as Master Gateway and will inform the rest of the domain members of the role change.
To determine which gateway will become Master, priority value will be used by the gateway appliance. The value of 0 (zero) is reserved for the gateway appliance to indicate it is releasing responsibility from being the gateway Master. The range 1-255 is available for the gateway appliance. Higher values indicate higher priorities. The default value is (decimal) 100.
In case two gateway appliances share the same priority, the one with the higher system GUID (Globally Unique ID) will be considered as higher priority and will become the new domain Master Gateway.