image image image image image image



On This Page

Scope

This Reference Deployment Guide (RDG) document is aimed at having a practical and scalable Red-Hat OpenStack deployment suitable for high-performance workloads.

The deployment utilizes a single physical network based on NVIDIA high-speed NICs and switches

Abbreviations and Acronyms

TermDefinitionTermDefinition
AIArtificial IntelligenceMLAGMulti-Chassis Link Aggregation
ASAP2Accelerated Switching and Packet Processing®MLNX_OFEDNVIDIA OpenFabrics Enterprise Distribution for Linux (network driver)
BGPBorder Gateway ProtocolNFVNetwork Functions Virtualization
BOMBill of MaterialsNICNetwork Interface Card
CPU Central Processing UnitOSOperating System
CUDACompute Unified Device ArchitectureOVSOpen vSwitch
DHCPDynamic Host Configuration ProtocolRDGReference Deployment Guide
DPDKData Plane Development KitRDMARemote Direct Memory Access
DVRDistributed Virtual RoutingRHELRed Hat Enterprise Linux
ECMPEqual Cost Multi-PathingRH-OSPRed Hat OpenStack Platform
FWFirmWareRoCERDMA over Converged Ethernet
GPUGraphics Processing UnitSDNSoftware Defined Networking
HAHigh AvailabilitySR-IOVSingle Root Input/Output Virtualization
IPInternet Protocol VFVirtual Function 
IPMIIntelligent Platform Management InterfaceVF-LAGVirtual Function Link Aggregation 
L3IP Network Layer 3VLANVirtual LAN
LACPLink Aggregation Control ProtocolVMVirtual Machine
MGMTManagementVNFVirtualized Network Function
ML2Modular Layer 2 Openstack Plugin

Introduction

This document demonstrates the deployment of a large-scale OpenStack cloud over a single high-speed fabric.

The fabric provides the cloud a mix of L3-routed networks and L2-stretched EVPN networks -

L2-stretched networks are used for the "Deployment/Provisioning" network (as they greatly simplify DHCP and PXE operations) and for the "External" network (as they allow attaching a single external network segment to the cluster, typically a subnet that has real Internet addressing).

Red Hat OpenStack Platform (RH-OSP) is a cloud computing solution that enables the creation, deployment, scale and management of a secure and reliable public or private OpenStack-based cloud. This production-ready platform offers a tight integration with NVIDIA networking and data processing technologies.

The solution demonstrated in this article can be easily applied to diverse use cases, such as core or edge computing, with hardware accelerated packet and data processing for NFV, Big Data, and AI workloads over IP, DPDK, and RoCE stacks.


RH-OSP16.1 Release Notes

Red Hat OpenStack Platform 16.1, supports offloading of the OVS switching function to the SmartNIC hardware. This enhancement reduces the processing resources required and accelerates the data path. In Red Hat OpenStack Platform 16.1, this feature has graduated from Technology Preview and is now fully supported.

Downloadable Content

All configuration files used in this article can be found here: 47036708-1.0.0.zip

References

Red Hat OpenStack Platform 16.1 Installation Guide

Red Hat Openstack Platform 16.1 Spine Leaf Networking

QSG: High Availability with ASAP2 Enhanced SR-IOV (VF-LAG).

Solution Architecture

Key Components and Technologies

  • CONNECTX®-6 Dx is a member of the world-class, award-winning ConnectX series of network adapters. ConnectX-6 Dx delivers two ports of 10/25/40/50/100Gb/s or a single-port of 200Gb/s Ethernet connectivity paired with best-in-class hardware capabilities that accelerate and secure cloud and data center workloads.
  • NVIDIA Spectrum™ Ethernet Switch product family includes a broad portfolio of top-of-rack and aggregation switches, that can be deployed in layer-2 and layer-3 cloud designs, in overlay-based virtualized networks, or as part of high-performance, mission-critical ethernet storage fabrics.
  • LinkX® product family of cables and transceivers provides the industry’s most complete line of 10, 25, 40, 50, 100, 200, and 400GbE in Ethernet and EDR, HDR, and NDR in InfiniBand products for Cloud, HPC, Web 2.0, Enterprise, telco, storage, and artificial intelligence and data center applications. LinkX cables and transceivers are often used to link top-of-rack switches downwards to network adapters in NVIDIA GPUs and CPU servers, and storage and/or upwards in switch-to-switch applications throughout the network infrastructure.
  • CUMULUS Linux is the world’s most robust open networking operating system. It includes a comprehensive list of advanced, modern networking features, and is built for scale.
  • Red Hat OpenStack Platform is a cloud computing platform that virtualizes resources from industry-standard hardware, organizes those resources into clouds, and manages them so users can access what they need, when they need it.

Logical Design

A typical OpenStack deployment uses several infrastructure networks, as portrayed in the following diagram:


A straight forward approach would be to use a separate physical network for each infrastructure network above, but this is not practical. Most deployments converge several networks onto a physical fabric and typically use one or more 1/10GbE fabrics and sometimes an additional high-speed fabric (25/100/200GbE).

In this document, we will demonstrate a deployment that converges all the below networks onto a single 100/200GbE high-speed fabric.

Network Fabric Design


Note

  • In this network design, Compute Nodes are connected to the External network as required for Distributed Virtual Routing (DVR) configuration. For more information, please refer to Red Hat Openstack Platform DVR.
  • Routed Spine-Leaf networking architecture is used in this RDG for high speed Control and Data networks. However, the provisioning and External network use L2-Stretched EVPN networks. For more information, please see Red Hat Openstack Platform Spine Leaf Networking.

The deployment includes two separate networking layers: 

  1. High-speed Ethernet fabric
  2. IPMI and switch management (not covered in this document)

The high-speed fabric described in this document contains the following building blocks:

  • Routed Spine-Leaf networking architecture
  • 2 x MSN3700C Spine switches
  • L3 BGP unnumbered with ECMP is configured on the Leaf and Spine switches to allow multipath routing between racks located on different L3 network segments
  • 2 x MSN2100 Leaf switches per rack in MLAG topology
  • 2 x 100GbE ports for MLAG peerlink between Leaf pairs
  • L3 VRR VLAN interfaces on a Leaf pair that is used as the default gateway for the host servers VLAN interfaces
  • VXLAN VTEPs with BGP EVPN control plane for creating stretch L2 networks for provisioning and for the external network
  • Host servers with 2 x 100GbE ports configured with LACP Active-Active bonding with multiple VLAN interfaces
  • The entire fabric is configured to support Jumbo Frames (MTU=9000)


This document will demonstrate a minimalistic scale of two racks with 1 compute node each. Though using the same design, the fabric can be scaled to accommodate up to 236 compute nodes with 2x100GbE connectivity and a non-blocking fabric.

This is a diagram demonstrating the maximum possible scale for a non-blocking deployment that uses 2x100GbE to the hosts (16 racks, 15 servers each using 15 spines and 32 leafs):

Host Accelerated Bonding Logical Design

In order to use the MLAG-based network high availability, the hosts must have two high-speed interfaces that are bonded together with an LACP bond.

In the solution described in this article, enhanced SR-IOV with bonding support (ASAP2 VF-LAG) is used to offload network processing from the host and VM into the network adapter hardware, while providing fast data plane with high availability functionality. 

Two Virtual Functions, each on a different physical port, are bonded and allocated to the VM as a single LAGed VF. The bonded interface is connected to a single or multiple ToR switches, using Active-Standby or Active-Active bond modes.

For additional information, refer to QSG: High Availability with ASAP2 Enhanced SR-IOV (VF-LAG).

image2021-2-22_16-19-4.png

Host and Application Logical Design

Compute host components:

  • ConnectX-6 Dx High Speed NIC with a dual physical port that is configured with LACP bonding in MLAG topology and providing VF-LAG redundancy to the VM
  • Storage Drives for local OS usage
  • RHEL as a base OS
  • Red Hat OpenStack Platform containerized software stack with the following:
    • KVM-based hypervisor
    • Openvswitch (OVS) with hardware offload support
    • Distributed Virtual Routing (DVR) configuration

Virtual Machine components:

  • CentOS 7.x/8.x as a base OS
  • SR-IOV VF allocated using PCI passthrough which allows to bypass the compute server hypervisor
  • perftest-tools Performance and benchmark testing tool set    

image2021-2-3_18-46-8.png

Software Stack Components

Bill of Materials

Deployment and Configuration

Spines
HostnameRouter IDAutonomous SystemDownlinks
spine110.10.10.101/3265199swp1-4 
spine210.10.10.102/3265199swp1-4 
Leafs
RackHostnameRouter IDAutonomous SystemUplinksISL portsCLAG System MACCLAG PriorityVXLAN_Anycast_IP
1leaf1a10.10.10.1/3265100swp13-14swp15-1644:38:39:BE:EF:AA100010.0.0.10
1leaf1b10.10.10.2/3265100swp13-14swp15-1644:38:39:BE:EF:AA3276810.0.0.10
2leaf2a10.10.10.3/3265101swp13-14swp15-1644:38:39:BE:EF:BB100010.0.0.20
2leaf2b10.10.10.4/3265101swp13-14swp15-1644:38:39:BE:EF:BB3276810.0.0.20
L3-Routed VLANs
VLAN IDVirtual MACVirtual IPPrimary Router IPSecondary Router IPPurpose
1000:00:5E:00:01:00172.16.0.254/24172.16.0.252/24172.16.0.253/24Internal_API
2000:00:5E:00:01:00172.17.0.254/24172.17.0.252/24172.17.0.253/24Storage
3000:00:5E:00:01:00172.18.0.254/24172.18.0.252/24172.18.0.253/24Storage Mgmt
4000:00:5E:00:01:00172.19.0.254/24172.19.0.252/24172.19.0.253/24Tenant
1100:00:5E:00:01:01172.16.1.254/24172.16.1.252/24172.16.1.253/24Internal_API
2100:00:5E:00:01:01172.17.1.254/24172.17.1.252/24172.17.1.253/24Storage
3100:00:5E:00:01:01172.18.1.254/24172.18.1.252/24172.18.1.253/24Storage Mgmt
4100:00:5E:00:01:01172.19.1.254/24172.19.1.252/24172.19.1.253/24Tenant
L2-Stretched VLANs (EVPN)
VLAN IDVNIUsed SubnetPurpose
5010050

192.168.24.0/24

Provisioning
(generated by the undercloud node)

6010060

172.60.0.0/24

External/Internet Access
The undercloud node has an address in this subnet (172.60.0.1) and its default gateway is 172.60.0.254

Server Ports
RackVLAN IDAccess PortsTrunk PortsNetwork Purpose
110
swp1-5

Internal API

120
swp1-5

Storage

130
swp1-5

Storage Mgmt

140
swp1-5

Tenant

150swp1-5

Provisioning/Mgmt (Undercloud-Overcloud)

160
swp1-5

External

211
swp1

Internal API

221
swp1

Storage

231
swp1

Storage Mgmt

241
swp1

Tenant

250swp1

Provisioning/Mgmt (Undercloud-Overcloud)

260
swp1

External




Server Wiring

Rack1

       

Director Node (Undercloud)

ens2f0 → Leaf1A, swp1         

ens2f1 → Leaf1B, swp1

Controller-1

ens2f0 → Leaf1A, swp2          

ens2f1 → Leaf1B, swp2          

Controller-2

ens2f0 → Leaf1A, swp3          

ens2f1 → Leaf1B, swp3       

Controller-3

ens2f0 → Leaf1A, swp4          

ens2f1 → Leaf1B, swp4  

Compute-1

enp57s0f0 → Leaf1A, swp5          

enp57s0f1 → Leaf1B, swp5  

Rack2


Compute-2 

enp57s0f1 → Leaf2A, swp1          

enp57s0f0 → Leaf2A, swp1

Wiring

The wiring principal for the high-speed Ethernet fabric is as follows:

  • Each server in the racks is wired to two leaf(or "TOR") switch
  • Leaf switches are interconnected using two ports (to create an MLAG)
  • Every leaf is wired to all the spines



Below is the full wiring diagram for the demonstrated fabric:

Network/Fabric

Updating Cumulus Linux

As a best practice, make sure to use the latest released Cumulus Linux NOS version.

Please see this guide on how to upgrade Cumulus Linux.

Configuring the Cumulus Linux switch

Make sure your Cumulus Linux switch has passed its initial configuration stages (for additional information, see the Quick-Start Guide for version 4.4):

  1. License installation
  2. Creation of switch interfaces (e.g., swp1-32)

The configuration of the spine switches includes the following:

  1. Routing & BGP
  2. EVPN
  3. Downlinks configuration

The configuration of the leaf switches includes the following:

  1. Routing & BGP
  2. EVPN
  3. Uplinks configuration
  4. MLAG
  5. L3-Routed VLANs
  6. L2-Stretched VLANs
  7. Bonds configuration

Following are the configuration commands:

  • The configuration below does not include the connection of an external router/gateway for external/Internet connectivity
  • The "external" network (stretched L2 over VLAN 60) is assumed to have a gateway (at 172.60.0.254) and to provide Internet connectivity for the cluster
  • The director (undercloud) node has an interface on VLAN 60 (bond0.60 with address 172.60.0.1)
Spine1 Console
net add loopback lo ip address 10.10.10.101/32    
net add routing defaults datacenter
net add routing log syslog informational
net add routing service integrated-vtysh-config
net add bgp autonomous-system 65199
net add bgp router-id 10.10.10.101   
net add bgp neighbor underlay peer-group
net add bgp neighbor underlay remote-as external
net add interface swp1 mtu 9216
net add bgp neighbor swp1 interface peer-group underlay
net add interface swp2 mtu 9216
net add bgp neighbor swp2 interface peer-group underlay
net add interface swp3 mtu 9216
net add bgp neighbor swp3 interface peer-group underlay
net add interface swp4 mtu 9216
net add bgp neighbor swp4 interface peer-group underlay
net add bgp ipv4 unicast redistribute connected
net add bgp ipv6 unicast neighbor underlay activate
net add bgp l2vpn evpn  neighbor underlay activate
net add bgp l2vpn evpn  advertise-all-vni
net commit
Spine2 Console
net add loopback lo ip address 10.10.10.102/32    
net add routing defaults datacenter
net add routing log syslog informational
net add routing service integrated-vtysh-config
net add bgp autonomous-system 65199
net add bgp router-id 10.10.10.102   
net add bgp neighbor underlay peer-group
net add bgp neighbor underlay remote-as external
net add interface swp1 mtu 9216
net add bgp neighbor swp1 interface peer-group underlay
net add interface swp2 mtu 9216
net add bgp neighbor swp2 interface peer-group underlay
net add interface swp3 mtu 9216
net add bgp neighbor swp3 interface peer-group underlay
net add interface swp4 mtu 9216
net add bgp neighbor swp4 interface peer-group underlay
net add bgp ipv4 unicast redistribute connected
net add bgp ipv6 unicast neighbor underlay activate
net add bgp l2vpn evpn  neighbor underlay activate
net add bgp l2vpn evpn  advertise-all-vni
net commit
Leaf1A Console
net add interface swp15,swp16 mtu 9216
net add bond peerlink bond slaves swp15,swp16
net add interface swp1 mtu 9216
net add bond bond1 bond slaves swp1
net add bond bond1 clag id 1
net add interface swp2 mtu 9216
net add bond bond2 bond slaves swp2
net add bond bond2 clag id 2
net add interface swp3 mtu 9216
net add bond bond3 bond slaves swp3
net add bond bond3 clag id 3
net add interface swp4 mtu 9216
net add bond bond4 bond slaves swp4
net add bond bond4 clag id 4
net add interface swp5 mtu 9216
net add bond bond5 bond slaves swp5
net add bond bond5 clag id 5
net add bond bond1,bond2,bond3,bond4,bond5 bond lacp-bypass-allow
net add bond bond1,bond2,bond3,bond4,bond5 stp bpduguard
net add bond bond1,bond2,bond3,bond4,bond5 stp portadminedge
net add bridge bridge ports bond1,bond2,bond3,bond4,bond5,peerlink
net add bridge bridge vlan-aware
net add bgp autonomous-system 65100    
net add routing defaults datacenter
net add routing log syslog informational
net add routing service integrated-vtysh-config
net add bgp router-id 10.10.10.1
net add bgp bestpath as-path multipath-relax
net add bgp neighbor underlay peer-group
net add bgp neighbor underlay remote-as external
net add bgp neighbor peerlink.4094 interface remote-as internal
net add interface swp13 mtu 9216
net add bgp neighbor swp13 interface peer-group underlay
net add interface swp14 mtu 9216
net add bgp neighbor swp14 interface peer-group underlay
net add bgp ipv4 unicast redistribute connected
net add loopback lo ip address 10.10.10.1/32   
net add interface peerlink.4094 clag backup-ip 10.10.10.2  
net add interface peerlink.4094 clag priority 1000  
net add interface peerlink.4094 clag sys-mac 44:38:39:BE:EF:AA   
net add interface peerlink.4094 clag peer-ip linklocal
net add interface peerlink.4094 clag args --initDelay 10
net add bridge bridge vids 10  
net add vlan 10 ip address 172.16.0.252/24
net add vlan 10 ip address-virtual 00:00:5E:00:01:00 172.16.0.254/24    
net add vlan 10 vlan-id 10    
net add vlan 10 vlan-raw-device bridge
net add bridge bridge vids 20  
net add vlan 20 ip address 172.17.0.252/24
net add vlan 20 ip address-virtual 00:00:5E:00:01:00 172.17.0.254/24    
net add vlan 20 vlan-id 20    
net add vlan 20 vlan-raw-device bridge
net add bridge bridge vids 30  
net add vlan 30 ip address 172.18.0.252/24
net add vlan 30 ip address-virtual 00:00:5E:00:01:00 172.18.0.254/24    
net add vlan 30 vlan-id 30    
net add vlan 30 vlan-raw-device bridge
net add bridge bridge vids 40  
net add vlan 40 ip address 172.19.0.252/24
net add vlan 40 ip address-virtual 00:00:5E:00:01:00 172.19.0.254/24    
net add vlan 40 vlan-id 40    
net add vlan 40 vlan-raw-device bridge
net add bgp l2vpn evpn  neighbor underlay activate
net add bgp l2vpn evpn  neighbor peerlink.4094 activate
net add bgp l2vpn evpn  advertise-all-vni
net add bgp l2vpn evpn  advertise-default-gw
net add bgp l2vpn evpn  advertise ipv4 unicast
net add bgp ipv6 unicast neighbor underlay activate
net add loopback lo clag vxlan-anycast-ip 10.0.0.10
net add loopback lo vxlan local-tunnelip 10.10.10.1
net add vlan 60 vlan-id 60
net add vlan 60 vlan-raw-device bridge
net add vlan 50 vlan-id 50
net add vlan 50 vlan-raw-device bridge
net add vxlan vtep60 vxlan id 10060
net add vxlan vtep50 vxlan id 10050
net add vxlan vtep60,50 bridge arp-nd-suppress on
net add vxlan vtep60,50 bridge learning off
net add vxlan vtep60,50 stp bpduguard
net add vxlan vtep60,50 stp portbpdufilter
net add vxlan vtep60,50 vxlan local-tunnelip 10.10.10.1
net add vxlan vtep60 bridge access 60
net add vxlan vtep50 bridge access 50
net add bridge bridge ports vtep60,vtep50
net add bridge bridge vids 60,50
net add bond bond1-5 bridge pvid 50
net commit
Leaf1B Console
net add interface swp15,swp16 mtu 9216
net add bond peerlink bond slaves swp15,swp16
net add interface swp1 mtu 9216
net add bond bond1 bond slaves swp1
net add bond bond1 clag id 1
net add interface swp2 mtu 9216
net add bond bond2 bond slaves swp2
net add bond bond2 clag id 2
net add interface swp3 mtu 9216
net add bond bond3 bond slaves swp3
net add bond bond3 clag id 3
net add interface swp4 mtu 9216
net add bond bond4 bond slaves swp4
net add bond bond4 clag id 4
net add interface swp5 mtu 9216
net add bond bond5 bond slaves swp5
net add bond bond5 clag id 5
net add bond bond1,bond2,bond3,bond4,bond5 bond lacp-bypass-allow
net add bond bond1,bond2,bond3,bond4,bond5 stp bpduguard
net add bond bond1,bond2,bond3,bond4,bond5 stp portadminedge
net add bridge bridge ports bond1,bond2,bond3,bond4,bond5,peerlink
net add bridge bridge vlan-aware
net add bgp autonomous-system 65100    
net add routing defaults datacenter
net add routing log syslog informational
net add routing service integrated-vtysh-config
net add bgp router-id 10.10.10.2
net add bgp bestpath as-path multipath-relax
net add bgp neighbor underlay peer-group
net add bgp neighbor underlay remote-as external
net add bgp neighbor peerlink.4094 interface remote-as internal
net add interface swp13 mtu 9216
net add bgp neighbor swp13 interface peer-group underlay
net add interface swp14 mtu 9216
net add bgp neighbor swp14 interface peer-group underlay
net add bgp ipv4 unicast redistribute connected
net add loopback lo ip address 10.10.10.2/32   
net add interface peerlink.4094 clag backup-ip 10.10.10.1  
net add interface peerlink.4094 clag priority 32768  
net add interface peerlink.4094 clag sys-mac 44:38:39:BE:EF:AA   
net add interface peerlink.4094 clag peer-ip linklocal
net add interface peerlink.4094 clag args --initDelay 10
net add bridge bridge vids 10  
net add vlan 10 ip address 172.16.0.253/24
net add vlan 10 ip address-virtual 00:00:5E:00:01:00 172.16.0.254/24    
net add vlan 10 vlan-id 10    
net add vlan 10 vlan-raw-device bridge
net add bridge bridge vids 20  
net add vlan 20 ip address 172.17.0.253/24
net add vlan 20 ip address-virtual 00:00:5E:00:01:00 172.17.0.254/24    
net add vlan 20 vlan-id 20    
net add vlan 20 vlan-raw-device bridge
net add bridge bridge vids 30  
net add vlan 30 ip address 172.18.0.253/24
net add vlan 30 ip address-virtual 00:00:5E:00:01:00 172.18.0.254/24    
net add vlan 30 vlan-id 30    
net add vlan 30 vlan-raw-device bridge
net add bridge bridge vids 40  
net add vlan 40 ip address 172.19.0.253/24
net add vlan 40 ip address-virtual 00:00:5E:00:01:00 172.19.0.254/24    
net add vlan 40 vlan-id 40    
net add vlan 40 vlan-raw-device bridge
net add bgp l2vpn evpn  neighbor underlay activate
net add bgp l2vpn evpn  neighbor peerlink.4094 activate
net add bgp l2vpn evpn  advertise-all-vni
net add bgp l2vpn evpn  advertise-default-gw
net add bgp l2vpn evpn  advertise ipv4 unicast
net add bgp ipv6 unicast neighbor underlay activate
net add loopback lo clag vxlan-anycast-ip 10.0.0.10
net add loopback lo vxlan local-tunnelip 10.10.10.2
net add vlan 60 vlan-id 60
net add vlan 60 vlan-raw-device bridge
net add vlan 50 vlan-id 50
net add vlan 50 vlan-raw-device bridge
net add vxlan vtep60 vxlan id 10060
net add vxlan vtep50 vxlan id 10050
net add vxlan vtep60,50 bridge arp-nd-suppress on
net add vxlan vtep60,50 bridge learning off
net add vxlan vtep60,50 stp bpduguard
net add vxlan vtep60,50 stp portbpdufilter
net add vxlan vtep60,50 vxlan local-tunnelip 10.10.10.2
net add vxlan vtep60 bridge access 60
net add vxlan vtep50 bridge access 50
net add bridge bridge ports vtep60,vtep50
net add bridge bridge vids 60,50
net add bond bond1-5 bridge pvid 50
net commit
Leaf2A Console
net add interface swp15,swp16 mtu 9216
net add bond peerlink bond slaves swp15,swp16
net add interface swp1 mtu 9216
net add bond bond1 bond slaves swp1
net add bond bond1 clag id 1
net add bond bond1 bond lacp-bypass-allow
net add bond bond1 stp bpduguard
net add bond bond1 stp portadminedge
net add bridge bridge ports bond1,peerlink
net add bridge bridge vlan-aware
net add bgp autonomous-system 65101    
net add routing defaults datacenter
net add routing log syslog informational
net add routing service integrated-vtysh-config
net add bgp router-id 10.10.10.3
net add bgp bestpath as-path multipath-relax
net add bgp neighbor underlay peer-group
net add bgp neighbor underlay remote-as external
net add bgp neighbor peerlink.4094 interface remote-as internal
net add interface swp13 mtu 9216
net add bgp neighbor swp13 interface peer-group underlay
net add interface swp14 mtu 9216
net add bgp neighbor swp14 interface peer-group underlay
net add bgp ipv4 unicast redistribute connected
net add loopback lo ip address 10.10.10.3/32   
net add interface peerlink.4094 clag backup-ip 10.10.10.4  
net add interface peerlink.4094 clag priority 1000  
net add interface peerlink.4094 clag sys-mac 44:38:39:BE:EF:BB   
net add interface peerlink.4094 clag peer-ip linklocal
net add interface peerlink.4094 clag args --initDelay 10
net add bridge bridge vids 11  
net add vlan 11 ip address 172.16.1.252/24
net add vlan 11 ip address-virtual 00:00:5E:00:01:01 172.16.1.254/24    
net add vlan 11 vlan-id 11    
net add vlan 11 vlan-raw-device bridge
net add bridge bridge vids 21  
net add vlan 21 ip address 172.17.1.252/24
net add vlan 21 ip address-virtual 00:00:5E:00:01:01 172.17.1.254/24    
net add vlan 21 vlan-id 21    
net add vlan 21 vlan-raw-device bridge
net add bridge bridge vids 31  
net add vlan 31 ip address 172.18.1.252/24
net add vlan 31 ip address-virtual 00:00:5E:00:01:01 172.18.1.254/24    
net add vlan 31 vlan-id 31    
net add vlan 31 vlan-raw-device bridge
net add bridge bridge vids 41  
net add vlan 41 ip address 172.19.1.252/24
net add vlan 41 ip address-virtual 00:00:5E:00:01:01 172.19.1.254/24    
net add vlan 41 vlan-id 41    
net add vlan 41 vlan-raw-device bridge
net add bgp l2vpn evpn  neighbor underlay activate
net add bgp l2vpn evpn  neighbor peerlink.4094 activate
net add bgp l2vpn evpn  advertise-all-vni
net add bgp l2vpn evpn  advertise-default-gw
net add bgp l2vpn evpn  advertise ipv4 unicast
net add bgp ipv6 unicast neighbor underlay activate
net add loopback lo clag vxlan-anycast-ip 10.0.0.20
net add loopback lo vxlan local-tunnelip 10.10.10.3
net add vlan 60 vlan-id 60
net add vlan 60 vlan-raw-device bridge
net add vlan 50 vlan-id 50
net add vlan 50 vlan-raw-device bridge
net add vxlan vtep60 vxlan id 10060
net add vxlan vtep50 vxlan id 10050
net add vxlan vtep60,50 bridge arp-nd-suppress on
net add vxlan vtep60,50 bridge learning off
net add vxlan vtep60,50 stp bpduguard
net add vxlan vtep60,50 stp portbpdufilter
net add vxlan vtep60,50 vxlan local-tunnelip 10.10.10.3
net add vxlan vtep60 bridge access 60
net add vxlan vtep50 bridge access 50
net add bridge bridge ports vtep60,vtep50
net add bridge bridge vids 60,50
net add bond bond1 bridge pvid 50
net commit
Leaf2B Console
net add interface swp15,swp16 mtu 9216
net add bond peerlink bond slaves swp15,swp16
net add interface swp1 mtu 9216
net add bond bond1 bond slaves swp1
net add bond bond1 clag id 1
net add bond bond1 bond lacp-bypass-allow
net add bond bond1 stp bpduguard
net add bond bond1 stp portadminedge
net add bridge bridge ports bond1,peerlink
net add bridge bridge vlan-aware
net add bgp autonomous-system 65101    
net add routing defaults datacenter
net add routing log syslog informational
net add routing service integrated-vtysh-config
net add bgp router-id 10.10.10.4
net add bgp bestpath as-path multipath-relax
net add bgp neighbor underlay peer-group
net add bgp neighbor underlay remote-as external
net add bgp neighbor peerlink.4094 interface remote-as internal
net add interface swp13 mtu 9216
net add bgp neighbor swp13 interface peer-group underlay
net add interface swp14 mtu 9216
net add bgp neighbor swp14 interface peer-group underlay
net add bgp ipv4 unicast redistribute connected
net add loopback lo ip address 10.10.10.4/32   
net add interface peerlink.4094 clag backup-ip 10.10.10.3  
net add interface peerlink.4094 clag priority 32768  
net add interface peerlink.4094 clag sys-mac 44:38:39:BE:EF:BB   
net add interface peerlink.4094 clag peer-ip linklocal
net add interface peerlink.4094 clag args --initDelay 10
net add bridge bridge vids 11  
net add vlan 11 ip address 172.16.1.253/24
net add vlan 11 ip address-virtual 00:00:5E:00:01:01 172.16.1.254/24    
net add vlan 11 vlan-id 11    
net add vlan 11 vlan-raw-device bridge
net add bridge bridge vids 21  
net add vlan 21 ip address 172.17.1.253/24
net add vlan 21 ip address-virtual 00:00:5E:00:01:01 172.17.1.254/24    
net add vlan 21 vlan-id 21    
net add vlan 21 vlan-raw-device bridge
net add bridge bridge vids 31  
net add vlan 31 ip address 172.18.1.253/24
net add vlan 31 ip address-virtual 00:00:5E:00:01:01 172.18.1.254/24    
net add vlan 31 vlan-id 31    
net add vlan 31 vlan-raw-device bridge
net add bridge bridge vids 41  
net add vlan 41 ip address 172.19.1.253/24
net add vlan 41 ip address-virtual 00:00:5E:00:01:01 172.19.1.254/24    
net add vlan 41 vlan-id 41    
net add vlan 41 vlan-raw-device bridge
net add bgp l2vpn evpn  neighbor underlay activate
net add bgp l2vpn evpn  neighbor peerlink.4094 activate
net add bgp l2vpn evpn  advertise-all-vni
net add bgp l2vpn evpn  advertise-default-gw
net add bgp l2vpn evpn  advertise ipv4 unicast
net add bgp ipv6 unicast neighbor underlay activate
net add loopback lo clag vxlan-anycast-ip 10.0.0.20
net add loopback lo vxlan local-tunnelip 10.10.10.4
net add vlan 60 vlan-id 60
net add vlan 60 vlan-raw-device bridge
net add vlan 50 vlan-id 50
net add vlan 50 vlan-raw-device bridge
net add vxlan vtep60 vxlan id 10060
net add vxlan vtep50 vxlan id 10050
net add vxlan vtep60,50 bridge arp-nd-suppress on
net add vxlan vtep60,50 bridge learning off
net add vxlan vtep60,50 stp bpduguard
net add vxlan vtep60,50 stp portbpdufilter
net add vxlan vtep60,50 vxlan local-tunnelip 10.10.10.4
net add vxlan vtep60 bridge access 60
net add vxlan vtep50 bridge access 50
net add bridge bridge ports vtep60,vtep50
net add bridge bridge vids 60,50
net add bond bond1 bridge pvid 50
net commit

Repeat the following commands on all the leafs:

Leaf Switch Console
cumulus@leaf1a:mgmt:~$ net show clag
The peer is alive
     Our Priority, ID, and Role: 1000 1c:34:da:b4:09:40 primary
    Peer Priority, ID, and Role: 32768 1c:34:da:b4:06:40 secondary
          Peer Interface and IP: peerlink.4094 fe80::1e34:daff:feb4:640 (linklocal)
               VxLAN Anycast IP: 10.0.0.10
                      Backup IP: 10.10.10.2 (active)
                     System MAC: 44:38:39:be:ef:aa

CLAG Interfaces
Our Interface      Peer Interface     CLAG Id   Conflicts              Proto-Down Reason
----------------   ----------------   -------   --------------------   -----------------
           bond1   bond1              1         -                      -              
           bond2   bond2              2         -                      -              
           bond3   bond3              3         -                      -              
           bond4   bond4              4         -                      -              
           bond5   bond5              5         -                      -              
          vtep50   vtep50             -         -                      -              
          vtep60   vtep60             -         -                      -     

cumulus@leaf1a:mgmt:~$ net show route
show ip route
=============
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric,
       > - selected route, * - FIB route, q - queued, r - rejected, b - backup
       t - trapped, o - offload failure
C>* 10.0.0.10/32 is directly connected, lo, 01w5d21h
B>* 10.0.0.20/32 [20/0] via fe80::1e34:daff:feb3:ff6c, swp30, weight 1, 01w5d21h
  *                     via fe80::1e34:daff:feb4:6c, swp29, weight 1, 01w5d21h
C>* 10.10.10.1/32 is directly connected, lo, 01w6d19h
B>* 10.10.10.2/32 [200/0] via fe80::1e34:daff:feb4:640, peerlink.4094, weight 1, 01w5d21h
B>* 10.10.10.3/32 [20/0] via fe80::1e34:daff:feb3:ff6c, swp30, weight 1, 01w5d21h
  *                      via fe80::1e34:daff:feb4:6c, swp29, weight 1, 01w5d21h
B>* 10.10.10.4/32 [20/0] via fe80::1e34:daff:feb3:ff6c, swp30, weight 1, 01w5d21h
  *                      via fe80::1e34:daff:feb4:6c, swp29, weight 1, 01w5d21h
B>* 10.10.10.101/32 [20/0] via fe80::1e34:daff:feb4:6c, swp29, weight 1, 01w5d21h
B>* 10.10.10.102/32 [20/0] via fe80::1e34:daff:feb3:ff6c, swp30, weight 1, 01w5d21h
C * 172.16.0.0/24 [0/1024] is directly connected, vlan10-v0, 01w5d21h
C>* 172.16.0.0/24 is directly connected, vlan10, 01w5d21h
B>* 172.16.1.0/24 [20/0] via fe80::1e34:daff:feb3:ff6c, swp30, weight 1, 01w5d21h
  *                      via fe80::1e34:daff:feb4:6c, swp29, weight 1, 01w5d21h
C * 172.17.0.0/24 [0/1024] is directly connected, vlan20-v0, 01w5d21h
C>* 172.17.0.0/24 is directly connected, vlan20, 01w5d21h
B>* 172.17.1.0/24 [20/0] via fe80::1e34:daff:feb3:ff6c, swp30, weight 1, 01w5d21h
  *                      via fe80::1e34:daff:feb4:6c, swp29, weight 1, 01w5d21h
C * 172.18.0.0/24 [0/1024] is directly connected, vlan30-v0, 01w5d21h
C>* 172.18.0.0/24 is directly connected, vlan30, 01w5d21h
B>* 172.18.1.0/24 [20/0] via fe80::1e34:daff:feb3:ff6c, swp30, weight 1, 01w5d21h
  *                      via fe80::1e34:daff:feb4:6c, swp29, weight 1, 01w5d21h
C * 172.19.0.0/24 [0/1024] is directly connected, vlan40-v0, 01w5d21h
C>* 172.19.0.0/24 is directly connected, vlan40, 01w5d21h
B>* 172.19.1.0/24 [20/0] via fe80::1e34:daff:feb3:ff6c, swp30, weight 1, 01w5d21h
  *                      via fe80::1e34:daff:feb4:6c, swp29, weight 1, 01w5d21h



show ipv6 route
===============
Codes: K - kernel route, C - connected, S - static, R - RIPng,
       O - OSPFv3, I - IS-IS, B - BGP, N - NHRP, T - Table,
       v - VNC, V - VNC-Direct, A - Babel, D - SHARP, F - PBR,
       f - OpenFabric,
       > - selected route, * - FIB route, q - queued, r - rejected, b - backup
       t - trapped, o - offload failure
C * fe80::/64 is directly connected, vlan50, 01w5d21h
C * fe80::/64 is directly connected, vlan60, 01w5d21h
C * fe80::/64 is directly connected, vlan40, 01w5d21h
C * fe80::/64 is directly connected, vlan40-v0, 01w5d21h
C * fe80::/64 is directly connected, vlan30-v0, 01w6d19h
C * fe80::/64 is directly connected, vlan30, 01w6d19h
C * fe80::/64 is directly connected, vlan20-v0, 01w6d19h
C * fe80::/64 is directly connected, vlan20, 01w6d19h
C * fe80::/64 is directly connected, vlan10-v0, 01w6d19h
C * fe80::/64 is directly connected, vlan10, 01w6d19h
C * fe80::/64 is directly connected, bridge, 01w6d19h
C * fe80::/64 is directly connected, peerlink.4094, 01w6d19h
C * fe80::/64 is directly connected, swid0_eth, 01w6d19h
C * fe80::/64 is directly connected, swp30, 01w6d19h
C>* fe80::/64 is directly connected, swp29, 01w6d19h 

Host Preparation

In order to achieve optimal results for DPDK use cases, the compute node must be correctly optimized for DPDK performance.

The optimization might require specific BIOS and NIC Firmware settings. Please refer to the official DPDK performance document provided by your CPU vendor.

For our deployment we used the following document from AMD: https://www.amd.com/system/files/documents/epyc-7Fx2-processors-dpdk-nw-performance-brief.pdf

Please note that the host boot settings (Linux grub command line) are done later on as part of the overcloud image configuration and that the hugepages allocation is done on the actual VM used for testing.

OpenStack Deployment

The RedHat OpenStack Platform (RHOSP) version 16.1 will be used.

The deployment is divided into three major steps:

  1. Installing the director node
  2. Deploying the undercloud on the director node
  3. Deploying the overcloud using the undercloud

Make sure that the BIOS settings on the worker nodes servers have SR-IOV enabled and that the servers are tuned for maximum performance.

All nodes which belong to the same profile (e.g., controller, compute) must have the same PCIe placement for the NIC and must expose the same interface name.

The director node needs to have the following network interfaces:

  1. Untagged over bond0—used for provisioning the overcloud nodes (connected to stretched VLAN 50 on the switch)
  2. Tagged VLAN 60 over bond0.60—used for Internet access over the external network segment (connected to stretched VLAN 60 on the switch)
  3. An interface on the IPMI network—used for accessing to the bare metal nodes BMCs

Undercloud Director Installation 

  • Follow RH-OSP Preparing for Director Installation up to Preparing Container Images.

  • Use the following environment file for OVS-based RH-OSP 16.1 container preparation. Remember to update your Red Hat registry credentials:

/home/stack/containers-prepare-parameter.yaml
#global
parameter_defaults:
  ContainerImagePrepare:
  - push_destination: true
    excludes:
      - ceph
      - prometheus
    set:
      name_prefix: openstack-
      name_suffix: ''
      namespace: registry.redhat.io/rhosp-rhel8
      neutron_driver: null
      rhel_containers: false
      tag: '16.1'
    tag_from_label: '{version}-{release}'
  ContainerImageRegistryCredentials:
    registry.redhat.io:
      '<username>': '<password>'
/home/stack/undercloud.conf
[DEFAULT]
undercloud_hostname = rhosp-director.localdomain
local_ip = 192.168.24.1/24
network_gateway = 192.168.24.1
undercloud_public_host = 192.168.24.2
undercloud_admin_host = 192.168.24.3
undercloud_nameservers = 8.8.8.8,8.8.4.4
undercloud_ntp_servers = 10.211.0.134,10.211.0.124
subnets = ctlplane-subnet
local_subnet = ctlplane-subnet
generate_service_certificate = True
certificate_generation_ca = local
local_interface = bond0
inspection_interface = br-ctlplane
undercloud_debug = true
enable_tempest = false
enable_telemetry = false
enable_validations = true
enable_novajoin = false
clean_nodes = true
custom_env_files = /home/stack/custom-undercloud-params.yaml,/home/stack/ironic_ipmitool.yaml
container_images_file = /home/stack/containers-prepare-parameter.yaml

[auth]

[ctlplane-subnet]
cidr = 192.168.24.0/24
dhcp_start = 192.168.24.5
dhcp_end = 192.168.24.30
inspection_iprange = 192.168.24.100,192.168.24.120
gateway = 192.168.24.1
masquerade = true
  • Follow the instructions in RH-OSP Obtain Images for Overcloud Nodes (section 4.9.1, steps 1–3), without importing the images into the director yet.
  • Once obtained, follow the customization process described in RH-OSP Working with Overcloud Images:
    1. Start from 3.3. QCOW: Installing virt-customize to director
    2. Skip 3.4
    3. Run 3.5. QCOW: Setting the root password (optional)
    4. Run 3.6. QCOW: Registering the image
    5. Run the following command to locate your subscription pool ID:

      $ sudo subscription-manager list --available --all --matches="Red Hat OpenStack"
    6. Replace [subscription-pool] in the below command with your relevant subscription pool ID:

      $ virt-customize --selinux-relabel -a overcloud-full.qcow2 --run-command 'subscription-manager attach --pool [subscription-pool]'
    7. Skip 3.7 and 3.8
    8. Run the following command to add mstflint to overcloud image to allow the NIC firmware provisioning during overcloud deployment (similar to 3.9).

      $ virt-customize --selinux-relabel -a overcloud-full.qcow2 --install mstflint

      mstflint is required for the overcloud nodes to support the automatic NIC firmware upgrade by the cloud orchestration system during deployment.

    9. Run 3.10. QCOW: Cleaning the subscription pool
    10. Run 3.11. QCOW: Unregistering the image
    11. Run 3.12. QCOW: Reset the machine ID
    12. Run 3.13. Uploading the images to director

Undercloud Director Preparation for Automatic NIC Firmware Provisioning

  1. Download the latest ConnectX NIC firmware binary file (fw-<NIC-Model>.bin) from NVIDIA Networking Firmware Download Site.
  2. Create a directory named mlnx_fw under /var/lib/ironic/httpboot/ in the Director node, and place the firmware binary file in it.
  3. Extract the connectx_first_boot.yaml file from the configuration files attached to this guide, and place it in the /home/stack/templates/ directory in the Director node

The connectx_first_boot.yaml file is called by another deployment configuration file (env-ovs-dvr.yaml), so please use the instructed location, or change the configuration files accordingly.

Overcloud Nodes Introspection

A full overcloud introspection procedure is described in RH-OSP Configuring a Basic Overcloud. In this RDG, the following configuration steps were used for introspecting overcloud bare metal nodes to be deployed later-on over two routed Spine-Leaf racks:

  • Prepare a bare metal inventory file - instackenv.json, with the overcloud nodes information. In this case, the inventory file is listing 5 bare metal nodes to be deployed as overcloud nodes: 3 controller nodes and 2 compute nodes (1 in each routed rack). Make sure to update the file with the IPMI servers addresses and credentials and with the MAC address of one of the physical ports in order to avoid issues in the introspection process:

    instackenv.json
    {
        "nodes": [
            {
                "name": "controller-1",
                "pm_type":"ipmi",
                "pm_user":"root",
                "pm_password":"******",
                "pm_addr":"10.7.214.1",
    			"ports":[
                    {
                        "address":"<MAC_ADDRESS>", 
                        "pxe_enabled":true
                    }
                ]
            },
            {
                "name": "controller-2",
                "pm_type":"ipmi",
                "pm_user":"root",
                "pm_password":"******",
                "pm_addr":"10.7.214.2",
    			"ports":[
                    {
                        "address":"<MAC_ADDRESS>", 
                        "pxe_enabled":true
                    }
                ]
            },
            {
                "name": "controller-3",
                "pm_type":"ipmi",
                "pm_user":"root",
                "pm_password":"******",
                "pm_addr":"10.7.214.3",
    			"ports":[
                    {
                        "address":"<MAC_ADDRESS>", 
                        "pxe_enabled":true
                    }
                ]
            },
            {
                "name": "compute-1",
                "pm_type":"ipmi",
                "pm_user":"root",
                "pm_password":"******",
                "pm_addr":"10.7.214.4",
    			"ports":[
                    {
                        "address":"<MAC_ADDRESS>", 
                        "pxe_enabled":true
                    }
                ]
            },
            {
                "name": "compute-2",
                "pm_type":"ipmi",
                "pm_user":"root",
                "pm_password":"******",
                "pm_addr":"10.7.214.5",
    			"ports":[
                    {
                        "address":"<MAC_ADDRESS>", 
                        "pxe_enabled":true
                    }
                ]
            }
        ]
    }
  • Import the overcloud baremetal nodes inventory, and wait until all nodes are listed in "manageable" state.

    [stack@rhosp-director ~]$ source ~/stackrc
    (undercloud) [stack@rhosp-director ~]$ openstack overcloud node import /home/stack/instackenv.json
    $ openstack baremetal node list
    +--------------------------------------+--------------+---------------+-------------+--------------------+-------------+
    | UUID                                 | Name         | Instance UUID | Power State | Provisioning State | Maintenance |
    +--------------------------------------+--------------+---------------+-------------+--------------------+-------------+
    | 476c7659-abc2-4d8c-9532-1756abbfd18a | controller-1 | None          | power off   | manageable         | False       |
    | 3cbb74e5-6508-4ec8-91a8-870dbf28baed | controller-2 | None          | power off   | manageable         | False       |
    | 457b329e-f1bc-476a-996d-eb82a56998e8 | controller-3 | None          | power off   | manageable         | False       |
    | 870445b7-650f-40fc-8ac2-5c3df700ccdc | compute-1    | None          | power off   | manageable         | False       |
    | baa7356b-11ca-4cb0-b58c-16c110bbbea0 | compute-2    | None          | power off   | manageable         | False       |
    +--------------------------------------+--------------+---------------+-------------+--------------------+-------------+
  • Start the baremetal nodes introspection:

    $ openstack overcloud node introspect --all-manageable
  • Set the root device for deployment, and provide all baremetal nodes to reach "available" state:

    $ openstack overcloud node configure --all-manageable --instance-boot-option local --root-device largest
    $ openstack overcloud node provide --all-manageable
  • Tag the controller nodes into the "control" profile, which is later mapped to the overcloud controller role: 

    $ openstack baremetal node set --property capabilities='profile:control,boot_option:local' controller-1
    $ openstack baremetal node set --property capabilities='profile:control,boot_option:local' controller-2
    $ openstack baremetal node set --property capabilities='profile:control,boot_option:local' controller-3

    Note: The Role to Profile mapping is specified in the node-info.yaml file used during the overcloud deployment.

  • Create a new compute flavor, and tag one compute node into the "compute-r0" profile, which is later mapped to the overcloud "compute in rack 0" role:

    $ openstack flavor create --id auto --ram 4096 --disk 40 --vcpus 1 compute-r0
    $ openstack flavor set --property "capabilities:boot_option"="local" --property "capabilities:profile"="compute-r0" --property "resources:CUSTOM_BAREMETAL"="1" --property "resources:DISK_GB"="0" --property "resources:MEMORY_MB"="0" --property "resources:VCPU"="0" compute-r0
     
    $ openstack baremetal node set --property capabilities='profile:compute-r0,boot_option:local' compute-1
  • Create a new compute flavor, and tag the last compute node into the "compute-r1" profile, which is later mapped to the overcloud "compute in rack 1" role: 

    $ openstack flavor create --id auto --ram 4096 --disk 40 --vcpus 1 compute-r1
    $ openstack flavor set --property "capabilities:boot_option"="local" --property "capabilities:profile"="compute-r1" --property "resources:CUSTOM_BAREMETAL"="1" --property "resources:DISK_GB"="0" --property "resources:MEMORY_MB"="0" --property "resources:VCPU"="0" compute-r1
     
    $ openstack baremetal node set --property capabilities='profile:compute-r1,boot_option:local' compute-2
  • Verify the overcloud nodes profiles allocation:

    $ openstack overcloud profiles list
    +--------------------------------------+--------------+-----------------+-----------------+-------------------+
    | Node UUID                            | Node Name    | Provision State | Current Profile | Possible Profiles |
    +--------------------------------------+--------------+-----------------+-----------------+-------------------+
    | 476c7659-abc2-4d8c-9532-1756abbfd18a | controller-1 | available       | control         |                   |
    | 3cbb74e5-6508-4ec8-91a8-870dbf28baed | controller-2 | available       | control         |                   |
    | 457b329e-f1bc-476a-996d-eb82a56998e8 | controller-3 | available       | control         |                   |
    | 870445b7-650f-40fc-8ac2-5c3df700ccdc | compute-1    | available       | compute-r0      |                   |
    | baa7356b-11ca-4cb0-b58c-16c110bbbea0 | compute-2    | available       | compute-r1      |                   |
    +--------------------------------------+--------------+-----------------+-----------------+-------------------+

Overcloud Deployment Configuration Files 

Prepare the following cloud deployment configuration files, and place it under the /home/stack/templates/dvr directory. 

Note: The full files are attached to this article, and can be found here: 47036708-1.0.0.zip

Some configuration files are customized specifically the to the /home/stack/templates/dvr location. If you place the template files in a different location, adjust it accordingly.

  • containers-prepare-parameter.yaml
  • network-environment-dvr.yaml
  • controller.yaml

    This template file contains the network settings for the controller nodes, including large MTU and bonding configuration

  • computesriov-dvr.yaml

    This template file contains the network settings for compute node, including SR-IOV VFs, large MTU and accelerated bonding (VF-LAG) configuration for data path.

  • node-info.yaml

    This environment file contains the count of nodes per role and the role to the baremetal profile mapping.

  • roles_data_dvr.yaml

    This environment file contains the services enabled on each cloud role and the networks associated with its rack location.

  • network_data.yaml

    This environment file contains a cloud network configuration for routed Spine-Leaf topology with large MTU. Rack0 and Rack1 L3 segments are listed as subnets of each cloud network. For further information refer to RH-OSP Configuring the Overcloud Leaf Networks.

  • env-ovs-dvr.yaml

    • This environment file contains the following settings:

      • Overcloud nodes time settings
      • ConnectX First Boot parameters (by calling  /home/stack/templates/connectx_first_boot.yaml file)
      • Neutron Jumbo Frame MTU and DVR mode
      • Compute nodes CPU partitioning and isolation adjusted to Numa topology
      • Nova PCI passthrough settings adjusted to VXLAN hardware offload


Overcloud Deployment

  • Issue the overcloud deploy command to start cloud deployment with the prepared configuration files.

    $ openstack overcloud deploy --templates /usr/share/openstack-tripleo-heat-templates \
    --libvirt-type kvm \
    -n /home/stack/templates/dvr/network_data.yaml \
    -r /home/stack/templates/dvr/roles_data_dvr.yaml \
    --validation-warnings-fatal \
    -e /home/stack/templates/dvr/node-info.yaml \
    -e /home/stack/templates/dvr/containers-prepare-parameter.yaml \
    -e /usr/share/openstack-tripleo-heat-templates/environments/podman.yaml \
    -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \
    -e /usr/share/openstack-tripleo-heat-templates/environments/neutron-ovs-dvr.yaml \
    -e /home/stack/templates/dvr/network-environment-dvr.yaml \
    -e /home/stack/templates/dvr/env-ovs-dvr.yaml \
    -e /usr/share/openstack-tripleo-heat-templates/environments/disable-telemetry.yaml
  • Once deployed, load the necessary environment variables to interact with your overcloud:

    $ source ~/overcloudrc

Applications and Use Cases

Accelerated Packet Processing (SDN Acceleration) 

Note: The following use case is demonstrating SDN layer acceleration using hardware offload capabilities.
The appendix below includes benchmarks that demonstrate SDN offload performance and usability.


VM Image 

  • Build a VM cloud image (qcow2) with packet processing performance tools and cloud-init elements as described in How-to: Create OpenStack Cloud Image with Performance Tools.
  • Upload the image to the overcloud image store:

    $ openstack image create perf --public --disk-format qcow2 --container-format bare --file /home/stack/images/guest/centos8-perf.qcow2

VM Flavor 

  • Create a flavor: 

    $ openstack flavor create m1.packet --id auto --ram 8192 --disk 20 --vcpus 10
  • Set hugepages and cpu-pinning parameters:

    $ openstack flavor set m1.packet --property hw:mem_page_size=large
    $ openstack flavor set m1.packet --property hw:cpu_policy=dedicated

VM Networks and Ports 

  • Create a VXLAN network with normal ports to be used for instance management and access:

    $ openstack network create vx_mgmt --provider-network-type vxlan --share
    $ openstack subnet create vx_mgmt_subnet --dhcp --network vx_mgmt --subnet-range 22.22.22.0/24 --dns-nameserver 8.8.8.8
    $ openstack port create normal1 --network vx_mgmt --no-security-group --disable-port-security
    $ openstack port create normal2 --network vx_mgmt --no-security-group --disable-port-security
  • Create a VXLAN network to be used for accelerated data traffic between the VM instances with Jumbo Frames support:

    $ openstack network create vx_data --provider-network-type vxlan --share --mtu 8950
    $ openstack subnet create vx_data_subnet --dhcp --network vx_data --subnet-range 33.33.33.0/24 --gateway none
  • Create 2 x SR-IOV direct ports with hardware offload capabilities: 

    $ openstack port create direct1 --vnic-type=direct --network vx_data --binding-profile '{"capabilities":["switchdev"]}'
    $ openstack port create direct2 --vnic-type=direct --network vx_data --binding-profile '{"capabilities":["switchdev"]}'
  • Create an external network for public access: 

    $ openstack network create public --provider-physical-network datacentre --provider-network-type vlan --provider-segment 60 --share --mtu 8950 --external
    $ openstack subnet create public_subnet --dhcp --network public --subnet-range 172.60.0.0/24 --allocation-pool start=172.60.0.100,end=172.60.0.150 --gateway 172.60.0.254
  • Create a public router, and add the management network subnet: 

    $ openstack router create public_router
    $ openstack router set public_router --external-gateway public
    $ openstack router add subnet public_router vx_mgmt_subnet

VM Instances 

  • Create a VM instance with a management port and a direct port on the compute node located on the Rack0 L3 network segment: 

    $ openstack server create --flavor m1.packet --image perf --port normal1 --port direct1 vm1 --availability-zone nova:overcloud-computesriov-rack0-0.localdomain
  • Create a VM instance with a management port and a direct port on the compute node located on the Rack1 L3 network segment: 

    $ openstack server create --flavor m1.packet --image perf --port normal2 --port direct2 vm2 --availability-zone nova:overcloud-computesriov-rack1-0.localdomain
  • Wait until the VM instances status is changed to ACTIVE: 

$ openstack server list
+--------------------------------------+---------+--------+---------------------------------------------------------+-------+--------+
| ID                                   | Name    | Status | Networks                                                | Image | Flavor |
+--------------------------------------+---------+--------+---------------------------------------------------------+-------+--------+
| 000d3e17-9583-4885-aa2d-79c3b5df3cc8 | vm2     | ACTIVE | vx_data=33.33.33.244; vx_mgmt=22.22.22.151              | perf  |        |
| 150013ad-170d-4587-850e-70af692aa74c | vm1     | ACTIVE | vx_data=33.33.33.186; vx_mgmt=22.22.22.59               | perf  |        |
+--------------------------------------+---------+--------+---------------------------------------------------------+-------+--------+
  • In order to verify that external/public network access is functioning properly, create and assign external floating IP addresses for the VMs:

    $ openstack floating ip create --floating-ip-address 172.60.0.151 public
    $ openstack server add floating ip vm1 172.60.0.151
    $ openstack floating ip create --floating-ip-address 172.60.0.152 public
    $ openstack server add floating ip vm2 172.60.0.152

Appendix

Performance Testing


The performance results listed in this document are indicative and should not be considered as formal performance targets for NVIDIA products.

Note: The tools used in the below tests are included in the VM image built with the perf-tools element, as instructed in previous steps.

iperf TCP Test

  • On the Transmitter VM, disable XPS (Transmit Packet Steering):

    # for i in {0..7}; do echo 0 > /sys/class/net/eth1/queues/tx-$i/xps_cpus;  done;
    • This step is required to allow packet distribution of the iperf traffic using ConnectX VF-LAG.
    • Use interface statistic/counters on the Leaf switches' bond interfaces to confirm traffic is distributed by the Transmitter VM to both switches using VF-LAG  (Cumulus "net show counters" command).
  • On the Receiver VM, start multiple iperf3 servers:

    # iperf3 -s -p 5101&
    # iperf3 -s -p 5102&
    # iperf3 -s -p 5103&
    # iperf3 -s -p 5104&
  • On the Transmitter VM, start multiple iperf3 clients for multi-core parallel streams:

    # iperf3 -c 33.33.33.180 -T s1 -p 5101 -t 15 &
    # iperf3 -c 33.33.33.180 -T s2 -p 5102 -t 15 &
    # iperf3 -c 33.33.33.180 -T s3 -p 5103 -t 15 &
    # iperf3 -c 33.33.33.180 -T s4 -p 5104 -t 15 &
  • Check the results:

    s1:  - - - - - - - - - - - - - - - - - - - - - - - - -
    s1:  [ ID] Interval           Transfer     Bitrate         Retr
    s1:  [  5]   0.00-15.00  sec  38.4 GBytes  22.0 Gbits/sec  18239             sender
    s1:  [  5]   0.00-15.04  sec  38.4 GBytes  22.0 Gbits/sec                  receiver
    s1:  
    s1:  iperf Done.
    s2:  [  5]  14.00-15.00  sec  3.24 GBytes  27.8 Gbits/sec  997   1008 KBytes       
    s2:  - - - - - - - - - - - - - - - - - - - - - - - - -
    s2:  [ ID] Interval           Transfer     Bitrate         Retr
    s2:  [  5]   0.00-15.00  sec  45.1 GBytes  25.8 Gbits/sec  22465             sender
    s2:  [  5]   0.00-15.04  sec  45.1 GBytes  25.7 Gbits/sec                  receiver
    s2:  
    s2:  iperf Done.
    s4:  [  5]  14.00-15.00  sec  2.26 GBytes  19.4 Gbits/sec  1435    443 KBytes       
    s4:  - - - - - - - - - - - - - - - - - - - - - - - - -
    s4:  [ ID] Interval           Transfer     Bitrate         Retr
    s4:  [  5]   0.00-15.00  sec  46.5 GBytes  26.6 Gbits/sec  19964             sender
    s4:  [  5]   0.00-15.04  sec  46.5 GBytes  26.6 Gbits/sec                  receiver
    s4:  
    s4:  iperf Done.
    s3:  [  5]  14.00-15.00  sec  2.70 GBytes  23.2 Gbits/sec  1537    643 KBytes       
    s3:  - - - - - - - - - - - - - - - - - - - - - - - - -
    s3:  [ ID] Interval           Transfer     Bitrate         Retr
    s3:  [  5]   0.00-15.00  sec  39.5 GBytes  22.6 Gbits/sec  18447             sender
    s3:  [  5]   0.00-15.04  sec  39.5 GBytes  22.6 Gbits/sec                  receiver
    s3:  
    s3:  iperf Done.

    The test results above demonstrate a total of around 100Gbps line rate for IP TCP traffic.

Before proceeding to the next test, stop all iperf servers on the Receiver VM:

# killall iperf3
iperf3: interrupt - the server has terminated
iperf3: interrupt - the server has terminated
iperf3: interrupt - the server has terminated
iperf3: interrupt - the server has terminated

RoCE Bandwidth Test over the bond

  • On the Receiver VM, start ib_write_bw server with 2 QPs in order to utilize the VF-LAG infrastructure:

    # ib_write_bw -R --report_gbits --qp 2
  • On the Transmitter VM, start ib_write_bw client with 2 QPs and a duration of 10 seconds:

    # ib_write_bw -R --report_gbits --qp 2 -D 10 33.33.33.180
  • Check the test results:

    ---------------------------------------------------------------------------------------
                        RDMA_Write BW Test
     Dual-port       : OFF          Device         : mlx5_0
     Number of qps   : 2            Transport type : IB
     Connection type : RC           Using SRQ      : OFF
     PCIe relax order: ON
     ibv_wr* API     : ON
     TX depth        : 128
     CQ Moderation   : 1
     Mtu             : 4096[B]
     Link type       : Ethernet
     GID index       : 3
     Max inline data : 0[B]
     rdma_cm QPs     : ON
     Data ex. method : rdma_cm
    ---------------------------------------------------------------------------------------
     local address: LID 0000 QPN 0x016e PSN 0x3ab6d6
     GID: 00:00:00:00:00:00:00:00:00:00:255:255:33:33:33:173
     local address: LID 0000 QPN 0x016f PSN 0xc9fa3f
     GID: 00:00:00:00:00:00:00:00:00:00:255:255:33:33:33:173
     remote address: LID 0000 QPN 0x016e PSN 0x8d928c
     GID: 00:00:00:00:00:00:00:00:00:00:255:255:33:33:33:180
     remote address: LID 0000 QPN 0x016f PSN 0xc89786
     GID: 00:00:00:00:00:00:00:00:00:00:255:255:33:33:33:180
    ---------------------------------------------------------------------------------------
     #bytes     #iterations    BW peak[Gb/sec]    BW average[Gb/sec]   MsgRate[Mpps]
     65536      22065492         0.00               192.81             0.367756
    ---------------------------------------------------------------------------------------

Single-core DPDK packet-rate test over a single port

In this test, we used two VMs - one running the trex traffic generator and the other running the dpdk-testpmd tool.

The traffic generator sends 64 byte packets to the test-pmd tool and the tool sends them back (thus testing the DPDK packet-processing pipeline).

We tested single-core packet processing rate between two VMs using 2 tx and rx queues.

Please note that this test requires using specific versions of DPDK and the VM image kernel and compatibility issues might exist when using the latest releases.

For this test we used a CentOS 7.8 image with kernel 3.10.0-1160.42.2 and DPDK 20.11.


On one VM we activated the dpdk-testpmd utility, this is the command line used to run testpmd (this command displays the MAC address of the port to be used later on with trex as the destination port MAC):

testpmd VM console
# ./dpdk-testpmd -c 0x1ff -n 4 -m 1024 -a 00:05.0 -- --burst=64 --txd=1024 --rxd=1024 --mbcache=512 --rxq=4 --txq=4 --nb-cores=1 --rss-udp --forward-mode=macswap -a -i

Next, we created a second VM with two direct ports (on the vx_data network) and used it for running the trex traffic generator (version 2.82):

We ran the DPDK port setup interactive wizard and used the MAC address of the TestPMD VM collected beforehand:

trex VM console
# cd /root/trex/v2.82
# ./dpdk_setup_ports.py -i
By default, IP based configuration file will be created. Do you want to use MAC based config? (y/N)y
+----+------+---------+-------------------+----------------------------------------------+------------+----------+----------+
| ID | NUMA |   PCI   |        MAC        |                     Name                     |   Driver   | Linux IF |  Active  |
+====+======+=========+===================+==============================================+============+==========+==========+
| 0  | -1   | 00:03.0 | fa:16:3e:11:3e:64 | Virtio network device                        | virtio-pci | eth0     | *Active* |
+----+------+---------+-------------------+----------------------------------------------+------------+----------+----------+
| 1  | -1   | 00:05.0 | fa:16:3e:cb:a4:82 | ConnectX Family mlx5Gen Virtual Function     | mlx5_core  | eth1     |          |
+----+------+---------+-------------------+----------------------------------------------+------------+----------+----------+
| 2  | -1   | 00:06.0 | fa:16:3e:58:79:e2 | ConnectX Family mlx5Gen Virtual Function     | mlx5_core  | eth2     |          |
+----+------+---------+-------------------+----------------------------------------------+------------+----------+----------+
Please choose an even number of interfaces from the list above, either by ID, PCI or Linux IF
Stateful will use order of interfaces: Client1 Server1 Client2 Server2 etc. for flows.
Stateless can be in any order.
Enter list of interfaces separated by space (for example: 1 3) : 1 2
 
For interface 1, assuming loopback to its dual interface 2.
Destination MAC is fa:16:3e:58:79:e2. Change it to MAC of DUT? (y/N).y
Please enter a new destination MAC of interface 1: FA:16:3E:32:5C:A4
For interface 2, assuming loopback to its dual interface 1.
Destination MAC is fa:16:3e:cb:a4:82. Change it to MAC of DUT? (y/N).y
Please enter a new destination MAC of interface 2: FA:16:3E:32:5C:A4
Print preview of generated config? (Y/n)
### Config file generated by dpdk_setup_ports.py ###
 
- version: 2
  interfaces: ['00:05.0', '00:06.0']
  port_info:
      - dest_mac: fa:16:3e:32:5c:a4
        src_mac:  fa:16:3e:cb:a4:82
      - dest_mac: fa:16:3e:32:5c:a4
        src_mac:  fa:16:3e:58:79:e2
 
  platform:
      master_thread_id: 0
      latency_thread_id: 1
      dual_if:
        - socket: 0
          threads: [2,3,4,5,6,7,8,9]
 
 
Save the config to file? (Y/n)y
Default filename is /etc/trex_cfg.yaml
Press ENTER to confirm or enter new file:
Saved to /etc/trex_cfg.yaml.

We create the following UDP packet stream configuration file under the /root/trex/v2.82 directory:

udp_rss.py
from trex_stl_lib.api import *
 
class STLS1(object):
 
    def create_stream (self):
        
        pkt = Ether()/IP(src="16.0.0.1",dst="48.0.0.1")/UDP(dport=12)/(22*'x')
                  
        vm = STLScVmRaw( [
                                STLVmFlowVar(name="v_port",
                                                min_value=4337,
                                                  max_value=5337,
                                                  size=2, op="inc"),
                                STLVmWrFlowVar(fv_name="v_port",
                                            pkt_offset= "UDP.sport" ),
                                STLVmFixChecksumHw(l3_offset="IP",l4_offset="UDP",l4_type=CTRexVmInsFixHwCs.L4_TYPE_UDP),
 
                            ]
                        )
 
        return STLStream(packet = STLPktBuilder(pkt = pkt ,vm = vm ) ,
                                mode = STLTXCont(pps = 8000000) )
 
 
    def get_streams (self, direction = 0, **kwargs):
        # create 1 stream
        return [ self.create_stream() ]
 
 
# dynamic load - used for trex console or simulator
def register():
    return STLS1()


Next, we ran the TRex application in the background over 8 out of 10 cores:

trex VM console
# nohup ./t-rex-64 --no-ofed-check -i -c 8 &

And finally ran the TRex Console:

trex VM console
# ./trex-console 

Using 'python' as Python interpeter


Connecting to RPC server on localhost:4501                   [SUCCESS]


Connecting to publisher server on localhost:4500             [SUCCESS]


Acquiring ports [0, 1]:                                      [SUCCESS]


Server Info:

Server version:   v2.82 @ STL
Server mode:      Stateless
Server CPU:       8 x AMD EPYC Processor (with IBPB)
Ports count:      2 x 100Gbps @ ConnectX Family mlx5Gen Virtual Function

-=TRex Console v3.0=-

Type 'help' or '?' for supported actions

trex>

In the TRex Console, entered the UI (TUI):

trex VM console
trex>tui

And started a 35MPPS stream using the stream configuration file created in previous steps:

trex VM console
tui>start -f udp_rss.py -m 35mpps -p 0

And these are the measured results on the testpmd VM (~32.5mpps):

testpmd tool console
testpmd> show port stats all

######################## NIC statistics for port 0 ########################
RX-packets: 486825081 RX-missed: 14768224 RX-bytes: 29209504860
RX-errors: 0
RX-nombuf: 0 
TX-packets: 486712558 TX-errors: 0 TX-bytes: 29202753480

Throughput (since last show)
Rx-pps: 32542689 Rx-bps: 15620491128
Tx-pps: 32542685 Tx-bps: 15620488872
############################################################################

Troubleshooting the Nodes in Case Network Is Not Functioning

In case there is a need to debug the overcloud nodes due to issues in the deployment, the only way to connect to them is using an SSH key that is placed in the director node.

Connecting through a BMC console is useless since there is no way to log in to the nodes using a username and a password.

Since the nodes are connected using a single network, there can be a situation in which they are completely inaccessible for troubleshooting in case there is an issue with the YAML files and so forth.

The solution for this is to add the following lines to the env-ovs-dvr.yaml file (under the mentioned sections) that will allow for setting a root password for the nodes so they become accessible from the BMC console:

resource_registry:
   OS::TripleO::NodeUserData: /usr/share/openstack-tripleo-heat-templates/firstboot/userdata_root_password.yaml
 
parameter_defaults:
  NodeRootPassword: "******" 

The overcloud to e deleted and redeployed in order for the above changes to affect.

Note: this feature and the "Automatic NIC Firmware Provisioning" feature are mutually exclusive and can't be used in parallel as they both use the "NodeUserData" variable!

Note: Make sure to remove the above lines in case of a production cloud as they pose a security hazard

Once the nodes are accessible, an additional helpful step to troubleshoot network issues on the nodes would be to run the following script on the problematic nodes:

# os-net-config -c /etc/os-net-config/config.json -d 

This script will try to reconfigure the network and will indicate any relevant issues.

Authors

Itai Levy

Over the past few years, Itai Levy has worked as a Solutions Architect and member of the NVIDIA Networking “Solutions Labs” team. Itai designs and executes cutting-edge solutions around Cloud Computing, SDN, SDS and Security. His main areas of expertise include NVIDIA BlueField Data Processing Unit (DPU) solutions and accelerated OpenStack/K8s platforms.


Shachar Dor

Shachar Dor joined the Solutions Lab team after working more than ten years as a software architect at NVIDIA Networking (previously Mellanox Technologies), where he was responsible for the architecture of network management products and solutions. Shachar's focus is on networking technologies, especially around fabric bring-up, configuration, monitoring, and life-cycle management. 

Shachar has a strong background in software architecture, design, and programming through his work on multiple projects and technologies also prior to joining the company. 


Notice

This document is provided for information purposes only and shall not be regarded as a warranty of a certain functionality, condition, or quality of a product. Neither NVIDIA Corporation nor any of its direct or indirect subsidiaries and affiliates (collectively: “NVIDIA”) make any representations or warranties, expressed or implied, as to the accuracy or completeness of the information contained in this document and assumes no responsibility for any errors contained herein. NVIDIA shall have no liability for the consequences or use of such information or for any infringement of patents or other rights of third parties that may result from its use. This document is not a commitment to develop, release, or deliver any Material (defined below), code, or functionality.
NVIDIA reserves the right to make corrections, modifications, enhancements, improvements, and any other changes to this document, at any time without notice.
Customer should obtain the latest relevant information before placing orders and should verify that such information is current and complete.
NVIDIA products are sold subject to the NVIDIA standard terms and conditions of sale supplied at the time of order acknowledgement, unless otherwise agreed in an individual sales agreement signed by authorized representatives of NVIDIA and customer (“Terms of Sale”). NVIDIA hereby expressly objects to applying any customer general terms and conditions with regards to the purchase of the NVIDIA product referenced in this document. No contractual obligations are formed either directly or indirectly by this document.
NVIDIA products are not designed, authorized, or warranted to be suitable for use in medical, military, aircraft, space, or life support equipment, nor in applications where failure or malfunction of the NVIDIA product can reasonably be expected to result in personal injury, death, or property or environmental damage. NVIDIA accepts no liability for inclusion and/or use of NVIDIA products in such equipment or applications and therefore such inclusion and/or use is at customer’s own risk.
NVIDIA makes no representation or warranty that products based on this document will be suitable for any specified use. Testing of all parameters of each product is not necessarily performed by NVIDIA. It is customer’s sole responsibility to evaluate and determine the applicability of any information contained in this document, ensure the product is suitable and fit for the application planned by customer, and perform the necessary testing for the application in order to avoid a default of the application or the product. Weaknesses in customer’s product designs may affect the quality and reliability of the NVIDIA product and may result in additional or different conditions and/or requirements beyond those contained in this document. NVIDIA accepts no liability related to any default, damage, costs, or problem which may be based on or attributable to: (i) the use of the NVIDIA product in any manner that is contrary to this document or (ii) customer product designs.
No license, either expressed or implied, is granted under any NVIDIA patent right, copyright, or other NVIDIA intellectual property right under this document. Information published by NVIDIA regarding third-party products or services does not constitute a license from NVIDIA to use such products or services or a warranty or endorsement thereof. Use of such information may require a license from a third party under the patents or other intellectual property rights of the third party, or a license from NVIDIA under the patents or other intellectual property rights of NVIDIA.
Reproduction of information in this document is permissible only if approved in advance by NVIDIA in writing, reproduced without alteration and in full compliance with all applicable export laws and regulations, and accompanied by all associated conditions, limitations, and notices.
THIS DOCUMENT AND ALL NVIDIA DESIGN SPECIFICATIONS, REFERENCE BOARDS, FILES, DRAWINGS, DIAGNOSTICS, LISTS, AND OTHER DOCUMENTS (TOGETHER AND SEPARATELY, “MATERIALS”) ARE BEING PROVIDED “AS IS.” NVIDIA MAKES NO WARRANTIES, EXPRESSED, IMPLIED, STATUTORY, OR OTHERWISE WITH RESPECT TO THE MATERIALS, AND EXPRESSLY DISCLAIMS ALL IMPLIED WARRANTIES OF NONINFRINGEMENT, MERCHANTABILITY, AND FITNESS FOR A PARTICULAR PURPOSE. TO THE EXTENT NOT PROHIBITED BY LAW, IN NO EVENT WILL NVIDIA BE LIABLE FOR ANY DAMAGES, INCLUDING WITHOUT LIMITATION ANY DIRECT, INDIRECT, SPECIAL, INCIDENTAL, PUNITIVE, OR CONSEQUENTIAL DAMAGES, HOWEVER CAUSED AND REGARDLESS OF THE THEORY OF LIABILITY, ARISING OUT OF ANY USE OF THIS DOCUMENT, EVEN IF NVIDIA HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. Notwithstanding any damages that customer might incur for any reason whatsoever, NVIDIA’s aggregate and cumulative liability towards customer for the products described herein shall be limited in accordance with the Terms of Sale for the product.

Trademarks
NVIDIA, the NVIDIA logo, and Mellanox are trademarks and/or registered trademarks of NVIDIA Corporation and/or Mellanox Technologies Ltd. in the U.S. and in other countries. Other company and product names may be trademarks of the respective companies with which they are associated.

Copyright
© 2022 NVIDIA Corporation & affiliates. All Rights Reserved.