VXLAN

NVIDIA Onyx User Manual v3.10.4006

Data centers are being increasingly consolidated and outsourced in an effort to improve the deployment time of applications and reduce operational costs, and applications are constantly raising demand for compute, storage, and network resource. Thus, in order to scale compute, storage, and network resources, physical resources are being abstracted from their logical representation, in what is referred to as server, storage, and network virtualization. Virtualization can be implemented in various layers of computer systems or networks.

Multi-tenant data centers are taking advantage of the benefits of server virtualization to provide a new kind of hosting—a virtual hosted data center. Multi-tenant data centers are ones where individual tenants could belong to a different company or a different department. To a tenant, virtual data centers are similar to their physical counterparts, consisting of end-stations attached to a network, complete with services such as load balancers and firewalls. To tenant systems, a virtual network looks like a normal network, except that the only end-stations connected to the virtual network are those belonging to a tenant’s specific virtual network.

How a virtual network is implemented does not generally matter to the tenant; what matters is that the service provided (Layer 2 (L2) or Layer 3 (L3)) has the right semantics, performance, etc. It could be implemented via a pure routed network, a pure bridged network, or a combination of bridged and routed networks.

VXLAN (Virtual eXtensible Local Area Network) addresses the above requirements of the L2 and L3 data center network infrastructure in the presence of virtual networks in a multi-tenant environment. It runs over the existing networking infrastructure and provides a means to “stretch” an L2 network. Each overlay bridge is called a VXLAN segment. Only machines within the same VXLAN segment can communicate with each other. Each VXLAN segment is identified through a 24-bit segment ID called “VXLAN Network Identifier (VNI)”. A network endpoint which performs a conversion from virtual to physical network and back is called VXLAN Tunnel End-Point or VTEP.

In virtual environments, it is typically required to use logical switches to forward traffic between different virtual machines (VMs) on the same physical host, between virtual machines and the physical machines and between networks. Virtual switch environments use an OVSDB management protocol for configuration and state discovery of the virtual networks. OVSDB protocol allows programmable access to the database of virtual switch configuration.

To enable VXLAN:

  1. Configure jumbo frames for NVE ports. Run:

    Copy
    Copied!
                

    switch (config)# interface ethernet 1/1-1/4 mtu 9216 force

  2. Configure jumbo frames for underlay-facing ports. Run:

    Copy
    Copied!
                

    switch (config)# interface ethernet 1/17 mtu 9216 force

  3. Create VLAN for all VXLAN traffic. Run:

    Copy
    Copied!
                

    switch (config)# vlan 3

  4. Configure Overlay interfaces with VXLAN VLAN. Run:

    Copy
    Copied!
                

    switch (config)# interface ethernet 1/17 switchport access vlan 3

  5. Enable IP routing. Run:

    Copy
    Copied!
                

    switch (config)# ip routing vrf default

  6. Configure interface on the VXLAN VLAN and configure an IP address for it. Run:

    Copy
    Copied!
                

    switch (config)# interface vlan 3 switch (config interface vlan 3)# ip address 33.33.33.254 255.255.255.0 switch (config interface vlan 3)# interface vlan 3 mtu 9216

  7. Enable NVE protocol. Run:

    Copy
    Copied!
                

    switch (config)# protocol nve

  8. Configure interface NVE. Run:

    Copy
    Copied!
                

    switch (config)# interface nve 1

  9. Create loopback interface to terminate the VXLAN tunnel. The IP address of the interface will be a VTEP endpoint address, and needs to be reachable in the underlay network. Run:

    Copy
    Copied!
                

    switch (config)# interface loopback 1 switch (config interface loopback 1)# ip address 1.2.3.4 255.255.255.255 switch (config)# interface nve 1 vxlan source interface loopback 1

  10. Configure routing to other VTEP devices. Run:

    Copy
    Copied!
                

    switch (config)# ip route vrf default 1.2.3.5 /32 33.33.33.253 switch (config)# ip route vrf default 1.2.3.6 /32 33.33.33.252

  11. Configure overlay-facing ports for NVE mode. Run:

    Copy
    Copied!
                

    switch (config)# interface ethernet 1/1 nve mode only force switch (config)# interface ethernet 1/2 nve mode only force switch (config)# interface ethernet 1/3 nve mode only force switch (config)# interface ethernet 1/4 nve mode only force

For deployments with a controller, set up OVSDB:

  1. Start OVSDB server. Run:

    Copy
    Copied!
                

    switch (config)# ovs ovsdb server

  2. Configure the OVSDB manager to an IP address of a controller. Run:

    Copy
    Copied!
                

    switch (config)# ovs ovsdb manager remote ssl ip address 10.130.250.5

For controller-less deployments, configure the bridging from the CLI directly:

  1. Create bridges. Run:

    Copy
    Copied!
                

    switch (config)# interface nve 1 nve bridge 7777 switch (config)# interface ethernet 1/1 nve vlan 10 bridge 7777

  2. Configure source-node replication. Run:

    Copy
    Copied!
                

    switch (config)# no interface nve 1 nve fdb flood load-balance

  3. Configure flood addresses for BUM traffic. Run:

    Copy
    Copied!
                

    switch (config)# interface nve 1 nve fdb flood bridge 7777 address 1.2.3.5 switch (config)# interface nve 1 nve fdb flood bridge 7777 address 1.2.3.6

  4. Configure FDB remote learning. Run:

    Copy
    Copied!
                

    switch (config)# interface nve 1 nve fdb learning remote

VxLAN-version-1-modificationdate-1709536863393-api-v2.JPG

Hardware Topology

  • 2 ESXi servers pre-configured with VXLAN networking using VMware NSX

  • 3 NSX Controllers available for VXLAN unicast type logical switches

  • 1 NVIDIA switch connected to the ESXi servers and to a physical database server

  • Out-of-band network for management and a VLAN network to carry VXLAN traffic

VXLAN_Hardware_Topology_-_nonHPE-version-1-modificationdate-1709536864907-api-v2.png


Switch Configuration

  1. Configure jumbo frames on ESXi and Database server facing interfaces. Run:

    Copy
    Copied!
                

    switch (config)# interface ethernet 1/1-1/3 mtu 9216 force

  2. Create VLAN 3 to carry VXLAN traffic (if it does not exist yet). Run:

    Copy
    Copied!
                

    switch (config)# vlan 3 switch (config vlan 3)# exit switch (config)#

  3. Enable IP routing. Run:

    Copy
    Copied!
                

    switch (config)# ip routing vrf default

  4. Create an interface on VLAN 3 and assign an IP address to it.
    The IP address must be the default gateway of the VXLAN netstack created by NSX after enabling VXLAN traffic on the hosts.
    To check the default gateway in vSphere web client select an ESXi host and go to: Configure -> TCP/IP configuration.

    VXLAN_Switch_Configuration-version-1-modificationdate-1709536864000-api-v2.png

    Copy
    Copied!
                

    switch (config)# interface vlan 3 switch (config interface vlan 3)# ip address 33.33.33.254 255.255.255.0 switch (config interface vlan 3)# interface vlan 3 mtu 9216

  5. Create a loopback interface to communicate with VTEPs on the ESXi servers by routing through “interface vlan 3”. This interface will be the VTEP IP assigned to the switch. Run:

    Copy
    Copied!
                

    switch (config)# interface loopback 1 switch (config interface loopback 1)# ip address 1.2.3.4 255.255.255.255

  6. Enable NVE protocol. Run:

    Copy
    Copied!
                

    switch (config)# protocol nve

  7. Configure interface NVE. Run:

    Copy
    Copied!
                

    switch (config)# interface nve 1

  8. Configure the source of the NVE interface to be the loopback created above. Run:

    Copy
    Copied!
                

    switch (config)# interface nve 1 vxlan source interface loopback 1

  9. Start the OVSDB server and connect it to the NSX Controllers. Run:

    Copy
    Copied!
                

    switch (config)# ovs ovsdb server switch (config)# ovs ovsdb manager remote ssl ip address 10.130.200.100 switch (config)# ovs ovsdb manager remote ssl ip address 10.144.200.101 switch (config)# ovs ovsdb manager remote ssl ip address 10.144.200.102

  10. Configure the port facing the Database server as an NVE port. Run:

    Copy
    Copied!
                

    switch (config)# interface ethernet 1/3 nve mode only force

  11. Get the switch certificate for later configuration in the NSX Manager. Run:

    Copy
    Copied!
                

    switch (config)# show crypto certificate name system-self-signed public-pem

    Copy the certificate starting with the line:

    Copy
    Copied!
                

    -----BEGIN CERTIFICATE-----

    Until the line:

    Copy
    Copied!
                

    -----END CERTIFICATE-----

    Make sure to include both of those lines.

    Important

    NSX Manager Configuration

    Important

    Adding Hosts to Replication Cluster

  12. In NSX Manager, go to “Service Definitions” → “Hardware Devices”.

    VXLAN_Switch_Configuration_2-version-1-modificationdate-1709536864423-api-v2.png

  13. Under “Replication Cluster” click Edit.

  14. Add both of the ESXi servers to the replication cluster.

All hosts added to the replication cluster can replicate BUM (Broadcast, Unknown unicast and Multicast) traffic to other ESXi servers.

When the switch needs to send BUM traffic to a virtual machine, it will select one of the hosts in the replication cluster and send the traffic to it, the host will then replicate it to all other ESXi hosts.

It is recommended to add at least 2 ESXi servers to the replication cluster for redundancy.

Adding the Switch to NSX

  1. Under Hardware Devices click the + sign to add a new hardware device.

  2. Fill in a name for the new hardware device.

  3. Fill in the switch certificate we got earlier.

  4. Click OK.

    Adding_the_Switch_to_NSX_-_Step_4-version-1-modificationdate-1709536867157-api-v2.png

  5. Wait until the new switch is showing as “UP” under the connectivity column, you may need to refresh vSphere client a few times.

    Adding_the_Switch_to_NSX_-_Step_5-version-1-modificationdate-1709536866607-api-v2.png

Mapping a Logical Switch to a Physical Switch Port

  1. In NSX Manager go to “Logical Switches”.

  2. Right click the logical switch you wish to map to the physical switch port and select “Manage Hardware Bindings”.

    Mapping_a_Logical_Switch_to_a_Physical_Switch_Port_-_Step_2-version-1-modificationdate-1709536866280-api-v2.png

  3. Click the “+” sign to add a new mapping instance.

  4. Click Select under the port column and select port “eth3”, this corresponds to “ 1/3” we configured earlier as an NVE port in the switch.

  5. Under the VLAN column, set the VLAN that will map this logical switch to this specific switch port, you can have multiple logical switches mapped to the same port on a different VLAN (for example to connect a firewall appliance to logical switches). For “access” configuration (no VLAN is required on the host connected to the physical switch port) use VLAN 1.

  6. Click OK.

    Mapping_a_Logical_Switch_to_a_Physical_Switch_Port_-_Step_6-version-1-modificationdate-1709536865857-api-v2.png

For more information about this feature and its potential applications, please refer to the following community posts:

RoCEv2 Using PFC and ECN

The following figure and flow demonstrate how to configure RoCEv2 using PFC and ECN. RoCEv2 QoS is preserved by DSCP.

image2019-4-30_7-18-52-version-1-modificationdate-1709536863057-api-v2.png

Warning

DSCP is automatically driven from the original packet into the VXLAN header in Onyx.

  • Configure the switch buffer to support lossless traffic.

    Copy
    Copied!
                

    traffic pool roce type lossless traffic pool roce memory percent 50.00 traffic pool roce map switch-priority 3

  • Enable ECN.

    Copy
    Copied!
                

    interface ethernet 1/15 traffic-class 3 congestion-control ecn minimum-absolute 150 maximum-absolute 1500 interface ethernet 1/16 traffic-class 3 congestion-control ecn minimum-absolute 150 maximum-absolute 1500 interface mlag-port-channel 7-8 traffic-class 3 congestion-control ecn minimum-absolute 150 maximum-absolute 1500 interface port-channel 1 traffic-class 3 congestion-control ecn minimum-absolute 150 maximum-absolute 1500 interface ethernet 1/15 traffic-class 6 dcb ets strict interface ethernet 1/16 traffic-class 6 dcb ets strict interface mlag-port-channel 7-8 traffic-class 6 dcb ets strict interface port-channel 1 traffic-class 6 dcb ets strict

  • Set QoS trust to DSCP.

    Copy
    Copied!
                

    interface ethernet 1/15-1/16 qos trust L3 interface mlag-port-channel 7-8 qos trust L3 interface port-channel 1 qos trust L3

RoCEv1 Using PFC

The following figure and flow demonstrate how to configure RoCEv1 using PFC. RoCEv1 QoS is based on the PCP field sent by the server.

image2019-4-30_7-26-6-version-1-modificationdate-1709536862513-api-v2.png

  • Configure the switch buffer to support lossless traffic.

    Copy
    Copied!
                

    traffic pool roce type lossless traffic pool roce memory percent 50.00 traffic pool roce map switch-priority 3

  • Set Uplinks and IPL trust to DSCP.

    Copy
    Copied!
                

    interface ethernet 1/15-1/16 qos trust L3 interface port-channel 1 qos trust L3

  • Set Downlinks trust to PCP.

    Copy
    Copied!
                

    interface mlag-port-channel 7-8 qos trust L2

  • Set Downlinks rewrite to DSCP. This will allow translation from PCP to DSCP in VXLAN.

    Copy
    Copied!
                

    interface mlag-port-channel 7-8 qos rewrite dscp

  • Set Uplinks and IPL rewrite to PCP. This will allow translation from DSCP to PCP.

    Copy
    Copied!
                

    interface ethernet 1/15-1/16 qos rewrite pcp interface port-channel 1 qos rewrite pcp

© Copyright 2023, NVIDIA. Last updated on Mar 5, 2024.