Validate Ethernet Cluster Networking#

Once all of the switches are up, you need to verify the hardware health and proper configuration.

Platform Health#

  • System

  • Internal Components

  • Software Processes

  • Transceiver/Cable

Cabling Quality (L1)#

  • Tx/Rx Signal Level

  • Peer Connectivity

  • Tx/Rx Signal Integrity

Fabric Underlay Routing#

Fabric Overlay Routing (EVPN)#

  1. To show the Ethernet segments across all VNIs:

nv show evpn multihoming
_images/ns-fab-underlay-routing-01.png

Show Ethernet Segment Information#

  1. To show the Ethernet segments across all VNIs:

nv show evpn multihoming esi
_images/ns-fab-underlay-routing-02.png
  1. To show information about a specific ESI:

nv show evpn multihoming esi <ESI ID>
_images/ns-fab-underlay-routing-03.png
  1. To show the Ethernet segments for a specific VNI:

nv show evpn vni <vlan> multihoming esi
_images/ns-fab-underlay-routing-04.png
  1. To show the Ethernet segments across all VNIs learned through type-1 and type-4 routes, run the NVUE:

nv show evpn multihoming bgp-info esi
_images/ns-fab-underlay-routing-05.png
  1. To verify ESI to VRF information:

sudo vtysh -c 'show bgp l2vpn evpn es-vrf'
_images/ns-fab-underlay-routing-06.png
  1. To verify EVPN type-1 and type-4 routes learned:

sudo vtysh -c 'show bgp l2vpn evpn route type ead'
_images/ns-fab-underlay-routing-07.png
  1. To verify NVE interface (VTEP) state and configuration:

nv show nve vxlan
_images/ns-fab-underlay-routing-08.png

Customer Network Reachability#

  1. Customer Edge to Cluster Border

    1. BGP IPv4 AF Neighbors

    2. BGP Prefix Exchange

    3. Next-Hop Reachability

    4. Packet Loss

  2. Customer (external) to Cluster (internal)

    1. End-to-End Reachability (if available)

      1. Dependent on end-host connected to fabric (ex: DGX)

    2. Packet Loss

Negative Testing#

  • Customer Edge Failover

    • Disable Switch(es) / Link(s) / BGP Neighbor(s)

    • End-to-End Reachability

    • Packet Loss / Recovery Time

  • Cluster Fabric Failover

    • Disable Switch(es) / Link(s) / BGP Neighbor(s)

    • End-to-End Reachability

    • Packet Loss / Recovery Time