Transceiver and Cable Self-qualification with Cumulus Linux
NVIDIA provides this document as a suggested procedure for qualifying a transceiver module or cable that does not appear on the Cumulus Linux Hardware Compatibility List (HCL) as a recommended pluggable. NVIDIA does not actively prevent any non-recommended pluggable from functioning and does not restrict the use of self-qualified pluggables. However, NVIDIA recommends customers use the pluggables listed on the HCL. Customers who wish to use a non-recommended pluggable can follow the suggested procedure outlined in this document. For concerns about pluggables that are not listed, contact your NVIDIA sales team.
This procedure is valid for qualifying all types of transceivers and cables in a device undergoing testing. Customers can choose the same types or a combination of different transceivers for this test.
The following diagram illustrates an example where the top and bottom ports connect with a cable as a loopback. The example includes cabling and configuration for the testing of both 40G QSFP and 10/1G SFP. NVIDIA recommends that you test different speed components independently. The example includes both to simplify the presentation.
TG-1/TG-2 are either networking traffic generators (IXIA/Spirent or equivalent) or two servers with a Linux OS installed and
iperf3 free traffic generator tool or an equivalent.
Cumulus Linux Configuration
/etc/network/interfaces file with the following bridge configuration:
auto trg_1 iface trg_1 inet manual bridge_ageing 150 bridge_stp off bridge_ports swp41 swp52 up ip link set trg_1 up auto trg_2 iface trg_2 inet manual bridge_ageing 150 bridge_stp off bridge_ports swp42 swp43 up ip link set trg_2 up auto l_1 iface l_1 inet manual bridge_ageing 150 bridge_stp off bridge_ports swp51 swp50 up ip link set l_1 up auto l_2 iface l_2 inet manual bridge_ageing 150 bridge_stp off bridge_ports swp49 swp48 up ip link set l_2 up auto l_3 iface l_3 inet manual bridge_ageing 150 bridge_stp off bridge_ports swp47 swp46 up ip link set l_3 up auto l_4 iface l_4 inet manual bridge_ageing 150 bridge_stp off bridge_ports swp45 swp44 up ip link set l_4 up
You should expect to see the following bridge configuration after you reboot the switch you are testing. Rebooting instead of reloading the configuration ensures that the system detects and properly configures all optics when the it starts up.
cumulus@switch~$ sudo brctl show bridge name bridge id STP enabled interfaces l_1 8000.443839002076 no swp50 swp51 l_2 8000.443839002071 no swp48 swp49 l_3 8000.44383900206f no swp46 swp47 L_4 8000.44383900206e no swp44 swp45 trg_1 8000.44383900206c no swp43 swp52 trg_2 8000.44383900206d no swp44 swp45
Verifying Link Status
Run the following Cumulus Linux command to verify that all loopback links are up:
~$ sudo ethtool swp50 Settings for swp50: Supported ports: [ FIBRE ] Supported link modes: 10000baseT/Full 40000baseT/Full Supported pause frame use: Symmetric Receive-only Supports auto-negotiation: Yes Advertised link modes: 1000baseT/Full 10000baseT/Full 40000baseT/Full Advertised pause frame use: Symmetric Advertised auto-negotiation: No Speed: 40000Mb/s Duplex: Full Port: FIBRE PHYAD: 0 Transceiver: external Auto-negotiation: off Current message level: 0x00000000 (0) Link detected: yes
Activating Data Traffic from Linux Servers
Configure the server interfaces connected to traffic ingress/egress ports to set IPv4 addresses in the same IP subnet. For example:
TG1$ sudo ifconfig eth1 184.108.40.206/24 up TG2$ sudo ifconfig eth1 220.127.116.11/24 up
You can try
ping –f (flood) between these interfaces.
iperf traffic, use options like the following (for example):
TG1$ sudo iperf -s -B 18.104.22.168 -p 9000 TG2$ sudo iperf -c 22.214.171.124 -i 3 -t 600 -p 9000 –d
- -B is bound to an interface
- -p is the TCP port number
- -c is the iperf destination
- -i is the print to screen interval
- -t is the duration of the test in seconds
- -d is bidirectional traffic
iperf traffic reach the destination and the
bandwidth matches the expected rate (subject to the transceiver’s
supported speed and the server CPU). Connect two servers back-to-back
first to capture baseline server performance characteristics.
Transceivers Module Information (EEPROM & DOM)
Use the following Cumulus Linux command to check each transceiver’s EEPROM and Digital Optical Monitoring (DOM) information:
cumulus@switch~$ sudo ethtool –m swp<id>
Cumulus Linux-based Error Counters Check
The following commands indicate error and drop counters occurred during and after the test:
cumulus@switch~$ sudo ethtool -S swp<id> | grep -i error HwIfInDot3LengthErrors: 0 HwIfInErrors: 0 SoftInErrors: 0 SoftInFrameErrors: 0 HwIfOutErrors: 0 SoftOutErrors: 0
The following checklist and test plan comprise successful results.
Source and destination IP of TG1 & TG2
Bidirectional traffic with Cumulus Linux snake test matches transfer rate of two traffic generators' endpoints when connected back-to-back.
All error counters return null.
To detects drops or errors, collect output from
Returns EEPROM information, such as vendor and equipment type.
DOM information is optional for successful self-qualification of transceivers. In some cases ODM vendors elect to restrict information programmed for DOM.
This test confirms that a
Reboot the switch and repeat same tests and checkpoints again.
All checks/tests iterations are successful.
During the qualification cycle for some transceivers, NVIDIA observed that marginal and disqualified transceivers exhibited failures with 10-25% failure rates across switch reboots.