NVIDIA® Cumulus Linux is the first full-featured Debian Buster-based, Linux operating system for the networking industry.
This user guide provides in-depth documentation on the Cumulus Linux installation process, system configuration and management, network solutions, and monitoring and troubleshooting recommendations. In addition, the quick start guide provides an end-to-end setup process to get you started.
Cumulus Linux 5.8 includes the NVIDIA NetQ agent and CLI. You can use NetQ to monitor and manage your data center network infrastructure and operational health. Refer to the NVIDIA NetQ documentation for details.
For a list of the new features in this release, see What's New. For bug fixes and known issues present in this release, refer to the Cumulus Linux 5.8 Release Notes.
Try It Pre-built Demos
The Cumulus Linux documentation includes pre-built Try It demos for certain Cumulus Linux features. The Try It demos run a simulation in NVIDIA Air; a cloud hosted platform that works exactly like a real world production deployment. Use the Try It demos to examine switch configuration for a feature. For more information, see Try It Pre-built Demos.
Open Source Contributions
To implement various Cumulus Linux features, NVIDIA has forked various software projects, like CFEngine Netdev and some Puppet Labs packages. Some of the forked code resides in the NVIDIA Networking GitHub repository and some is available as part of the Cumulus Linux repository as Debian source packages.
NVIDIA has also developed and released new applications as open source. The list of open source projects is on the Cumulus Linux packages page.
Download the User Guide
Use one of the following methods to download the Cumulus Linux user guide and view it offline:
Host the documentation on a local host using hugo.
For a fully functional copy of the user guide, download a zip file of an HTML documentation build for offline use. Download the desired version, extract it locally, then open cumulus-linux-58.html in your web browser.
To view this user guide as a single page to print to a PDF with limited functionality, click here.
Click on the link one time and use the web browser print-to-PDF option to save the PDF locally.
What's New
This document supports the Cumulus Linux 5.8 release, and lists new platforms, features, and enhancements.
Cumulus Linux 5.8.0 contains several new features and improvements, and provides bug fixes.
Cumulus Linux 5.8.0 provides a critical bug fix to correct FRR log rotation settings for Cumulus Linux 5.6 and 5.7. Refer to issue ID 3766994 in the Cumulus Linux 5.8 Release Notes.
Minimized data retrieval for the NVUE nv show router nexthop rib and nv show vrf <vrf> router rib ipv4 route commands
▼
Improved tab completion for NVUE routing commands
nv show vrf <vrf> router bgp nexthop ipv4 ip-address
nv show vrf <vrf> router bgp address-family l2vpn-evpn loc-rib rd <rd> route-type <type> route
nv show evpn vni <vni>
nv show evpn access-vlan-info vlan
nv show evpn vni <vni> multihoming bgp-info esi
nv show vrf <vrf> router rib ipv4 route
nv show vrf <vrf> router rib ipv6 route
nv show vrf <vrf> router rib ipv4 route <route> protocol connected entry-index
nv show vrf <vrf> router bgp address-family l2vpn-evpn loc-rib rd <rd> route-type macip route
nv show vrf <vrf> router bgp address-family l2vpn-evpn loc-rib rd <rd> route-type prefix route <route> path
nv show vrf <vrf> router bgp neighbor <neighbor> address-family ipv4-unicast advertised-routes <routes> path
▼
New NVUE Commands
For descriptions and examples of all NVUE commands, refer to the NVUE Command Reference for Cumulus Linux.
nv show bridge domain <bridge> svi-force-up
nv show system global svi-force-up
nv show service ptp <instance-id> force-version
nv set interface <interface> dot1x host-mode
nv set bridge domain <bridge> svi-force-up enable
nv set system global svi-force-up enable
nv set service ptp <instance-id> force-version
nv unset interface <interface> dot1x host-mode
nv unset bridge domain <bridge> svi-force-up enable
nv unset system global svi-force-up enable
nv unset service ptp <instance-id> force-version
The repository key stored in Cumulus Linux 5.5.0 and earlier has expired. Before performing a package upgrade to Cumulus Linux 5.8.0 from Cumulus Linux 5.5.0 and earlier, you must install the new key. See this knowledge base article.
Cumulus Linux 5.8 includes the NVUE object model. After you upgrade to Cumulus Linux 5.8, running NVUE configuration commands might override configuration for features that are now configurable with NVUE and removes configuration you added manually to files or with automation tools like Ansible, Chef, or Puppet. To keep your configuration, you can do one of the following:
Use Linux and FRR (vtysh) commands instead of NVUE for all switch configuration.
Cumulus Linux 3.7, 4.3, and 4.4 continue to support NCLU. For more information, contact your NVIDIA Spectrum platform sales representative.
Quick Start Guide
This quick start guide provides an end-to-end setup process for installing and running Cumulus Linux.
Prerequisites
This guide assumes you have intermediate-level Linux knowledge. You need to be familiar with basic text editing, Unix file permissions, and process monitoring. Cumulus Linux includes a variety of preinstalled text editors, such as vi and nano.
You must have access to a Linux or UNIX shell. If you are running Windows, use a Linux environment like Cygwin as your command line tool for interacting with Cumulus Linux.
Get Started
Cumulus Linux is on the switch by default. To upgrade to a different Cumulus Linux release or reinstall Cumulus Linux, refer to Installation Management. To show the current Cumulus Linux release on the switch, run the NVUE nv show system command.
When starting Cumulus Linux for the first time, the management port makes a DHCPv4 request. To determine the IP address of the switch, you can cross reference the MAC address of the switch with your DHCP server. The MAC address is typically located on the side of the switch or on the box in which the unit ships.
To get started:
Log in to Cumulus Linux on the switch and change the default credentials.
Configure Cumulus Linux. This quick start guide provides instructions on changing the hostname of the switch, setting the date and time, and configuring switch ports and a loopback interface.
You can choose to configure Cumulus Linux either with NVUE commands or Linux commands (with vtysh or by manually editing configuration files). Do not run both NVUE configuration commands (such as nv set, nv unset, nv action, nv config) and Linux commands to configure the switch. NVUE commands replace the configuration in files such as /etc/network/interfaces and /etc/frr/frr.conf, and remove any configuration you add manually or with automation tools like Ansible, Chef, or Puppet.
If you choose to configure Cumulus Linux with NVUE, you can configure features that do not yet support the NVUE object model by creating NVUE Snippets.
Login Credentials
The default installation includes two accounts:
The system account (root) has full system privileges. Cumulus Linux locks the root account password, which prohibits login.
The user account (cumulus) has sudo privileges. The cumulus account uses the default password cumulus. When you log in for the first time with the cumulus account, Cumulus Linux prompts you to change the default password. After you provide a new password, the SSH session disconnects and you have to reconnect with the new password.
In this quick start guide, you use the cumulus account to configure Cumulus Linux.
All accounts except root can use remote SSH login; you can use sudo to grant a non-root account root-level access. Commands that change the system configuration require this elevated level of access.
NVIDIA recommends you perform management and configuration over the network, either in band or out of band. A serial console is fully supported.
Typically, switches ship from the manufacturer with a mating DB9 serial cable. Switches with ONIE are always set to a 115200 baud rate.
Wired Ethernet Management
A Cumulus Linux switch always provides at least one dedicated Ethernet management port called eth0. This interface is specifically for out-of-band management use. The management interface uses DHCPv4 for addressing by default.
To set a static IP address and gateway address for eth0:
cumulus@switch:~$ nv set interface eth0 ip address 192.0.2.42/24
cumulus@switch:~$ nv set interface eth0 ip gateway 192.0.2.1
cumulus@switch:~$ nv config apply
The command prompt in the terminal does not reflect the new hostname until you either log out of the switch or start a new shell.
Configure the Time Zone
The default time zone on the switch is UTC (Coordinated Universal Time). Change the time zone on your switch to be the time zone for your location.
To update the time zone:
Run the nv set system timezone <timezone> command. To see all the available time zones, run nv set system timezone and press the Tab key. The following example sets the time zone to US/Eastern:
cumulus@switch:~$ nv set system timezone US/Eastern
cumulus@switch:~$ nv config apply
In a terminal, run the following command:
cumulus@switch:~$ sudo dpkg-reconfigure tzdata
Follow the on screen menu options to select the geographic area and region.
Programs that are already running (including log files) and logged in users, do not see time zone changes. To set the time zone for all services and daemons, reboot the switch.
Verify the System Time
Verify that the date and time on the switch are correct. If the date and time are incorrect, the switch does not synchronize with automation tools, such as Puppet, and returns errors after you restart switchd.
To show the current date and time, run the nv show system date-time command:
cumulus@switch:~$ nv show system date-time
operational
------------------------- -----------------------------
local-time Wed 2023-11-22 11:22:54 EST
universal-time Wed 2023-11-22 16:22:54 UTC
rtc-time Wed 2023-11-22 16:22:54
time-zone America/New_York (EST, -0500)
system-clock-synchronized no
ntp-service inactive
rtc-in-local-tz no
unix-time 1700670174.4371066
To set the software clock according to the configured time zone, run the nv action change system date-time <YYYY-MM-DD> <HH:MM:SS> command; for example:
cumulus@switch:~$ nv action change system date-time 2023-12-04 2:33:30
System Date-time changed successfully
Local Time is now Mon 2023-12-04 02:33:30 UTC
Action succeeded
To show the current date and time on the switch, run the date command:
cumulus@switch:~$ date
Wed 11 Oct 2023 12:18:33 PM UTC
To set the software clock according to the configured time zone, run the sudo date -s command:
cumulus@switch:~$ sudo date -s "Tue Jan 26 00:37:13 2021"
NTP starts at boot by default on the switch and the NTP configuration includes default servers. To customize NTP, see NTP.
PTP is off by default on the switch. To configure PTP, see PTP.
Configure Breakout Ports with Splitter Cables
If you are using 4x10G DAC or AOC cables, or you want to break out (split) switch ports, configure the breakout ports; see Switch Port Attributes.
Test Cable Connectivity
By default, Cumulus Linux disables all data plane ports (every Ethernet port except the management interface, eth0). To test cable connectivity, administratively enable physical ports.
To enable a port administratively, run the nv set interface <interface> command:
cumulus@switch:~$ nv set interface swp1
cumulus@switch:~$ nv config apply
To enable all physical ports administratively on a switch that has ports numbered from swp1 to swp52:
cumulus@switch:~$ nv set interface swp1-52
cumulus@switch:~$ nv config apply
To view link status, run the nv show interface command.
To enable a port administratively:
cumulus@switch:~$ sudo ip link set swp1 up
To enable all physical ports administratively, run the following bash script:
cumulus@switch:~$ sudo su -
cumulus@switch:~$ for i in /sys/class/net/*; do iface=`basename $i`; if [[ $iface == swp* ]]; then ip link set $iface up fi done
To view link status, run the ip link show command.
Configure Layer 2 Ports
Cumulus Linux does not put all ports into a bridge by default. To create a bridge and configure one or more front panel ports as members of the bridge:
The following configuration example places the front panel port swp1 into the default bridge called br_default.
The following configuration example places the front panel port swp1 into the default bridge called br_default:
...
auto br_default
iface br_default
bridge-ports swp1
...
To put a range of ports into a bridge, use the glob keyword. For example, to add swp1 through swp10, swp12, and swp14 through swp20 to the bridge called br_default:
You can configure a front panel port or bridge interface as a layer 3 port.
The following configuration example configures the front panel port swp1 as a layer 3 access port:
cumulus@switch:~$ nv set interface swp1 ip address 10.0.0.0/31
cumulus@switch:~$ nv config apply
To add an IP address to a bridge interface, you must put it into a VLAN interface. If you want to use a VLAN other than the native one, set the bridge PVID:
cumulus@switch:~$ nv set interface swp1-2 bridge domain br_default
cumulus@switch:~$ nv set bridge domain br_default vlan 10
cumulus@switch:~$ nv set interface vlan10 ip address 10.1.10.2/24
cumulus@switch:~$ nv set bridge domain br_default untagged 1
cumulus@switch:~$ nv config apply
The following configuration example configures the front panel port swp1 as a layer 3 access port:
auto swp1
iface swp1
address 10.0.0.0/31
To add an IP address to a bridge interface, include the address under the iface stanza in the /etc/network/interfaces file. If you want to use a VLAN other than the native one, set the bridge PVID:
If there are no errors, run the following command:
cumulus@switch:~$ sudo ifup -a
Configure a Loopback Interface
Cumulus Linux has a preconfigured loopback interface. When the switch boots up, the loopback interface, called lo, is up and assigned an IP address of 127.0.0.1.
The loopback interface lo must always exist on the switch and must always be up. To check the status of the loopback interface, run the NVUE nv show interface lo command or the Linux ip addr show lo command.
To add an IP address to a loopback interface, configure the lo interface:
cumulus@switch:~$ nv set interface lo ip address 10.10.10.1/32
cumulus@switch:~$ nv config apply
Add the IP address directly under the iface lo inet loopback definition in the /etc network/interfaces file:
auto lo
iface lo inet loopback
address 10.10.10.1
If you configure an IP address without a subnet mask, it becomes a /32 IP address. For example, 10.10.10.1 is 10.10.10.1/32.
If you run NVUE commands to configure the switch, run the nv config save command before you reboot. The command saves the applied configuration to the startup configuration so that the changes persist after the reboot.
cumulus@switch:~$ nv config save
Show Platform and System Settings
To show the hostname of the switch, the time zone, and the version of Cumulus Linux running on the switch, run the NVUE nv show system command.
To show switch platform information, such as the ASIC model, CPU, hard disk drive size, RAM size, and port layout, run the NVUE nv show platform hardware command.
Next Steps
You are now ready to configure the switch according to your needs. This guide provides separate sections that describe how to configure system, layer 1, layer 2, layer 3, and network virtualization settings. Each section includes example configurations and pre-built demos.
For a deep dive into the NVUE object model that provides a CLI to simplify configuration, see NVUE.
Installation Management
This section describes how to manage, install, and upgrade Cumulus Linux on your switch.
Managing Cumulus Linux Disk Images
The Cumulus Linux operating system resides on a switch as a disk image. This section discusses how to manage the image.
Reprovisioning the system deletes all system data from the switch.
To stage an ONIE installer from the network (where ONIE automatically locates the installer), run the onie-select -i command. You must reboot the switch to start the install process.
cumulus@switch:~$ sudo onie-select -i
WARNING:
WARNING: Operating System install requested.
WARNING: This will wipe out all system data.
WARNING:
Are you sure (y/N)? y
Enabling install at next reboot...done.
Reboot required to take effect.
To cancel a pending reinstall operation, run the onie-select -c command:
cumulus@switch:~$ sudo onie-select -c
Cancelling pending install at next reboot...done.
To stage an installer located in a specific location, run the onie-install -i <location> command. You can specify a local, absolute or relative path, an HTTP or HTTPS server, SCP or FTP server. You can also stage a Zero Touch Provisioning (ZTP) script along with the installer.
You typically use the onie-install command with the -a option to activate installation. If you do not specify the -a option, you must reboot the switch to start the installation process.
The following example stages the installer located at http://203.0.113.10/image-installer together with the ZTP script located at http://203.0.113.10/ztp-script and activates installation and ZTP:
You can also specify these options together in the same command. For example:
cumulus@switch:~$ sudo onie-install -i http://203.0.113.10/image-installer -z http://203.0.113.10/ztp-script -a
To see more onie-install options, run man onie-install.
Migrate from Cumulus Linux to ONIE (Uninstall All Images and Remove the Configuration)
To remove all installed images and configurations, and return the switch to its factory defaults, run the onie-select -k command.
The onie-select -k command takes a long time to run as it overwrites the entire NOS section of the flash. Only use this command if you want to erase all NOS data and take the switch out of service.
cumulus@switch:~$ sudo onie-select -k
WARNING:
WARNING: Operating System uninstall requested.
WARNING: This will wipe out all system data.
WARNING:
Are you sure (y/N)? y
Enabling uninstall at next reboot...done.
Reboot required to take effect.
You must reboot the switch to start the uninstallation process.
To cancel a pending uninstall operation, run the onie-select -c command:
cumulus@switch:~$ sudo onie-select -c
Cancelling pending uninstall at next reboot...done.
Boot Into Rescue Mode
If your system becomes unresponsive, you can correct certain issues by booting into ONIE rescue mode, which uses unmounted file systems. You can use various Cumulus Linux utilities to try and resolve a problem.
To reboot the system into ONIE rescue mode, run the onie-select -r command:
cumulus@switch:~$ sudo onie-select -r
WARNING:
WARNING: Rescue boot requested.
WARNING:
Are you sure (y/N)? y
Enabling rescue at next reboot...done.
Reboot required to take effect.
You must reboot the system to boot into rescue mode.
To cancel a pending rescue boot operation, run the onie-select -c command:
cumulus@switch:~$ sudo onie-select -c
Cancelling pending rescue at next reboot...done.
Inspect the Image File
The Cumulus Linux image file is executable. From a running switch, you can display, extract, and verify the contents of the image file.
To display the contents of the Cumulus Linux image file, pass the info option to the image file. For example, to display the contents of an image file called onie-installer located in the /var/lib/cumulus/installer directory:
To extract the contents of the image file, use with the extract <path> option. For example, to extract an image file called onie-installer located in the /var/lib/cumulus/installer directory to the mypath directory:
cumulus@switch:~$ sudo /var/lib/cumulus/installer/onie-installer extract mypath
total 181860
-rw-r--r-- 1 4000 4000 308 May 16 19:04 control
drwxr-xr-x 5 4000 4000 4096 Apr 26 21:28 embedded-installer
-rw-r--r-- 1 4000 4000 13273936 May 16 19:04 initrd
-rw-r--r-- 1 4000 4000 4239088 May 16 19:04 kernel
-rw-r--r-- 1 4000 4000 168701528 May 16 19:04 sysroot.tar
To verify the contents of the image file, use with the verify option. For example, to verify the contents of an image file called onie-installer located in the /var/lib/cumulus/installer directory:
cumulus@switch:~$ sudo /var/lib/cumulus/installer/onie-installer verify
Verifying image checksum ...OK.
Preparing image archive ... OK.
./cumulus-linux-bcm-amd64.bin.1: 161: ./cumulus-linux-bcm-amd64.bin.1: onie-sysinfo: not found
Verifying image compatibility ...OK.
Verifying system ram ...OK.
The default password for the cumulus user account is cumulus. The first time you log into Cumulus Linux, you must change this default password. Be sure to update any automation scripts before installing a new image. Cumulus Linux provides command line options to change the default password automatically during the installation process. Refer to ONIE Installation Options.
You can install a new Cumulus Linux image using ONIE, an open source project (equivalent to PXE on servers) that enables the installation of network operating systems (NOS) on bare metal switches.
Before you install Cumulus Linux, the switch can be in two different states:
The switch does not contain an image (the switch is only running ONIE).
Cumulus Linux is already on the switch but you want to use ONIE to reinstall Cumulus Linux or upgrade to a newer version.
The sections below describe some of the different ways you can install the Cumulus Linux image. Steps show how to install directly from ONIE (if no image is on the switch) and from Cumulus Linux (if the image is already on the switch). For additional methods to find and install the Cumulus Linux image, see the ONIE Design Specification.
Installing the Cumulus Linux image is destructive; configuration files on the switch are not saved; copy them to a different server before installing.
In the following procedures:
You can name your Cumulus Linux image using any of the
ONIE naming schemes mentioned here.
Run the sudo onie-install -h command to show the ONIE installer options.
Install Using a DHCP/Web Server With DHCP Options
To install Cumulus Linux using a DHCP or web server withDHCP options, set up a DHCP/web server on your laptop and connect the eth0 management port of the switch to your laptop. After you connect the cable, the installation proceeds as follows:
The switch boots up and requests an IP address (DHCP request).
The DHCP server acknowledges and responds with DHCP option 114 and the location of the installation image.
ONIE downloads the Cumulus Linux image, installs, and reboots.
You are now running Cumulus Linux.
The most common way is to send DHCP option 114 with the entire URL to the web server (this can be the same system). However, there are other ways you can use DHCP even if you do not have full control over DHCP. See the ONIE user guide for information on partial installer URLs and advanced DHCP options; both articles list more supported DHCP options.
Here is an example DHCP configuration with an ISC DHCP server:
Place the Cumulus Linux image in a directory on the web server.
From the Cumulus Linux command prompt, run the onie-install command, then reboot the switch.
cumulus@switch:~$ sudo onie-install -a -i http://10.0.1.251/path/to/cumulus-install-x86_64.bin
Install Using a Web Server With no DHCP
Follow the steps below if you can log into the switch on a serial console (ONIE), or you can log in on the console or with ssh (Install from Cumulus Linux) but no DHCP server is available.
You need a console connection to access the switch; you cannot perform this procedure remotely.
ONIE is in discovery mode. You must disable discovery mode with the following command:
onie# onie-discovery-stop
On older ONIE versions, if the onie-discovery-stop command is not supported, run:
onie# /etc/init.d/discover.sh stop
Assign a static address to eth0 with the ip addr add command:
ONIE:/ #ip addr add 10.0.1.252/24 dev eth0
Place the Cumulus Linux image in a directory on your web server.
Run the installer manually (because there are no DHCP options):
The onie-install command lets you stage a Cumulus Linux image and other files, such as a ZTP script or an NVUE startup.yaml file, then run the installation on the switch when you are ready.
You can provide the following file paths with the onie-install command:
The local file path (absolute or relative path)
http://server/path/
https://server/path/
scp://user@server/path/
ftp://server/path/ (anonymous only)
Use these options to stage additional files with the Cumulus Linux image:
To activate the staged installation, use the -a option, then reboot the switch:
cumulus@cumulus:~$ sudo onie-install -a
WARNING: This will wipe out all system data
WARNING: Make sure to back up your data
Are you sure (N/y)? y
Activating staged installer...done.
Reboot required to take effect.
You can combine the -i, -z, -t and -a options all at once. In addition, you can use the -f (force) option together with the -a option to suppress the yes and no prompts:
Follow the steps below to install the Cumulus Linux image using a USB drive.
Installing Cumulus Linux using a USB drive is fine for a single switch here and there but is not scalable. DHCP can scale to hundreds of switch installs with zero manual input unlike USB installs.
From a computer, prepare your USB drive by formatting it using one of the supported formats: FAT32, vFAT or EXT2.
▼
Optional: Prepare a USB Drive inside Cumulus Linux
a. Insert your USB drive into the USB port on the switch running Cumulus Linux and log in to the switch. Examine output from cat /proc/partitions and sudo fdisk -l [device] to determine the location of your USB drive. For example, sudo fdisk -l /dev/sdb.
These instructions assume your USB drive is the /dev/sdb device, which is typical if you insert the USB drive after the machine is already booted. However, if you insert the USB drive during the boot process, it is possible that your USB drive is the /dev/sda device. Make sure to modify the commands below to use the proper device for your USB drive.
b. Create a new partition table on the USB drive. If the parted utility is not on the system, install it with sudo -E apt-get install parted.
sudo parted /dev/sdb mklabel msdos
c. Create a new partition on the USB drive:
sudo parted /dev/sdb -a optimal mkpart primary 0% 100%
d. Format the partition to your filesystem of choice using one of the examples below:
When using a MAC or Windows computer to rename the installation file, the file extension can still be present. Make sure you remove the file extension so that ONIE can detect the file.
Insert the USB drive into the switch, then prepare the switch for installation:
If the switch is offline, connect to the console and power on the switch.
If the switch is already online in ONIE, use the reboot command.
SSH sessions to the switch get dropped after this step. To complete the remaining instructions, connect to the console of the switch. Cumulus Linux switches display their boot process to the console; you need to monitor the console specifically to complete the next step.
Monitor the console and select the ONIE option from the first GRUB screen shown below.
Cumulus Linux on x86 uses GRUB chainloading to present a second GRUB menu specific to the ONIE partition. No action is necessary in this menu to select the default option ONIE: Install OS.
The switch recognizes the USB drive and mounts it automatically. Cumulus Linux installation begins.
After installation completes, the switch automatically reboots into the newly installed instance of Cumulus Linux.
ONIE Installation Options
You can run several installer command line options from ONIE to perform basic switch configuration automatically after installation completes and Cumulus Linux boots for the first time. These options enable you to:
Set a unique password for the cumulus user
Provide an initial network configuration
Execute a ZTP script to perform necessary configuration
The onie-nos-install command does not allow you to specify command line parameters. You must access the switch from the console and transfer a disk image to the switch. You must then make the disk image executable and install the image directly from the ONIE command line with the options you want to use.
The following example commands transfer a disk image to the switch, make the image executable, and install the image with the --password option to change the default cumulus user password:
You can run more than one option in the same command.
Set the cumulus User Password
The default cumulus user account password is cumulus. When you log into Cumulus Linux for the first time, you must provide a new password for the cumulus account, then log back into the system.
To automate this process, you can specify a new password from the command line of the installer with the --password '<clear text-password>' option. For example, to change the default cumulus user password to MyP4$$word:
To provide a hashed password instead of a clear text password, use the --hashed-password '<hash>' option. An encrypted hash maintains a secure management network.
Generate a sha-512 password hash with the following openssl command. The example command generates a sha-512 password hash for the password MyP4$$word.
If you specify both the --password and --hashed-password options, the --hashed-password option takes precedence and the switch ignores the --password option.
Provide Initial Network Configuration
To provide initial network configuration automatically when Cumulus Linux boots for the first time after installation, use the --interfaces-file <filename> option. For example, to copy the contents of a file called network.intf into the /etc/network/interfaces file and run the ifreload -a command:
To run a ZTP script that contains commands to execute after Cumulus Linux boots for the first time after installation, use the --ztp <filename> option. For example, to run a ZTP script called initial-conf.ztp:
The ZTP script must contain the CUMULUS-AUTOPROVISIONING string near the beginning of the file and must reside on the ONIE filesystem. Refer to Zero Touch Provisioning - ZTP.
If you use the --ztp option together with any of the other command line options, the ZTP script takes precedence and the switch ignores other command line options.
Change the Default BIOS Password
To provide a layer of security and to prevent unauthorized access to the switch, NVIDIA recommends you change the default BIOS password. The default BIOS password is admin.
To change the default BIOS password:
During system boot, press Ctrl+B through the serial console while the BIOS version prints.
From the Security menu, select Administrator Password.
Follow the prompts.
Edit the Cumulus Linux Image (Advanced)
The Cumulus Linux disk image file contains a BASH script that includes a set of variables. You can set these variables to be able to install a fully configured system with a single image file.
▼
To edit the image
Example Image File
The Cumulus Linux disk image file is a self-extracting executable. The executable part of the file is a BASH script at the beginning of the file. Towards the beginning of this BASH script are a set of variables with empty strings:
Defines the clear text password. This variable is equivalent to the ONIE installer command line option --password.
CL_INSTALLER_HASHED_PASSWORD
Defines the hashed password. This variable is equivalent to the ONIE installer command line option --hashed-password. If you set both the CL_INSTALLER_PASSWORD and CL_INSTALLER_HASHED_PASSWORD variable, the CL_INSTALLER_HASHED_PASSWORD takes precedence.
CL_INSTALLER_INTERFACES_FILENAME
Defines the name of the file on the ONIE filesystem you want to use as the /etc/network/interfaces file. This variable is equivalent to the ONIE installer command line option --interfaces-file.
CL_INSTALLER_INTERFACES_CONTENT
Describes the network interfaces available on your system and how to activate them. Setting this variable defines the contents of the /etc/network/interfaces file. There is no equivalent ONIE installer command line option. If you set both the CL_INSTALLER_INTERFACES_FILENAME and CL_INSTALLER_INTERFACES_CONTENT variables, the CL_INSTALLER_INTERFACES_FILENAME takes precedence.
CL_INSTALLER_ZTP_FILENAME
Defines the name of the ZTP file on the ONIE filesystem you want to execute at first boot after installation. This variable is equivalent to the ONIE installer command line option --ztp
Edit the Image File
Because the Cumulus Linux image file is a binary file, you cannot use standard text editors to edit the file directly. Instead, you must split the file into two parts, edit the first part, then put the two parts back together.
Copy the first 20 lines to an empty file:
head -20 cumulus-linux-4.4.0-mlx-amd64.bin > cumulus-linux-4.4.0-mlx-amd64.bin.1
Remove the first 20 lines of the image, then copy the remaining lines into another empty file:
sed -e '1,20d' cumulus-linux-4.4.0-mlx-amd64.bin > cumulus-linux-4.4.0-mlx-amd64.bin.2
The original file is now split, with the first 20 lines in cumulus-linux-4.4.0-mlx-amd64.bin.1 and the remaining lines in cumulus-linux-4.4.0-mlx-amd64.bin.2.
Use a text editor to change the variables in cumulus-linux-4.4.0-mlx-amd64.bin.1.
Calculate the new checksum and update the CL_INSTALLER_PAYLOAD_SHA256 variable. sed -e '1,/^exit_marker$/d' "cumulus-linux-4.4.0-mlx-amd64.bin.final" | sha256sum | awk '{ print $1 }'
This following example shows a modified image file:
...
CL_INSTALLER_PAYLOAD_SHA256='d14a028c2a3a2bc9476102bb288234c415a2b01f828ea62ac332e42f'
CL_INSTALLER_PASSWORD='MyP4$$word'
CL_INSTALLER_HASHED_PASSWORD=''
CL_INSTALLER_LICENSE='customer@datacenter.com|4C3YMCACDiK0D/EnrxlXpj71FBBNAg4Yrq+brza4ZtJFCInvalid'
CL_INSTALLER_INTERFACES_FILENAME=''
CL_INSTALLER_INTERFACES_CONTENT='# This file describes the network interfaces available on your system and how to activate them.
source /etc/network/interfaces.d/*.intf
# The loopback network interface
auto lo
iface lo inet loopback
# The primary network interface
auto eth0
iface eth0 inet dhcp
vrf mgmt
auto bridge
iface bridge
bridge-ports swp1 swp2
bridge-pvid 1
bridge-vids 10 11
bridge-vlan-aware yes
auto mgmt
iface mgmt
address 127.0.0.1/8
address ::1/128
vrf-table auto
'
CL_INSTALLER_ZTP_FILENAME=''
...
You can install this edited image file in the usual way, by using the ONIE install waterfall or the onie-nos-install command.
If you install the modified installation image and specify installer command line parameters, the command line parameters take precedence over the variables modified in the image.
Secure Boot
Secure Boot validates each binary image loaded during system boot with key signatures that correspond to a stored trusted key in firmware.
Secure Boot is only on the NVIDIA SN3700C-S switch and switches with the Spectrum-4 ASIC.
Secure Boot settings are in the BIOS Security menu. To access BIOS, press Ctrl+B through the serial console during system boot while the BIOS version prints:
To access the BIOS menu, use admin which is the default BIOS password:
NVIDIA recommends changing the default BIOS password; navigate to Security and select Administrator Password.
To validate or change the Secure Boot mode, navigate to Security and select Secure Boot:
In the Secure Boot menu, you can enable and disable Secure Boot mode. To install an unsigned version of Cumulus Linux or access ONIE without a prompt for a username and password, set Secure Boot to disabled:
To access ONIE when Secure Boot is enabled, authentication is necessary. The default username and password are both root:
ONIE: Rescue Mode ...
Platform : x86_64-mlnx_x86-r0
Version : 2021.02-5.3.0006-rc3-115200
Build Date: 2021-05-20T14:27+03:00
Info: Mounting kernel filesystems... done.
Info: Mounting ONIE-BOOT on /mnt/onie-boot ...
[ 17.011057] ext4 filesystem being mounted at /mnt/onie-boot supports timestamps until 2038 (0x7fffffff)
Info: Mounting EFI System on /boot/efi ...
Info: BIOS mode: UEFI
Info: Using eth0 MAC address: b8:ce:f6:3c:62:06
Info: eth0: Checking link... up.
Info: Trying DHCPv4 on interface: eth0
ONIE: Using DHCPv4 addr: eth0: 10.20.84.226 / 255.255.255.0
Starting: klogd... done.
Starting: dropbear ssh daemon... done.
Starting: telnetd... done.
discover: Rescue mode detected. Installer disabled.
Please press Enter to activate this console. To check the install status inspect /var/log/onie.log.
Try this: tail -f /var/log/onie.log
** Rescue Mode Enabled **
login: root
Password: root
ONIE:~ #
To validate the Secure Boot status of a system from Cumulus Linux, run the mokutil --sb-state command.
On a switch with the Spectrum-4 ASIC, if the ASIC firmware fails to boot, you see a message alerting you to contact NVIDIA Customer Support for further options.
The default password for the cumulus user account is cumulus. The first time you log into Cumulus Linux, you must change this default password. Be sure to update any automation scripts before you upgrade. You can use ONIE command line options to change the default password automatically during the Cumulus Linux image installation process. Refer to ONIE Installation Options.
This topic describes how to upgrade Cumulus Linux on your switch.
Consider deploying, provisioning, configuring, and upgrading switches using automation, even with small networks or test labs. During the upgrade process, you can upgrade dozens of devices in a repeatable manner. Using tools like Ansible, Chef, or Puppet for configuration management greatly increases the speed and accuracy of the next major upgrade; these tools also enable you to quickly swap failed switch hardware.
Understanding the location of configuration data is important for successful upgrades, migrations, and backup. As with other Linux distributions, the /etc directory is the primary location for all configuration data in Cumulus Linux. The following list contains the files you need to back up and migrate to a new release. Make sure you examine any changed files. Make the following files and directories part of a backup strategy.
File Name and Location
Description
Cumulus Linux Documentation
Debian Documentation
/etc/frr/
Routing application (responsible for BGP and OSPF)
If you are using the root user account, consider including /root/.
If you have custom user accounts, consider including /home/<username>/.
File Name and Location
Description
/etc/mlx/
Per-platform hardware configuration directory, created on first boot. Do not copy.
/etc/default/clagd
Created and managed by ifupdown2. Do not copy.
/etc/default/grub
Grub init table. Do not modify manually.
/etc/default/hwclock
Platform hardware-specific file. Created during first boot. Do not copy.
/etc/init
Platform initialization files. Do not copy.
/etc/init.d/
Platform initialization files. Do not copy.
/etc/fstab
Static information on filesystem. Do not copy.
/etc/image-release
System version data. Do not copy.
/etc/os-release
System version data. Do not copy.
/etc/lsb-release
System version data. Do not copy.
/etc/lvm/archive
Filesystem files. Do not copy.
/etc/lvm/backup
Filesystem files. Do not copy.
/etc/modules
Created during first boot. Do not copy.
/etc/modules-load.d/
Created during first boot. Do not copy.
/etc/sensors.d
Platform-specific sensor data. Created during first boot. Do not copy.
/root/.ansible
Ansible tmp files. Do not copy.
/home/cumulus/.ansible
Ansible tmp files. Do not copy.
The following commands verify which files have changed compared to the previous Cumulus Linux install. Be sure to back up any changed files.
Run the sudo dpkg --verify command to show a list of changed files.
Run the egrep -v '^$|^#|=""$' /etc/default/isc-dhcp-* command to see if any of the generated /etc/default/isc-* files have changed.
Back Up and Restore Configuration with NVUE
You can back up and restore the configuration file with NVUE only if you used NVUE commands to configure the switch you want to upgrade.
To back up and restore the configuration file:
Save the configuration to the /etc/nvue.d/startup.yaml file with the nv config save command:
cumulus@switch:~$ nv config save
saved
Copy the /etc/nvue.d/startup.yaml file off the switch to a different location.
After upgrade is complete, restore the configuration. Copy the /etc/nvue.d/startup.yaml file to the switch, then run the nv config apply startup command:
If NVUE introduces new syntax for the feature that a snippet configures, you must remove the snippet before upgrading.
Create a cl-support File
Before and after you upgrade the switch, run the cl-support script to create a cl-support archive file. The file is a compressed archive of useful information for troubleshooting. If you experience any issues during upgrade, you can send this archive file to the Cumulus Linux support team to investigate.
Create the cl-support archive file with the cl-support command:
cumulus@switch:~$ sudo cl-support
Copy the cl-support file off the switch to a different location.
After upgrade is complete, run the cl-support command again to create a new archive file:
cumulus@switch:~$ sudo cl-support
Upgrade Cumulus Linux
You can upgrade Cumulus Linux in one of two ways:
Install a Cumulus Linux image of the new release, using ONIE.
Upgrade only the changed packages using the sudo -E apt-get update and sudo -E apt-get upgrade command.
Cumulus Linux also provides ISSU to upgrade an active switch with minimal disruption to the network. See In-Service-System-Upgrade-ISSU.
To upgrade to Cumulus Linux 5.8.0 from Cumulus Linux 4.x or 3.x, you must install a disk image of the new release using ONIE. You cannot upgrade packages with the apt-get upgrade command.
Upgrading an MLAG pair requires additional steps. If you are using MLAG to dual connect two Cumulus Linux switches in your environment, follow the steps in Upgrade Switches in an MLAG Pair below to ensure a smooth upgrade.
Install a Cumulus Linux Image or Upgrade Packages?
The decision to upgrade Cumulus Linux by either installing a Cumulus Linux image or upgrading packages depends on your environment and your preferences. Here are some recommendations for each upgrade method.
Install a Cumulus Linux image if you are performing a rolling upgrade in a production environment and if you are using up-to-date and comprehensive automation scripts. This upgrade method enables you to choose the exact release to which you want to upgrade and is the only method available to upgrade your switch to a new release train (for example, from 4.4.3 to 5.8.0).
Be aware of the following when installing the Cumulus Linux image:
Installing a Cumulus Linux image is destructive; any configuration files on the switch are not saved; copy them to a different server before you start the Cumulus Linux image install.
You must move configuration data to the new OS using ZTP or automation while the OS is first booted, or soon afterwards using out-of-band management.
Merge conflicts with configuration file changes in the new release sometimes go undetected.
If configuration files do not restore correctly, you cannot ssh to the switch from in-band management. Use out-of-band connectivity (eth0 or console).
You must reinstall and reconfigure third-party applications after upgrade.
Run package upgrade if you are upgrading from Cumulus Linux 5.0.0 to a later 5.x release, or if you use third-party applications (package upgrade does not replace or remove third-party applications, unlike the Cumulus Linux image install).
Be aware of the following when upgrading packages:
You cannot upgrade the switch to a new release train. For example, you cannot upgrade the switch from 4.x to 5.x.
You can only use package upgrade to upgrade a switch with an image install to a maximum of two releases; for example, you can package upgrade a switch running the Cumulus Linux 5.6 image to 5.7 or 5.8 (5.6 plus two releases).
The sudo -E apt-get upgrade command might restart or stop services as part of the upgrade process.
The sudo -E apt-get upgrade command might disrupt core services by changing core service dependency packages.
After you upgrade, account UIDs and GIDs created by packages might be different on different switches, depending on the configuration and package installation history.
Cumulus Linux does not support the sudo -E apt-get dist-upgrade command. Be sure to use sudo -E apt-get upgrade when upgrading packages.
Cumulus Linux Image Install (ONIE)
ONIE is an open source project (equivalent to PXE on servers) that enables the installation of network operating systems (NOS) on a bare metal switch.
To upgrade the switch:
Back up the configurations off the switch.
Download the Cumulus Linux image.
Install the Cumulus Linux image with the onie-install -a -i <image-location> command, which boots the switch into ONIE. The following example command installs the image from a web server, then reboots the switch. There are additional ways to install the Cumulus Linux image, such as using FTP, a local file, or a USB drive. For more information, see Installing a New Cumulus Linux Image.
cumulus@switch:~$ sudo onie-install -a -i http://10.0.1.251/cumulus-linux-5.8.0-mlx-amd64.bin && sudo reboot
Restore the configuration files to the new release (NVIDIA does not recommend restoring files with automation).
Verify correct operation with the old configurations on the new release.
Reinstall third party applications and associated configurations.
Package Upgrade
NVUE deprecated the port split command options (2x10G, 2x25G, 2x40G, 2x50G, 2x100G, 2x200G, 4x10G, 4x25G, 4x50G, 4x100G, 8x50G) available in Cumulus Linux 5.3 and earlier. If you use NVUE to configure port breakout speeds in Cumulus 5.3 or earlier, NVUE automatically updates the configuration during upgrade to Cumulus Linux 5.5 and later to use the new format (2x, 4x, 8x).
Cumulus Linux continues to support the old port split format in the /etc/cumulus/ports.conf file; however NVIDIA recommends that you use the new format.
Cumulus Linux completely embraces the Linux and Debian upgrade workflow, where you use an installer to install a base image, then perform any upgrades within that release train with sudo -E apt-get update and sudo -E apt-get upgrade commands. Any packages that have changed after the base install get upgraded in place from the repository. All switch configuration files remain untouched, or in rare cases merged (using the Debian merge function) during the package upgrade.
When you use package upgrade to upgrade your switch, configuration data stays in place during the upgrade. If the new release updates a previously changed configuration file, the upgrade process prompts you to either specify the version you want to use or evaluate the differences.
When you upgrade a switch from Cumulus Linux 5.5 or earlier to 5.8.0 with package upgrade, expired GPG keys prevent you from upgrading. To work around this issue, install the new keys with the following commands, then upgrade the switch.
Upgrade all the packages to the latest distribution.
cumulus@switch:~$ sudo -E apt-get upgrade
If you do not need to reboot the switch after the upgrade completes, the upgrade ends, restarts all upgraded services, and logs messages in the /var/log/syslog file similar to the ones shown below. In the examples below, the process only upgrades the frr package.
Policy: Service frr.service action stop postponed
Policy: Service frr.service action start postponed
Policy: Restarting services: frr.service
Policy: Finished restarting services
Policy: Removed /usr/sbin/policy-rc.d
Policy: Upgrade is finished
If the upgrade process encounters changed configuration files that have new versions in the release to which you are upgrading, you see a message similar to this:
Configuration file '/etc/frr/daemons'
==> Modified (by you or by a script) since installation.
==> Package distributor has shipped an updated version.
What would you like to do about it ? Your options are:
Y or I : install the package maintainer's version
N or O : keep your currently-installed version
D : show the differences between the versions
Z : start a shell to examine the situation
The default action is to keep your current version.
*** daemons (Y/I/N/O/D/Z) [default=N] ?
To see the differences between the currently installed version and the new version, type D.
To keep the currently installed version, type N. The new package version installs with the suffix .dpkg-dist (for example, /etc/frr/daemons.dpkg-dist). When the upgrade completes and before you reboot, merge your changes with the changes from the newly installed file.
To install the new version, type I. Your currently installed version has the suffix .dpkg-old.
Cumulus Linux includes /etc/apt/sources.list in the cumulus-archive-keyring package. During upgrade, you must select if you want the new version from the package or the existing file.
When the upgrade is complete, you can search for the files with the sudo find / -mount -type f -name '*.dpkg-*' command.
If you see errors for expired GPG keys that prevent you from upgrading packages, follow the steps in Upgrading Expired GPG Keys.
Reboot the switch if the upgrade messages indicate that you need to perform a system restart.
cumulus@switch:~$ sudo -E apt-get upgrade
... upgrade messages here ...
*** Caution: Service restart prior to reboot could cause unpredictable behavior
*** System reboot required ***
cumulus@switch:~$ sudo reboot
Verify correct operation with the old configurations on the new version.
Upgrade Notes
Package upgrade always updates to the latest available release in the Cumulus Linux repository. For example, if you are currently running Cumulus Linux 5.0.0 and run the sudo -E apt-get upgrade command on that switch, the packages upgrade to the latest releases in the latest 5.x release.
Because Cumulus Linux is a collection of different Debian Linux packages, be aware of the following:
The /etc/os-release and /etc/lsb-release files update to the currently installed Cumulus Linux release when you upgrade the switch using either package upgrade or Cumulus Linux image install. For example, if you run sudo -E apt-get upgrade and the latest Cumulus Linux release on the repository is 5.8.0, these two files display the release as 5.8.0 after the upgrade.
The /etc/image-release file updates only when you run a Cumulus Linux image install. Therefore, if you run a Cumulus Linux image install of Cumulus Linux 5.5.0, followed by a package upgrade to 5.5.1 using sudo -E apt-get upgrade, the /etc/image-release file continues to display Cumulus Linux 5.5.0, which is the originally installed base image.
Upgrade Switches in an MLAG Pair
If you are using MLAG to dual connect two switches in your environment, follow the steps below to upgrade the switches.
You must upgrade both switches in the MLAG pair to the same release of Cumulus Linux.
Only during the upgrade process does Cumulus Linux supports different software versions between MLAG peer switches. After you upgrade the first MLAG switch in the pair, run the clagctl showtimers command to monitor the init-delay timer. When the timer expires, make the upgraded MLAG switch the primary, then upgrade the peer to the same version of Cumulus Linux.
NVIDIA has not tested running different versions of Cumulus Linux on MLAG peer switches outside of the upgrade time period; you might see unexpected results.
Verify the switch is in the secondary role:
cumulus@switch:~$ nv show mlag
Shut down the core uplink layer 3 interfaces. The following example shuts down swp1:
cumulus@switch:~$ nv set interface swp1 link state down
cumulus@switch:~$ nv config apply
Shut down the peer link:
cumulus@switch:~$ nv set interface peerlink link state down
cumulus@switch:~$ nv config apply
To boot the switch into ONIE, run the onie-install -a -i <image-location> command. The following example command installs the image from a web server. There are additional ways to install the Cumulus Linux image, such as using FTP, a local file, or a USB drive. For more information, see Installing a New Cumulus Linux Image.
cumulus@switch:~$ sudo onie-install -a -i http://10.0.1.251/downloads/cumulus-linux-5.8.0-mlx-amd64.bin
To upgrade the switch with package upgrade instead of booting into ONIE, run the sudo -E apt-get update and sudo -E apt-get upgrade commands; see Package Upgrade.
Save the changes to the NVUE configuration from steps 2-3 and reboot the switch:
cumulus@switch:~$ nv config save
cumulus@switch:~$ nv action reboot system
If you installed a new image on the switch, restore the configuration files to the new release. If you performed an upgrade with apt, bring the uplink and peer link interfaces you shut down in steps 2-3 up:
cumulus@switch:~$ nv set interface swp1 link state up
cumulus@switch:~$ nv set interface peerlink link state up
cumulus@switch:~$ nv config apply
cumulus@switch:~$ nv config save
Verify STP convergence across both switches with the Linux mstpctl showall command. NVUE does not provide an equivalent command.
cumulus@switch:~$ mstpctl showall
Verify core uplinks and peer links are UP:
cumulus@switch:~$ nv show interface
Verify MLAG convergence:
cumulus@switch:~$ nv show mlag
Make this secondary switch the primary:
cumulus@switch:~$ nv set mlag priority 2084
Verify the other switch is now in the secondary role.
Repeat steps 2-9 on the new secondary switch.
Remove the priority 2048 and restore the priority back to 32768 on the current primary switch:
cumulus@switch:~$ nv set mlag priority 32768
Verify the switch is in the secondary role:
cumulus@switch:~$ clagctl status
Shut down the core uplink layer 3 interfaces:
cumulus@switch:~$ sudo ip link set <switch-port> down
Shut down the peer link:
cumulus@switch:~$ sudo ip link set peerlink down
To boot the switch into ONIE, run the onie-install -a -i <image-location> command. The following example command installs the image from a web server. There are additional ways to install the Cumulus Linux image, such as using FTP, a local file, or a USB drive. For more information, see Installing a New Cumulus Linux Image.
cumulus@switch:~$ sudo onie-install -a -i http://10.0.1.251/downloads/cumulus-linux-5.8.0-mlx-amd64.bin
To upgrade the switch with package upgrade instead of booting into ONIE, run the sudo -E apt-get update and sudo -E apt-get upgrade commands; see Package Upgrade.
Reboot the switch:
cumulus@switch:~$ sudo reboot
If you installed a new image on the switch, restore the configuration files to the new release.
Verify STP convergence across both switches:
cumulus@switch:~$ mstpctl showall
Verify that core uplinks and peer links are UP:
cumulus@switch:~$ ip addr show
Verify MLAG convergence:
cumulus@switch:~$ clagctl status
Make this secondary switch the primary:
cumulus@switch:~$ clagctl priority 2048
Verify the other switch is now in the secondary role.
Repeat steps 2-9 on the new secondary switch.
Remove the priority 2048 and restore the priority back to 32768 on the current primary switch:
cumulus@switch:~$ clagctl priority 32768
Roll Back a Cumulus Linux Installation
Even the most well planned and tested upgrades can result in unforeseen problems and sometimes the best solution is to roll back to the previous state. These main strategies require detailed planning and execution:
Flatten and rebuild. If the OS becomes unusable, you can use orchestration tools to reinstall the previous OS release from scratch and then rebuild the configuration automatically.
Restore to a previous state using a backup configuration captured before the upgrade.
The method you employ is specific to your deployment strategy. Providing detailed steps for each scenario is outside the scope of this document.
Third Party Packages
If you install any third party applications on a Cumulus Linux switch, configuration data is typically installed in the /etc directory, but it is not guaranteed. It is your responsibility to understand the behavior and configuration file information of any third party packages installed on the switch.
After you upgrade using a full Cumulus Linux image install, you need to reinstall any third party packages or any Cumulus Linux add-on packages.
To manage additional applications in the form of packages and to install the latest updates, use the Advanced Packaging Tool (apt).
Updating, upgrading, and installing packages with apt causes disruptions to network services:
Upgrading a package can cause services to restart or stop.
Installing a package sometimes disrupts core services by changing core service dependency packages. In some cases, installing new packages also upgrades additional existing packages due to dependencies.
If services stop, you need to reboot the switch to restart the services.
Update the Package Cache
To work correctly, apt relies on a local cache listing of the available packages. You must populate the cache initially, then periodically update it with sudo -E apt-get update:
Use the -E option with sudo whenever you run any apt-get command. This option preserves your environment variables (such as HTTP proxies) before you install new packages or upgrade your distribution.
List Available Packages
After the cache populates, use the apt-cache command to search the cache and find the packages of interest or to get information about an available package.
Here are examples of the search and show sub-commands:
cumulus@switch:~$ apt-cache search tcp
collectd-core - statistics collection and monitoring daemon (core system)
fakeroot - tool for simulating superuser privileges
iperf - Internet Protocol bandwidth measuring tool
iptraf-ng - Next Generation Interactive Colorful IP LAN Monitor
libfakeroot - tool for simulating superuser privileges - shared libraries
libfstrm0 - Frame Streams (fstrm) library
libibverbs1 - Library for direct userspace use of RDMA (InfiniBand/iWARP)
libnginx-mod-stream - Stream module for Nginx
libqt4-network - Qt 4 network module
librtr-dev - Small extensible RPKI-RTR-Client C library - development files
librtr0 - Small extensible RPKI-RTR-Client C library
libwiretap8 - network packet capture library -- shared library
libwrap0 - Wietse Venema's TCP wrappers library
libwrap0-dev - Wietse Venema's TCP wrappers library, development files
netbase - Basic TCP/IP networking system
nmap-common - Architecture independent files for nmap
nuttcp - network performance measurement tool
openssh-client - secure shell (SSH) client, for secure access to remote machines
openssh-server - secure shell (SSH) server, for secure access from remote machines
openssh-sftp-server - secure shell (SSH) sftp server module, for SFTP access from remote machines
python-dpkt - Python 2 packet creation / parsing module for basic TCP/IP protocols
rsyslog - reliable system and kernel logging daemon
socat - multipurpose relay for bidirectional data transfer
tcpdump - command-line network traffic analyzer
cumulus@switch:~$ apt-cache show tcpdump
Package: tcpdump
Version: 4.9.3-1~deb10u1
Installed-Size: 1109
Maintainer: Romain Francoise <rfrancoise@debian.org>
Architecture: amd64
Replaces: apparmor-profiles-extra (<< 1.12~)
Depends: libc6 (>= 2.14), libpcap0.8 (>= 1.5.1), libssl1.1 (>= 1.1.0)
Suggests: apparmor (>= 2.3)
Breaks: apparmor-profiles-extra (<< 1.12~)
Size: 400060
SHA256: 3a63be16f96004bdf8848056f2621fbd863fadc0baf44bdcbc5d75dd98331fd3
SHA1: 2ab9f0d2673f49da466f5164ecec8836350aed42
MD5sum: 603baaf914de63f62a9f8055709257f3
Description: command-line network traffic analyzer
This program allows you to dump the traffic on a network. tcpdump
is able to examine IPv4, ICMPv4, IPv6, ICMPv6, UDP, TCP, SNMP, AFS
BGP, RIP, PIM, DVMRP, IGMP, SMB, OSPF, NFS and many other packet
types.
.
It can be used to print out the headers of packets on a network
interface, filter packets that match a certain expression. You can
use this tool to track down network problems, to detect attacks
or to monitor network activities.
Description-md5: f01841bfda357d116d7ff7b7a47e8782
Homepage: http://www.tcpdump.org/
Multi-Arch: foreign
Section: net
Priority: optional
Filename: pool/upstream/t/tcpdump/tcpdump_4.9.3-1~deb10u1_amd64.deb
The search commands look for the search terms not only in the package name but in other parts of the package information; the search matches on more packages than you expect.
List Packages Installed on the System
The apt-cache command shows information about all the packages available in the repository. To see which packages are actually installed on your system with the version, run the following command.
cumulus@switch:~$ nv show platform software installed
Installed Package description package version
----------------- ------------------- ---------- --------------------
acpi displays information on ACPI devices acpi 1.7-1.1
acpi-support-base scripts for handling base ACPI events such as the power button acpi-support-base 0.142-8
acpid Advanced Configuration and Power Interface event daemon acpid 1:2.0.31-1
...
cumulus@switch:~$ dpkg -l
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name Version Architecture Description
+++-===================-=========================-============-=================================
ii acpi 1.7-1.1 amd64 displays information on ACPI devices
ii acpi-support-base 0.142-8 all scripts for handling base ACPI events such as th
ii acpid 1:2.0.31-1 amd64 Advanced Configuration and Power Interface event
ii adduser 3.118 all add and remove users and groups
ii apt 1.8.2 amd64 commandline package manager
ii arping 2.19-6 amd64 sends IP and/or ARP pings (to the MAC address)
ii arptables 0.0.4+snapshot20181021-4 amd64 ARP table administration
...
Show the Version of a Package
To show the version of a specific package installed on the system:
The following example command shows which version of the vrf package is on the system:
cumulus@switch:~$ nv show platform software installed vrf
running applied pending description
----------- ------------------- ------- ------- -----------
description Linux tools for VRF Description
package vrf Package
version 1.0-cl5.7.0u9 Version
The following example command shows which version of the vrf package is on the system:
cumulus@switch:~$ dpkg -l vrf
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name Version Architecture Description
+++-==========-============-============-=================================
ii vrf 1.0-cl5.7.0u9 amd64 Linux tools for VRF
Upgrade Packages
To upgrade all the packages installed on the system to their latest versions, run the following commands:
The system lists the packages for upgrade and prompts you to continue.
The above commands upgrade all installed versions with their latest versions but do not install any new packages.
Add New Packages
To add a new package, first ensure the package is not already on the system:
cumulus@switch:~$ dpkg -l | grep <name of package>
If the package is already on the system, you can update the package from the Cumulus Linux repository as part of the package upgrade process, which upgrades all packages on the system. See Upgrade Packages above.
If the package is not already on the system, add it by running sudo -E apt-get install <name of package>. This retrieves the package from the Cumulus Linux repository and installs it on your system together with any other dependent packages. The following example adds the tcpreplay package to the system:
cumulus@switch:~$ sudo -E apt-get update
cumulus@switch:~$ sudo -E apt-get install tcpreplay
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following NEW packages will be installed:
tcpreplay
0 upgraded, 1 newly installed, 0 to remove and 1 not upgraded.
Need to get 436 kB of archives.
After this operation, 1008 kB of additional disk space will be used
...
You can install several packages at the same time:
In some cases, installing a new package also upgrades additional existing packages due to dependencies. To view these additional packages before you install, run the apt-get install --dry-run command.
Add Packages From Another Repository
As shipped, Cumulus Linux searches the Cumulus Linux repository for available packages. You can add additional repositories to search by adding them to the list of sources that apt-get consults. See man sources.list for more information.
NVIDIA adds features or makes bug fixes to certain packages; do not replace these packages with versions from other repositories.
If you want to install packages that are not in the Cumulus Linux repository, the procedure is the same as above, but with one additional step.
NVIDIA does not test and Cumulus Linux Technical Support does not support packages that are not part of the Cumulus Linux repository.
Installing packages outside of the Cumulus Linux repository requires the use of sudo -E apt-get; however, depending on the package, you can use easy-install and other commands.
To install a new package, complete the following steps:
Run the dpkg command to ensure that the package is not already
installed on the system:
cumulus@switch:~$ dpkg -l | grep <name of package>
If the package is already on the system, ensure it is the version you need. If it is an older version, update the package from the Cumulus Linux repository:
If the package is not on the system, the package source location is not in the /etc/apt/sources.list file. Edit and add the appropriate source to the file. For example, add the following if you want a package from the Debian repository that is not in the Cumulus Linux repository:
deb http://http.us.debian.org/debian buster main
deb http://security.debian.org/ buster/updates main
Otherwise, /etc/apt/sources.list lists the repository but comments it out. To uncomment the repository, remove the # at the start of the line, then save the file.
Run sudo -E apt-get update, then install the package and upgrade:
Cumulus Linux contains a local archive embedded in the Cumulus Linux image. This archive, cumulus-local-apt-archive, contains the packages you need to install ifplugd, LDAP, RADIUS or TACACS+ without a network connection.
The archive contains the following packages:
audisp-tacplus
ifplugd
libdaemon0
libnss-ldapd
libnss-mapuser
libnss-tacplus
libpam-ldapd
libpam-radius-auth
libpam-tacplus
libtac2
libtacplus-map1
nslcd
Add these packages with apt-get update && apt-get install, as described above.
man pages for apt-get, dpkg, sources.list, apt_preferences
Zero Touch Provisioning - ZTP
Use ZTP to deploy network devices in large-scale environments. On first boot, Cumulus Linux runs ZTP, which executes the provisioning automation that deploys the device for its intended role in the network.
The provisioning framework allows you to execute a one-time, user-provided script. You can develop this script using a variety of automation tools and scripting languages. You can also use it to add the switch to a configuration management (CM) platform such as Puppet, Chef, CFEngine or a custom, proprietary tool.
While developing and testing the provisioning logic, you can use the ztp command in Cumulus Linux to run your provisioning script manually on a device.
ZTP in Cumulus Linux can run automatically in one of the following ways, in this order:
Through a local file
Using a USB drive inserted into the switch (ZTP-USB)
Through DHCP
Use a Local File
ZTP only looks one time for a ZTP script on the local file system when the switch boots. ZTP searches for an install script that matches an ONIE-style waterfall in /var/lib/cumulus/ztp, looking for the most specific name first, and ending at the most generic:
You can also trigger the ZTP process manually by running the ztp --run <URL> command, where the URL is the path to the ZTP script.
Use a USB Drive
NVIDIA tests this feature only with thumb drives, not an external large USB hard drive.
If the ztp process does not discover a local script, it tries one time to locate an inserted but unmounted USB drive. If it discovers one, it begins the ZTP process.
Cumulus Linux supports the use of a FAT32, FAT16, or VFAT-formatted USB drive as an installation source for ZTP scripts. You must plug in the USB drive before you power up the switch.
At minimum, the script must:
Install the Cumulus Linux operating system.
Copy over a basic configuration to the switch.
Restart the switch or the relevant services to get switchd up and running with that configuration.
Follow these steps to perform ZTP using a USB drive:
Copy the installation image to the USB drive.
The ztp process searches the root filesystem of the newly mounted drive for filenames matching an ONIE-style waterfall (see the patterns and examples above), looking for the most specific name first, and ending at the most generic.
ZTP parses the contents of the script to ensure it contains the CUMULUS-AUTOPROVISIONING flag (see example scripts).
The USB drive mounts to a temporary directory under /tmp (for example, /tmp/tmpigGgjf/). To reference files on the USB drive, use the environment variable ZTP_USB_MOUNTPOINT to refer to the USB root partition.
ZTP Over DHCP
If the ztp process does not discover a local ONIE script or applicable USB drive, it checks DHCP every ten seconds for up to five minutes for the presence of a ZTP URL specified in /var/run/ztp.dhcp. The URL can be any of HTTP, HTTPS, FTP, or TFTP.
For ZTP using DHCP, provisioning initially takes place over the management network and initiates through a DHCP hook. A DHCP option specifies a configuration script. The ZTP process requests this script from the Web server and the script executes locally.
The ZTP process over DHCP follows these steps:
The first time you boot Cumulus Linux, eth0 makes a DHCP request. By default, Cumulus Linux sends DHCP option 60 (the vendor class identifier) with the value cumulus-linux x86_64 to identify itself to the DHCP server.
The DHCP server offers a lease to the switch.
If option 239 is in the response, the ZTP process starts.
The ZTP process requests the contents of the script from the URL, sending additional HTTP headers containing details about the switch.
ZTP parses the contents of the script to ensure it contains the CUMULUS-AUTOPROVISIONING flag (see example scripts).
If provisioning is necessary, the script executes locally on the switch with root privileges.
ZTP examines the return code of the script. If the return code is 0, ZTP marks the provisioning state as complete in the autoprovisioning configuration file.
Trigger ZTP Over DHCP
If you have not yet provisioned the switch, you can trigger the ZTP process over DHCP when eth0 uses DHCP and one of the following events occur:
The switch boots.
You plug a cable into or unplug a cable from the eth0 port.
You disconnect, then reconnect the switch power cord.
You can also run the ztp --run <URL> command, where the URL is the path to the ZTP script.
Configure the DHCP Server
During the DHCP process over eth0, Cumulus Linux requests DHCP option 239. This option specifies the custom provisioning script.
For example, the /etc/dhcp/dhcpd.conf file for an ISC DHCP server looks like:
Do not use an underscore (_) in the hostname; underscores are not permitted in hostnames.
DHCP on Front Panel Ports
ZTP runs DHCP on all the front panel switch ports and on any active interface. ZTP assesses the list of active ports on every retry cycle. When it receives the DHCP lease and option 239 is present in the response, ZTP starts to execute the script.
Inspect HTTP Headers
The following HTTP headers in the request to the web server retrieve the provisioning script:
Header Value Example
------ ----- -------
User-Agent CumulusLinux-AutoProvision/0.4
CUMULUS-ARCH CPU architecture x86_64
CUMULUS-BUILD 5.1.0
CUMULUS-MANUFACTURER odm
CUMULUS-PRODUCTNAME switch_model
CUMULUS-SERIAL XYZ123004
CUMULUS-BASE-MAC 44:38:39:FF:40:94
CUMULUS-MGMT-MAC 44:38:39:FF:00:00
CUMULUS-VERSION 5.1.0
CUMULUS-PROV-COUNT 0
CUMULUS-PROV-MAX 32
Write ZTP Scripts
You must include the following line in any of the supported scripts that you expect to run using the autoprovisioning framework.
# CUMULUS-AUTOPROVISIONING
The script must contain the CUMULUS-AUTOPROVISIONING flag. You can include this flag in a comment or remark; you do not need to echo or write the flag to stdout.
You can write the script in any language that Cumulus Linux supports, such as:
Perl
Python
Ruby
Shell
The script must return an exit code of 0 upon success to mark the process as complete in the autoprovisioning configuration file.
The following script installs Cumulus Linux from a USB drive and applies a configuration:
#!/bin/bash
function error() {
echo -e "\e[0;33mERROR: The ZTP script failed while running the command $BASH_COMMAND at line $BASH_LINENO.\e[0m" >&2
exit 1
}
# Log all output from this script
exec >> /var/log/autoprovision 2>&1
date "+%FT%T ztp starting script $0"
trap error ERR
#Add Debian Repositories
echo "deb http://http.us.debian.org/debian buster main" >> /etc/apt/sources.list
echo "deb http://security.debian.org/ buster/updates main" >> /etc/apt/sources.list
#Update Package Cache
apt-get update -y
#Load interface config from usb
cp ${ZTP_USB_MOUNTPOINT}/interfaces /etc/network/interfaces
#Load port config from usb
# (if breakout cables are used for certain interfaces)
cp ${ZTP_USB_MOUNTPOINT}/ports.conf /etc/cumulus/ports.conf
#Reload interfaces to apply loaded config
ifreload -a
# CUMULUS-AUTOPROVISIONING
exit 0
Continue Provisioning
Typically ZTP exits after executing the script locally and does not continue. To continue with provisioning so that you do not have to intervene manually or embed an Ansible callback into the script, you can add the CUMULUS-AUTOPROVISION-CASCADE directive.
Best Practices
ZTP scripts come in different forms and frequently perform the same tasks. As BASH is the most common language for ZTP scripts, use the following BASH snippets to perform common tasks with robust error checking.
Set the Default Cumulus User Password
The default cumulus user account password is cumulus. When you log into Cumulus Linux for the first time, you must provide a new password for the cumulus account, then log back into the system.
Add the following function to your ZTP script to change the default cumulus user account password to a clear-text password. The example changes the password cumulus to MyP4$$word.
function set_password(){
# Unexpire the cumulus account
passwd -x 99999 cumulus
# Set the password
echo 'cumulus:MyP4$$word' | chpasswd
}
set_password
If you have an insecure management network, set the password with an encrypted hash instead of a clear-text password.
First, generate a sha-512 password hash with the following python commands. The example commands generate a sha-512 password hash for the password MyP4$$word.
Then, add the following function to the ZTP script to change the default cumulus user account password:
function set_password(){
# Unexpire the cumulus account
passwd -x 99999 cumulus
# Set the password
usermod -p '$6$hs7OPmnrfvLNKfoZ$iB3hy5N6Vv6koqDmxixpTO6lej6VaoKGvs5E8p5zNo4tPec0KKqyQnrFMII3jGxVEYWntG9e7Z7DORdylG5aR/' cumulus
}
set_password
Test DNS Name Resolution
DNS names are frequently used in ZTP scripts. The ping_until_reachable function tests that each DNS name resolves into a reachable IP address. Call this function with each DNS target used in your script before you use the DNS name elsewhere in your script.
The following example shows how to call the ping_until_reachable function in the context of a larger task.
function ping_until_reachable(){
last_code=1
max_tries=30
tries=0
while [ "0" != "$last_code" ] && [ "$tries" -lt "$max_tries" ]; do
tries=$((tries+1))
echo "$(date) INFO: ( Attempt $tries of $max_tries ) Pinging $1 Target Until Reachable."
ping $1 -c2 &> /dev/null
last_code=$?
sleep 1
done
if [ "$tries" -eq "$max_tries" ] && [ "$last_code" -ne "0" ]; then
echo "$(date) ERROR: Reached maximum number of attempts to ping the target $1 ."
exit 1
fi
}
Check the Cumulus Linux Release
The following script segment demonstrates how to check which Cumulus Linux release is running and upgrades the node if the release is not the target release. If the release is the target release, normal ZTP tasks execute. This script calls the ping_until_reachable script (described above) to make sure the server holding the image server and the ZTP script is reachable.
If you apply a management VRF in your script, either apply it last or reboot instead. If you do not apply a management VRF last, you need to prepend any commands that require eth0 to communicate out with /usr/bin/ip vrf exec mgmt; for example, /usr/bin/ip vrf exec mgmt apt-get update -y.
Perform Ansible Provisioning Callbacks
After initially configuring a node with ZTP, use Provisioning Callbacks to inform Ansible Tower or AWX that the node is ready for more detailed provisioning. The following example demonstrates how to use a provisioning callback:
Make sure to disable the DHCP hostname override setting in your script.
function set_hostname(){
# Remove DHCP Setting of Hostname
sed s/'SETHOSTNAME="yes"'/'SETHOSTNAME="no"'/g -i /etc/dhcp/dhclient-exit-hooks.d/dhcp-sethostname
hostnamectl set-hostname $1
}
Test ZTP Scripts
Use these commands to test and debug your ZTP scripts.
You can use verbose mode to debug your script and see where your script fails. Include the -v option when you run ZTP:
cumulus@switch:~$ sudo ztp -v -r http://192.0.2.1/demo.sh
Attempting to provision via ZTP Manual from http://192.0.2.1/demo.sh
Broadcast message from root@dell-s6010-01 (ttyS0) (Tue May 10 22:44:17 2016):
ZTP: Attempting to provision via ZTP Manual from http://192.0.2.1/demo.sh
ZTP Manual: URL response code 200
ZTP Manual: Found Marker CUMULUS-AUTOPROVISIONING
ZTP Manual: Executing http://192.0.2.1/demo.sh
error: ZTP Manual: Payload returned code 1
error: Script returned failure
To see results of the most recent ZTP execution, you can run the ztp -s command.
cumulus@switch:~$ ztp -s
ZTP INFO:
State enabled
Version 1.0
Result Script Failure
Date Mon 20 May 2019 09:31:27 PM UTC
Method ZTP DHCP
URL http://192.0.2.1/demo.sh
If ZTP runs when the switch boots and not manually, you can run the systemctl -l status ztp.service then journalctl -l -u ztp.service to see if any failures occur:
cumulus@switch:~$ sudo systemctl -l status ztp.service
● ztp.service - Cumulus Linux ZTP
Loaded: loaded (/lib/systemd/system/ztp.service; enabled)
Active: failed (Result: exit-code) since Wed 2016-05-11 16:38:45 UTC; 1min 47s ago
Docs: man:ztp(8)
Process: 400 ExecStart=/usr/sbin/ztp -b (code=exited, status=1/FAILURE)
Main PID: 400 (code=exited, status=1/FAILURE)
May 11 16:37:45 cumulus ztp[400]: ztp [400]: ZTP USB: Device not found
May 11 16:38:45 dell-s6010-01 ztp[400]: ztp [400]: ZTP DHCP: Looking for ZTP Script provided by DHCP
May 11 16:38:45 dell-s6010-01 ztp[400]: ztp [400]: Attempting to provision via ZTP DHCP from http://192.0.2.1/demo.sh
May 11 16:38:45 dell-s6010-01 ztp[400]: ztp [400]: ZTP DHCP: URL response code 200
May 11 16:38:45 dell-s6010-01 ztp[400]: ztp [400]: ZTP DHCP: Found Marker CUMULUS-AUTOPROVISIONING
May 11 16:38:45 dell-s6010-01 ztp[400]: ztp [400]: ZTP DHCP: Executing http://192.0.2.1/demo.sh
May 11 16:38:45 dell-s6010-01 ztp[400]: ztp [400]: ZTP DHCP: Payload returned code 1
May 11 16:38:45 dell-s6010-01 ztp[400]: ztp [400]: Script returned failure
May 11 16:38:45 dell-s6010-01 systemd[1]: ztp.service: main process exited, code=exited, status=1/FAILURE
May 11 16:38:45 dell-s6010-01 systemd[1]: Unit ztp.service entered failed state.
cumulus@switch:~$
cumulus@switch:~$ sudo journalctl -l -u ztp.service --no-pager
-- Logs begin at Wed 2016-05-11 16:37:42 UTC, end at Wed 2016-05-11 16:40:39 UTC. --
May 11 16:37:45 cumulus ztp[400]: ztp [400]: /var/lib/cumulus/ztp: Sate Directory does not exist. Creating it...
May 11 16:37:45 cumulus ztp[400]: ztp [400]: /var/run/ztp.lock: Lock File does not exist. Creating it...
May 11 16:37:45 cumulus ztp[400]: ztp [400]: /var/lib/cumulus/ztp/ztp_state.log: State File does not exist. Creating it...
May 11 16:37:45 cumulus ztp[400]: ztp [400]: ZTP LOCAL: Looking for ZTP local Script
May 11 16:37:45 cumulus ztp[400]: ztp [400]: ZTP LOCAL: Waterfall search for /var/lib/cumulus/ztp/cumulus-ztp-x86_64-dell_s6010_s1220-rUNKNOWN
May 11 16:37:45 cumulus ztp[400]: ztp [400]: ZTP LOCAL: Waterfall search for /var/lib/cumulus/ztp/cumulus-ztp-x86_64-dell_s6010_s1220
May 11 16:37:45 cumulus ztp[400]: ztp [400]: ZTP LOCAL: Waterfall search for /var/lib/cumulus/ztp/cumulus-ztp-x86_64-dell
May 11 16:37:45 cumulus ztp[400]: ztp [400]: ZTP LOCAL: Waterfall search for /var/lib/cumulus/ztp/cumulus-ztp-x86_64
May 11 16:37:45 cumulus ztp[400]: ztp [400]: ZTP LOCAL: Waterfall search for /var/lib/cumulus/ztp/cumulus-ztp
May 11 16:37:45 cumulus ztp[400]: ztp [400]: ZTP USB: Looking for unmounted USB devices
May 11 16:37:45 cumulus ztp[400]: ztp [400]: ZTP USB: Parsing partitions
May 11 16:37:45 cumulus ztp[400]: ztp [400]: ZTP USB: Device not found
May 11 16:38:45 dell-s6010-01 ztp[400]: ztp [400]: ZTP DHCP: Looking for ZTP Script provided by DHCP
May 11 16:38:45 dell-s6010-01 ztp[400]: ztp [400]: Attempting to provision via ZTP DHCP from http://192.0.2.1/demo.sh
May 11 16:38:45 dell-s6010-01 ztp[400]: ztp [400]: ZTP DHCP: URL response code 200
May 11 16:38:45 dell-s6010-01 ztp[400]: ztp [400]: ZTP DHCP: Found Marker CUMULUS-AUTOPROVISIONING
May 11 16:38:45 dell-s6010-01 ztp[400]: ztp [400]: ZTP DHCP: Executing http://192.0.2.1/demo.sh
May 11 16:38:45 dell-s6010-01 ztp[400]: ztp [400]: ZTP DHCP: Payload returned code 1
May 11 16:38:45 dell-s6010-01 ztp[400]: ztp [400]: Script returned failure
May 11 16:38:45 dell-s6010-01 systemd[1]: ztp.service: main process exited, code=exited, status=1/FAILURE
May 11 16:38:45 dell-s6010-01 systemd[1]: Unit ztp.service entered failed state.
Instead of running journalctl, you can see the log history by running:
cumulus@switch:~$ cat /var/log/syslog | grep ztp
2016-05-11T16:37:45.132583+00:00 cumulus ztp [400]: /var/lib/cumulus/ztp: State Directory does not exist. Creating it...
2016-05-11T16:37:45.134081+00:00 cumulus ztp [400]: /var/run/ztp.lock: Lock File does not exist. Creating it...
2016-05-11T16:37:45.135360+00:00 cumulus ztp [400]: /var/lib/cumulus/ztp/ztp_state.log: State File does not exist. Creating it...
2016-05-11T16:37:45.185598+00:00 cumulus ztp [400]: ZTP LOCAL: Looking for ZTP local Script
2016-05-11T16:37:45.485084+00:00 cumulus ztp [400]: ZTP LOCAL: Waterfall search for /var/lib/cumulus/ztp/cumulus-ztp-x86_64-dell_s6010_s1220-rUNKNOWN
2016-05-11T16:37:45.486394+00:00 cumulus ztp [400]: ZTP LOCAL: Waterfall search for /var/lib/cumulus/ztp/cumulus-ztp-x86_64-dell_s6010_s1220
2016-05-11T16:37:45.488385+00:00 cumulus ztp [400]: ZTP LOCAL: Waterfall search for /var/lib/cumulus/ztp/cumulus-ztp-x86_64-dell
2016-05-11T16:37:45.489665+00:00 cumulus ztp [400]: ZTP LOCAL: Waterfall search for /var/lib/cumulus/ztp/cumulus-ztp-x86_64
2016-05-11T16:37:45.490854+00:00 cumulus ztp [400]: ZTP LOCAL: Waterfall search for /var/lib/cumulus/ztp/cumulus-ztp
2016-05-11T16:37:45.492296+00:00 cumulus ztp [400]: ZTP USB: Looking for unmounted USB devices
2016-05-11T16:37:45.493525+00:00 cumulus ztp [400]: ZTP USB: Parsing partitions
2016-05-11T16:37:45.636422+00:00 cumulus ztp [400]: ZTP USB: Device not found
2016-05-11T16:38:43.372857+00:00 cumulus ztp [1805]: Found ZTP DHCP Request
2016-05-11T16:38:45.696562+00:00 cumulus ztp [400]: ZTP DHCP: Looking for ZTP Script provided by DHCP
2016-05-11T16:38:45.698598+00:00 cumulus ztp [400]: Attempting to provision via ZTP DHCP from http://192.0.2.1/demo.sh
2016-05-11T16:38:45.816275+00:00 cumulus ztp [400]: ZTP DHCP: URL response code 200
2016-05-11T16:38:45.817446+00:00 cumulus ztp [400]: ZTP DHCP: Found Marker CUMULUS-AUTOPROVISIONING
2016-05-11T16:38:45.818402+00:00 cumulus ztp [400]: ZTP DHCP: Executing http://192.0.2.1/demo.sh
2016-05-11T16:38:45.834240+00:00 cumulus ztp [400]: ZTP DHCP: Payload returned code 1
2016-05-11T16:38:45.835488+00:00 cumulus ztp [400]: Script returned failure
2016-05-11T16:38:45.876334+00:00 cumulus systemd[1]: ztp.service: main process exited, code=exited, status=1/FAILURE
2016-05-11T16:38:45.879410+00:00 cumulus systemd[1]: Unit ztp.service entered failed state.
If you see that the issue is a script failure, you can modify the script and then run ZTP manually using ztp -v -r <URL/path to that script>, as above.
cumulus@switch:~$ sudo ztp -v -r http://192.0.2.1/demo.sh
Attempting to provision via ZTP Manual from http://192.0.2.1/demo.sh
Broadcast message from root@dell-s6010-01 (ttyS0) (Tue May 10 22:44:17 2019):
ZTP: Attempting to provision via ZTP Manual from http://192.0.2.1/demo.sh
ZTP Manual: URL response code 200
ZTP Manual: Found Marker CUMULUS-AUTOPROVISIONING
ZTP Manual: Executing http://192.0.2.1/demo.sh
error: ZTP Manual: Payload returned code 1
error: Script returned failure
cumulus@switch:~$ sudo ztp -s
State enabled
Version 1.0
Result Script Failure
Date Mon 20 May 2019 09:31:27 PM UTC
Method ZTP Manual
URL http://192.0.2.1/demo.sh
Use the following command to check syslog for information about ZTP:
Errors in syslog for ZTP like those shown above often occur if you create or edit the script on a Windows machine. Check to make sure that the \r\n characters are not present in the end-of-line encodings.
Use the cat -v ztp.sh command to view the contents of the script and search for any hidden characters.
root@oob-mgmt-server:/var/www/html# cat -v ./ztp_oob_windows.sh
#!/bin/bash^M
^M
###################^M
# ZTP Script^M
###################^M
^M
/usr/cumulus/bin/cl-license -i http://192.168.0.254/license.txt^M
^M
# Clean method of performing a Reboot^M
nohup bash -c 'sleep 2; shutdown now -r "Rebooting to Complete ZTP"' &^M
^M
exit 0^M
^M
# The line below is required to be a valid ZTP script^M
#CUMULUS-AUTOPROVISIONING^M
root@oob-mgmt-server:/var/www/html#
The ^M characters in the output of your ZTP script, as shown above, indicate the presence of Windows end-of-line encodings that you need to remove.
Use the translate (tr) command on any Linux system to remove the '\r' characters from the file.
root@oob-mgmt-server:/var/www/html# tr -d '\r' < ztp_oob_windows.sh > ztp_oob_unix.sh
root@oob-mgmt-server:/var/www/html# cat -v ./ztp_oob_unix.sh
#!/bin/bash
###################
# ZTP Script
###################
/usr/cumulus/bin/cl-license -i http://192.168.0.254/license.txt
# Clean method of performing a Reboot
nohup bash -c 'sleep 2; shutdown now -r "Rebooting to Complete ZTP"' &
exit 0
# The line below is required to be a valid ZTP script
#CUMULUS-AUTOPROVISIONING
root@oob-mgmt-server:/var/www/html#
Manually Use the ztp Command
To enable ZTP, use the -e option:
cumulus@switch:~$ sudo ztp -e
When you enable ZTP, it tries to run the next time the switch boots. However, if ZTP already ran on a previous boot up or if there is a manual configuration, ZTP exits without trying to look for a script.
ZTP checks for these manual configurations when the switch boots:
Password changes
Users and groups changes
Packages changes
Interfaces changes
When the switch boots for the first time, ZTP records the state of important files that can update after you configure the switch. After a reboot, ZTP compares the recorded state to the current state of these files. If they do not match, ZTP considers the switch as already provisioned and exits. ZTP only deletes these files after a reset.
To reset ZTP to its original state, use the -R option. This removes the ztp directory and ZTP runs the next time the switch reboots.
cumulus@switch:~$ sudo ztp -R
To disable ZTP, use the -d option:
cumulus@switch:~$ sudo ztp -d
To force provisioning to occur and ignore the status listed in the configuration file, use the -r option:
cumulus@switch:~$ sudo ztp -r cumulus-ztp.sh
To see the current ZTP state, use the -s option:
cumulus@switch:~$ sudo ztp -s
ZTP INFO:
State disabled
Version 1.0
Result success
Date Mon May 20 21:51:04 2019 UTC
Method Switch manually configured
URL None
Considerations
While you are writing a provisioning script, you sometimes need to reboot the switch.
You can use the Cumulus Linux onie-select -i command to reprovision the switch and install a network operating system again using ONIE.
System Configuration
This section describes how to configure the following system settings:
NVUE is an object-oriented, schema driven model of a complete Cumulus Linux system (hardware and software) providing a robust API that allows for multiple interfaces to both view (show) and configure (set and unset) any element within a system running the NVUE software.
For a description of the NVUE object model, go to NVUE Object Model.
For an overview of the NVUE CLI commands, go to NVUE CLI.
For information on how to access and use the NVUE API, go to NVUE API.
For information on how to use NVUE snippets, go to NVUE Snippets.
NVUE Object Model
The NVUE object model definition uses the OpenAPI specification (OAS). Similar to YANG (RFC 6020 and RFC 7950), OAS is a data definition, manipulation, and modeling language (DML) that lets you build model-driven interfaces for both humans and machines. Although the computer networking and telecommunications industry commonly uses YANG (standardized by IETF) as a DML, the adoption of OpenAPI is broader, spanning cloud to compute to storage to IoT and even social media. The OpenAPI Initiative (OAI) consortium leads OpenAPI standardization, a chartered project under the Linux Foundation.
The OAS schema forms the management plane model with which you configure, monitor, and manage the Cumulus Linux switch. The v3.0.2 version of OAS defines the NVUE data model.
Like other systems that use OpenAPI, the NVUE OAS schema defines the endpoints (paths) exposed as RESTful APIs. With these REST APIs, you can perform various create, retrieve, update, delete, and eXecute (CRUDX) operations. The OAS schema also describes the API inputs and outputs (data models).
You can use the NVUE object model in these two ways:
Through the NVUE REST API, where you run the GET, PATCH, DELETE, and other REST APIs on the NVUE object model endpoints to configure, monitor, and manage the switch. Because of the large user community and maturity of OAS, you can use several popular tools and libraries to create client-side bindings to use the NVUE REST API.
Through the NVUE CLI, where you configure, monitor and manage the Cumulus Linux network elements. The CLI commands translate to their equivalent REST APIs, which Cumulus Linux then runs on the NVUE object model.
The CLI and the REST API are equivalent in functionality; you can run all management operations from the REST API or the CLI. The NVUE object model drives both the REST API and the CLI management operations. All operations are consistent; for example, the CLI nv show commands reflect any PATCH operation (create) you run through the REST API.
NVUE follows a declarative model, removing context-specific commands and settings. It is structured as a big tree that represents the entire state of a Cumulus Linux instance. At the base of the tree are high level branches representing objects, such as router and interface. Under each of these branches are further branches. As you navigate through the tree, you gain a more specific context. At the leaves of the tree are actual attributes, represented as key-value pairs. The path through the tree is similar to a filesystem path.
Cumulus Linux installs NVUE by default and enables the NVUE service nvued.
NVUE CLI
The NVUE CLI has a flat structure instead of a modal structure. Therefore, you can run all commands from the primary prompt instead of only in a specific mode.
You can choose to configure Cumulus Linux either with NVUE commands or Linux commands (with vtysh or by manually editing configuration files). Do not run both NVUE configuration commands (such as nv set, nv unset, nv action, and nv config) and Linux commands to configure the switch. NVUE commands replace the configuration in files such as /etc/network/interfaces and /etc/frr/frr.conf, and remove any configuration you add manually or with automation tools like Ansible, Chef, or Puppet.
If you choose to configure Cumulus Linux with NVUE, you can configure features that do not yet support the NVUE object model by creating snippets. See NVUE Snippets.
Command Syntax
NVUE commands all begin with nv and fall into one of three syntax categories:
Configuration (nv set and nv unset)
Monitoring (nv show)
Configuration management (nv config)
Action commands (nv action)
Command Completion
As you enter commands, you can get help with the valid keywords or options using the tab key. For example, using tab completion with nv set displays the possible options for the command and returns you to the command prompt to complete the command.
cumulus@switch:~$ nv set <<press tab>>
acl evpn mlag platform router system
bridge interface nve qos service vrf
cumulus@switch:~$ nv set
Command Question Mark
You can type a question mark (?) after a command to display required information quickly and concisely. When you type ?, NVUE specifies the value type, range, and options with a brief description of each; for example:
cumulus@switch:~$ nv set interface swp1 link state ?
[Enter]
down The interface is not ready
up The interface is ready
cumulus@switch:~$ nv set interface swp1 link mtu ?
<arg> (integer:552 - 9216)
cumulus@switch:~$ nv set interface swp1 link speed ?
<arg> (string | enum:10M, 100M, 1G, 10G, 25G, 40G, 50G, 100G,
200G, 400G, 800G, auto)
NVUE also indicates if you need to provide specific values for the command:
NVUE supports command abbreviation, where you can type a certain number of characters instead of a whole command to speed up CLI interaction. For example, instead of typing nv show interface, you can type nv sh int.
If the command you type is ambiguous, NVUE shows the reason for the ambiguity so that you can correct the shortcut. For example:
cumulus@switch:~$ nv s i
Ambiguous Command:
set interface
show interface
Command Help
As you enter commands, you can get help with command syntax by entering -h or --help at various points within a command entry. For example, to examine the options available for nv set interface, enter nv set interface -h or nv set interface --help.
cumulus@switch:~$ nv set interface -h
usage:
nv [options] set interface <interface-id>
Description:
interface Update all interfaces. Provide single interface or multiple interfaces using ranging (e.g. swp1-2,5-6 -> swp1,swp2,swp5,swp6).
Identifiers:
<interface-id> Interface (interface-name)
General Options:
-h, --help Show help.
Command List
You can list all the NVUE commands by running nv list-commands. See List All NVUE Commands below.
Command History
At the command prompt, press the Up Arrow and Down Arrow keys to move back and forth through the list of commands you entered previously. When you find the command you want to use, you can run the command by pressing Enter. You can also modify the command before you run it.
Command Categories
The NVUE CLI has a flat structure; however, the commands are in three functional categories:
Configuration
Monitoring
Configuration Management
Action
Configuration Commands
The NVUE configuration commands modify switch configuration. You can set and unset configuration options.
The nv set and nv unset commands are in the following categories. Each command group includes subcommands. Use command completion (press the tab key) to list the subcommands.
Command Group
Description
nv set acl nv unset acl
Configures ACLs.
nv set bridge nv unset bridge
Configures a bridge domain. This is where you configure bridge attributes, such as the bridge type (VLAN-aware), the STP state and priority, and VLANs.
nv set evpn nv unset evpn
Configures EVPN. This is where you enable and disable the EVPN control plane, and set EVPN route advertise, multihoming, and duplicate address detection options.
nv set interface <interface-id> nv unset interface <interface-id>
Configures the switch interfaces. Use this command to configure bond and bridge interfaces, interface IP addresses and descriptions, VLAN IDs, and links (MTU, FEC, speed, duplex, and so on).
nv set mlag nv unset mlag
Configures MLAG. This is where you configure the backup IP address or interface, MLAG system MAC address, peer IP address, MLAG priority, and the delay before bonds come up.
nv set nve nv unset nve
Configures network virtualization (VXLAN) settings. This is where you configure the UDP port for VXLAN frames, control dynamic MAC learning over VXLAN tunnels, enable and disable ARP and ND suppression, and configure how Cumulus Linux handles BUM traffic in the overlay.
nv set platform nv unset platform
Configures Pulse per Second; the simplest form of synchronization for the physical hardware clock.
nv set qos nv unset qos
Configures QoS RoCE.
nv set router nv unset router
Configures router policies (prefix list rules and route maps), sets global BGP options (enable and disable, ASN and router ID, BGP graceful restart and shutdown), global OSPF options (enable and disable, router ID, and OSPF timers) PIM, IGMP, PBR, VRR, and VRRP.
nv set service nv unset service
Configures DHCP relays and servers, NTP, PTP, LLDP, SNMP servers, DNS, and syslog.
nv set system nv unset system
Configures system settings, such as the hostname of the switch, pre and post login messages, reboot options (warm, cold, fast), the time zone and global system settings, such as the anycast ID, the system MAC address, and the anycast MAC address. This is also where you configure SPAN and ERSPAN sessions and set how configuration apply operations work (which files to ignore and which files to overwrite; see Configure NVUE to Ignore Linux Files).
nv set vrf <vrf-id> nv unset vrf <vrf-id>
Configures VRFs. This is where you configure VRF-level configuration for PTP, BGP, OSPF, and EVPN.
Monitoring Commands
The NVUE monitoring commands show various parts of the network configuration. For example, you can show the complete network configuration or only interface configuration. The monitoring commands are in the following categories. Each command group includes subcommands. Use command completion (press the tab key) to list the subcommands.
Command Group
Description
nv show acl
Shows ACL configuration.
nv show action
Shows information about the action commands that reset counters and remove conflicts.
nv show bridge
Shows bridge domain configuration.
nv show evpn
Shows EVPN configuration.
nv show interface
Shows interface configuration and counters.
nv show mlag
Shows MLAG configuration.
nv show nve
Shows network virtualization configuration, such as VXLAN-specfic MLAG configuration and VXLAN flooding.
nv show platform
Shows platform configuration, such as hardware and software components.
nv show qos
Shows QoS RoCE configuration.
nv show router
Shows router configuration, such as router policies, global BGP and OSPF configuration, PBR, PIM, IGMP, VRR, and VRRP configuration.
nv show service
Shows DHCP relays and server, NTP, PTP, LLDP, and syslog configuration.
nv show system
Shows global system settings, such as the reserved routing table range for PBR and the reserved VLAN range for layer 3 VNIs. You can also see system login messages and switch reboot history.
nv show vrf
Shows VRF configuration.
The following example shows the nv show router commands after pressing the tab key, then shows the output of the nv show router bgp command.
cumulus@leaf01:mgmt:~$ nv show router <<tab>>
adaptive-routing igmp ospf pim ptm vrrp
bgp nexthop pbr policy vrr
cumulus@leaf01:mgmt:~$ nv show router bgp
operational applied pending
------------------------------ ----------- ------- ----------- ----------------------------------------------------------------------
applied pending
------------------------------ ----------- -----------
enable on on
autonomous-system 65101 65101
router-id 10.10.10.1 10.10.10.1
policy-update-timer 5 5
graceful-shutdown off off
wait-for-install off off
graceful-restart
mode helper-only helper-only
restart-time 120 120
path-selection-deferral-time 360 360
stale-routes-time 360 360
convergence-wait
time 0 0
establish-wait-time 0 0
queue-limit
input 10000 10000
output 10000 10000
If there are no pending or applied configuration changes, the nv show command only shows the running configuration (under operational).
Additional options are available for certain nv show commands. For example, you can choose the configuration you want to show (pending, applied, startup, or operational). You can also turn on colored output, and paginate specific output.
Option
Description
--applied
Shows configuration applied with the nv config apply command. For example, nv show --applied.
--brief-help
Shows help about the nv show command. For example, nv show interface swp1 --brief-help
--color
Turns colored output on or off. For example, nv show interface swp1 --color on
--filter
Filters show command output on column data. For example, the nv show interface --filter mtu=1500 shows only the interfaces with MTU set to 1500.To filter on multiple column outputs, enclose the entire filter in double quotes; for example, nv show interface --filter "type=bridge&mtu=9216" shows data for bridges with MTU 9216.You can use wildcards; for example, nv show interface swp1 --filter "ip.address=1*" shows all IP addresses that start with 1 for swp1.You can filter on all revisions (operational, applied, and pending); for example, nv show interface --filter "ip.address=1*" --rev=applied shows all IP addresses that start with 1 for swp1 in the applied revision.
--hostname
Shows system configuration for the switch with the specified hostname. For example, nv show --hostname leaf01.
--operational
Shows the running configuration (the actual system state). For example, nv show interface swp1 --operational shows the running configuration for swp1. The running and applied configuration should be the same. If different, inspect the logs.
--output
Shows command output in table (auto), json, or yaml format. For example: nv show interface bond1 --output auto nv show interface bond1 --output json nv show interface bond1 --output yaml
--paginate
Paginates the output. For example, nv show interface bond1 --paginate on.
--pending
Shows the last applied configuration and any pending set or unset configuration that you have not yet applied. For example, nv show interface bond1 --pending.
--rev <revision>
Shows a detached pending configuration. See the nv config detach configuration management command below. For example, nv show --rev 1. You can also show only applied or only operational information in the nv show output. For example, to show only the applied settings for swp1 configuration, run the nv show interface swp1 --rev=applied command. To show only the operational settings for swp1 configuration, run the nv show interface swp1 --rev=operational command.
--startup
Shows configuration saved with the nv config save command. This is the configuration after the switch boots. For example: nv show interface --startup.
--tab
Show information in tab format. For example, nv show interface swp1 --tab.
--view
Shows different views. A view is a subset of information provided by certain nv show commands. To see the views available for an nv show command, run the command with --view and press TAB.
The following example shows pending BGP graceful restart configuration:
The following example shows the views available for the nv show interface command:
cumulus@switch:~$ nv show interface --view <<TAB>>
acl-statistics detail lldp mlag-cc port-security synce-counters
brief dot1x-counters lldp-detail neighbor qos-profile
counters dot1x-summary mac pluggables small
Monitoring Commands and FRR Daemons
If you run an NVUE show command but the corresponding FRR routing daemons are not running on the switch, you see an error message; for example:
If OSPF is not running when you run nv show vrf <vrf-id> ospf commands, NVUE returns Error: The requested item does not exist because the OSPF deamon is not running in FRR.
If PIM and IGMP are not running when you run the nv show interface <interface> ip igmp -o json command, NVUE returns Error: The requested item does not exist because the PIM daemon is not running in FRR.
If PIM is running but IGMP is not running when you the nv show interface <interface> ip igmp group -o json command, NVUE does not return an error message but shows an empty { } response.
Net Show commands
In addition to the nv show commands, Cumulus Linux continues to provide a subset of the NCLU net show commands. Use these commands to get additional views of various parts of your network configuration.
cumulus@leaf01:mgmt:~$ net show <<TAB>>
bfd : Bidirectional forwarding detection
bgp : Border Gateway Protocol
bridge : a layer2 bridge
clag : Multi-Chassis Link Aggregation
commit : apply the commit buffer to the system
configuration : settings, configuration state, etc
counters : net show counters
debugs : Debugs
dhcp-snoop : DHCP snooping for IPv4
dhcp-snoop6 : DHCP snooping for IPv6
dot1x : Configure, Enable, Delete or Show IEEE 802.1X EAPOL
evpn : Ethernet VPN
hostname : local hostname
igmp : Internet Group Management Protocol
interface : An interface, such as swp1, swp2, etc.
ip : Internet Protocol version 4/6
ipv6 : Internet Protocol version 6
lldp : Link Layer Discovery Protocol
mpls : Multiprotocol Label Switching
mroute : Static unicast routes in MRIB for multicast RPF lookup
msdp : Multicast Source Discovery Protocol
neighbor : A BGP, OSPF, PIM, etc neighbor
ospf : Open Shortest Path First (OSPFv2)
ospf6 : Open Shortest Path First (OSPFv3)
package : A Cumulus Linux package name
pbr : Policy Based Routing
pim : Protocol Independent Multicast
port-mirror : port-mirror
port-security : Port security
ptp : Precision Time Protocol
roce : Enable RoCE on all interfaces, default mode is lossless
rollback : revert to a previous configuration state
route : EVPN route information
route-map : Route-map
snmp-server : Configure the SNMP server
system : System
time : Time
version : Version number
vrf : Virtual routing and forwarding
vrrp : Virtual Router Redundancy Protocol
Configuration Management Commands
The NVUE configuration management commands manage and apply configurations.
Command
Description
nv config apply
Applies the pending configuration (nv config apply) or a specific revision (nv config apply 2) to become the applied configuration. To see the list of revisions you can apply, run nv config apply <<Tab>>. You can also use these prompt options:
--y or --assume-yes to automatically reply yes to all prompts.
--assume-no to automatically reply no to all prompts.
Cumulus Linux applies but does not save the configuration; the configuration does not persist after a reboot.
You can also use these apply options: --confirm applies the configuration change but you must confirm the applied configuration. If you do not confirm within ten minutes, the configuration rolls back automatically. You can change the default time with the apply --confirm <time> command. For example, apply --confirm 60 requires you to confirm within one hour. --confirm-status shows the amount of time left before the automatic rollback.To save the pending configuration to the startup configuration automatically when you run nv config apply so that you do not have to run the nv config save command, enable auto save.
nv config detach
Detaches the configuration from the current pending configuration and uses an integer to identify it; for example, 4. To list all the current detached pending configurations, run nv config diff <<press tab>.
nv config diff <revision> <revision>
Shows differences between configurations, such as the pending configuration and the applied configuration, or the detached configuration and the pending configuration.
nv config find <string>
Finds a portion of the applied configuration according to the search string you provide. For example to find swp1 in the applied configuration, run nv config find swp1.
nv config history
Enables you to keep track of the configuration changes on the switch and shows a table with the configuration revision ID, the date and time of the change, the user account that made the change, and the type of change (such as CLI or REST API). The nv config history <revision> command shows the apply history for a specific revision.
nv config patch <nvue-file>
Updates the pending configuration with the specified YAML configuration file.
nv config replace <nvue-file>
Replaces the pending configuration with the specified YAML configuration file.
nv config revision
Shows all the configuration revisions on the switch.
nv config save
Overwrites the startup configuration with the applied configuration by writing to the /etc/nvue.d/startup.yaml file. The configuration persists after a reboot.
nv config show
Shows the currently applied configuration in yaml format. This command also shows NVUE version information.
nv config show -o commands
Shows the currently applied configuration commands.
nv config diff -o commands
Shows differences between two configuration revisions.
You can use the NVUE configuration management commands to back up and restore configuration when you upgrade Cumulus Linux on the switch. Refer to Upgrading Cumulus Linux.
Action Commands
The NVUE action commands clear counters, and provide system reboot and TACACS user disconnect options.
Deauthenticates the 802.1X supplicant on the specified interface. If you do not want to notify the supplicant that they are being deauthenticated, you can add the silent option; for example, nv action deauthenticate interface swp1 dot1x authorized-sessions 00:55:00:00:00:09 silent.
nv action delete system security
Provides commands to delete CA and entity certificates.
nv action disable system maintenance mode nv action disable system maintenance ports
Disables system maintenance mode Brings up the ports.
nv action disconnect system aaa user
Provides commands to disconnect users logged into the switch.
nv action enable system maintenance mode nv action enable system maintenance ports
Enables system maintenance mode. Brings all the ports down for maintenance.
nv action import system security ca-certificate nv action import system security certificate
Provides commands to import CA and entity certificates.
nv action reboot system
Reboots the switch in the configured restart mode (fast, cold, or warm). You must specify the no-confirm option with this command.
nv action rename
Renames the system configuration.
nv action upload
Uploads system configuration to the switch.
List All NVUE Commands
To show the full list of NVUE commands, run nv list-commands. For example:
cumulus@switch:~$ nv list-commands
nv show platform
nv show platform hardware
nv show platform hardware component
nv show platform hardware component <component-id>
nv show platform software
nv show platform software installed
nv show platform software installed <installed-id>
nv show platform capabilities
nv show platform environment
...
You can show the list of commands for a command grouping. For example, to show the list of interface commands:
cumulus@switch:~$ nv list-commands interface
nv show interface
nv show interface <interface-id>
nv show interface <interface-id> ip
nv show interface <interface-id> ip address
nv show interface <interface-id> ip address <ip-prefix-id>
nv show interface <interface-id> ip gateway
nv show interface <interface-id> ip gateway <ip-address-id>
...
Use the tab key to get help for the command lists you want to see. For example, to show the list of command options available for swp1, run the nv list-commands interface swp1 command and press the tab key:
cumulus@switch:~$ nv list-commands interface swp1 <<press tab>>
acl counters ip neighbor ptp storm-control tunnel
bond dot1x link pluggable qos synce
bridge evpn lldp port-security router telemetry
To view the NVUE command reference for Cumulus Linux, which describes all the NVUE CLI commands and provides examples, go to the NVUE Command Reference.
NVUE Configuration File
When you save network configuration, NVUE writes the configuration to the /etc/nvue.d/startup.yaml file.
You can edit or replace the contents of the /etc/nvue.d/startup.yaml file. NVUE applies the configuration in the /etc/nvue.d/startup.yaml file during system boot only if the nvue-startup.service is running. If this service is not running, the switch reboots with the same configuration that is running before the reboot.
When you apply a configuration with nv config apply, NVUE also writes to underlying Linux files such as /etc/network/interfaces and /etc/frr/frr.conf. You can view these configuration files; however, do not manually edit them while using NVUE. If you need to configure certain network settings manually or use automation such as Ansible to configure the switch, see Configure NVUE to Ignore Linux Files below.
Configuration Files that NVUE Manages
NVUE manages the following configuration files:
File
Description
/etc/network/interfaces
Configures the network interfaces available on your system.
/etc/frr/frr.conf
Configures FRRouting.
/etc/cumulus/switchd.conf
Configures switchd options.
/etc/cumulus/switchd.d/ptp.conf
Configures PTP timestamping.
/etc/frr/daemons
Configures FRRouting services.
/etc/hosts
Configures the hostname of the switch.
/etc/default/isc-dhcp-relay-default
Configures DHCP relay options.
/etc/dhcp/dhcpd.conf
Configures DHCP server options.
/etc/hostname
Configures the hostname of the switch.
/etc/cumulus/datapath/qos/qos_features.conf
Configures QoS settings, such as traffic marking, shaping and flow control.
/etc/mlx/datapath/qos/qos_infra.conf
Configures QoS platform specific configurations, such as buffer allocations and Alpha values.
/etc/cumulus/switchd.d/qos.conf
Configures QoS settings.
/etc/cumulus/ports.conf
Configures port breakouts.
/etc/ntp.conf
Configures NTP settings.
/etc/ptp4l.conf
Configures PTP settings.
/etc/snmp/snmpd.conf
Configures SNMP settings.
Search for a Specific Configuration
To search for a specific portion of the NVUE configuration, run the nv config find <search string> command. The search shows all items above and below the search string. For example, to search the entire NVUE object model configuration for any mention of bond1:
cumulus@switch:~$ nv config find bond1
- set:
interface:
bond1:
bond:
lacp-bypass: on
member:
swp1: {}
mlag:
enable: on
id: 1
bridge:
domain:
br_default:
access: 10
link:
mtu: 9000
type: bond
Configure NVUE to Ignore Linux Files
You can configure NVUE to ignore certain underlying Linux files when applying configuration changes. For example, if you push certain configuration to the switch using Ansible and Jinja2 file templates or you want to use custom configuration for a particular service such as PTP, you can ensure that NVUE never writes to those configuration files.
The following example configures NVUE to ignore the Linux /etc/ptp4l.conf file when applying configuration changes and saves the configuration so it persists after a reboot.
cumulus@switch:~$ nv set system config apply ignore /etc/ptp4l.conf
cumulus@switch:~$ nv config apply
cumulus@switch:~$ nv config save
Configure Auto Save
By default, when you run the nv config apply command to apply a configuration setting, NVUE applies the pending configuration to become the applied configuration but does not update the startup configuration file (/etc/nvue.d/startup.yaml). To save the applied configuration to the startup configuration so that the changes persist after the reboot, you must run the nv config save command. The auto save option lets you save the pending configuration to the startup configuration automatically when you run nv config apply so that you do not have to run the nv config save command.
To enable auto save:
cumulus@switch:~$ nv set system config auto-save enable on
cumulus@switch:~$ nv config apply
To disable auto save, run the nv set system config auto-save enable off command.
Add Configuration Apply Messages
When you run the nv config apply command, you can add a message that describes the configuration updates you make. You can see the message when you run the nv config history command.
To add a configuration apply message, run the nv config apply -m <message> command. If the message includes more than one word, enclose the message in quotes.
cumulus@switch:~$ nv config apply -m "this is my message"
Reset NVUE Configuration to Default Values
To reset the NVUE configuration on the switch back to the default values, run the following command:
cumulus@switch:~$ nv config apply empty
Example Configuration Commands
This section provides examples of how to configure a Cumulus Linux switch using NVUE commands.
Configure the System Hostname
The example below shows the NVUE commands required to change the hostname for the switch to leaf01:
cumulus@switch:~$ nv set system hostname leaf01
cumulus@switch:~$ nv config apply
Configure the System DNS Server
The example below shows the NVUE commands required to define the DNS server for the switch:
cumulus@switch:~$ nv set service dns mgmt server 192.168.200.1
cumulus@switch:~$ nv config apply
Configure an Interface
The example below shows the NVUE commands required to bring up swp1.
cumulus@switch:~$ nv set interface swp1
cumulus@switch:~$ nv config apply
Configure a Bond
The example below shows the NVUE commands required to configure the front panel port interfaces swp1 thru swp4 to be slaves in bond0.
cumulus@switch:~$ nv set interface bond0 bond member swp1-4
cumulus@switch:~$ nv config apply
Configure a Bridge
The example below shows the NVUE commands required to create a VLAN-aware bridge that contains two switch ports (swp1 and swp2) and includes 3 VLANs; tagged VLANs 10 and 20 and an untagged (native) VLAN of 1.
With NVUE, there is a default bridge called br_default, which has no ports assigned to it. The example below configures this default bridge.
cumulus@switch:~$ nv set interface swp1-2 bridge domain br_default
cumulus@switch:~$ nv set bridge domain br_default vlan 10,20
cumulus@switch:~$ nv set bridge domain br_default untagged 1
cumulus@switch:~$ nv config apply
Configure MLAG
The example below shows the NVUE commands required to configure MLAG on leaf01. The commands:
Place swp1 into bond1 and swp2 into bond2.
Configure the MLAG ID to 1 for bond1 and to 2 for bond2.
Add bond1 and bond2 to the default bridge (br_default).
Create the inter-chassis bond (swp49 and swp50) and the peer link (peerlink)
Set the peer link IP address to linklocal, the MLAG system MAC address to 44:38:39:BE:EF:AA, and the backup interface to 10.10.10.2.
cumulus@leaf01:~$ nv set interface bond1 bond member swp1
cumulus@leaf01:~$ nv set interface bond2 bond member swp2
cumulus@leaf01:~$ nv set interface bond1 bond mlag id 1
cumulus@leaf01:~$ nv set interface bond2 bond mlag id 2
cumulus@switch:~$ nv set interface bond1-2 bridge domain br_default
cumulus@leaf01:~$ nv set interface peerlink bond member swp49-50
cumulus@leaf01:~$ nv set mlag mac-address 44:38:39:BE:EF:AA
cumulus@leaf01:~$ nv set mlag backup 10.10.10.2
cumulus@leaf01:~$ nv set mlag peer-ip linklocal
cumulus@leaf01:~$ nv config apply
Configure BGP Unnumbered
The example below shows the NVUE commands required to configure BGP unnumbered on leaf01. The commands:
Assign the ASN for this BGP node to 65101.
Set the router ID to 10.10.10.1.
Distribute routing information to the peer on swp51.
Originate prefixes 10.10.10.1/32 from this BGP node.
cumulus@leaf01:~$ nv set router bgp autonomous-system 65101
cumulus@leaf01:~$ nv set router bgp router-id 10.10.10.1
cumulus@leaf01:~$ nv set vrf default router bgp neighbor swp51 remote-as external
cumulus@leaf01:~$ nv set vrf default router bgp address-family ipv4-unicast network 10.10.10.1/32
cumulus@leaf01:~$ nv config apply
Example Monitoring Commands
This section provides monitoring command examples.
Show Installed Software
The following example command lists the software installed on the switch:
cumulus@switch:~$ nv show platform software
Installed Software
=====================
Installed software description package version
--------------------------- --------------------------- -------------------------- -----------------------------
acpi displays information on ACPI acpi 1.7-1.1
devices
acpi-support-base scripts for handling base acpi-support-base 0.142-8
ACPI events such as the
power button
acpid Advanced Configuration and acpid 1:2.0.31-1
Power Interface event daemon
adduser add and remove users and adduser 3.118
groups
apt commandline package manager apt 1.8.2.3
...
Show Interface Configuration
The following example command shows the running, applied, and pending (if available) swp1 interface configuration.
cumulus@leaf01:~$ nv show interface swp1
operational applied
------------------------ ----------------- ----------
type swp swp
[acl]
bridge
[domain] br_default br_default
evpn
multihoming
uplink off
ptp
enable off
router
adaptive-routing
enable off
ospf
enable off
ospf6
enable off
pbr
[map]
pim
...
Example Configuration Management Commands
This section provides examples of how to use the configuration management commands to apply, save, and detach configurations.
Apply and Save a Configuration
The following example command configures the front panel port interfaces swp1 through swp4 to be slaves in bond0. The configuration is only in a pending configuration state. The configuration is not applied. NVUE has not yet made any changes to the running configuration.
cumulus@switch:~$ nv set interface bond0 bond member swp1-4
To apply the pending configuration to the running configuration, run the nv config apply command. The configuration does not persist after a reboot.
cumulus@switch:~$ nv config apply
To save the applied configuration to the startup configuration, run the nv config save command. This command overwrites the startup configuration with the applied configuration by writing to the /etc/nvue.d/startup.yaml file. The configuration persists after a reboot.
cumulus@switch:~$ nv config save
Detach a Pending Configuration
The following example configures the IP address of the loopback interface, then detaches the configuration from the current pending configuration. Cumulus Linux saves the detached configuration to a file with a numerical value to distinguish it from other pending configurations.
cumulus@switch:~$ nv set interface lo ip address 10.10.10.1/32
cumulus@switch:~$ nv config detach
View Differences Between Configurations
To view differences between configurations, run the nv config diff command.
To view differences between two detached pending configurations, run the nv config diff «tab» command to list all the current detached pending configurations, then run the nv config diff command with the pending configurations you want to diff.
The following example replaces the pending configuration with the contents of the YAML configuration file called nv-02/13/2021.yaml located in the /deps directory:
The following example patches the pending configuration (runs the set or unset commands from the configuration in the nv-02/13/2021.yaml file located in the /deps directory):
A patch contains a single request to the NVUE service. Ordering of parameters within a patch is not guaranteed; NVUE does not support both unset and set commands for the same object in a single patch.
Date and Time
This section discusses how to set the time zone and the date and time on the switch software clock, configure NTP, PTP, PPS, and SyncE.
Setting the time zone, and the date and time on the software clock requires root privileges; use sudo.
Show the Current Time Zone, Date, and Time
To show the current time zone, date, and time on the switch:
cumulus@switch:~$ nv show system date-time
operational
------------------------- -----------------------------
local-time Wed 2023-11-22 11:22:54 EST
universal-time Wed 2023-11-22 16:22:54 UTC
rtc-time Wed 2023-11-22 16:22:54
time-zone America/New_York (EST, -0500)
system-clock-synchronized no
ntp-service inactive
rtc-in-local-tz no
unix-time 1700670174.4371066
cumulus@switch:~$ date
Wed 11 Oct 2023 12:18:33 PM UTC
To show the time zone only, run the date +%Z command:
cumulus@switch:~$ date +%Z
UTC
Set the Time Zone
You can use one of these methods to set the time zone on the switch:
Run NVUE commands.
Use the guided wizard.
Edit the /etc/timezone file.
Run the nv set system timezone <timezone> command. To see all the available time zones, run nv set system timezone and press the Tab key. The following example sets the time zone to US/Eastern:
cumulus@switch:~$ nv set system timezone US/Eastern
cumulus@switch:~$ nv config apply
In a terminal, run the following command:
cumulus@switch:~$ sudo dpkg-reconfigure tzdata
Follow the on screen menu options to select the geographic area and region.
The switch contains a battery backed hardware clock that maintains the time while the switch powers off and between reboots. When the switch is running, the Cumulus Linux operating system maintains its own software clock.
During boot up, the switch copies the time from the hardware clock to the operating system software clock. The software clock takes care of all the timekeeping. During system shutdown, the switch copies the software clock back to the battery backed hardware clock.
If you need to reconfigure the current time zone, refer to the instructions above.
To set the software clock according to the configured time zone:
Run the nv action change system date-time <clock-date> <clock-time> command. Specify <clock-date> in YYYY-MM-DD format and <clock-time> in HH:MM:SS format.
cumulus@switch:~$ nv action change system date-time 2023-12-04 2:33:30
System Date-time changed successfully
Local Time is now Mon 2023-12-04 02:33:30 UTC
Action succeeded
cumulus@switch:~$ sudo date -s "Tue Jan 26 00:37:13 2021"
You can write the current value of the software clock to the hardware clock using the hwclock command:
cumulus@switch:~$ sudo hwclock -w
See man hwclock(8) for more information.
NVUE Snippets
NVUE supports both traditional snippets and flexible snippets:
Use traditional snippets to add configuration to the /etc/network/interfaces, /etc/frr/frr.conf, /etc/frr/daemons, /etc/cumulus/switchd.conf, /etc/cumulus/datapath/traffic.conf or /etc/ssh/sshd_config files.
Use flexible snippets to manage any other text file on the system.
A snippet configures a single parameter associated with a specific configuration file.
You can only set or unset a snippet; you cannot modify, partially update, or change a snippet.
Setting the snippet value replaces any existing snippet value.
Cumulus Linux supports only one snippet for a configuration file.
Only certain configuration files support a snippet.
NVUE does not parse or validate the snippet content and does not validate the resulting file after you apply the snippet.
PATCH is only the method of applying snippets and does not refer to any snippet capabilities.
As NVUE supports more features and introduces new syntax, snippets and flexible snippets become invalid. Before you upgrade Cumulus Linux to a new release, review the What's New for new NVUE syntax and remove the snippet if NVUE introduces new syntax for the feature that the snippet configures.
Traditional Snippets
Use traditional snippets if you configure Cumulus Linux with NVUE commands, then want to configure a feature that does not yet support the NVUE object model. You create a snippet in yaml format, then add the configuration to the file with the nv config patch command.
The nv config patch command requires you to use the fully qualified path name to the snippet .yaml file; for example you cannot use ./ with the nv config patch command.
/etc/frr/frr.conf Snippets
Example 1: Top Level Configuration
NVUE does not support configuring BGP to peer across the default route. The following example configures BGP to peer across the default route from the default VRF:
Create a .yaml file with the following traditional snippet:
Run the nv config apply command to apply the configuration:
cumulus@switch:~$ nv config apply
Verify that the configuration exists at the end of the /etc/frr/frr.conf file:
cumulus@switch:~$ sudo cat /etc/frr/frr.conf
...
! end of router ospf block
!---- CUE snippets ----
ip nht resolve-via-default
Example 2: Nested Configuration
NVUE does not support configuring EVPN route targets using auto derived values from RFC 8365. The following example configures BGP to enable RFC 8365 derived router targets:
Create a .yaml file with the following traditional snippet:
The traditional snippets for FRR write content to the /etc/frr/frr.conf file. When you apply the configuration and snippet with the nv config apply command, the FRR service goes through and reads in the /etc/frr/frr.conf file.
Example 3: EVPN Multihoming FRR Debugging
NVUE does not support configuring FRR debugging for EVPN multihoming. The following example configures FRR debugging:
Create a .yaml file and add the following traditional snippet:
The traditional snippets for FRR write content to the /etc/frr/frr.conf file. When you apply the configuration and snippet with the nv config apply command, the FRR service goes through and reads in the /etc/frr/frr.conf file.
/etc/network/interfaces Snippets
MLAG Timers Example
NVUE supports configuring only one of the MLAG service timeouts (initDelay). The following example configures the MLAG peer timeout to 400 seconds:
Create a .yaml file and add the following traditional snippet:
NVUE does not support configuring traditional bridges. The following example configures a traditional bridge called br0 with the IP address 11.0.0.10/24. swp1, swp2 are members of the bridge.
Create a .yaml file and add the following traditional snippet:
Run the nv config apply command to apply the configuration:
cumulus@switch:~$ nv config apply
Verify that the configuration exists at the end of the /etc/network/interfaces file:
cumulus@switch:~$ sudo cat /etc/network/interfaces
...
auto br0
iface br0
address 11.0.0.10/24
bridge-ports swp1 swp2
bridge-vlan-aware no
VLAN-aware RSTP Timers Example
NVUE does not support configuring RSTP timers on VLAN-aware bridges. The following example configures non-default RSTP timers for the NVUE default bridge br_default:
Create a .yaml file and add the following traditional snippet:
To add Cumulus Linux SNMP agent configuration not yet available with NVUE commands, create an snmpd.conf snippet.
The following example creates a file called snmpd.conf_snippet.yaml, and sets the read only community string and the listening address to run in the mgmt VRF.
SNMP snippets do not take effect unless you first enable SNMP with the NVUE nv set service snmp-server enable on and nv set service snmp-server listening-address commands (or with the equivalent REST API methods).
Create a .yaml file and add the following traditional snippet:
To add SSH service configuration not yet available with NVUE commands, create an sshd_config snippet.
The following example creates a file called sshd_config_snippet.yaml to allow root login and enable X11 forwarding for all users except user anoncvs. The snippet also disables TCP forwarding for the anoncvs user and runs the cvs server command when anoncvs logs in.
Create a .yaml file and add the following traditional snippet:
cumulus@switch:~$ sudo nano sshd_config_snippet.yaml
- set:
system:
config:
snippet:
sshd_config: |
PermitRootLogin yes
X11Forwarding yes
Match User anoncvs
X11Forwarding no
AllowTcpForwarding no
ForceCommand cvs server
Run the following command to patch the configuration:
Run the nv config apply command to apply the configuration:
cumulus@switch:~$ nv config apply
Verify that the configuration exists at the end of the /etc/ssh/sshd_config file:
cumulus@switch:~$ sudo cat /etc/ssh/sshd_config
...
!---- NVUE snippets ----
PermitRootLogin yes
X11Forwarding yes
Match User anoncvs
X11Forwarding no
AllowTcpForwarding no
ForceCommand cvs server
Flexible Snippets
Flexible snippets are an extension of traditional snippets that let you manage any text file on the system.
You can create new files or modify existing files that NVUE does not manage.
You can add configuration to files that NVUE manages.
The account you use through the CLI or the REST API to configure and manage flexible snippets must be in the sudo group, which includes the NVUE system-admin role, or you must be the root user.
Files NVUE Manages
You can use flexible snippets to add configuration to the following files that NVUE manages:
Filename
Description
/etc/cumulus/csmgrd
Configuration file for csmgrctl commands.
/etc/default/isc-dhcp-relay-<VRF>
Configuration file for DHCP relay. Changes to this file require a dhcrelay@<VRF>.service restart.
/etc/resolv.conf
Configuration file for DNS resolution.
/etc/hosts
Configuration file for the hostname of the switch.
/etc/default/isc-dhcp-server-<VRF>
Configuration file for DHCP servers. Changes to this file require a dhcpd@<VRF>.service restart.
/etc/default/isc-dhcp-server6-<VRF>
Configuration file for DHCP servers for IPv6. Changes to this file require a dhcpd6@<VRF>.service restart
/etc/dhcp/dhcpd-<VRF>.conf
Configuration file for the dhcpd service. Changes to this file require a dhcpd@<VRF>.service restart
/etc/dhcp/dhcpd6-<VRF>.conf
Configuration file for the dhcpd service for IPv6. Changes to this file require a dhcpd6@<VRF>.service restart
/etc/ntp.conf
Configuration file for NTP servers. Changes to this file require an ntp service restart.
/etc/default/isc-dhcp-relay6-<VRF>
Configuration file for DHCP relay for IPv6. Changes to this file require a dhcrelay6@<VRF>.service restart.
/etc/snmp/snmpd.conf
Configuration file for SNMP. Changes to this file require an snmpd restart.
/etc/cumulus/datapath/traffic.conf
Configuration file for forwarding table profiles. Changes to this file require a switchd restart.
/etc/cumulus/switchd.conf
Configuration file for switchd. Changes to this file require a switchd restart.
Flexible snippets do not support:
Binary files.
Symbolic links.
More than 1MB of content.
More than one flexible snippet in the same destination file.
Use caution when creating flexible snippets:
If you configure flexible snippets incorrectly, they might impact switch functionality. For example, even though flexible snippet validation allows you to only add textual content, Cumulus Linux does not prevent you from creating a flexible snippet that adds to sensitive text files, such as /boot/grub.cfg and /etc/fstab or add corrupt contents. Such snippets might render the switch unusable or create a potential security vulnerability (the NVUE service (nvued) runs with superuser privileges).
Do not manually update configuration files to which you add flexible snippets.
Any sensitive data in plain text (such as passwords) appears in the NVUE-managed configuration files as plain text.
Create a Flexible Snippet
To create a flexible snippet:
Create a file in yaml format and add each flexible snippet you want to apply in the format shown below. NVUE appends the flexible snippet at the end of an existing file. If the file does not exist, NVUE creates the file, then adds the content.
cumulus@leaf01:mgmt:~$ sudo nano <filename>.yaml>
- set:
system:
config:
snippet:
<snippet-name>:
file: "<filename>"
permissions: "<umask-permissions>"
content: |
# This is my content
services:
<name>:
service: <service-name>
action: <action>
You can only set the umast permissions to a new file that you create. Adding the permissions: line is optional. The default umask persmissions are 644.
You can add a service with an action, such as start, restart, or stop. Adding the services: lines is optional; however, if you add the service: line, you must specify at least one service.
Run the following command to patch the configuration:
Run the nv config apply command to apply the configuration:
cumulus@switch:~$ nv config apply
Verify the patched configuration.
The nv config patch command requires you to use the fully qualified path name to the snippet .yaml file; for example you cannot use ./ with the nv config patch command.
Flexible Snippet Examples
The following example flexible snippet called crontab-flex-snippet appends the single line @daily /opt/utils/run-backup.sh to the existing /etc/crontab file, then restarts the cron service.
The following example flexible snippet called apt-flex-snippet creates a new file /etc/apt/sources.list.d/microsoft-prod.list with 0644 permissions and adds multi-line text:
cumulus@leaf01:mgmt:~$ sudo nano apt-flex-snippet.yaml
- set:
system:
config:
snippet:
apt-flexible-snippet:
file: "/etc/apt/sources.list.d/microsoft-prod.list"
content: |
# Adding Microsoft SQL Server Sources
deb [arch=amd64] https://packages.microsoft.com/debian/10/prod buster main
permissions: "0644"
The following flexible snippet called lldp_config_snipppet disables LLDP on swp1 and swp2 using the configure system interface pattern-blacklist command:
After you patch and apply the configuration above, the snippet creates a new file in the /etc/lldp.d directory, then restarts the lldpd service to stop LLDP transmitting and receiving on swp1 and swp2. Other interfaces continue to participate in LLDP.
If you try to apply a flexible snippet to a file that NVUE does not allow, you see an error message similar to the following:
cumulus@leaf01:mgmt:~$ nv config apply
Invalid config [rev_id: 8]
Flexible snippets are not allowed to be configured on the file '/etc/cumulus/ports.conf’.
Flexible snippets are not allowed to be configured on the file '/etc/cumulus/ports_width.conf’.
If you try to apply a flexible snippet to a file that supports traditional snippets, you see an error message similar to the following:
cumulus@leaf01:mgmt:~$ nv config apply
Invalid config [rev_id: 1]
Flexible snippet cannot be used to modify the file '/etc/ssh/sshd_config'. Traditional snippets (for e.g., 'sshd_config') are supported on this file. Consult NVIDIA NVUE documentation for further information on snippets.
You can also create a flexible snippet with the REST API. See NVUE API.
Remove a Snippet
To remove a traditional or flexible snippet, edit the snippet’s .yaml file to change set to unset, then patch and apply the configuration. Alternatively, you can use the REST API DELETE and PATCH methods.
The ntpd daemon running on the switch implements the NTP protocol. It synchronizes the system time with time servers in the /etc/ntp.conf file. The ntpd daemon starts at boot by default.
If you intend to run this service within a VRF, including the management VRF, follow these steps to configure the service.
Configure NTP Servers
The default NTP configuration includes the following servers, which are in the /etc/ntp.conf file:
server 0.cumulusnetworks.pool.ntp.org iburst
server 1.cumulusnetworks.pool.ntp.org iburst
server 2.cumulusnetworks.pool.ntp.org iburst
server 3.cumulusnetworks.pool.ntp.org iburst
To add the NTP servers you want to use, run the following commands. Include the iburst option to increase the sync speed.
The NVUE command requires a VRF. The following command adds the NTP servers in the default VRF.
cumulus@switch:~$ nv set service ntp default server 4.cumulusnetworks.pool.ntp.org iburst on
cumulus@switch:~$ nv config apply
Edit the /etc/ntp.conf file to add or update NTP server information:
cumulus@switch:~$ sudo nano /etc/ntp.conf
# pool.ntp.org maps to about 1000 low-stratum NTP servers. Your server will
# pick a different set every time it starts up. Please consider joining the
# pool: <http://www.pool.ntp.org/join.html>
server 0.cumulusnetworks.pool.ntp.org iburst
server 1.cumulusnetworks.pool.ntp.org iburst
server 2.cumulusnetworks.pool.ntp.org iburst
server 3.cumulusnetworks.pool.ntp.org iburst
server 4.cumulusnetworks.pool.ntp.org iburst
To set the initial date and time with NTP before starting the ntpd daemon, run the ntpd -q command. Be aware that ntpd -q can hang if the time servers are not reachable.
cumulus@switch:~$ nv show service ntp default server
cumulus@switch:~$ ntpq -p
remote refid st t when poll reach delay offset jitter
==============================================================================
+ec2-34-225-6-20 129.6.15.30 2 u 73 1024 377 70.414 -2.414 4.110
+lax1.m-d.net 132.163.96.1 2 u 69 1024 377 11.676 0.155 2.736
*69.195.159.158 199.102.46.72 2 u 133 1024 377 48.047 -0.457 1.856
-2.time.dbsinet. 198.60.22.240 2 u 1057 1024 377 63.973 2.182 2.692
The following example commands remove some of the default NTP servers:
cumulus@switch:~$ nv unset service ntp default server 0.cumulusnetworks.pool.ntp.org
cumulus@switch:~$ nv unset service ntp default server 1.cumulusnetworks.pool.ntp.org
cumulus@switch:~$ nv unset service ntp default server 2.cumulusnetworks.pool.ntp.org
cumulus@switch:~$ nv unset service ntp default server 3.cumulusnetworks.pool.ntp.org
cumulus@switch:~$ nv config apply
Edit the /etc/ntp.conf file to delete NTP servers.
cumulus@switch:~$ sudo nano /etc/ntp.conf
...
# pool.ntp.org maps to about 1000 low-stratum NTP servers. Your server will
# pick a different set every time it starts up. Please consider joining the
# pool: <http://www.pool.ntp.org/join.html>
server 4.cumulusnetworks.pool.ntp.org iburst
...
Specify the NTP Source Interface
By default, the source interface that NTP uses is eth0. The following example command configures the NTP source interface to be swp10.
cumulus@switch:~$ nv set service ntp default listen swp10
cumulus@switch:~$ nv config apply
Edit the /etc/ntp.conf file and modify the entry under the Specify interfaces comment.
You can use DHCP to specify your NTP servers. Ensure that the DHCP-generated configuration file /run/ntp.conf.dhcp exists. The /etc/dhcp/dhclient-exit-hooks.d/ntp script generates this file, which is a copy of the default /etc/ntp.conf file with a modified server list from the DHCP server. If this file does not exist and you plan on using DHCP in the future, you can copy your current /etc/ntp.conf file to the location of the DHCP file.
To use DHCP to specify your NTP servers, run the sudo -E systemctl edit ntp.service command and add the ExecStart= line:
The sudo -E systemctl edit ntp.service command always updates the base ntp.service even if you use ntp@mgmt.service. The ntp@mgmt.service is re-generated automatically.
To validate that your configuration, run these commands:
If the state is not Active, or the alternate configuration file does not appear in the ntp command line, it is likely that you made a configuration mistake. Correct the mistake and rerun the commands above to verify.
Configure NTP with Authorization Keys
For added security, you can configure NTP to use authorization keys.
Configure the NTP Server
Create a .keys file, such as /etc/ntp.keys. Specify a key identifier (a number between 1 and 65535), an encryption method (M for MD5), and the password. The following provides an example:
#
# PLEASE DO NOT USE THE DEFAULT VALUES HERE.
#
#65535 M akey
#1 M pass
1 M CumulusLinux!
In the /etc/ntp.conf file, add a pointer to the /etc/ntp.keys file you created above and specify the key identifier. For example:
Restart NTP with the sudo systemctl restart ntp command.
Configure the NTP Client
The NTP client is the Cumulus Linux switch.
Create the same .keys file you created on the NTP server (/etc/ntp.keys). For example:
cumulus@switch:~$ sudo nano /etc/ntp.keys
#
# DO NOT USE THE DEFAULT VALUES HERE.
#
#65535 M akey
#1 M pass
1 M CumulusLinux!
Edit the /etc/ntp.conf file to specify the server you want to use, the key identifier, and a pointer to the /etc/ntp.keys file you created in step 1. For example:
cumulus@switch:~$ sudo nano /etc/ntp.conf
...
# You do need to talk to an NTP server or two (or three).
#pool ntp.your-provider.example
# OR
#server ntp.your-provider.example
# pool.ntp.org maps to about 1000 low-stratum NTP servers. Your server will
# pick a different set every time it starts up. Please consider joining the
# pool: <http://www.pool.ntp.org/join.html>
#server 0.cumulusnetworks.pool.ntp.org iburst
#server 1.cumulusnetworks.pool.ntp.org iburst
#server 2.cumulusnetworks.pool.ntp.org iburst
#server 3.cumulusnetworks.pool.ntp.org iburst
server 10.50.23.121 key 1
#keys
keys /etc/ntp.keys
trustedkey 1
controlkey 1
requestkey 1
...
Restart NTP in the active VRF (default or management). For example:
Wait a few minutes, then run the ntpq -c as command to verify the configuration:
cumulus@switch:~$ ntpq -c as
ind assid status conf reach auth condition last_event cnt
===========================================================
1 40828 f014 yes yes ok reject reachable 1
After a successful authorization, you see the following command output:
cumulus@switch:~$ ntpq -c as
ind assid status conf reach auth condition last_event cnt
===========================================================
1 40828 f61a yes yes ok sys.peer sys_peer 1
Considerations
NTP in Cumulus Linux uses the /usr/share/zoneinfo/leap-seconds.list file, which expires periodically and results in generated log messages about the expiration. When the file expires, update it from https://www.ietf.org/timezones/data/leap-seconds.list or upgrade the tzdata package to the newest version.
When you upgrade to Cumulus Linux 5.6 or later, the switch overwrites any manual configuration you performed by editing files in Cumulus Linux 5.5 or earlier, such as configuring the listening address, port, TLS, or certificate.
In addition to the CLI, NVUE supports a REST API. Instead of accessing Cumulus Linux using SSH, you can interact with the switch using an HTTP client, such as cURL or a web browser.
The nvued service provides access to the NVUE REST API. Cumulus Linux exposes the HTTP endpoint internally, which makes the NVUE REST API accessible locally within the Cumulus Linux switch. The NVUE CLI also communicates with the nvued service using internal APIs. To provide external access to the NVUE REST API, Cumulus Linux uses an HTTP reverse proxy server, and supports HTTPS and TLS connections from external REST API clients.
The following illustration shows the NVUE REST API architecture and illustrates how Cumulus Linux forwards the requests internally.
Supported HTTP Methods
The NVUE REST API supports the following methods:
The GET method displays configuration and operational data, and is equivalent to the nv show commands.
The POST method creates and submits operations. You typically use this method for nv action commands and for the nv config command to create revisions.
The PATCH method replaces or unsets a configuration. You use this method for the nv set and nv config apply commands. You can either perform:
A targeted configuration patch to make a configuration change, where you run a specific NVUE REST API targeted at a particular OpenAPI end-point URI. Based on the NVUE schema definition, you need to direct the PATCH REST API request at a particular endpoint (for example, /nvue_v1/vrf/<vrf-id>/router/bgp) and provide the payload that conforms to the schema. With a targeted configuration patch, you can control individual resources.
A root patch, where you run the NVUE PATCH API on the root node of the schema so that a single PATCH operation can change one, some, or the entire configuration in a single payload. The payload of the PATCH method must be aware of the entire NVUE object model schema because you make the configuration changes relative to the root node /nvue_v1. You typically perform a root patch to push all configurations to the switch in bulk; for example, if you use an SDN controller or a network management system to push the entire switch configuration every time you need to make a change, regardless of how small or large. A root patch can also make configuration changes with fewer round trips to the switch.
The input payload in a PATCH request can have either a set or unset json object for the same resource, but not both. The order in which the API executes the set and unset objects is not deterministic and not supported.
The DELETE method deletes a configuration and is equivalent to the nv unset commands.
Secure the API
The NVUE REST API supports HTTP basic authentication, and the same underlying authentication methods for username and password that the NVUE CLI supports. User accounts work the same on both the API and the CLI.
Certificates
Cumulus Linux includes a self-signed certificate and private key to use on the server so that it works out of the box. The switch generates the self-signed certificate and private key when it boots for the first time. The X.509 certificate with the public key is in /etc/ssl/certs/cumulus.pem and the corresponding private key is in /etc/ssl/private/cumulus.key.
Cumulus Linux lets you manage CA certificates (such as DigiCert or Verisign) and entity (end-point) certificates. Both a CA certificate and an entity certificate can contain a chain of certificates.
You can import certificates onto the switch (fetch certificates from an external source), set which certificate you want to use for the NVUE REST API, and show information about a certificate, such as the serial number, and the date and time during which the certificate is valid.
Import a Certificate
You can import a maximum of 25 entity certificates and a maximum of 50 CA certificates.
The certificate you import contains sensitive private key information. NVIDIA recommends that you use a secure transport such as SFTP, SCP, or HTTPS.
To import an entity certificate, run an nv action import system security certificate <cert-id> command.
To import a CA certificate, run an nv action import system security ca-certificate <cert-id> command.
If the certificate is passphrase protected, you need to include the passphrase.
You must provide a certificate ID (<cert-id>) to uniquely identify the certificate you import.
The following example imports a CA certificate with a public key and calls the certificate tls-cert-1. The certificate is passphrase protected with mypassphrase. The public key is a Base64 ASCII encoded PEM string.
cumulus@switch:~$ nv action import system security ca-certificate tls-cert-1 passphrase mypassphrase data "<public-key>"
The following example imports an entity certificate bundle and calls the certificate tls-cert-1. The certificate bundle is passphrase protected with mypassphrase.
A certificate bundle must be in .PFX or .P12 format.
The following example imports an entity certificate with the public key URI scp://user@pass:1.2.3.4 and private key URI scp://user@pass:1.2.3.4, and calls the certificate tls-cert-1. The certificate is not passphrase protected.
A CA certificate must be in .pem, .p7a, or .p7c format.
You can configure the NVUE REST API to use a specific certificate.
The following example configures the API to use the certificate tls-cert-1:
cumulus@switch:~$ nv set system api certificate tls-cert-1
cumulus@switch:~$ nv config apply
The following example configures the API to use the self-signed certificate:
cumulus@switch:~$ nv set system api certificate self-signed
cumulus@switch:~$ nv config apply
To unset the certificate to use with the NVUE REST API:
cumulus@switch:~$ nv unset system api certificate tls-cert-1
Delete Certificates
To delete an entity certificate and the key data stored on the switch, run the nv action delete system security certificate <cert-id> command.
To delete a CA certificate and the key data stored on the switch, run the nv action delete system security ca-certificate <cert-id> command.
The following command deletes the certificate tls-cert-1:
cumulus@switch:~$ nv action delete system security certificate tls-cert-1
Show Certificate Information
To show all the entity certificates on the switch, run the nv show system security certificate command.
To show all the CA certificates on the switch, run the nv show system security ca-certificate command.
The following example shows all the entity certificates on the switch:
cumulus@switch:~$ nv show system security certificate
To show the applications that are using a specific entity certificate, run the nv show system security certificate <cert-id> installed command.
To show the applications that are using a specific CA certificate, run the nv show system security ca-certificate <cert-id> installed command.
The following example shows the applications that are using a specific entity certificate.
cumulus@switch:~$ nv show system security certificate tls-cert-1 installed
To show detailed information about a specific entity certificate, run the nv show system security certificate <cert-id> dump command.
To show detailed information about a specific CA certificate, run the nv show system security ca-certificate <cert-id> dump command.
The following example shows detailed information about the CA certificate tls-cert-1:
cumulus@switch:~$ nv show system security ca-certificate tls-cert-1 dump
API-only User
To create an API-only user without SSH permissions, use Linux group permissions. You can create the API-only user in the ZTP script.
# Create the dedicated automation user
adduser --disabled-password --gecos "Automation User,,,," --shell /usr/bin/nologin automation
# Set the password
echo 'automation:password!' | chpasswd
# Add the user to nvapply group to make NVUE config changes
adduser automation nvapply
This example shows how to create ACLs to allow users from the management subnet and the local switch to communicate with the switch using REST APIs, and restrict all other access.
cumulus@switch:~$ nv set acl API-PROTECT type ipv4
cumulus@switch:~$ nv set acl API-PROTECT rule 10 action permit
cumulus@switch:~$ nv set acl API-PROTECT rule 10 match ip .protocol tcp .dest-port 8765 .source-ip 192.168.200.0/24
cumulus@switch:~$ nv set acl API-PROTECT rule 10 remark "Allow the Management Subnet to talk to API"
cumulus@switch:~$ nv set acl API-PROTECT rule 20 action permit
cumulus@switch:~$ nv set acl API-PROTECT rule 20 match ip .protocol tcp .dest-port 8765 .source-ip 127.0.0.1
cumulus@switch:~$ nv set acl API-PROTECT rule 20 remark "Allow the local switch to talk to the API"
cumulus@switch:~$ nv set acl API-PROTECT rule 30 action deny
cumulus@switch:~$ nv set acl API-PROTECT rule 30 match ip .protocol tcp .dest-port 8765
cumulus@switch:~$ nv set acl API-PROTECT rule 30 remark "Block everyone else from talking to the API"
cumulus@switch:~$ nv set system control-plane acl API-PROTECT inbound
Supported Objects
The NVUE object model supports most features on the Cumulus Linux switch. The following list shows the supported objects. The NVUE API supports more objects within each of these objects. You can find a full listing of the supported API endpoints
here.
High-level Objects
Description
acl
Access control lists.
bridge
Bridge domain configuration.
evpn
EVPN configuration.
interface
Interface configuration.
mlag
MLAG configuration.
nve
Network virtualization configuration, such as VXLAN-specfic MLAG configuration and VXLAN flooding.
platform
Platform configuration, such as hardware and software components.
qos
QoS RoCE configuration.
router
Router configuration, such as router policies, global BGP and OSPF configuration, PBR, PIM, IGMP, VRR, and VRRP configuration.
service
DHCP relays and server, NTP, PTP, LLDP, and syslog configuration.
system
Global system settings, such as the reserved routing table range for PBR and the reserved VLAN range for layer 3 VNIs, system login messages and switch reboot history.
vrf
VRF configuration.
Use the API
The NVUE CLI and the REST API are equivalent in functionality; you can run all management operations from the REST API or from the CLI. The NVUE object model drives both the REST API and the CLI management operations. All operations are consistent; for example, the CLI nv show commands reflect any PATCH operation (create and update) you run through the REST API.
NVUE follows a declarative model, removing context-specific commands and settings. The structure of NVUE is like a big tree that represents the entire state of a Cumulus Linux instance. At the base of the tree are high level branches representing objects, such as router and interface. Under each of these branches are more branches. As you navigate through the tree, you gain a more specific context. At the leaves of the tree are actual attributes, represented as key-value pairs. The path through the tree is similar to a filesystem path.
Cumulus Linux enables the NVUE REST API by default. To disable the NVUE REST API, run the nv set system api state disabled command.
To use the NVUE REST API in Cumulus Linux 5.6 and later, you must change the password for the cumulus user; otherwise you see 403 responses when you run commands.
API Port and Listening Address
This section shows how to:
Set the NVUE REST API port. If you do not set a port, Cumulus Linux uses the default port 8765.
Specify the NVUE REST API listening address; you can specify an IPv4 address, IPv6 address, or localhost. If you do not specify a listening address, NGINX listens on all addresses for the target port.
The following example sets the port to 8888:
cumulus@switch:~$ nv set system api port 8888
cumulus@switch:~$ nv config apply
You can listen on multiple interfaces by specifying different listening addresses:
cumulus@switch:~$ nv set system api listening-address 10.10.10.1
cumulus@switch:~$ nv set system api listening-address 10.10.20.1
cumulus@switch:~$ nv config apply
The following example configures the listening address on eth0, which has IP address 172.0.24.0 and uses the management VRF by default:
cumulus@switch:~$ nv set system api listening-address 172.0.24.0
cumulus@switch:~$ nv config apply
The following example configures VRF BLUE on swp1, which has IP address 10.10.20.1, then sets the API listening address to the IP address for swp1 (configured for VRF BLUE).
cumulus@switch:~$ nv set interface swp1 ip address 10.10.10.1/24
cumulus@switch:~$ nv set interface swp1 ip vrf BLUE
cumulus@switch:~$ nv config apply
cumulus@switch:~$ nv set system api listening-address 10.10.10.1
cumulus@switch:~$ nv config apply
You can listen on multiple interfaces by specifying different listening addresses. The following example sets localhost, interface address 10.10.10.1, and 10.10.20.1 as listen-addresses.
To show REST API port configuration, state (enabled or disabled), certificate, listening address, and connection information:
Run the nv show system api command:
cumulus@switch:~$ nv show system api
operational applied
-------------- ----------- -------
port 8888 8888
state enabled enabled
certificate self-signed self-signed
[listening-address] localhost localhost
connections
accepted 31
active 1
handled 33
reading 0
requests 28
waiting 0
writing 1
To show connection information only, run the nv show system api connections command:
cumulus@switch:~$ nv show system api connections
operational applied
-------- ----------- -------
accepted 31
active 1
handled 33
reading 0
requests 28
waiting 0
writing 1
To show the configured listening address, run the nv show system api listening-address command:
cumulus@switch:~$ nv show system api listening-address
---------
localhost
To show all the certificates installed on the switch, run the nv show system security certificate command. To show information about a specific certificate, such as the serial number and how long the certificate is valid, run the nv show system security certificate <certificate> command:
The following examples show the primary API uses cases.
View a Configuration
Use the following example to obtain the current applied configuration on the switch. Change the rev argument to view any revision. Possible options for the rev argument include startup, pending, operational, and applied.
cumulus@switch:~$ nv show system
operational applied
-------- ------------------- -------
hostname switch01 cumulus
build Cumulus Linux 5.4.0
uptime 0:12:59
timezone Etc/UTC
cumulus@switch:~$ nv show bridge domain br_default vlan 10
operational applied pending description
--------------- ----------- ------- ------- ------------------------------------------------------
[vni] 10 10 10 L2 VNI
multicast
snooping
querier
source-ip 0.0.0.0 0.0.0.0 0.0.0.0 Source IP to use when sending IGMP/MLD queries.
ptp
enable off off off Turn the feature 'on' or 'off'. The default is 'off'.
#!/usr/bin/env python3
import requests
from requests.auth import HTTPBasicAuth
import json
import time
auth = HTTPBasicAuth(username="cumulus", password="password")
nvue_end_point = "https://127.0.0.1:8765/nvue_v1"
mime_header = {"Content-Type": "application/json"}
DUMMY_SLEEP = 5 # In seconds
POLL_APPLIED = 1 # in seconds
RETRIES = 10
def print_request(r: requests.Request):
print("=======Request=======")
print("URL:", r.url)
print("Headers:", r.headers)
print("Body:", r.body)
def print_response(r: requests.Response):
print("=======Response=======")
print("Headers:", r.headers)
print("Body:", json.dumps(r.json(), indent=2))
def create_nvue_changest():
r = requests.post(url=nvue_end_point + "/revision",
auth=auth,
verify=False)
print_request(r.request)
print_response(r)
response = r.json()
changeset = response.popitem()[0]
return changeset
def apply_nvue_changeset(changeset):
apply_payload = {"state": "apply", "auto-prompt": {"ays": "ays_yes"}}
url = nvue_end_point + "/revision/" + requests.utils.quote(changeset,
safe="")
r = requests.patch(url=url,
auth=auth,
verify=False,
data=json.dumps(apply_payload),
headers=mime_header)
print_request(r.request)
print_response(r)
def is_config_applied(changeset) -> bool:
# Check if the configuration was indeed applied
global RETRIES
global POLL_APPLIED
retries = RETRIES
while retries > 0:
r = requests.get(url=nvue_end_point + "/revision/" + requests.utils.quote(changeset, safe=""),
auth=auth,
verify=False)
response = r.json()
print(response)
if response["state"] == "applied":
return True
retries -= 1
time.sleep(POLL_APPLIED)
return False
def apply_new_config(path,payload):
# Create a new revision ID
changeset = create_nvue_changest()
print("Using NVUE Changeset: '{}'".format(changeset))
# Delete existing configuration
query_string = {"rev": changeset}
r = requests.delete(url=nvue_end_point + path,
auth=auth,
verify=False,
params=query_string,
headers=mime_header)
print_request(r.request)
print_response(r)
# Patch the new configuration
query_string = {"rev": changeset}
r = requests.patch(url=nvue_end_point + path,
auth=auth,
verify=False,
data=json.dumps(payload),
params=query_string,
headers=mime_header)
print_request(r.request)
print_response(r)
# Apply the changes to the new revision changeset
apply_nvue_changeset(changeset)
# Check if the changeset was applied
is_config_applied(changeset)
def nvue_get(path):
r = requests.get(url=nvue_end_point + path,
auth=auth,
verify=False)
print_request(r.request)
print_response(r)
if __name__ == "__main__":
payload = {
"99.99.99.99/32": {}
}
apply_new_config("/interface/lo/ip/address",payload)
time.sleep(DUMMY_SLEEP)
nvue_get("/interface/lo/ip/address")
cumulus@switch:~$ nv show interface lo ip address
-------------
99.99.99.99/32
127.0.0.1/8
::1/128
View Differences Between Configurations
To view differences between configurations, run the API GET /nvue_v1/<resource>?rev=<rev-A>&diff=<rev-B> method with the configurations you want to diff. This method is equivalent to the NVUE nv config diff <rev-A> <rev-B> command.
To see the difference between the startup revision and the applied revision:
cumulus@switch:~$ curl -u 'cumulus:cumulus' --insecure -X GET /nvue_v1/interface?rev=startup&diff=applied
To see the difference between revision 1 and revision 2:
cumulus@switch:~$ curl -u 'cumulus:cumulus' --insecure -X GET /nvue_v1/<resource>?rev=1&diff=2
You can change the order of the revisions; for example, GET /nvue_v1/<resource>?rev=2&diff=1.
Troubleshoot Configuration Changes
When a configuration change fails, you see an error in the change request.
Configuration Fails Because of a Dependency
If you stage a configuration but it fails because of a dependency, the failure shows the reason. In the following example, the change fails because the BGP router ID is not set.
cumulus@switch:~$ curl -u 'cumulus:cumulus' --insecure https://127.0.0.1:8765/nvue_v1/revision/6
{
"state": "invalid",
"transition": {
"issue": {
"0": {
"code": "config_invalid",
"data": {
"location": "router.bgp.enable",
"reason": "BGP requires router-id to be set globally or in the VRF.\n"
},
"message": "Config invalid at router.bgp.enable: BGP requires router-id to be set globally or in the VRF.\n",
"severity": "error"
}
},
"progress": "Invalid config"
}
}
To resolve this issue, observe the failures or errors, then inspect the configuration that you are trying to apply. After you resolve the errors, retry the API. If you prefer to overlook the errors and force an apply, add "auto-prompt":{"ays": "ays_yes"} to the configuration apply.
To save an applied configuration change to the startup configuration file (/etc/nvue.d/startup.yaml) so that the changes persist after a reboot, use a PATCH to the applied revision with the save state.
When you unset a change, you must still use the PATCH action. The value indicates removal of the entry. The data is {"vlan100":null} with the PATCH action.
Use the API for Active Monitoring
The example below fetches the counters for interface swp1.
#!/usr/bin/env python3
import requests
from requests.auth import HTTPBasicAuth
import json
import time
auth = HTTPBasicAuth(username="cumulus", password="password")
nvue_end_point = "https://127.0.0.1:8765/nvue_v1"
mime_header = {"Content-Type": "application/json"}
if __name__ == "__main__":
r = requests.get(url=nvue_end_point + "/interface/swp1/link/stats",
auth=auth,
verify=False)
print("=======Interface swp1 Statistics=======")
print(json.dumps(r.json(), indent=2))
cumulus@switch:~$ nv show interface swp1 link stats
operational applied pending description
------------------- ----------- ------- ------- ----------------------------------------------------------------------
carrier-transitions 6 Number of times the interface state has transitioned between up and...
in-bytes 280.15 MB total number of bytes received on the interface
in-drops 0 number of received packets dropped
in-errors 0 number of received packets with errors
in-pkts 2321659 total number of packets received on the interface
out-bytes 349.10 MB total number of bytes transmitted out of the interface
out-drops 0 The number of outbound packets that were chosen to be discarded eve...
out-errors 0 The number of outbound packets that could not be transmitted becaus...
out-pkts 3536508 total number of packets transmitted out of the interface
Retrieve View Types
NVUE provides views for certain show commands. A view is a subset of information.
To see the views available for a show command, run the command with --view and press TAB:
cumulus@switch:~$ nv show interface --view <<TAB>>
acl-statistics detail lldp mlag-cc port-security synce-counters
brief dot1x-counters lldp-detail neighbor qos-profile
counters dot1x-summary mac pluggables small
To retrieve view types through the REST API, you use the curl -u 'cumulus:CumulusLinux!' -k -X GET http://path?view=<brief> syntax. For example, the equivalent REST API method for the NVUE nv show vrf <vrf-id> router rib ipv4 route --view=brief command is:
cumulus@switch:~$ curl -u 'cumulus:CumulusLinux!' -k -X GET https://127.0.0.1:8765/nvue_v1/vrf/BLUE/router/rib/ipv4/route?view=brief
Convert CLI Changes to Use the API
You can take a configuration change from the CLI and use the API to configure the same set of changes.
Make your configuration changes on the system with the NVUE CLI.
cumulus@switch:~$ nv set system hostname switch01
cumulus@switch:~$ nv set interface lo ip address 99.99.99.99/32
cumulus@switch:~$ nv set interface eth0 ip address 192.168.200.6/24
cumulus@switch:~$ nv set interface bond0 bond member swp1-4
#!/usr/bin/env python3
import requests
from requests.auth import HTTPBasicAuth
import json
import time
auth = HTTPBasicAuth(username="cumulus", password="password")
nvue_end_point = "https://127.0.0.1:8765/nvue_v1"
mime_header = {"Content-Type": "application/json"}
DUMMY_SLEEP = 5 # In seconds
POLL_APPLIED = 1 # in seconds
RETRIES = 10
def print_request(r: requests.Request):
print("=======Request=======")
print("URL:", r.url)
print("Headers:", r.headers)
print("Body:", r.body)
def print_response(r: requests.Response):
print("=======Response=======")
print("Headers:", r.headers)
print("Body:", json.dumps(r.json(), indent=2))
def create_nvue_changest():
r = requests.post(url=nvue_end_point + "/revision",
auth=auth,
verify=False)
print_request(r.request)
print_response(r)
response = r.json()
changeset = response.popitem()[0]
return changeset
def apply_nvue_changeset(changeset):
# apply_payload = {"state": "apply"}
apply_payload = {"state": "apply", "auto-prompt": {"ays": "ays_yes"}}
url = nvue_end_point + "/revision/" + requests.utils.quote(changeset,
safe="")
r = requests.patch(url=url,
auth=auth,
verify=False,
data=json.dumps(apply_payload),
headers=mime_header)
print_request(r.request)
print_response(r)
def is_config_applied(changeset) -> bool:
# Check if the configuration was indeed applied
global RETRIES
global POLL_APPLIED
retries = RETRIES
while retries > 0:
r = requests.get(url=nvue_end_point + "/revision/" + requests.utils.quote(changeset, safe=""),
auth=auth,
verify=False)
response = r.json()
print(response)
if response["state"] == "applied":
return True
retries -= 1
time.sleep(POLL_APPLIED)
return False
def apply_new_config(path,payload):
# Create a new revision ID
changeset = create_nvue_changest()
print("Using NVUE Changeset: '{}'".format(changeset))
# Delete existing configuration
query_string = {"rev": changeset}
r = requests.delete(url=nvue_end_point + path,
auth=auth,
verify=False,
params=query_string,
headers=mime_header)
print_request(r.request)
print_response(r)
# Patch the new configuration
query_string = {"rev": changeset}
r = requests.patch(url=nvue_end_point + path,
auth=auth,
verify=False,
data=json.dumps(payload),
params=query_string,
headers=mime_header)
print_request(r.request)
print_response(r)
# Apply the changes to the new revision changeset
apply_nvue_changeset(changeset)
# Check if the changeset was applied
is_config_applied(changeset)
def nvue_get(path):
r = requests.get(url=nvue_end_point + path,
auth=auth,
verify=False)
print_request(r.request)
print_response(r)
if __name__ == "__main__":
payload = {
"interface": {
"bond0": {
"bond": {
"member": {
"swp1": {},
"swp2": {},
"swp3": {},
"swp4": {}
}
},
"type": "bond"
},
"lo": {
"ip": {
"address": {
"99.99.99.99/32": {}
}
}
}
},
"system": {
"hostname": "switch01"
}
}
apply_new_config("/",payload)
time.sleep(DUMMY_SLEEP)
nvue_get("/interface/bond0")
nvue_get("/interface/lo")
nvue_get("/system")
API Examples
The following section provides practical API examples.
Configure the System
To set the system hostname, pre-login or post-login message, and time zone on the switch, send a targeted API request to /nvue_v1/system.
cumulus@switch:~$ curl -u 'cumulus:cumulus' -d '{"system": {"hostname":"switch01","timezone":"America/Los_Angeles","message":{"pre-login":"Welcome to NVIDIA Cumulus Linux","post-login":"You have successfully logged in to switch01"}}}' -k -X PATCH https://127.0.0.1:8765/nvue_v1/?rev=4
#!/usr/bin/env python3
import requests
from requests.auth import HTTPBasicAuth
import json
import time
auth = HTTPBasicAuth(username="cumulus", password="password")
nvue_end_point = "https://127.0.0.1:8765/nvue_v1"
mime_header = {"Content-Type": "application/json"}
DUMMY_SLEEP = 5 # In seconds
POLL_APPLIED = 1 # in seconds
RETRIES = 10
def print_request(r: requests.Request):
print("=======Request=======")
print("URL:", r.url)
print("Headers:", r.headers)
print("Body:", r.body)
def print_response(r: requests.Response):
print("=======Response=======")
print("Headers:", r.headers)
print("Body:", json.dumps(r.json(), indent=2))
def create_nvue_changest():
r = requests.post(url=nvue_end_point + "/revision",
auth=auth,
verify=False)
print_request(r.request)
print_response(r)
response = r.json()
changeset = response.popitem()[0]
return changeset
def apply_nvue_changeset(changeset):
# apply_payload = {"state": "apply"}
apply_payload = {"state": "apply", "auto-prompt": {"ays": "ays_yes"}}
url = nvue_end_point + "/revision/" + requests.utils.quote(changeset,
safe="")
r = requests.patch(url=url,
auth=auth,
verify=False,
data=json.dumps(apply_payload),
headers=mime_header)
print_request(r.request)
print_response(r)
def is_config_applied(changeset) -> bool:
# Check if the configuration was indeed applied
global RETRIES
global POLL_APPLIED
retries = RETRIES
while retries > 0:
r = requests.get(url=nvue_end_point + "/revision/" + requests.utils.quote(changeset, safe=""),
auth=auth,
verify=False)
response = r.json()
print(response)
if response["state"] == "applied":
return True
retries -= 1
time.sleep(POLL_APPLIED)
return False
def apply_new_config(path,payload):
# Create a new revision ID
changeset = create_nvue_changest()
print("Using NVUE Changeset: '{}'".format(changeset))
# Delete existing configuration
query_string = {"rev": changeset}
r = requests.delete(url=nvue_end_point + path,
auth=auth,
verify=False,
params=query_string,
headers=mime_header)
print_request(r.request)
print_response(r)
# Patch the new configuration
query_string = {"rev": changeset}
r = requests.patch(url=nvue_end_point + path,
auth=auth,
verify=False,
data=json.dumps(payload),
params=query_string,
headers=mime_header)
print_request(r.request)
print_response(r)
# Apply the changes to the new revision changeset
apply_nvue_changeset(changeset)
# Check if the changeset was applied
is_config_applied(changeset)
def nvue_get(path):
r = requests.get(url=nvue_end_point + path,
auth=auth,
verify=False)
print_request(r.request)
print_response(r)
if __name__ == "__main__":
payload = {
"system":
{
"hostname":"switch01",
"timezone":"America/Los_Angeles",
"message":
{
"pre-login":"Welcome to NVIDIA Cumulus Linux",
"post-login:"You have successfully logged in to switch01"
}
}
}
apply_new_config("/",payload) # Root patch
time.sleep(DUMMY_SLEEP)
nvue_get("/system")
cumulus@switch:~$ nv set system hostname switch01
cumulus@switch:~$ nv set system timezone America/Los_Angeles
cumulus@switch:~$ nv set system message pre-login "Welcome to NVIDIA Cumulus Linux"
cumulus@switch:~$ nv set system message post-login "You have successfully logged into switch01"
Configure Services
To set up NTP, DNS, and SNMP on the switch, send a targeted API request to /nvue_v1/service.
#!/usr/bin/env python3
import requests
from requests.auth import HTTPBasicAuth
import json
import time
auth = HTTPBasicAuth(username="cumulus", password="password")
nvue_end_point = "https://127.0.0.1:8765/nvue_v1"
mime_header = {"Content-Type": "application/json"}
DUMMY_SLEEP = 5 # In seconds
POLL_APPLIED = 1 # in seconds
RETRIES = 10
def print_request(r: requests.Request):
print("=======Request=======")
print("URL:", r.url)
print("Headers:", r.headers)
print("Body:", r.body)
def print_response(r: requests.Response):
print("=======Response=======")
print("Headers:", r.headers)
print("Body:", json.dumps(r.json(), indent=2))
def create_nvue_changest():
r = requests.post(url=nvue_end_point + "/revision",
auth=auth,
verify=False)
print_request(r.request)
print_response(r)
response = r.json()
changeset = response.popitem()[0]
return changeset
def apply_nvue_changeset(changeset):
# apply_payload = {"state": "apply"}
apply_payload = {"state": "apply", "auto-prompt": {"ays": "ays_yes"}}
url = nvue_end_point + "/revision/" + requests.utils.quote(changeset,
safe="")
r = requests.patch(url=url,
auth=auth,
verify=False,
data=json.dumps(apply_payload),
headers=mime_header)
print_request(r.request)
print_response(r)
def is_config_applied(changeset) -> bool:
# Check if the configuration was indeed applied
global RETRIES
global POLL_APPLIED
retries = RETRIES
while retries > 0:
r = requests.get(url=nvue_end_point + "/revision/" + requests.utils.quote(changeset, safe=""),
auth=auth,
verify=False)
response = r.json()
print(response)
if response["state"] == "applied":
return True
retries -= 1
time.sleep(POLL_APPLIED)
return False
def apply_new_config(path,payload):
# Create a new revision ID
changeset = create_nvue_changest()
print("Using NVUE Changeset: '{}'".format(changeset))
# Delete existing configuration
query_string = {"rev": changeset}
r = requests.delete(url=nvue_end_point + path,
auth=auth,
verify=False,
params=query_string,
headers=mime_header)
print_request(r.request)
print_response(r)
# Patch the new configuration
query_string = {"rev": changeset}
r = requests.patch(url=nvue_end_point + path,
auth=auth,
verify=False,
data=json.dumps(payload),
params=query_string,
headers=mime_header)
print_request(r.request)
print_response(r)
# Apply the changes to the new revision changeset
apply_nvue_changeset(changeset)
# Check if the changeset was applied
is_config_applied(changeset)
def nvue_get(path):
r = requests.get(url=nvue_end_point + path,
auth=auth,
verify=False)
print_request(r.request)
print_response(r)
if __name__ == "__main__":
payload = {
"service":
{
"ntp":
{
"default":
{
"server:
{
"4.cumulusnetworks.pool.ntp.org":
{
"iburst":"on"
}
}
}
},
"dns":
{
"mgmt":
{
"server:
{
"192.168.1.100":{}
}
}
},
"syslog":
{
"mgmt":
{
"server:
{
"192.168.1.120":
{
"port":8000
}
}
}
}
}
}
apply_new_config("/",payload) # Root patch
time.sleep(DUMMY_SLEEP)
nvue_get("/service/ntp")
nvue_get("/service/dns")
nvue_get("/service/syslog")
cumulus@switch:~$ nv set service ntp default server 4.cumulusnetworks.pool.ntp.org iburst on
cumulus@switch:~$ nv set service dns mgmt server 192.168.1.100
cumulus@switch:~$ nv set service syslog mgmt server 192.168.1.120 port 8000
Configure Users
The following example creates a new user, then deletes the user.
#!/usr/bin/env python3
import requests
from requests.auth import HTTPBasicAuth
import json
import time
auth = HTTPBasicAuth(username="cumulus", password="password")
nvue_end_point = "https://127.0.0.1:8765/nvue_v1"
mime_header = {"Content-Type": "application/json"}
DUMMY_SLEEP = 5 # In seconds
POLL_APPLIED = 1 # in seconds
RETRIES = 10
def print_request(r: requests.Request):
print("=======Request=======")
print("URL:", r.url)
print("Headers:", r.headers)
print("Body:", r.body)
def print_response(r: requests.Response):
print("=======Response=======")
print("Headers:", r.headers)
print("Body:", json.dumps(r.json(), indent=2))
def create_nvue_changest():
r = requests.post(url=nvue_end_point + "/revision",
auth=auth,
verify=False)
print_request(r.request)
print_response(r)
response = r.json()
changeset = response.popitem()[0]
return changeset
def apply_nvue_changeset(changeset):
# apply_payload = {"state": "apply"}
apply_payload = {"state": "apply", "auto-prompt": {"ays": "ays_yes"}}
url = nvue_end_point + "/revision/" + requests.utils.quote(changeset,
safe="")
r = requests.patch(url=url,
auth=auth,
verify=False,
data=json.dumps(apply_payload),
headers=mime_header)
print_request(r.request)
print_response(r)
def is_config_applied(changeset) -> bool:
# Check if the configuration was indeed applied
global RETRIES
global POLL_APPLIED
retries = RETRIES
while retries > 0:
r = requests.get(url=nvue_end_point + "/revision/" + requests.utils.quote(changeset, safe=""),
auth=auth,
verify=False)
response = r.json()
print(response)
if response["state"] == "applied":
return True
retries -= 1
time.sleep(POLL_APPLIED)
return False
def apply_new_config(path,payload):
# Create a new revision ID
changeset = create_nvue_changest()
print("Using NVUE Changeset: '{}'".format(changeset))
# Delete existing configuration
query_string = {"rev": changeset}
r = requests.delete(url=nvue_end_point + path,
auth=auth,
verify=False,
params=query_string,
headers=mime_header)
print_request(r.request)
print_response(r)
# Patch the new configuration
query_string = {"rev": changeset}
r = requests.patch(url=nvue_end_point + path,
auth=auth,
verify=False,
data=json.dumps(payload),
params=query_string,
headers=mime_header)
print_request(r.request)
print_response(r)
# Apply the changes to the new revision changeset
apply_nvue_changeset(changeset)
# Check if the changeset was applied
is_config_applied(changeset)
def delete_config(path):
# Create an NVUE changeset
changeset = create_nvue_changest()
print("Using NVUE Changeset: '{}'".format(changeset))
# Equivalent to JSON `null`
payload = None
# Stage the change
query_string = {"rev": changeset}
r = requests.delete(url=nvue_end_point + path,
auth=auth,
verify=False,
data=json.dumps(payload),
params=query_string,
headers=mime_header)
print_request(r.request)
print_response(r)
# Apply the staged changeset
apply_nvue_changeset(changeset)
# Check if the changeset was applied
is_config_applied(changeset)
def nvue_get(path):
r = requests.get(url=nvue_end_point + path,
auth=auth,
verify=False)
print_request(r.request)
print_response(r)
if __name__ == "__main__":
# Need to create a hashed password - The supported password
# hashes are documented here:
# https://docs.nvidia.com/networking-ethernet-software/cumulus-linux-55/System-Configuration/Authentication-Authorization-and-Accounting/User-Accounts/#hashed-passwords # noqa
# Here in this example, we use SHA-512
import crypt
hashed_password = crypt.crypt("hello$world#2023", salt=crypt.METHOD_SHA512)
payload = {
"system": {
"aaa": {
"user": {
"test1": {
"hashed-password": hashed_password,
"role": "nvue-monitor",
"enable": "on",
"full-name": "Test User",
}
}
}
}
}
apply_new_config("/",payload) # Root patch
time.sleep(DUMMY_SLEEP)
nvue_get("/system/user/aaa")
"""Delete an existing user account using the AAA API."""
delete_config("/system/aaa/user/test1")
time.sleep(DUMMY_SLEEP)
nvue_get("/system/user/aaa")
This example creates a new user test1.
cumulus@switch:~$ nv set system aaa user test1
cumulus@switch:~$ nv set system aaa user test1 full-name "Test User"
cumulus@switch:~$ nv set system aaa user test1 password "abcd@test"
cumulus@switch:~$ nv set system aaa user test1 role nvue-monitor
cumulus@switch:~$ nv set system aaa user test1 enable on
#!/usr/bin/env python3
import requests
from requests.auth import HTTPBasicAuth
import json
import time
auth = HTTPBasicAuth(username="cumulus", password="password")
nvue_end_point = "https://127.0.0.1:8765/nvue_v1"
mime_header = {"Content-Type": "application/json"}
DUMMY_SLEEP = 5 # In seconds
POLL_APPLIED = 1 # in seconds
RETRIES = 10
def print_request(r: requests.Request):
print("=======Request=======")
print("URL:", r.url)
print("Headers:", r.headers)
print("Body:", r.body)
def print_response(r: requests.Response):
print("=======Response=======")
print("Headers:", r.headers)
print("Body:", json.dumps(r.json(), indent=2))
def create_nvue_changest():
r = requests.post(url=nvue_end_point + "/revision",
auth=auth,
verify=False)
print_request(r.request)
print_response(r)
response = r.json()
changeset = response.popitem()[0]
return changeset
def apply_nvue_changeset(changeset):
# apply_payload = {"state": "apply"}
apply_payload = {"state": "apply", "auto-prompt": {"ays": "ays_yes"}}
url = nvue_end_point + "/revision/" + requests.utils.quote(changeset,
safe="")
r = requests.patch(url=url,
auth=auth,
verify=False,
data=json.dumps(apply_payload),
headers=mime_header)
print_request(r.request)
print_response(r)
def is_config_applied(changeset) -> bool:
# Check if the configuration was indeed applied
global RETRIES
global POLL_APPLIED
retries = RETRIES
while retries > 0:
r = requests.get(url=nvue_end_point + "/revision/" + requests.utils.quote(changeset, safe=""),
auth=auth,
verify=False)
response = r.json()
print(response)
if response["state"] == "applied":
return True
retries -= 1
time.sleep(POLL_APPLIED)
return False
def apply_new_config(path,payload):
# Create a new revision ID
changeset = create_nvue_changest()
print("Using NVUE Changeset: '{}'".format(changeset))
# Delete existing configuration
query_string = {"rev": changeset}
r = requests.delete(url=nvue_end_point + path,
auth=auth,
verify=False,
params=query_string,
headers=mime_header)
print_request(r.request)
print_response(r)
# Patch the new configuration
query_string = {"rev": changeset}
r = requests.patch(url=nvue_end_point + path,
auth=auth,
verify=False,
data=json.dumps(payload),
params=query_string,
headers=mime_header)
print_request(r.request)
print_response(r)
# Apply the changes to the new revision changeset
apply_nvue_changeset(changeset)
# Check if the changeset was applied
is_config_applied(changeset)
def nvue_get(path):
r = requests.get(url=nvue_end_point + path,
auth=auth,
verify=False)
print_request(r.request)
print_response(r)
if __name__ == "__main__":
rt_payload = {
"bgp":
{
"autonomous-system": 65101,
"router-id":"10.10.10.1"
}
}
apply_new_config("/router",rt_payload)
vrf_payload = {
"bgp":
{
"neighbor":
{
"swp51":
{
"remote-as":"external"
}
},
"address-family":
{
"ipv4-unicast":
{
"network":
{
"10.10.10.1/32":{}
}
}
}
}
}
apply_new_config("/vrf/default/router",vrf_payload)
time.sleep(DUMMY_SLEEP)
nvue_get("/router")
nvue_get("/vrf/default/router")
cumulus@switch:~$ nv set router bgp autonomous-system 65101
cumulus@switch:~$ nv set router bgp router-id 10.10.10.1
cumulus@switch:~$ nv set vrf default router bgp neighbor swp51 remote-as external
cumulus@switch:~$ nv set vrf default router bgp address-family ipv4-unicast network 10.10.10.1/32
Action Operations
The NVUE action operations are ephemeral operations that do not modify the state of the configuration; they reset counters for interfaces, BGP, QoS buffers and pools, and remove conflicts from protodown MLAG bonds.
In the following python example, the full_config_example() method sets the system pre-login message, enables BGP globally, and changes a few other configuration settings in a single bulk operation. The API end-point goes to the root node /nvue_v1. The bridge_config_example() method performs a targeted API request to /nvue_v1/bridge/domain/<domain-id> to set the vlan-vni-offset attribute.
To try out the NVUE REST API, use the NVUE API Lab available on NVIDIA Air. The lab provides a basic example to help you get started. You can also try out the other examples in this document.
Unlike the NVUE CLI, the NVUE API does not support configuring a plain text password for a user account; you must configure a hashed password for a user account with the NVUE API.
If you need to make multiple updates on the switch, NVIDIA recommends you use a root patch, which can make configuration changes with fewer round trips to the switch. Running many specific NVUE PATCH APIs to set or unset objects requires many round trips to the switch to set up the HTTP connection, transfer payload and responses, manage network utilization, and so on.
Cumulus Linux supports IEEE 1588-2008 Precision Timing Protocol (PTPv2), which defines the algorithm and method for synchronizing clocks of various devices across packet-based networks, including Ethernet switches and IP routers.
PTP is capable of sub-microsecond accuracy. The clocks are in a master-slave hierarchy, where the slaves synchronize to their masters, which can be slaves to their own masters. The best master clock (BMC) algorithm, which runs on every clock, creates and updates the hierarchy automatically. The grandmaster clock is the top-level master. To provide a high-degree of accuracy, a Global Positioning System (GPS) time source typically synchronizes the grandmaster clock.
In the following example:
Boundary clock 2 receives time from Master 1 (the grandmaster) on a PTP slave port, sets its clock and passes the time down from the PTP master port to Boundary clock 1.
Boundary clock 1 receives the time on a PTP slave port, sets its clock and passes the time down the hierarchy through the PTP master ports to the hosts that receive the time.
Cumulus Linux and PTP
PTP in Cumulus Linux uses the linuxptp package that includes the following programs:
ptp4l provides the PTP protocol and state machines
phc2sys provides PTP Hardware Clock and System Clock synchronization
timemaster provides System Clock and PTP synchronization
Cumulus Linux supports:
PTP boundary clock mode only (the switch provides timing to downstream servers; it is a slave to a higher-level clock and a master to downstream clocks).
UDPv4, UDPv6, and 802.3 encapsulation.
Only a single PTP domain per network.
PTP on layer 3 interfaces, layer 3 bonds, trunk ports, and switch ports belonging to a VLAN.
Multicast, unicast, and mixed message mode.
End-to-End delay mechanism only. Cumulus Linux does not support Peer-to-Peer.
One-step and two-step clock timestamp mode.
Hardware timestamping for PTP packets. This allows PTP to avoid inaccuracies caused by message transfer delays and improves the accuracy of time synchronization.
You cannot run both PTP and NTP on the switch.
PTP supports the default VRF only.
1G links might have a lower accuracy for PTP due to hardware limitations. If your application needs high accuracy from PTP, use higher link speeds.
Basic Configuration
Basic PTP configuration requires you:
Disable NTP and remove default NTP configuration.
Enable PTP on the switch.
Configure PTP on at least one interface; this can be a layer 3 routed port, switch port, or trunk port. You do not need to specify which is a master interface and which is a slave interface; the PTP Best Master Clock Algorithm (BMCA) determines the master and slave.
If you configure PTP with Linux commands, you must also enable PTP timestamping; see step 1 of the Linux procedure below. NVUE enables timestamping when you enable PTP on the switch.
The basic configuration shown below uses the default PTP settings:
The clock mode is Boundary. This is the only clock mode that Cumulus Linux supports.
The delay mechanism is End-to-End (E2E), where the slave measures the delay between itself and the master. The master and slave send delay request and delay response messages between each other to measure the delay.
The clock timestamp mode is two-step.
To configure other settings, such as the PTP profile, domain, priority, and DSCP, the PTP interface transport mode and timers, and PTP monitoring, see the Optional Configuration sections below.
Disable NTP
Remove the default NTP configuration on the switch:
cumulus@switch:~$ nv unset service ntp mgmt server 0.cumulusnetworks.pool.ntp.org
cumulus@switch:~$ nv unset service ntp mgmt server 1.cumulusnetworks.pool.ntp.org
cumulus@switch:~$ nv unset service ntp mgmt server 2.cumulusnetworks.pool.ntp.org
cumulus@switch:~$ nv unset service ntp mgmt server 3.cumulusnetworks.pool.ntp.org
cumulus@switch:~$ nv config apply
Stop and disable the NTP service in the management VRF:
Edit the /etc/ntpsec/ntp.conf file to comment out the default NTP configuration:
cumulus@switch:~$ sudo nano /etc/ntpsec/ntp.conf
# server 0.cumulusnetworks.pool.ntp.org iburst
# server 1.cumulusnetworks.pool.ntp.org iburst
# server 2.cumulusnetworks.pool.ntp.org iburst
# server 3.cumulusnetworks.pool.ntp.org iburst
Stop and disable the NTP service in the management VRF:
The NVUE nv set service ptp commands require an instance number (1 in the example command below) for management purposes.
When you enable the PTP service with the nv set service ptp <instance> enable on command, NVUE restarts the switchd service, which causes all network ports to reset in addition to resetting the switch hardware configuration.
cumulus@switch:~$ nv set service ptp 1 enable on
cumulus@switch:~$ nv set interface swp1 ip address 10.0.0.9/32
cumulus@switch:~$ nv set interface swp2 ip address 10.0.0.10/32
cumulus@switch:~$ nv set interface swp1 ptp enable on
cumulus@switch:~$ nv set interface swp2 ptp enable on
cumulus@switch:~$ nv config apply
The configuration writes to the /etc/ptp4l.conf file.
cumulus@switch:~$ nv set service ptp 1 enable on
cumulus@switch:~$ nv set bridge domain br_default
cumulus@switch:~$ nv set bridge domain br_default type vlan-aware
cumulus@switch:~$ nv set bridge domain br_default vlan 10-30
cumulus@switch:~$ nv set bridge domain br_default vlan 10 ptp enable on
cumulus@switch:~$ nv set interface vlan10 type svi
cumulus@switch:~$ nv set interface vlan10 ip address 10.1.10.2/24
cumulus@switch:~$ nv set interface vlan10 ptp enable on
cumulus@switch:~$ nv set interface swp1 bridge domain br_default
cumulus@switch:~$ nv set interface swp1 bridge domain br_default vlan 10
cumulus@switch:~$ nv set interface swp1 ptp enable on
cumulus@switch:~$ nv config apply
You can configure only one address; either IPv4 or IPv6.
For IPv6, set the trunk port transport mode to ipv6.
The configuration writes to the /etc/ptp4l.conf file.
cumulus@switch:~$ nv set service ptp 1 enable on
cumulus@switch:~$ nv set bridge domain br_default
cumulus@switch:~$ nv set bridge domain br_default type vlan-aware
cumulus@switch:~$ nv set bridge domain br_default vlan 10-30
cumulus@switch:~$ nv set bridge domain br_default vlan 10 ptp enable on
cumulus@switch:~$ nv set interface vlan10 type svi
cumulus@switch:~$ nv set interface vlan10 ip address 10.1.10.2/24
cumulus@switch:~$ nv set interface swp2 bridge domain br_default
cumulus@switch:~$ nv set interface swp2 bridge domain br_default access 10
cumulus@switch:~$ nv set interface swp2 ptp enable on
cumulus@switch:~$ nv config apply
You can configure only one address; either IPv4 or IPv6.
For IPv6, set the trunk port transport mode to ipv6.
The configuration writes to the /etc/ptp4l.conf file.
Configure NVUE to stop managing PTP configuration files:
cumulus@switch:~$ nv set system config apply ignore /etc/linuxptp/phc2sys.conf
cumulus@switch:~$ nv set system config apply ignore /etc/ptp4l.conf
cumulus@switch:~$ nv set system config apply ignore /etc/cumulus/switchd.d/ptp.conf
cumulus@switch:~$ nv config apply
Edit the /etc/cumulus/switchd.d/ptp.conf file to set the ptp.timestamping parameter to TRUE:
Restarting the switchd service causes all network ports to reset in addition to resetting the switch hardware configuration.
Edit the Default interface options section of the /etc/ptp4l.conf file to configure the interfaces on the switch that you want to use for PTP.
cumulus@switch:~$ sudo nano /etc/ptp4l.conf
...
#
# Default interface options
#
time_stamping hardware
# Interfaces in which ptp should be enabled
# these interfaces should be routed ports
# if an interface does not have an ip address
# the ptp4l will not work as expected.
[swp1]
udp_ttl 1
masterOnly 0
delay_mechanism E2E
[swp2]
udp_ttl 1
masterOnly 0
delay_mechanism E2E
For a trunk VLAN, add the VLAN configuration to the switch port stanza: set l2_mode to trunk, vlan_intf to the VLAN interface, and src_ip to the IP address of the VLAN interface:
For a switch port VLAN, add the VLAN configuration to the switch port stanza: set l2_mode to access, vlan_intf to the VLAN interface, and src_ip to the IP address of the VLAN interface:
Cumulus Linux provides several ways to modify the default basic global configuration. You can:
Use profiles.
Modify the parameters directly with NVUE commands.
Modify the Linux /etc/ptp4l.conf file.
When a predefined profile is set, NVUE does not allow you to configure global parameters. Do not edit the Linux /etc/ptp4l.conf file to modify the global parameters when a predefined profile is in use. For information about profiles, see PTP Profiles.
Clock Domains
PTP domains allow different independent timing systems to be present in the same network without confusing each other. A PTP domain is a network or a portion of a network within which all the clocks synchronize. Every PTP message contains a domain number. A PTP instance works in only one domain and ignores messages that contain a different domain number. Cumulus Linux supports only one domain in the system.
You can specify multiple PTP clock domains. PTP isolates each domain from other domains so that each domain is a different PTP network. You can specify a number between 0 and 127.
The following example commands configure domain 3 when a profile is not set:
cumulus@switch:~$ nv set service ptp 1 domain 3
cumulus@switch:~$ nv config apply
Edit the Default Data Set section of the /etc/ptp4l.conf file to change the domainNumber setting, then restart the ptp4l service.
cumulus@switch:~$ sudo nano /etc/ptp4l.conf
[global]
#
# Default Data Set
#
slaveOnly 0
priority1 128
priority2 128
domainNumber 3
...
The Cumulus Linux switch provides the following clock timestamp modes:
One-step, where PTP adds the precise time that the Sync packet egresses the port to the packet. There is no need for a follow up packet.
Two-step, where PTP notes the precise time when the Sync packet egresses the port and sends it in a separate follow up message.
One-step mode significantly reduces the number of PTP messages. Two-step mode is the default configuration.
Cumulus Linux supports one-step mode on Spectrum-2 and later.
The following example commands configure one-step mode when a profile is not set:
cumulus@switch:~$ nv set service ptp 1 two-step off
cumulus@switch:~$ nv config apply
To revert the clock timestamp mode to the default setting (two-step mode), run the nv set service ptp 1 two-step on command.
To set the clock timestamp mode for a custom profile based on IEEE1588, ITU 8275-1 or ITU 8275-2, run the nv set service ptp <instance-id> profile <profile-id> two-step command. For example, to set one-step mode for the custom profile called CUSTOM1, run the nv set service ptp 1 profile CUSTOM1 two-step off command.
Edit the Default Data Set section of the /etc/ptp4l.conf file to change the twoStepFlag setting to 0, then restart the ptp4l service.
To revert the clock timestamp mode to the default setting (two-step mode), change the twoStepFlag setting to 1.
PTP Priority
The BMC selects the PTP master according to the criteria in the following order:
Priority 1
Clock class
Clock accuracy
Clock variance
Priority 2
Port ID
Use the PTP priority to select the best master clock. You can set priority 1 and 2:
Priority 1 overrides the clock class and quality selection criteria to select the best master clock.
Priority 2 identifies primary and backup clocks among identical redundant Grandmasters.
The range for both priority1 and priority2 is between 0 and 255. The default priority is 128. For the boundary clock, use a number above 128. The lower priority applies first.
The following example commands set priority 1 and priority 2 to 200 when a profile is not set:
cumulus@switch:~$ nv set service ptp 1 priority1 200
cumulus@switch:~$ nv set service ptp 1 priority2 200
cumulus@switch:~$ nv config apply
Edit the Default Data Set section of the /etc/ptp4l.conf file to change the priority1 and, or priority2 setting, then restart the ptp4l service.
cumulus@switch:~$ sudo nano /etc/ptp4l.conf
[global]
#
# Default Data Set
#
slaveOnly 0
priority1 200
priority2 200
domainNumber 3
...
Use the local priority when you create a custom profile based on a Telecom profile (ITU 8275-1 or ITU 8275-2). Modify the local priority in a custom profile to set the local priority of the local clock. You can set a value between 0 and 255. The default priority is 128.
The following example command configures the local priority to 10 for the custom profile called CUSTOM1, based on ITU 8275-2:
cumulus@switch:~$ nv set service ptp 1 profile CUSTOM1 local-priority 10
cumulus@switch:~$ nv config apply
Edit the G.8275.defaultDS.localPriority option in the /etc/ptp4l.conf file. After you save the /etc/ptp4l.conf file, restart the ptp4l service.
ITU-T specifies the following key elements to measure, test, and classify the accuracy of a clock:
Noise generation—jitter and wander noise in the output of a clock in reference to a PRTC.
Noise tolerance—how much noise the clock can tolerate before it switches to another stable source.
Noise transfer—smoothe out the input noise so that noise does not accumulate and increase over a network of clocks.
Transient response—the response from the clock to a transient.
Hold over—the time interval during which the clock maintains its output after losing the input reference signal.
Cumulus Linux PTP has an option to use a servo specifically designed to handle the ITU-T Noise Transfer specification. When you use this option, the PHC is disciplined by Noise Transfer Servo, which smoothes the jitter and wander noise from the Master clock.
To use Noise Transfer Servo, you need to enable SyncE on the switch and on PTP interfaces.
Cumulus Linux supports Noise Transfer Servo on Spectrum ASICs that support SyncE.
NVIDIA recommends you do not change the default Noise Transfer Servo configuration parameters.
NVIDIA recommends you use Noise Transfer Servo with PTP Telecom profiles. If you use other profiles or choose not to use a profile, make sure to set the sync interval to -3 or better.
To enable Noise Transfer Servo:
The following example enables PTP, sets the profile to default-itu-8275-1, enables SyncE, enables PTP on swp3, and enables Noise Transfer Servo.
cumulus@switch:~$ nv set service ptp 1 enable on
cumulus@switch:~$ nv set service ptp 1 current-profile default-itu-8275-1
cumulus@switch:~$ nv set system synce enable on
cumulus@switch:~$ nv set interface swp3 ptp enable on
cumulus@switch:~$ nv set service ptp 1 servo noise-transfer
cumulus@switch:~$ nv config apply
Edit the /etc/ptp4l.conf and the /etc/firefly_servo/servo.conf files; see examples below.
To show Noise Transfer Servo configuration settings, run the nv show service ptp <instance-id> servo command:
cumulus@switch:~$ nv show service ptp 1 servo
operational applied
----- ----------- --------------
servo noise-transfer
Optional Global Configuration
Optional global PTP configuration includes configuring the DiffServ code point (DSCP). You can configure the DSCP value for all PTP IPv4 packets originated locally. You can set a value between 0 and 63.
cumulus@switch:~$ nv set service ptp 1 ip-dscp 22
cumulus@switch:~$ nv config apply
Edit the Default Data Set section of the /etc/ptp4l.conf file to change the dscp_event setting for PTP messages that trigger a timestamp read from the clock and the dscp_general setting for PTP messages that carry commands, responses, information, or timestamps.
After you save the /etc/ptp4l.conf file, restart the ptp4l service.
Cumulus Linux provides several ways to modify the default basic interface configuration. You can:
Use profiles
Modify the parameters directly with NVUE commands
Modify the Linux /etc/ptp4l.conf configuration file.
When a profile is in use, avoid configuring the following interface configuration parameters with NVUE or in the Linux configuration file so that the interface retains its profile settings.
Transport Mode
By default, Cumulus Linux encapsulates PTP messages in UDP IPV4 frames. To encapsulate PTP messages on an interface in UDP IPV6 frames:
cumulus@switch:~$ nv set interface swp1 ptp transport ipv6
cumulus@switch:~$ nv config apply
Edit the Default interface options section of the /etc/ptp4l.conf file to change the network_transport setting for the interface, then restart the ptp4l service.
cumulus@switch:~$ sudo nano /etc/ptp4l.conf
...
# Default interface options
#
time_stamping hardware
# Interfaces in which ptp should be enabled
# these interfaces should be routed ports
# if an interface does not have an ip address
# the ptp4l will not work as expected.
[swp1]
udp_ttl 1
masterOnly 0
delay_mechanism E2E
network_transport RAWUDPv6
[swp2]
udp_ttl 1
masterOnly 0
delay_mechanism E2E
network_transport RAWUDPv6
...
Cumulus Linux supports the following PTP message modes:
Multicast, where the ports subscribe to two multicast addresses, one for event messages with timestamps and the other for general messages without timestamps. The Sync message that the master sends is a multicast message; all slave ports receive this message because the slaves need the time from the master. The slave ports in turn generate a Delay Request to the master. This is a multicast message that the intended master for the message and other slave ports receive. Similarly, all slave ports in addition to the intended slave port receive the master’s Delay Response. The slave ports receiving the unintended Delay Requests and Responses need to drop the packets. This can affect network bandwidth if there are hundreds of slave ports.
Mixed, where Sync and Announce messages are multicast messages but Delay Request and Response messages are unicast. This avoids the issue seen in multicast message mode where every slave port sees Delay Requests and Responses from every other slave port.
Unicast, where you configure the port as a unicast client or server. See Unicast Mode.
Multicast mode is the default setting; when you enable PTP on an interface, the message mode is multicast.
To change the message mode to mixed on swp1:
cumulus@switch:~$ nv set interface swp1 ptp mixed-multicast-unicast on
cumulus@switch:~$ nv config apply
To change the message mode back to the default setting of multicast on swp1:
cumulus@switch:~$ nv set interface swp1 ptp mixed-multicast-unicast off
cumulus@switch:~$ nv config apply
Edit the Default interface options section of the /etc/ptp4l.conf file to add the hybrid_e2e 1 line under the interface, then restart the ptp4l service.
cumulus@switch:~$ sudo nano /etc/ptp4l.conf
...
# Default interface options
#
time_stamping hardware
# Interfaces in which ptp should be enabled
# these interfaces should be routed ports
# if an interface does not have an ip address
# the ptp4l will not work as expected.
[swp1]
hybrid_e2e 1
...
To change the message mode back to the default setting of multicast, remove the hybrid_e2e line under the interface, then restart the ptp4l service.
PTP Interface Timers
You can set the following timers for PTP messages.
Timer
Description
announce-interval
The average interval between successive Announce messages. Specify the value as a power of two in seconds.
announce-timeout
The number of announce intervals that have to occur without receiving an Announce message before a timeout occurs. Make sure that this value is longer than the announce-interval in your network.
delay-req-interval
The minimum average time interval allowed between successive Delay Required messages.
sync-interval
The interval between PTP synchronization messages on an interface. Specify the value as a power of two in seconds.
To set the timers with NVUE, run the nv set interface <interface> ptp timers <timer> <value> command.
To set the timers with Linux commands, edit the /etc/ptp4l.conf file and set the timers in the Default interface options section.
The following example sets the announce interval between successive Announce messages on swp1 to -1.
Edit the Default interface options section of the /etc/ptp4l.conf file:
To set the announce interval between successive Announce messages on swp1 to -1, add logAnnounceInterval -1 under the interface stanza.
To set the mean sync-interval for multicast messages on swp1 to -5, add logSyncInterval -5 under the interface stanza.
After you edit the /etc/ptp4l.conf file, restart the ptp4l service.
cumulus@switch:~$ sudo nano /etc/ptp4l.conf
...
# Default interface options
#
time_stamping hardware
# Interfaces in which ptp should be enabled
# these interfaces should be routed ports
# if an interface does not have an ip address
# the ptp4l will not work as expected.
[swp1]
logAnnounceInterval -1
logSyncInterval -5
udp_ttl 20
masterOnly 1
delay_mechanism E2E
...
Set the local priority on an interface for a profile that uses ITU 8275-1 or ITU 8275-2. You can set a value between 0 and 255. The default priority is 128.
The following example sets the local priority on swp1 to 10.
By default, PTP ports are in auto mode, where the BMC algorithm determines the state of the port.
You can configure Forced Master mode on a PTP port so that it is always in a master state and the BMC algorithm does not run for this port. This port ignores any Announce messages it receives.
cumulus@switch:~$ nv set interface swp1 ptp forced-master on
cumulus@switch:~$ nv config apply
Edit the Default interface options section of the /etc/ptp4l.conf file to change the masterOnly setting for the interface, then restart the ptp4l service.
cumulus@switch:~$ sudo nano /etc/ptp4l.conf
...
# Default interface options
#
time_stamping hardware
# Interfaces in which ptp should be enabled
# these interfaces should be routed ports
# if an interface does not have an ip address
# the ptp4l will not work as expected.
[swp1]
udp_ttl 1
masterOnly 1
delay_mechanism E2E
...
Edit the Default interface options section of the /etc/ptp4l.conf file to change the udp_ttl setting for the interface, then restart the ptp4l service.
cumulus@switch:~$ sudo nano /etc/ptp4l.conf
...
# Default interface options
#
time_stamping hardware
# Interfaces in which ptp should be enabled
# these interfaces should be routed ports
# if an interface does not have an ip address
# the ptp4l will not work as expected.
[swp1]
udp_ttl 20
masterOnly 1
delay_mechanism E2E
...
Cumulus Linux supports unicast mode so that a unicast client can perform Unicast Discover and Negotiation with servers. Unlike the default multicast mode, where both the server(master) and client(slave) start sending out announce requests and discover each other, in unicast mode, the client starts by sending out requests for unicast transmission. The client sends this to every server address in its Unicast Master Table. The server responds with an accept or deny to the request.
Global Unicast Configuration
Unicast clients need a unicast master table for unicast negotiation; you must configure at least one unicast master table on the switch.
To configure unicast globally:
Set the unicast table ID; a unique ID that identifies the unicast master table.
Set the unicast master address. You can set more than one unicast master address, which can be an IPv4, IPv6, or MAC address.
Optional: Set the unicast master query interval, which is the mean interval between requests for Announce messages. Specify this value as a power of two in seconds. You can specify a value between -3 and 4. The default value is -0 (2 power).
cumulus@switch:~$ nv set service ptp 1 unicast-master 1 address 10.10.10.1
cumulus@switch:~$ nv set service ptp 1 unicast-master 1 query-interval 4
cumulus@switch:~$ nv set interface swp1 ptp unicast-master-table-id 1
cumulus@switch:~$ nv config apply
Add the following lines at the end of the # Default interface options section of the /etc/ptp4l.conf file:
For interface unicast configuration, in addition to enabling PTP on an interface, you also need to configure the PTP interface to be either a unicast client or a unicast server.
When configuring multiple PTP interfaces on the switch to be unicast clients, you must configure a unicast table ID on every interface set as a unicast client. Each client must have a different table ID.
To configure a PTP interface to be the unicast client:
To show the unicast master table configuration on the switch, run the nv show service ptp <instance-id> unicast-master <table-id> command.
Optional Unicast Interface Configuration
You can set the unicast request duration for unicast clients, which is the service time in seconds requested by the unicast client during unicast negotiation. The default value is 300 seconds.
PTP profiles are a standardized set of configurations and rules intended to meet the requirements of a specific application. Profiles define required, allowed, and restricted PTP options, network restrictions, and performance requirements.
Cumulus Linux supports the following predefined profiles:
IEEE 1588
ITU 8275-1
ITU 8275-2
Application
Enterprise
Mobile Networks
Mobile Networks
Transport
Layer 2 and Layer 3
Layer 2
Layer 3
Encapsulation
802.3, UDPv4, or UDPv6
802.3
UDPv4 or UDPv6
Transmission
Unicast and Multicast
Multicast
Unicast
Supported Clock Types
Boundary Clock
Boundary Clock
Boundary Clock
You cannot modify the predefined profiles. If you want to set a parameter to a different value in a predefined profile, you need to create a custom profile. You can modify a custom profile within the range applicable to the profile type.
You cannot set the current profile to a profile not yet created.
You cannot set global PTP parameters in a profile currently in use.
PTP profiles do not support VLANs or bonds.
If you set a predefined or custom profile, do not change any global PTP settings, such as the DSCP or the clock domain.
For better performance in a high scale network with PTP on multiple interfaces, configure a higher system policer rate with the nv set system control-plane policer lldp-ptp burst <value> and nv set system control-plane policer lldp-ptp rate <value> commands. The switch uses the LLDP policer for PTP protocol packets. The default value for the LLDP policer is 2500. When you use the ITU 8275.1 profile with higher sync rates, use higher policer values.
Set a Predefined Profile
To set a predefined profile:
To set the ITU 8275.1 profile, run the nv set service ptp <instance-id> current-profile default-itu-8275-1 command.
To set the ITU 8275.2 profile, run the nv set service ptp <instance-id> current-profile default-itu-8275-2 command.
The following example sets the profile to ITU 8275.1
cumulus@switch:~$ nv set service ptp 1 current-profile default-itu-8275-1
cumulus@switch:~$ nv config apply
To set the IEEE 1588 profile:
cumulus@switch:~$ nv set service ptp 1 current-profile default-1588
cumulus@switch:~$ nv config apply
To set the predefined ITU 8275.1 profile, edit the /etc/ptp4l.conf file and set the parameters shown below, then restart the ptp4l service:
Set the profile type on which to base the new profile (itu-g-8275-1itu-g-8275-2, or ieee-1588).
Update any of the profile settings you want to change (announce-interval, delay-req-interval, priority1, sync-interval, announce-timeout, domain, priority2, transport, delay-mechanism, local-priority).
Set the custom profile to be the current profile.
The following example commands create a custom profile called CUSTOM1 based on the predefined profile ITU 8275-1. The commands set the domain to 28 and the announce-timeout to 3, then set CUSTOM1 to be the current profile:
cumulus@switch:~$ nv set service ptp 1 profile CUSTOM1
cumulus@switch:~$ nv set service ptp 1 profile CUSTOM1 profile-type itu-g-8275-1
cumulus@switch:~$ nv set service ptp 1 profile CUSTOM1 domain 28
cumulus@switch:~$ nv set service ptp 1 profile CUSTOM1 announce-timeout 3
cumulus@switch:~$ nv set service ptp 1 current-profile CUSTOM1
cumulus@switch:~$ nv config apply
The following example /etc/ptp4l.conf file creates a custom profile based on the predefined profile ITU 8275-1 and sets the domain to 28 and the announce-timeout to 3.
cumulus@switch:~$ sudo nano /etc/ptp4l.conf
[global]
#
# Default Data Set
#
slaveOnly 0
priority1 128
priority2 128
domainNumber 28
twoStepFlag 1
dscp_event 46
dscp_general 46
network_transport L2
dataset_comparison G.8275.x
G.8275.defaultDS.localPriority 128
ptp_dst_mac 01:80:C2:00:00:0E
#
# Port Data Set
#
logAnnounceInterval 5
logSyncInterval -4
logMinDelayReqInterval -4
announceReceiptTimeout 3
delay_mechanism E2E
offset_from_master_min_threshold -50
offset_from_master_max_threshold 50
mean_path_delay_threshold 200
tsmonitor_num_ts 100
tsmonitor_num_log_sets 3
tsmonitor_num_log_entries 4
tsmonitor_log_wait_seconds 1
#
# Run time options
#
logging_level 6
path_trace_enabled 0
use_syslog 1
verbose 0
summary_interval 0
#
# servo parameters
#
pi_proportional_const 0.000000
pi_integral_const 0.000000
pi_proportional_scale 0.700000
pi_proportional_exponent -0.300000
pi_proportional_norm_max 0.700000
pi_integral_scale 0.300000
pi_integral_exponent 0.400000
pi_integral_norm_max 0.300000
step_threshold 0.000002
first_step_threshold 0.000020
max_frequency 900000000
sanity_freq_limit 0
#
# Default interface options
#
time_stamping hardware
# Interfaces in which ptp should be enabled
# these interfaces should be routed ports
# if an interface does not have an ip address
# the ptp4l will not work as expected.
[swp1]
udp_ttl 1
masterOnly 0
delay_mechanism E2E
[swp2]
udp_ttl 1
masterOnly 0
delay_mechanism E2E
To show the current PTP profile setting, run the nv show service ptp <ptp-instance> command:
cumulus@switch:~$ nv show service ptp 1
operational applied description
--------------------------- ----------- ------------------ --------------------------------------------------------------------
enable on on Turn the feature 'on' or 'off'. The default is 'off'.
current-profile default-itu-8275-1 Current PTP profile index
domain 24 0 Domain number of the current syntonization
ip-dscp 46 46 Sets the Diffserv code point for all PTP packets originated locally.
priority1 128 128 Priority1 attribute of the local clock
priority2 128 128 Priority2 attribute of the local clock
...
To show the settings for a profile, run the nv show service ptp <instance> profile <profile-name> command:
The acceptable master table option is a security feature that prevents a rogue player from pretending to be the grandmaster clock to take over the PTP network. To use this feature, you configure the clock IDs of known grandmaster clocks in the acceptable master table and set the acceptable master table option on a PTP port. The BMC algorithm checks if the grandmaster clock received in the Announce message is in this table before proceeding with the master selection. Cumulus Linux disables this option by default on PTP ports.
The following example command adds the grandmaster clock ID 24:8a:07:ff:fe:f4:16:06 to the acceptable master table and enables the PTP acceptable master table option for swp1:
cumulus@switch:~$ nv set service ptp 1 acceptable-master 24:8a:07:ff:fe:f4:16:06
cumulus@switch:~$ nv config apply
You can also configure an alternate priority 1 value for the Grandmaster:
cumulus@switch:~$ nv set service ptp 1 acceptable-master 24:8a:07:ff:fe:f4:16:06 alt-priority 2
To enable the PTP acceptable master table option for swp1:
cumulus@switch:~$ nv set interface swp1 ptp acceptable-master on
cumulus@switch:~$ nv config apply
Edit the Default interface options section of the /etc/ptp4l.conf file to add acceptable_master_clockIdentity 248a07.fffe.f41606.
To enable the PTP acceptable master table option for swp1, add acceptable_master on under [swp1].
...
# Default interface options
#
time_stamping hardware
# Interfaces in which ptp should be enabled
# these interfaces should be routed ports
# if an interface does not have an ip address
# the ptp4l will not work as expected.
[swp1]
udp_ttl 20
masterOnly 1
delay_mechanism E2E
acceptable_master on
...
Cumulus Linux provides the following optional PTP monitoring configuration.
Configure Clock TimeStamp and Path Delay Thresholds
Cumulus Linux monitors clock timestamp and path delay against thresholds, and generates counters when PTP reaches the set thresholds. You can see the counters in the NVUE nv show command output and in log messages.
You can configure the following monitor settings:
Command
Description
nv set service ptp <instance> monitor min-offset-threshold
Sets the minimum difference allowed between the master and slave time. You can set a value between -1000000000 and 0 nanoseconds. The default value is -50 nanoseconds.
nv set service ptp <instance> monitor max-offset-threshold
Sets the maximum difference allowed between the master and slave time. You can set a value between 0 and 1000000000 nanoseconds. The default value is 50 nanoseconds.
nv set service ptp <instance> monitor path-delay-threshold
Sets the mean time that PTP packets take to travel between the master and slave. You can set a value between 0 and 1000000000 nanoseconds. The default value is 200 nanoseconds.
nv set service ptp <instance> monitor max-timestamp-entries
Sets the maximum number of timestamp entries allowed. Cumulus Linux updates the timestamps continuously. You can specify a value between 100 and 200. The default value is 100 entries.
The following example sets the minimum offset threshold to -1000, the maximum offset threshold to 1000, and the path delay threshold to 300:
cumulus@switch:~$ nv set service ptp 1 monitor min-offset-threshold -1000
cumulus@switch:~$ nv set service ptp 1 monitor max-offset-threshold 1000
cumulus@switch:~$ nv set service ptp 1 monitor path-delay-threshold 300
cumulus@switch:~$ nv config apply
You can configure the following monitor settings manually in the /etc/ptp4l.conf file. Be sure to run the sudo systemctl restart ptp4l.service to apply the settings.
Parameter
Description
offset_from_master_min_threshold
Sets the minimum difference allowed between the master and slave time. You can set a value between -1000000000 and 0 nanoseconds. The default value is -50 nanoseconds.
offset_from_master_max_threshold
Sets the maximum difference allowed between the master and slave time. You can set a value between 0 and 1000000000 nanoseconds. The default value is 50 nanoseconds.
mean_path_delay_threshold
Sets the mean time that PTP packets take to travel between the master and slave. You can set a value between 0 and 1000000000 nanoseconds. The default value is 200 nanoseconds.
The following example sets the minimum offset threshold to -1000, the maximum offset threshold to 1000, and the path delay threshold to 300:
A log set contains the log entries for clock timestamp and path delay violations at different times. You can set the number of entries to log and the interval between successive violation logs.
Command
Description
nv set service ptp 1 monitor max-violation-log-sets
Sets the maximum number of log sets allowed. You can specify a value between 2 and 4. The default value is 3.
nv set service ptp 1 monitor max-violation-log-entries
Sets the maximum number of log entries allowed in a log set. You can specify a value between 4 and 8. The default value is 4.
nv set service ptp 1 monitor violation-log-interval
Sets the number of seconds to wait before logging back-to-back violations. You can specify a value between 0 and 60. The default value is 1.
The following example sets the maximum number of log sets allowed to 4, the maximum number of log entries allowed to 6, and the violation log interval to 10:
cumulus@switch:~$ nv set service ptp 1 monitor max-violation-log-sets 4
cumulus@switch:~$ nv set service ptp 1 monitor max-violation-log-entries 6
cumulus@switch:~$ nv set service ptp 1 monitor violation-log-interval 10
cumulus@switch:~$ nv config apply
You can configure the following monitor settings manually in the /etc/ptp4l.conf file. Be sure to run the sudo systemctl restart ptp4l.service to apply the settings.
Parameter
Description
tsmonitor_num_log_sets
Sets the maximum number of log sets allowed. You can specify a value between 2 and 4. The default value is 3.
tsmonitor_num_log_entries
Sets the maximum number of log entries allowed in a log set. You can specify a value between 4 and 8. The default value is 4.
tsmonitor_log_wait_seconds
Sets the number of seconds to wait before logging back-to-back violations. You can specify a value between 0 and 60. The default value is 1.
The following example sets the maximum number of log sets allowed to 4, the maximum number of log entries allowed to 6, and the violation log interval to 10:
To delete PTP configuration, delete the PTP master and slave interfaces. The following example commands delete the PTP interfaces swp1, swp2, and swp3.
Edit the /etc/ptp4l.conf file to remove the interfaces from the Default interface options section, then restart the ptp4l service.
cumulus@switch:~$ sudo nano /etc/ptp4l.conf
...
# Default interface options
#
time_stamping hardware
# Interfaces in which ptp should be enabled
# these interfaces should be routed ports
# if an interface does not have an ip address
# the ptp4l will not work as expected.
You can drill down with the following nv show service ptp <instance> commands:
nv show service ptp <instance> acceptable-master shows acceptable master configuration.
nv show service ptp <instance> clock-quality shows the clock quality status.
nv show service ptp <instance> current shows the local states learned during PTP message exchange.
nv show service ptp <instance> domain shows the domain configuration.
nv show service ptp <instance> ip-dscp shows PTP DSCP configuration.
nv show service ptp <instance> monitor shows PTP monitor configuration.
nv show service ptp <instance> profile shows PTP profile configuration.
nv show service ptp <instance> parent shows the local states learned during PTP message exchange.
nv show service ptp <instance> priority1 shows PTP priority1 configuration.
nv show service ptp <instance> priority2 shows PTP priority2 configuration.
nv show service ptp <instance> status shows the status of all PTP interfaces.
nv show service ptp <instance> time-properties shows the clock time attributes.
nv show service ptp <instance> unicast-master shows the unicast master configuration.
Show PTP Interface Configuration
To check configuration for a PTP interface, run the nv show interface <interface> ptp command.
cumulus@switch:~$ nv show interface swp1 ptp
operational applied description
------------------------- ----------- ---------- ----------------------------------------------------------------------
enable on Turn the feature 'on' or 'off'. The default is 'off'.
acceptable-master off Determines if acceptable master check is enabled for this interface.
delay-mechanism end-to-end end-to-end Mode in which PTP message is transmitted.
forced-master off off Configures PTP interfaces to forced master state.
instance 1 PTP instance number.
mixed-multicast-unicast off Enables Multicast for Announce, Sync and Followup and Unicast for D...
transport ipv4 ipv4 Transport method for the PTP messages.
ttl 1 1 Maximum number of hops the PTP messages can make before it gets dro...
unicast-request-duration 300 The service time in seconds to be requested during discovery.
timers
announce-interval 0 0 Mean time interval between successive Announce messages. It's spec...
announce-timeout 3 3 The number of announceIntervals that have to pass without receipt o...
delay-req-interval -3 -3 The minimum permitted mean time interval between successive Delay R...
sync-interval -3 -3 The mean SyncInterval for multicast messages. It's specified as a...
peer-mean-path-delay 0 An estimate of the current one-way propagation delay on the link wh...
port-state master State of the port
protocol-version 2 The PTP version in use on the port
Show PTP Counters
To show all PTP counters, run the nv show service ptp <instance> counters command:
cumulus@switch:~$ nv show service ptp 1 counters
Packet Type Received Transmitted
--------------------- ------------ ------------
Port swp4
Announce 0 10370
Sync 0 20731
Follow-up 0 20731
Delay Request 0 0
Delay Response 0 0
Peer Delay Request 0 0
Peer Delay Response 0 0
Management 0 0
Signaling 0 0
To show PTP counters for an interface, run the nv show interface <interface> counters ptp command.
To clear PTP counters for an interface, run the nv action clear interface <interface> counters ptp command:
To show the status of all PTP interfaces, run the nv show service ptp <instance> status command.
The command output shows the PTP enabled ports, the PTP port mode (unicast or multicast), the state of the port based on BMCA, the unicast state, and identifies the server address to which the client connects.
cumulus@switch:~$ nv show service ptp 1 status
Port Mode State Ustate Server
----- ----- ------- ------------------------------- -------
swp9 Ucast SLAVE Sync and Delay Granted (H_SYDY) 9.9.9.2
swp10 Ucast PASSIVE Initial State (WAIT)
swp11 Ucast PASSIVE Initial State (WAIT)
swp12 Ucast PASSIVE Initial State (WAIT)
Show the List of NVUE PTP Commands
To see a full list of NVUE show commands for PTP, run the nv list-commands service ptp command.
To show a full list of show commands for a PTP interface, run the nv list-commands | grep 'nv show interface <interface-id> ptp' command.
cumulus@switch:~$ nv list-commands service ptp
nv show service ptp
nv show service ptp <instance-id>
nv show service ptp <instance-id> status
nv show service ptp <instance-id> domain
nv show service ptp <instance-id> priority1
nv show service ptp <instance-id> priority2
nv show service ptp <instance-id> ip-dscp
nv show service ptp <instance-id> acceptable-master
...
cumulus@switch:~$ nv list-commands | grep 'nv show interface <interface-id> ptp'
...
nv show interface <interface-id> ptp
nv show interface <interface-id> ptp timers
nv show interface <interface-id> ptp shaper
...
Example Configuration
In the following example, the boundary clock on the switch receives time from Master 1 (the grandmaster) on PTP slave port swp1, sets its clock and passes the time down through PTP master ports swp2, swp3, and swp4 to the hosts that receive the time.
The following example configuration assumes that you have already configured the layer 3 routed interfaces (swp1, swp2, swp3, and swp4) you want to use for PTP.
cumulus@switch:~$ nv set service ptp 1 enable on
cumulus@switch:~$ nv set service ptp 1 priority2 254
cumulus@switch:~$ nv set service ptp 1 priority1 254
cumulus@switch:~$ nv set service ptp 1 domain 3
cumulus@switch:~$ nv set interface swp1 ptp enable on
cumulus@switch:~$ nv set interface swp2 ptp enable on
cumulus@switch:~$ nv set interface swp3 ptp enable on
cumulus@switch:~$ nv set interface swp4 ptp enable on
cumulus@switch:~$ nv config apply
cumulus@switch:~$ sudo cat /etc/nvue.d/startup.yaml
- set:
interface:
lo:
ip:
address:
10.10.10.1/32: {}
type: loopback
swp1:
ptp:
enable: on
type: swp
swp2:
ptp:
enable: on
type: swp
swp3:
ptp:
enable: on
type: swp
swp4:
ptp:
enable: on
type: swp
service:
ptp:
'1':
domain: 3
enable: on
priority1: 254
priority2: 254
cumulus@switch:~$ sudo cat /etc/ptp4l.conf
...
[global]
#
# Default Data Set
#
slaveOnly 0
priority1 254
priority2 254
domainNumber 3
twoStepFlag 1
dscp_event 46
dscp_general 46
offset_from_master_min_threshold -50
offset_from_master_max_threshold 50
mean_path_delay_threshold 200
tsmonitor_num_ts 100
tsmonitor_num_log_sets 2
tsmonitor_num_log_entries 4
tsmonitor_log_wait_seconds 1
#
# Run time options
#
logging_level 6
path_trace_enabled 0
use_syslog 1
verbose 0
summary_interval 0
#
# servo parameters
#
pi_proportional_const 0.000000
pi_integral_const 0.000000
pi_proportional_scale 0.700000
pi_proportional_exponent -0.300000
pi_proportional_norm_max 0.700000
pi_integral_scale 0.300000
pi_integral_exponent 0.400000
pi_integral_norm_max 0.300000
step_threshold 0.000002
first_step_threshold 0.000020
max_frequency 900000000
sanity_freq_limit 0
#
# Default interface options
#
time_stamping hardware
# Interfaces in which ptp should be enabled
# these interfaces should be routed ports
# if an interface does not have an ip address
# the ptp4l will not work as expected.
[swp1]
udp_ttl 1
masterOnly 0
delay_mechanism E2E
network_transport RAWUDPv4
[swp2]
udp_ttl 1
masterOnly 0
delay_mechanism E2E
network_transport RAWUDPv4
[swp3]
udp_ttl 1
masterOnly 0
delay_mechanism E2E
network_transport RAWUDPv4
[swp4]
udp_ttl 1
masterOnly 0
delay_mechanism E2E
network_transport RAWUDPv4
Considerations
PTP Version
Cumulus Linux uses a linuxptp package that is PTP v2.1 compliant, and sets the major PTP version to 2 and the minor PTP version to 1 by default in the configuration. If your PTP configuration does not work correctly when the minor version is set, you can change the minor version to 0.
cumulus@switch:~$ nv set service ptp 1 force-version 2.0
cumulus@switch:~$ nv config apply
To set the minor PTP version back to the default, run the nv unset service ptp 1 force-version command.
Edit the /etc/ptp4l.conf file to add ptp_minor_version 0 to the Global section, then restart the ptp4l service.
To set the minor PTP version back to the default value (1), remove ptp_minor_version 0 from the Global section of the /etc/ptp4l.conf file, then restart the ptp4l service.
To show that the PTP minor version is now 0, run the nv show service ptp <instance> force-version command:
cumulus@switch:~$ nv show service ptp 1 force-version
applied
------------- -------
force-version 2.0
PTP Traffic Shaping
To improve performance on the NVIDIA Spectrum 1 switch for PTP-enabled ports with speeds lower than 100G, you can enable a pre-defined traffic shaping profile. For example, if you see that the PTP timing offset varies widely and does not stabilize, enable PTP shaping on all PTP enabled ports to reduce the bandwidth on the ports slightly and improve timing stabilization.
Switches with Spectrum-2 and later do not support PTP shaping.
Bonds do not support PTP shaping.
You cannot configure QoS traffic shaping and PTP traffic shaping on the same ports.
You must configure a strict priority for PTP traffic; for example:
cumulus@switch:~$ nv set qos egress-scheduler default-global traffic-class 0-5,7 mode dwrr
cumulus@switch:~$ nv set qos egress-scheduler default-global traffic-class 0-5,7 bw-percent 12
cumulus@switch:~$ nv set qos egress-scheduler default-global traffic-class 6 mode strict
For each PTP-enabled port on which you want to set traffic shaping, run the nv set interface <interface> ptp shaper enable on command.
cumulus@switch:~$ nv set interface swp1 ptp shaper enable on
cumulus@switch:~$ nv set interface swp2 ptp shaper enable on
cumulus@switch:~$ nv config apply
To see the PTP shaping setting for an interface, run the nv show interface <interface> ptp shaper command:
cumulus@switch:~$ nv show interface swp1 ptp shaper
operational applied
------ ----------- -------
enable on
In the /etc/cumulus/switchd.d/ptp_shaper.conf file, set the following parameters for the interfaces to which you want to apply traffic shaping and enable the traffic shaper. You must reload switchd for the changes to take effect.
PTP frames are affected by STP filtering; events, such as an STP topology change (where ports temporarily go into the blocking state), can cause interruptions to PTP communications.
If you configure PTP on bridge ports, NVIDIA recommends that the bridge ports are spanning tree edge ports or in a bridge domain where spanning tree is disabled.
Pulse Per Second - PPS
PPS is the simplest form of synchronization. The PPS source provides a signal precisely every second. The switch is capable of using an external PPS signal to synchronize its PHC (for PPS In) and can also generate the PPS signal that other devices can use to synchronize their clocks (for PPS Out).
In PPS In mode, the switch can use an external PPS signal to adjust the phase of its PHC. The PPS signal does not provide ToD, so Cumulus Linux uses PTP for ToD; you must configure a PTP slave port on the switch for PPS In.
In PPS Out mode, the switch can output the PPS signal. The switch can use this signal to check the accuracy of its PHC frequency and other devices can use this signal to synchronize their PHC.
Cumulus Linux supports PPS for the NVIDIA SN3750-SX switch only.
cumulus@switch:~$ nv set platform pulse-per-second in state enabled
cumulus@switch:~$ nv config apply
cumulus@switch:~$ nv set platform pulse-per-second out state enabled
cumulus@switch:~$ nv config apply
Edit the Default interface options section of the /etc/ptp4l.conf file to configure the PTP slave port on the switch. PPS In requires PTP slave port. See Precision Time Protocol - PTP for information about PTP.
Edit the /etc/linuxptp/pps_out.conf file to set the following parameters.
cumulus@switch:~$ sudo nano /etc/linuxptp/pps_out.conf
# Configuration file used for the pps_out.service
# It is shell formatted and the file is source'd by the service
# Set the PTP device to source our PPS from.
# If not specified, the service will find the first device with a clock name "sx_ptp".
PTP_DEV=/dev/ptp1
# Set the pin index on the PPS device to send on.
# On the NVIDIA systems, only pin 1 (0-based) is supported
OUT_PIN=1
# Set the file where to cache the last started values.
# This is used primarily in the "stop" operation to know what to clean up.
CACHE_FILE=/var/run/pps_out
# Set the out pulse charateristics for frequency and width
PULSE_FREQ=1000000000
PULSE_WIDTH=500000000
PULSE_PHASE=0
Sets the channel index for PPS In. You can set a value of 1 or 0. The default value is 0.
logging-level
Sets the logging level for PPS In. You can specify emergency, alert, critical, error, warning, notice, info, or debug. The default logging level is info.
pin-index
Sets the pin index for PPS In. You can set a value of 1 or 0. The default value is 0.
signal-polarity
Sets the polarity of the PPS In signal. You can specify rising-edge, falling-edge, or both. The default setting is rising-edge.
signal-width
Sets the pulse width of the PPS In signal. You can set a value between 1000000 and 999000000. The default value is 500000000.
timestamp-correction
Sets the value, in nanoseconds, to add to each PPS In timestamp. You can set a value between -1000000000 and 1000000000. The default value is 0.
PPS Out Setting
Description
channel-index
Sets the channel index for PPS Out. You can set a value of 1 or 0. The default value is 0.
frequency-adjustment
Sets the frequency adjustment of the PPS Out signal. You can set a value between 1000000000 and 2147483647. The default value is 1000000000.
pin-index
Sets the pin index for PPS Out. Cumulus Linux supports only pin 1.
signal-width
Sets the pulse width of the PPS Out signal. You can set a value between 1000000 and 999000000. The default value is 500000000.
The NVUE CLI includes the phase adjustment setting for PPS Out. Cumulus Linux 5.8 does not support this setting.
The following example configures PPS In and sets:
The channel index to 1.
The pin index to 1.
The signal width to 999000000.
The timestamp correction to 1000000000.
The logging level to warning.
The polarity of the PPS In signal to falling-edge.
cumulus@switch:~$ nv set platform pulse-per-second in channel-index 1
cumulus@switch:~$ nv set platform pulse-per-second in pin-index 1
cumulus@switch:~$ nv set platform pulse-per-second in signal-width 999000000
cumulus@switch:~$ nv set platform pulse-per-second in timestamp-correction 1000000000
cumulus@switch:~$ nv set platform pulse-per-second in logging-level warning
cumulus@switch:~$ nv set platform pulse-per-second in signal-polarity falling-edge
cumulus@switch:~$ nv config apply
The following example configures PPS Out and sets:
The channel index to 1.
The signal width to 999000000.
The frequency-adjustment of the PPS Out signal to 2147483647.
cumulus@switch:~$ nv set platform pulse-per-second out channel-index 1
cumulus@switch:~$ nv set platform pulse-per-second out signal-width 999000000
cumulus@switch:~$ nv set platform pulse-per-second out frequency-adjustment 2147483647
cumulus@switch:~$ nv config apply
To configure PPS In, edit the /etc/linuxptp/ts2phc.conf file, then restart the PPS In service with the sudo systemctl restart ts2phc.service command.
The following example configures PPS In and sets:
The channel index to 1
The pin index to 1
The signal width to 999000000.
The timestamp correction to 1000000000.
The logging level to 4 (warning).
The polarity of the PPS In signal to falling edge (falling).
To configure PPS Out, edit the /etc/linuxptp/pps_out.conf.conf file, then restart the PPS Out service with the sudo systemctl restart pps_out.service command.
The following example configures PPS Out and sets:
The channel index to 1.
The signal width to 999000000.
The frequency-adjustment of the PPS Out signal to 2147483647.
cumulus@switch:~$ sudo nano /etc/linuxptp/pps_out.conf.conf
# Configuration file used for the pps_out.service
# It is shell formatted and the file is source'd by the service
#
# Set the PTP device to source our PPS from.
# If not specified, the service will find the first device with a clock name "sx_ptp".
PTP_DEV=/dev/ptp1
#
# Set the pin index on the PPS device to send on.
# On the NVIDIA systems, only pin 1 (0-based) is supported
OUT_PIN=1
#
OUT_CHANNEL=1
#
# Set the file where to cache the last started values.
# This is used primarily in the "stop" operation to know what to clean up.
CACHE_FILE=/var/run/pps_out
#
# Set the out pulse charateristics for frequency and width
PULSE_FREQ=2147483647
PULSE_WIDTH=999000000
PULSE_PHASE=1000000000
Show PPS Configuration Settings
To show a summary of the PPS In and PPS out configuration settings, run the nv show platform pulse-per-second command:
cumulus@switch:~$ nv show platform pulse-per-second
applied
---------------------- -----------
in
state enabled
pin-index 0
channel-index 0
signal-width 500000000
signal-polarity rising-edge
timestamp-correction 0
logging-level info
out
state disabled
pin-index 1
channel-index 0
frequency-adjustment 1000000000
phase-adjustment 0
signal-width 500000000
To show only PPS In configuration settings, run the nv show platform pulse-per-second in command:
cumulus@switch:~$ nv show platform pulse-per-second in
applied
-------------------- -----------
state enabled
pin-index 0
channel-index 0
signal-width 500000000
signal-polarity rising-edge
timestamp-correction 0
logging-level info
To show only PPS Out configuration settings, run the nv show platform pulse-per-second out command:
cumulus@switch:~$ nv show platform pulse-per-second out
applied
-------------------- ----------
state disabled
pin-index 1
channel-index 0
frequency-adjustment 1000000000
phase-adjustment 0
signal-width 500000000
Synchronous Ethernet - SyncE
SyncE is an ITU-T standard for transmitting clock signals over the Ethernet physical layer to synchronize clocks across the network by propagating frequency using the transmission rate of symbols in the network. A dedicated channel, ESMC manages this synchronization, as specified by the ITU-T Rec. G.8264 standard.
The Cumulus Linux switch includes a SyncE controller and a SyncE daemon.
The SyncE controller reads performance counters to calculate the differences between transmit and receive ethernet symbols on the physical layer to fine tune the clock frequency.
The SyncE daemon (synced):
Manages transmitting and receiving SSMs on all SyncE enabled ports using the Ethernet Synchronization Messaging Channel (ESMC).
Manages the synchronization hierarchy and runs the master selection algorithm to choose the best reference clock from the QL in the SSM.
Cumulus Linux constructs the SyncE clock identity as follows:
The higher order 3 bytes are from the higher order 3 bytes (OUI) of the base MAC address.
The middle 2 bytes are 0xff 0xfe.
The lower order 3 bytes are from the lower order 3 bytes of the base MAC address.
Cumulus Linux supports SyncE for the NVIDIA SN3750-SX switch only.
SyncE on 1G interfaces only supports 1000BASE-SX transceivers, 1000BASE-LX transceivers, and ADVA 5401 GrandMaster transceivers.
Basic Configuration
Basic SyncE configuration requires you:
Enable SyncE on the switch.
Configure SyncE on at least one interface so that the interface is a timing source that passes to the selection algorithm.
Set the SyncE bundle ID to prevent loops if more than one link comes from the same clock source. You can set a value between 0 and 256. A value of 0 indicates no bundle.
The basic configuration shown below uses the default SyncE settings:
The amount of time SyncE waits after the interface comes up before using the interface for synchronization is set to 5 minutes.
cumulus@switch:~$ nv set system synce enable on
cumulus@switch:~$ nv set interface swp2 synce enable on
cumulus@switch:~$ nv set interface swp2 synce bundle-id 10
cumulus@switch:~$ nv config apply
Edit the /etc/synced/synced.conf file to configure the interface, then enable and start the SyncE service. Adding an interface section in the /etc/synced/synced.conf file enables SyncE on that interface.
The following example enables SyncE on swp2.
cumulus@switch:~$ sudo nano /etc/synced/synced.conf
....
# NVUE SyncE state is enable on
[global]
twtr_seconds=300
priority=1
[swp2]
bundle=10
The wait to restore time is the number of seconds SyncE waits for each port to be up before opening the Ethernet Synchronization Message Channel (ESMC) for messages. You can set a value between 0 and 720 (12) minutes. The default value is 300 seconds (5 minutes).
The following command example sets the wait to restore time to 180 seconds (3 minutes):
cumulus@switch:~$ nv set system synce wait-to-restore-time 180
cumulus@switch:~$ nv config apply
Edit the /etc/synced/synced.conf file to change the twtr_seconds setting, then restart the SyncE service.
You can set the priority for the clock source. The lowest priority is 1 and the highest priority is 256. If two clock sources have the same priority, the switch uses the lowest clock source.
The following example command sets the priority to 256:
cumulus@switch:~$ nv set system synce provider-default-priority 256
cumulus@switch:~$ nv config apply
Edit the /etc/synced/synced.conf file to change the priority setting, then restart the SyncE service.
The clock selection algorithm uses the frequency source priority on an interface to choose between two sources that have the same QL. You can specify a value between 1 (the highest priority) and 256 (the lowest priority). The default value is 1.
The following command example sets the priority on swp2 to 10, on swp2 to 20, and on swp3 to 10:
cumulus@switch:~$ nv set interface swp1 synce provider-priority 10
cumulus@switch:~$ nv set interface swp2 synce provider-priority 20
cumulus@switch:~$ nv set interface swp3 synce provider-priority 10
cumulus@switch:~$ nv config apply
Edit the /etc/synced.conf file to change the priority setting for the interface, then restart the SyncE service.
To show global SyncE configuration, run the NVUE nv show system synce command or the Linux syncectl show status command.
To show SyncE configuration for a specific interface, run the NVUE nv show interface <interface-id> synce command or the Linux syncectl show interface status <interface> command.
cumulus@switch:~$ nv show system synce
operational applied
------------------------- ----------------------------------------------------------------- -------
enable On on
log-level notice
provider-default-priority 10 10
wait-to-restore-time 40 40
clock-identity 0x849e00fffe00ca00
local-clock-quality eec1
network-type 1
summary Group #0: TRACKING holdover acquired on swp1. freq_diff: 77 (ppb)
To show SyncE statistics for a specific interface, run the NVUE nv show interface <interface-id> counters synce command or the Linux syncectl show interface counters <interface command:
To clear counters for a specific SyncE interface, run the NVUE nv action clear interface <interface> counters synce command or the Linux syncectl clear interface counters <interface> command.
This section describes how to set up user accounts and ssh for remote access, and configure LDAP authentication, TACACS+, and RADIUS AAA.
SSH for Remote Access
Cumulus Linux uses the OpenSSH package to provide access to the system using the Secure Shell (SSH) protocol.
Configure SSH
You can configure SSH to provide login access to the root user and to specific user accounts, limit SSH to listen on a specific VRF, and configure timeouts and session options.
Root User Settings
By default, the root account cannot use SSH to log in.
You can configure the root account to use SSH to log into the switch with:
A password
A public key or any allowed mechanism that is not a password and not keyboard interactive. This is the default setting.
A set of commands defined in the authorized_keys file.
To allow the root account to SSH into the switch with a password:
cumulus@switch:~$ nv set system ssh-server permit-root-login enabled
cumulus@switch:~$ nv config apply
Run the nv set system ssh-server permit-root-login disabled command to disable SSH login for the root account with a password.
To allow the root account to SSH into the switch and authenticate with a public key or any allowed mechanism that is not a password and not keyboard interactive:
cumulus@switch:~$ nv set system ssh-server permit-root-login prohibit-password
cumulus@switch:~$ nv config apply
To allow the root account to SSH into the switch and only run a set of commands defined in the authorized_keys file:
cumulus@switch:~$ nv set system ssh-server permit-root-login forced-commands-only
cumulus@switch:~$ nv config apply
To allow the root account to SSH into the switch using a password, edit the /etc/ssh/sshd_config file and set the PermitRootLogin option to yes:
Set the PermitRootLogin command to no to disable SSH login with a password.
To allow the root account to SSH into the switch and authenticate with a public key or any allowed mechanism that is not a password and not keyboard interactive:
As a privileged user (such as the cumulus user), either echo the public key contents and redirect the contents to the authorized key file or copy the public key file to the switch, then copy it to the root account (with privilege escalation).
To echo the public key contents and redirect the contents to the authorized key file:
cumulus@switch:~$ echo "<SSH public key contents>" | sudo tee -a /root/.ssh/authorized_keys
cumulus@switch:~$ sudo chmod 0644 /root/.ssh/authorized_keys
To copy the public key file to the switch, then copy it to the root account:
To allow certain users to establish an SSH session:
cumulus@switch:~$ nv set system ssh-server allow-users user1
cumulus@switch:~$ nv config apply
To deny certain users to establish an SSH session:
cumulus@switch:~$ nv set system ssh-server deny-users user4
cumulus@switch:~$ nv config apply
To allow certain users to establish an SSH session, edit the /etc/ssh/sshd_config file and add the AllowUsers parameter:
cumulus@switch:~$ sudo cat /etc/ssh/sshd_config
...
...
# Example of overriding settings on a per-user basis
#Match User anoncvs
# X11Forwarding no
# AllowTcpForwarding no
# PermitTTY no
# ForceCommand cvs server
AllowUsers = user1
To deny certain users to establish an SSH session, edit the /etc/ssh/sshd_config file and add the DenyUsers parameter:
cumulus@switch:~$ sudo cat /etc/ssh/sshd_config
...
# Example of overriding settings on a per-user basis
#Match User anoncvs
# X11Forwarding no
# AllowTcpForwarding no
# PermitTTY no
# ForceCommand cvs server
AllowUsers = user1
DenyUsers = user4
SSH and VRFs
The SSH service runs in the default VRF on the switch but listens on all interfaces in all VRFs. You can limit SSH to listen on specific VRFs.
You cannot run SSH in the default VRF and other VRFs at the same time.
The following example configures SSH to listen only on the management VRF:
cumulus@switch:~$ nv set system ssh-server vrf mgmt
cumulus@switch:~$ nv config apply
The following example configures SSH to listen on the management VRF and VRF RED:
cumulus@switch:~$ nv set system ssh-server vrf mgmt
cumulus@switch:~$ nv set system ssh-server vrf RED
cumulus@switch:~$ nv config apply
Bind the SSH service to the VRF. The following example configures SSH to listen only on the management VRF:
To configure SSH to listen to only one IP address or a subnet in a VRF, you need to bind the service to that VRF (as above), then set the ListenAddress parameter in the /etc/ssh/sshd_config file to the IP address or subnet in that VRF.
You can configure the following SSH timeout and session options:
The number of login attempts allowed before rejecting the SSH session. You can specify a value between 3 and 100. The default value is 3 login attempts.
The number of seconds allowed before login times out. You can specify a value between 1 and 600. The default value is 120 seconds.
The TCP port numbers that listen for incoming SSH sessions. You can specify a value between 1 and 65535.
The number of minutes a session can be inactive before the SSH server terminates the connection. The default value is 0 minutes.
The maximum number of SSH sessions allowed per TCP connection. You can specify a value between 1 and 100. The default value is 10.
Unauthenticated SSH sessions:
The maximum number of unauthenticated SSH sessions allowed. You can specify a value between 1 and 10000. The default value is 100.
The number of unauthenticated SSH sessions allowed before throttling starts. You can specify a value between 1 and 10000. The default value is 10.
The starting percentage of connections to reject above the throttle start count before reaching the session count limit. You can specify a value between 1 and 100. The default value is 30.
The following example configures the number of login attempts allowed before rejecting the SSH session to 10 and the number of seconds allowed before login times out to 200:
cumulus@switch:~$ nv set system ssh-server authentication-retries 10
cumulus@switch:~$ nv set system ssh-server login-timeout 200
cumulus@switch:~$ nv config apply
Edit the /etc/ssh/sshd_config file and change the MaxAuthTries parameter in the Authentication section to 10 and the LoginGraceTime parameter to 200:
The following example configures the TCP port that listens for incoming SSH sessions to 443:
cumulus@switch:~$ nv set system ssh-server port 443
cumulus@switch:~$ nv config apply
Edit the /etc/ssh/sshd_config file and add the Port parameter:
cumulus@switch:~$ sudo nano /etc/ssh/sshd_config
...
Port 443
#AddressFamily any
#ListenAddress 0.0.0.0
#ListenAddress ::
...
The following example configures the amount of time a session can be inactive before the SSH server terminates the connection to 5 minutes (300 seconds) and the maximum number of SSH sessions allowed per TCP connection to 5:
cumulus@switch:~$ nv set system ssh-server inactive-timeout 5
cumulus@switch:~$ nv set system ssh-server max-sessions-per-connection 5
cumulus@switch:~$ nv config apply
Edit Authentication section of the /etc/ssh/sshd_config file.
To configure the amount of time (in seconds) a session can be inactive before the SSH server terminates the connection, change the ClientAliveInterval parameter.
To configure the maximum number of SSH sessions allowed per TCP connection, change the MaxSessions parameter.
The number of unauthenticated SSH sessions allowed before throttling starts to 5.
The starting percentage of connections to reject above the throttle start count before reaching the session count limit to 22.
The maximum number of unauthenticated SSH sessions allowed to 20.
cumulus@switch:~$ nv set system ssh-server max-unauthenticated throttle-start 5
cumulus@switch:~$ nv set system ssh-server max-unauthenticated throttle-percent 22
cumulus@switch:~$ nv set system ssh-server max-unauthenticated session-count 20
cumulus@switch:~$ nv config apply
Edit the /etc/ssh/sshd_config file and change the MaxStartups parameter.
The following example configures:
The number of unauthenticated SSH sessions allowed before throttling starts to 5.
The starting percentage of connections to reject above the throttle start count before reaching the session count limit to 22.
The maximum number of unauthenticated SSH sessions allowed to 20.
This section describes how to generate an SSH key pair on one system and install the key as an authorized key on another system.
Generate an SSH Key Pair
To generate an SSH key pair, run the ssh-keygen command and follow the prompts.
To configure the system without a password, do not enter a passphrase when prompted in the following step.
cumulus@host01:~$ ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/home/cumulus/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/cumulus/.ssh/id_rsa.
Your public key has been saved in /home/cumulus/.ssh/id_rsa.pub.
The key fingerprint is:
5a:b4:16:a0:f9:14:6b:51:f6:f6:c0:76:1a:35:2b:bb cumulus@leaf04
The key's randomart image is:
+---[RSA 2048]----+
| +.o o |
| o * o . o |
| o + o O o |
| + . = O |
| . S o . |
| + . |
| . E |
| |
| |
+-----------------+
Install an Authorized SSH Key
To install an authorized SSH key, you take the contents of an SSH public key and add it to the SSH authorized key file (~/.ssh/authorized_keys) of the user.
A public key is a text file with three space separated fields:
<type> <key string> <comment>
Field
Description
<type>
The algorithm you want to use to hash the key. The algorithm can be ecdsa-sha2-nistp256, ecdsa-sha2-nistp384, ecdsa-sha2-nistp521, ssh-dss, ssh-ed25519, or ssh-rsa (the default value).
<key string>
A base64 format string for the key.
<comment>
A single word string. By default, this is the name of the system that generated the key. NVUE uses the <comment> field as the key name.
The procedure to install an authorized SSH key is different based on whether the user is an NVUE managed user or a non-NVUE managed user.
The following example adds an authorized key named prod_key to the user admin2. The content of the public key file is ssh-rsa 1234 prod_key.
cumulus@leaf01:~$ nv set system aaa user admin2 ssh authorized-key prod_key key XABDB3NzaC1yc2EAAAADAQABAAABgQCvjs/RFPhxLQMkckONg+1RE1PTIO2JQhzFN9TRg7ox7o0tfZ+IzSB99lr2dmmVe8FRWgxVjc...
cumulus@leaf01:~$ nv set system aaa user admin2 ssh authorized-key prod_key type ssh-rsa
cumulus@leaf01:~$ nv config apply
The following example adds an authorized key file from the account cumulus on a host to the cumulus account on the switch:
To copy a previously generated public key to the desired location, run the ssh-copy-id command and follow the prompts:
cumulus@host01:~$ ssh-copy-id -i /home/cumulus/.ssh/id_rsa.pub cumulus@leaf02
The authenticity of host 'leaf02 (192.168.0.11)' can't be established.
ECDSA key fingerprint is b1:ce:b7:6a:20:f4:06:3a:09:3c:d9:42:de:99:66:6e.
Are you sure you want to continue connecting (yes/no)? yes
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
cumulus@leaf01's password:
Number of key(s) added: 1
The ssh-copy-id command does not work if the username on the remote switch is different from the username on the local switch. To work around this issue, use the scp command instead:
cumulus@host01:~$ scp .ssh/id_rsa.pub cumulus@leaf02:.ssh/authorized_keys
Enter passphrase for key '/home/cumulus/.ssh/id_rsa':
id_rsa.pub
Connect to the remote switch to confirm that the authentication keys are in place:
cumulus@leaf01:~$ ssh cumulus@leaf02
Welcome to Cumulus VX (TM)
Cumulus VX (TM) is a community supported virtual appliance designed for
experiencing, testing and prototyping the latest technology.
For any questions or technical support, visit our community site at:
http://community.cumulusnetworks.com
The registered trademark Linux (R) is used pursuant to a sublicense from LMI,
the exclusive licensee of Linus Torvalds, owner of the mark on a world-wide basis.
Last login: Thu Sep 29 16:56:54 2016
Troubleshooting
To show all the current SSH server configuration settings, run the NVUE nv show system ssh-server command:
cumulus@switch:~$ nv show system ssh-server
applied
--------------------------- -----------------
authentication-retries 6
inactive-timeout 0
login-timeout 120
max-sessions-per-connection 10
permit-root-login prohibit-password
state enabled
max-unauthenticated
session-count 100
throttle-percent 30
throttle-start 10
To show the current number of active SSH sessions, run the NVUE nv show system ssh-server active-sessions command or the Linux w command:
cumulus@switch:~$ nv show system ssh-server active-sessions
Peer Address:Port Local Address:Port State
------------------- ---------------------- -----
192.168.200.1:46528 192.168.200.11%mgmt:22 ESTAB
cumulus@switch:~$ w
11:10:46 up 19:19, 4 users, load average: 0.08, 0.05, 0.05
USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT
cumulus ttyS0 - Wed15 19:19m 0.03s 0.02s -bash
cumulus pts/0 192.168.200.1 07:27 3:43m 0.03s 0.03s -bash
cumulus pts/1 192.168.200.1 10:01 1:09m 0.02s 0.02s -bash
cumulus pts/2 192.168.200.1 11:10 1.00s 0.03s 0.00s w
To show which users can establish an SSH session, run the nv show system ssh-server allow-users command. To show which users cannot establish an SSH session, run the nv show system ssh-server deny-users command. You can also show information for a specific user with the nv show system ssh-server allow-users <user> command and the nv show system ssh-server deny-users <user> command.
To show the TCP port numbers that listen for incoming SSH sessions, run the nv show system ssh-server port command. You can also show information for a specific port with the nv show system ssh-server port <port> command.
To show the SSH timer and session information, run the nv show system ssh-server max-unauthenticated command:
cumulus@switch:~$ nv show system ssh-server max-unauthenticated
applied
---------------- -------
session-count 20
throttle-percent 22
throttle-start 5
User Accounts
By default, Cumulus Linux has two user accounts: cumulus and root.
The cumulus account:
Uses the default password cumulus. You must change the default password when you log into Cumulus Linux for the first time.
Is a user account in the sudo group with sudo privileges.
Can log in to the system through all the usual channels, such as console and SSH.
Includes permissions to run NVUE nv show, nv set, nv unset, and nv apply commands.
The root account:
Has the default password disabled by default and prevents you from using SSH, telnet, FTP, and so on, to log in to the switch.
Has the standard Linux root user access to everything on the switch.
Add a New User Account
You can add additional user accounts as needed.
You control local user account access to NVUE commands by changing the group membership (role) for a user. Like the cumulus account, these accounts must be in the sudo group or include the NVUE system-admin role to execute privileged commands.
You can set a plain text password or a hashed password for the local user account. To access the switch without a password, you need to boot into single user mode.
You can provide a full name for the local user account (optional).
Default Roles
Cumulus Linux provides the following default roles:
Role
Permissions
system-admin
Allows the user to use sudo to run commands as the privileged user, run nv show commands, run nv set and nv unset commands to stage configuration changes, and run nv apply commands to apply configuration changes.
nvue-admin
Allows the user to run nv show commands, run nv set and nv unset commands to stage configuration changes, and run nv apply commands to apply configuration changes.
nvue-monitor
Allows the user to run nv show commands only.
Role
Permissions
sudo
Allows the user to use sudo to run commands as the privileged user.
nvshow
Allows the user to run nv show commands only.
nvset
Allows the user to run nv show commands, and run nv set and nv unset commands to stage configuration changes.
nvapply
Allows the user to run nv show commands, run nv set and nv unset commands to stage configuration changes, and run nv apply commands to apply configuration changes.
To add a new user account and assign the user a default role:
The following example:
Creates a new user account called admin2 and sets the role to system-admin.
Sets a plain text password. NVUE hashes the plain text password and stores the value as a hashed password. To set a hashed password, see Hashed Passwords, below.
Adds the full name FIRST LAST. If the full name includes more than one name, either separate the names with a hyphen (FIRST-LAST) or enclose the full name in quotes ("FIRST LAST").
cumulus@switch:~$ nv set system aaa user admin2 role system-admin
cumulus@switch:~$ nv set system aaa user admin2 password
Enter new password:
Confirm password:
cumulus@switch:~$ nv set system aaa user admin2 full-name "FIRST LAST"
cumulus@switch:~$ nv config apply
You can also run the nv set system aaa user <user> password <plain-text-password> command to specify the plain text password inline. This command bypasses the Enter new password and Confirm password prompts but displays the plain text password as you type it.
If you are an NVUE-managed user, you can update your own password with the Linux passwd command.
The following example:
Creates a new user account called admin2, creates a home directory for the user, and adds the full name First Last.
Securely sets the password for the user with passwd.
Sets the group membership (role) to sudo and nvapply (permissions to use sudo, nv show, nv set, and nv apply).
When you use Linux commands to add a new user, you must create a home directory for the user with the -m option. NVUE commands create a home directory automatically.
Only the following user accounts can create, modify, and delete other system-admin accounts:
NVUE-managed users with the system-admin role.
The root user.
Non NVUE-managed users that are in the sudo group.
Instead of a plain text password, you can provide a hashed password for a local user.
You must specify the hashed password in Linux crypt format; the password must be a minimum of 15 to 20 characters long and must include special characters, digits, lower case alphabetic letters, and more. Typically, the password format is set to $id$salt$hashed, where $id is the hashing algorithm. In GNU or Linux:
$1$ is MD5
$2a$ is Blowfish
$2y$ is Blowfish
$5$ is SHA-256
$6$ is SHA-512
To generate a hashed password on the switch, you can either run a python3 command or install and use the mkpasswd utility:
Run the following command on the switch or Linux host. When prompted, enter the plain text password you want to hash:
To generate a hashed password for SHA-512, SHA256, or MD5 encryption, run the following command. When prompted, enter the plain text password you want to hash:
Hashed password strings contain characters, such as $, that have a special meaning in the Linux shell; you must enclose the hashed password in single quotes (').
Delete a User Account
To delete a user account:
Run the nv unset system aaa user <user> command. The following example deletes the user account called admin2.
cumulus@switch:~$ nv unset system aaa user admin2
cumulus@switch:~$ nv config apply
Run the sudo userdel <user> command. The following example deletes the user account called admin2.
cumulus@switch:~$ sudo userdel admin2
Show User Accounts
To show the user accounts configured on the system, run the NVUE nv show system aaa command or the linux sudo cat /etc/passwd command.
cumulus@switch:~$ nv show system aaa user
Username Full-name Role enable Summary
---------------- ---------------------------------- ------- ------ -------
_apt Unknown system
_lldpd Unknown system
backup backup Unknown system
bin bin Unknown system
cumulus cumulus,,, Unknown on
daemon daemon Unknown system
dnsmasq dnsmasq,,, Unknown system
frr Frr routing suite,,, Unknown system
games games Unknown system
gnats Gnats Bug-Reporting System (admin) Unknown system
irc ircd Unknown system
list Mailing List Manager Unknown system
lp lp Unknown system
mail mail Unknown system
man man Unknown system
messagebus Unknown system
news news Unknown system
nobody nobody Unknown off
ntp Unknown system
nvue NVIDIA User Experience Unknown system
proxy proxy Unknown system
root root Unknown system
snmp Unknown system
sshd Unknown system
sync sync Unknown system
sys sys Unknown system
systemd-coredump systemd Core Dumper Unknown system
systemd-network systemd Network Management,,, Unknown system
systemd-resolve systemd Resolver,,, Unknown system
systemd-timesync systemd Time Synchronization,,, Unknown system
user1 OSPF on
user2 IFMgr on
uucp uucp Unknown system
uuidd Unknown system
To show information about a specific user account, run the NVUE nv show system aaa user <user> command:
cumulus@switch:~$ nv show system aaa user cumulus
operational applied
------------------ ----------- -------
role Unknown
full-name cumulus,,,
hashed-password *
ssh
[authorized-key]
enable on
Enable the root User
The root user does not have a password and cannot log into a switch using SSH. This default account behavior is consistent with Debian.
Enable Console Access
To log into the switch using root from the console, you must set the password for the root account:
cumulus@switch:~$ sudo passwd root
Enter new password:
...
Enable SSH Access
To log into the switch using root with SSH, either:
In addition to the default roles that Cumulus Linux provides, you can create your own roles to restrict authorization, giving you more granular control over what a user can manage on the switch. For example, you can assign a user the role of network manager and provide the user privileges for interface management, service management and system management. When the user logs in and executes an NVUE command, NVUE checks the user privileges and authorizes the user to run that command.
Custom role-based access control consists of the following elements:
Element
Description
Role
A virtual identifier for multiple classes (groups). You can assign only one role for a user. For example, for a user that can manage interfaces, you can create a role called IFMgr.
Class
A class is similar in concept to a Linux group. Creating and managing classes is the simplest way to configure multiple users simultaneously, especially when configuring permissions.A class consists of:
Command paths, which Cumulus Linux bases on the objects in the NVUE declarative model and, which are the same as URI paths; for example; you can use the /vrf/ command path to allow or deny a user access to all VRFs, or /system/nat to allow or deny a user access to NAT configuration. Use the tab key to see available command paths (nv set system aaa class <class-name> command-path / <<press tab>>).
Permissions for the command paths: (ro) to run show commands, (rw) to run set, unset, and apply commands, (act) to run action commands, or (all) to run all commands. The default permission setting is all.
Action
The action for the class: allow or deny.
You can assign a maximum of 64 classes to a role.
You can configure a maximum of 128 command paths for a class.
When you configure a command path, you allow or deny a specific schema path and its children. For example the command path /qos/ allows or denies access to QoS commands, whereas the command path /qos/egress-scheduler allows or denies access to QoS egress scheduler commands.
The following example describes the permissions for a role (role1) that consists of three classes: class1, class2, class3
class1 has the allow class action and the following command path permissions:
Command Path
Permissions
/interface/
all
/interface/*/acl/
ro
/interface/*/ptp/
ro
class2 has the allow class action and the following command path permissions:
Command Path
Permissions
/system/
ro
/vrf/
rw
class3 has the deny class action and the following command path permissions:
Command Path
Permissions
/interface/*/evpn/
rw
/interface/*/qos/
rw
The following table shows the permissions for a user assigned the role role1. In the table, R is read only (RO), W is write, and X is action (ACT).
Path
Allow
Deny
Permissions
/acl/
RWX
Implicit deny
/qos/
RWX
Implicit deny
All unspecified paths are implicit deny
/interface/
RWX
The permissions specified
/interface/* (* matches all interfaces)
RWX
Inherited from parent
/interface/*/bond/
RWX
Inherited from parent
/interface/*/ip/
RWX
Inherited from parent
All unspecified children of /interface/ inherit parent permissions
RWX
/interface/*/acl/
R
WX
The permissions specified
/interface/*/ptp/
R
WX
The permissions specified
/interface/*/evpn/
RWX
The permissions specified
/interface/*/qos/
RWX
The permissions specified
/system/
R
WX
The permissions specified
/system/aaa/
R
WX
Inherited from parent
/system/api/
R
WX
Inherited from parent
All unspecified children of /system/ inherit parent permissions
R
/vrf/
RW
X
The permissions specified
All unspecified children of /vrf/ inherit parent permissions
RW
X
Assign a Custom Role to a User Account
To assign a custom role to a user account:
Create a role and classes for the role.
Assign the action (allow or deny) for each class.
Add command paths and permissions for each class.
Assign a role to a user.
You assign a custom role to an existing user account. For information about creating user accounts, see User Accounts commands.
When you create a class, then run nv config apply, NVUE removes LDAP configuration from the /etc/nsswitch.conf file. If you are using LDAP, run the nv set system config apply ignore /etc/nsswitch.conf command before you run nv config apply to retain the LDAP configuration.
The following example creates the three classes described above for role role1.
class1 has permissions to manage all interfaces except for ACL and PTP interfaces, which only have show permissions:
cumulus@leaf01:mgmt:~$ nv set system aaa role ROLE1 class class1
cumulus@leaf01:mgmt:~$ nv set system aaa class class1 action allow
cumulus@leaf01:mgmt:~$ nv set system aaa class class1 command-path /interface/ permission all
cumulus@leaf01:mgmt:~$ nv set system aaa class class1 command-path /interface/*/acl/ permission ro
cumulus@leaf01:mgmt:~$ nv set system aaa class class1 command-path /interface/*/ptp/ permission ro
cumulus@leaf01:mgmt:~$ nv config apply
class2 has permissions to only show system commands and to set, unset, and apply VRF commands:
cumulus@leaf01:mgmt:~$ nv set system aaa role ROLE1 class class2
cumulus@leaf01:mgmt:~$ nv set system aaa class class2 action allow
cumulus@leaf01:mgmt:~$ nv set system aaa class class2 command-path /system/ permission ro
cumulus@leaf01:mgmt:~$ nv set system aaa class class2 command-path /vrf/ permission rw
cumulus@leaf01:mgmt:~$ nv config apply
class3 prevents setting, unsetting, and applying interface commands for EVPN and QOS:
cumulus@leaf01:mgmt:~$ nv set system aaa role ROLE1 class class3
cumulus@leaf01:mgmt:~$ nv set system aaa class class3 action deny
cumulus@leaf01:mgmt:~$ nv set system aaa class class3 command-path /interface/*/evpn/ permission rw
cumulus@leaf01:mgmt:~$ nv set system aaa class class3 command-path /interface/*/qos/ permission rw
cumulus@leaf01:mgmt:~$ nv config apply
The following command assigns user admin2 the role role1:
cumulus@leaf01:mgmt:~$ nv set system aaa user admin2 role role1
cumulus@leaf01:mgmt:~$ nv config apply
Delete Custom Roles
To delete a custom role and all its classes, you must first unassign the role from the user, then delete the role:
cumulus@switch:~$ nv unset system aaa user admin2 role role1
cumulus@switch:~$ nv unset system aaa role role1
cumulus@switch:~$ nv config apply
To delete a class from a role, run the nv unset system aaa role <role> class <class> command:
cumulus@switch:~$ nv unset system aaa role role1 class class2
cumulus@switch:~$ nv config apply
Show Custom Role Information
To show the user accounts configured on the system, run the NVUE nv show system aaa user command or the Linux sudo cat /etc/passwd command.
cumulus@switch:~$ nv show system aaa user
Username Full-name Role enable Summary
---------------- ---------------------------------- ------- ------ -------
_apt Unknown system
_lldpd Unknown system
backup backup Unknown system
bin bin Unknown system
cumulus cumulus,,, Unknown on
daemon daemon Unknown system
dnsmasq dnsmasq,,, Unknown system
frr Frr routing suite,,, Unknown system
games games Unknown system
gnats Gnats Bug-Reporting System (admin) Unknown system
irc ircd Unknown system
list Mailing List Manager Unknown system
lp lp Unknown system
mail mail Unknown system
man man Unknown system
messagebus Unknown system
news news Unknown system
nobody nobody Unknown off
ntp Unknown system
nvue NVIDIA User Experience Unknown system
proxy proxy Unknown system
root root Unknown system
snmp Unknown system
sshd Unknown system
sync sync Unknown system
sys sys Unknown system
systemd-coredump systemd Core Dumper Unknown system
systemd-network systemd Network Management,,, Unknown system
systemd-resolve systemd Resolver,,, Unknown system
systemd-timesync systemd Time Synchronization,,, Unknown system
admin2 role1 on
uucp uucp Unknown system
uuidd Unknown system
www-data www-data Unknown system
To show information about a specific user account including the role assigned to the user, run the NVUE nv show system aaa user <user> command:
cumulus@switch:~$ nv show system aaa user admin2
operational applied
--------- ----------- -------
role role1 role1
full-name
enable on on
To show all the roles configured on the switch, run the NVUE nv show system aaa role command:
cumulus@switch:~$ nv show system aaa role
Role Class
------------ -------
nvue-admin nvapply
nvue-monitor nvshow
role1 class1
class2
class3
system-admin nvapply
sudo
To show the classes applied to specific role, run the nv show system aaa role <role> command:
cumulus@switch:~$ nv show system aaa role role1
applied
------- -------
[class] class1
[class] class2
[class] class3
To show all the classes configured on the switch, run the nv show system aaa class command:
cumulus@switch:~$ nv show system aaa class
Class Name Command Path Permission Action
---------- ------------------ ---------- ------
class1 /interface/ all allow
/interface/*/acl/ ro
/interface/*/ptp/ ro
class2 /system/ ro allow
/vrf/ rw
class3 /interface/*/evpn/ rw deny
/interface/*/qos/ rw
nvapply / all allow
nvshow / ro allow
sudo / all allow
To show the configuration and state of the command paths for a class, run the nv show system aaa class <class> command:
cumulus@switch:~$ nv show system aaa class class3
applied
-------------- ------------------
action deny
[command-path] /interface/*/evpn/
[command-path] /interface/*/qos/
Using sudo to Delegate Privileges
By default, Cumulus Linux has two user accounts: root and cumulus. The cumulus account is a normal user and is in the group sudo.
You can add more user accounts as needed. Like the cumulus account, these accounts must use sudo to execute privileged commands.
sudo Basics
sudo allows you to execute a command as superuser or another user as specified by the security policy.
The default security policy is sudoers, which you configure in the /etc/sudoers file. Use /etc/sudoers.d/ to add to the default sudoers policy.
Use visudo only to edit the sudoers file; do not use another editor like vi or emacs.
When creating a new file in /etc/sudoers.d, use visudo -f. This option performs sanity checks before writing the file to avoid errors that prevent sudo from working.
Errors in the sudoers file can result in losing the ability to elevate privileges to root. You can fix this issue only by power cycling the switch and booting into single user mode. Before modifying sudoers, enable the root user by setting a password for the root user.
By default, users in the sudo group can use sudo to execute privileged commands. To add users to the sudo group, use the useradd(8) or usermod(8) command. To see which users belong to the sudo group, see /etc/group (man group(5)).
You can run any command as sudo, including su. You must enter a password.
The example below shows how to use sudo as a non-privileged user cumulus to bring up an interface:
cumulus@switch:~$ ip link show dev swp1
3: swp1: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast master br0 state DOWN mode DEFAULT qlen 500
link/ether 44:38:39:00:27:9f brd ff:ff:ff:ff:ff:ff
cumulus@switch:~$ ip link set dev swp1 up
RTNETLINK answers: Operation not permitted
cumulus@switch:~$ sudo ip link set dev swp1 up
Password:
umulus@switch:~$ ip link show dev swp1
3: swp1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master br0 state UP mode DEFAULT qlen 500
link/ether 44:38:39:00:27:9f brd ff:ff:ff:ff:ff:ff
sudoers Examples
The following examples show how you grant as few privileges as necessary to a user or group of users to allow them to perform the required task. Each example uses the system group noc; groups include the prefix %.
When an unprivileged user runs a command, the command must include the sudo prefix.
Cumulus Linux uses Pluggable Authentication Modules (PAM) and Name Service Switch (NSS) for user authentication. NSS enables PAM to use LDAP to provide user authentication, group mapping, and information for other services on the system.
NSS specifies the order of the information sources that resolve names for each service. Using NSS with authentication and authorization provides the order and location for user lookup and group mapping on the system.
PAM handles the interaction between the user and the system, providing login handling, session setup, authentication of users, and authorization of user actions.
To configure LDAP authentication on Linux, you can use libnss-ldap, libnss-ldapd, or libnss-sss. This chapter describes libnss-ldapd only. From internal testing, this library worked best with Cumulus Linux and is the easiest to configure, automate, and troubleshoot.
Install libnss-ldapd
The libldap-2.4-2 and libldap-common LDAP packages are already installed on the Cumulus Linux image; however you need to install these additional packages to use LDAP authentication:
libnss-ldapd
libpam-ldapd
ldap-utils
To install the additional packages, run the following command:
You can also install these packages even if the switch does not connect to the internet, as they are in the cumulus-local-apt-archive repository that is embedded in the Cumulus Linux image.
Follow the interactive prompts to specify the LDAP URI, search base distinguished name (DN), and services that must have LDAP lookups enabled. You need to select at least the passwd, group, and shadow services (press space to select a service). When done, select OK. This creates a basic LDAP configuration using anonymous bind and initiates user search under the base DN specified.
After the dialog closes, the install process prints information similar to the following:
/etc/nsswitch.conf: enable LDAP lookups for group
/etc/nsswitch.conf: enable LDAP lookups for passwd
/etc/nsswitch.conf: enable LDAP lookups for shadow
After the installation is complete, the name service caching daemon (nslcd) runs. This service handles all the LDAP protocol interactions and caches information that returns from the LDAP server. nslcd appends ldap to the /etc/nsswitch.conf file, as well as the secondary information source for passwd, group, and shadow. nslcd references the local files (/etc/passwd, /etc/groups and /etc/shadow) first, as specified by the compat source.
Keep compat as the first source in NSS for passwd, group, and shadow. This prevents you from getting locked out of the system.
Entering incorrect information during the installation process produces configuration errors. You can correct the information after installation by editing certain configuration files.
Edit the /etc/nslcd.conf file to update the LDAP URI and search base DN (see Update the nslcd.conf File, below).
Edit the /etc/nssswitch.conf file to update the service selections.
Be sure to restart nvued.service after editing the files.
▼
Alternative Installation Method Using debconf-utils
Instead of running the installer and following the interactive prompts, as described above, you can pre-seed the installer parameters using debconf-utils.
Run apt-get install debconf-utils and create the pre-seeded parameters using debconf-set-selections. Provide the appropriate answers.
Run debconf-show <pkg> to check the settings. Here is an example of how to pre-seed answers to the installer questions using debconf-set-selections:
root# debconf-set-selections <<'zzzEndOfFilezzz'
# LDAP database user. Leave blank will be populated later!
nslcd nslcd/ldap-binddn string
# LDAP user password. Leave blank!
nslcd nslcd/ldap-bindpw password
# LDAP server search base:
nslcd nslcd/ldap-base string ou=support,dc=rtp,dc=example,dc=test
# LDAP server URI. Using ldap over ssl.
nslcd nslcd/ldap-uris string ldaps://myadserver.rtp.example.test
# New to 0.9. restart cron, exim and others libraries without asking
nslcd libraries/restart-without-asking: boolean true
# LDAP authentication to use:
# Choices: none, simple, SASL
# Using simple because its easy to configure. Security comes by using LDAP over SSL
# keep /etc/nslcd.conf 'rw' to root for basic security of bindDN password
nslcd nslcd/ldap-auth-type select simple
# Don't set starttls to true
nslcd nslcd/ldap-starttls boolean false
# Check server's SSL certificate:
# Choices: never, allow, try, demand
nslcd nslcd/ldap-reqcert select never
# Choices: Ccreds credential caching - password saving, Unix authentication, LDAP Authentication , Create home directory on first time login, Ccreds credential caching - password checking
# This is where "mkhomedir" pam config is activated that allows automatic creation of home directory
libpam-runtime libpam-runtime/profiles multiselect ccreds-save, unix, ldap, mkhomedir , ccreds-check
# for internal use; can be preseeded
man-db man-db/auto-update boolean true
# Name services to configure:
# Choices: aliases, ethers, group, hosts, netgroup, networks, passwd, protocols, rpc, services, shadow
libnss-ldapd libnss-ldapd/nsswitch multiselect group, passwd, shadow
libnss-ldapd libnss-ldapd/clean_nsswitch boolean false
## define platform specific libnss-ldapd debconf questions/answers.
## For demo used amd64.
libnss-ldapd:amd64 libnss-ldapd/nsswitch multiselect group, passwd, shadow
libnss-ldapd:amd64 libnss-ldapd/clean_nsswitch boolean false
# libnss-ldapd:powerpc libnss-ldapd/nsswitch multiselect group, passwd, shadow
# libnss-ldapd:powerpc libnss-ldapd/clean_nsswitch boolean false
Update the nslcd.conf File
After installation, update the main configuration file (/etc/nslcd.conf) to accommodate the expected LDAP server settings.
This section documents some of the more important options that relate to security and queries. For details on all the available configuration options, read the nslcd.conf man page.
After first editing the /etc/nslcd.conf file and/or enabling LDAP in the /etc/nsswitch.conf file, you must restart netd with the sudo systemctl restart netd command. If you disable LDAP, you need to restart the netd service.
Connection
The LDAP client starts a session by connecting to the LDAP server on TCP and UDP port 389 or on port 636 for LDAPS. Depending on the configuration, this connection establishes without authentication (anonymous bind); otherwise, the client must provide a bind user and password. The variables you use to define the connection to the LDAP server are the URI and bind credentials.
The URI is mandatory and specifies the LDAP server location using the FQDN or IP address. The URI also designates whether to use ldap:// for clear text transport, or ldaps:// for SSL/TLS encrypted transport. You can also specify an alternate port in the URI. In production environments, use the LDAPS protocol so that all communications are secure.
After the connection to the server is complete, the BIND operation authenticates the session. The BIND credentials are optional; if you do not specify the credentials, the switch assumes an anonymous bind. Configure authenticated (Simple) BIND by specifying the user (binddn) and password (bindpw) in the configuration. Another option is to use SASL (Simple Authentication and Security Layer) BIND, which provides authentication services using other mechanisms, like Kerberos. Contact your LDAP server administrator for this information as it depends on the configuration of the LDAP server and the credentials for the client device.
# The location at which the LDAP server(s) should be reachable.
uri ldaps://ldap.example.com
# The DN to bind with for normal lookups.
binddn cn=CLswitch,ou=infra,dc=example,dc=com
bindpw CuMuLuS
Search Function
When an LDAP client requests information about a resource, it must connect and bind to the server. Then, it performs one or more resource queries depending on the lookup. All search queries to the LDAP server use the configured search base, filter, and the desired entry (uid=myuser). If the LDAP directory is large, this search takes a long time. Define a more specific search base for the common maps (passwd and group).
# The search base that will be used for all queries.
base dc=example,dc=com
# Mapped search bases to speed up common queries.
base passwd ou=people,dc=example,dc=com
base group ou=groups,dc=example,dc=com
Search Filters
To limit the search scope when authenticating users, use search filters to specify criteria when searching for objects within the directory. The default filters applied are:
filter passwd (objectClass=posixAccount)
filter group (objectClass=posixGroup)
Attribute Mapping
The map configuration allows you to override the attributes pushed from LDAP. To override an attribute for a given map, specify the attribute name and the new value. This is useful to ensure that the shell is bash and the home directory is /home/cumulus:
In LDAP, the map refers to one of the supported maps specified in the manpage for nslcd.conf (such as passwd or group).
Create Home Directory on Login
If you want to use unique home directories, run the sudo pam-auth-update command and select Create home directory on login in the PAM configuration dialog (press the space bar to select the option). Select OK, then press Enter to save the update and close the dialog.
cumulus@switch:~$ sudo pam-auth-update
The home directory for any user that logs in (using LDAP or not) populates with the standard dotfiles from /etc/skel.
When nslcd starts, an error message similar to the following (where 5816 is the nslcd PID) sometimes appears:
nslcd[5816]: unable to dlopen /usr/lib/x86_64-linux-gnu/sasl2/libsasldb.so: libdb-5.3.so: cannot open
shared object file: No such file or directory
You can ignore this message. The libdb package and resulting log messages from nslcd do not cause any issues when you use LDAP as a client for login and authentication.
Example Configuration
Here is an example configuration using Cumulus Linux.
# /etc/nslcd.conf
# nslcd configuration file. See nslcd.conf(5)
# for details.
# The user and group nslcd should run as.
uid nslcd
gid nslcd
# The location at which the LDAP server(s) should be reachable.
uri ldaps://myadserver.rtp.example.test
# The search base that will be used for all queries.
base ou=support,dc=rtp,dc=example,dc=test
# The LDAP protocol version to use.
#ldap_version 3
# The DN to bind with for normal lookups.
# defconf-set-selections doesn't seem to set this. so have to manually set this.
binddn CN=cumulus admin,CN=Users,DC=rtp,DC=example,DC=test
bindpw 1Q2w3e4r!
# The DN used for password modifications by root.
#rootpwmoddn cn=admin,dc=example,dc=com
# SSL options
#ssl off (default)
# Not good does not prevent man in the middle attacks
#tls_reqcert demand(default)
tls_cacertfile /etc/ssl/certs/rtp-example-ca.crt
# The search scope.
#scope sub
# Add nested group support
# Supported in nslcd 0.9 and higher.
# default wheezy install of nslcd supports on 0.8. wheezy-backports has 0.9
nss_nested_groups yes
# Mappings for Active Directory
# (replace the SIDs in the objectSid mappings with the value for your domain)
# "dsquery * -filter (samaccountname=testuser1) -attr ObjectSID" where cn == 'testuser1'
pagesize 1000
referrals off
idle_timelimit 1000
# Do not allow uids lower than 100 to login (aka Administrator)
# not needed as pam already has this support
# nss_min_uid 1000
# This filter says to get all users who are part of the cumuluslnxadm group. Supports nested groups.
# Example, mary is part of the snrnetworkadm group which is part of cumuluslnxadm group
# Ref: http://msdn.microsoft.com/en-us/library/aa746475%28VS.85%29.aspx (LDAP_MATCHING_RULE_IN_CHAIN)
filter passwd (&(Objectclass=user)(!(objectClass=computer))(memberOf:1.2.840.113556.1.4.1941:=cn=cumuluslnxadm,ou=groups,ou=support,dc=rtp,dc=example,dc=test))
map passwd uid sAMAccountName
map passwd uidNumber objectSid:S-1-5-21-1391733952-3059161487-1245441232
map passwd gidNumber objectSid:S-1-5-21-1391733952-3059161487-1245441232
map passwd homeDirectory "/home/$sAMAccountName"
map passwd gecos displayName
map passwd loginShell "/bin/bash"
# Filter for any AD group or user in the baseDN. the reason for filtering for the
# user to make sure group listing for user files don't say '<user> <gid>'. instead will say '<user> <user>'
# So for cosmetic reasons..nothing more.
filter group (&(|(objectClass=group)(Objectclass=user))(!(objectClass=computer)))
map group gidNumber objectSid:S-1-5-21-1391733952-3059161487-1245441232
map group cn sAMAccountName
Configure LDAP Authorization
Linux uses the sudo command to allow non-administrator users (such as the default cumulus user account) to perform privileged operations. To control the users that can use sudo, define a series of rules in the /etc/sudoers file and files in the /etc/sudoers.d/ directory. The rules apply to groups but you can also define specific users. You can add sudo rules using the group names from LDAP. For example, if a group of users are in the group netadmin, you can add a rule to give those users sudo privileges. Refer to the sudoers manual (man sudoers) for a complete usage description. The following shows an example in the /etc/sudoers file:
# The basic structure of a user specification is "who where = (as_whom) what ".
%sudo ALL=(ALL:ALL) ALL
%netadmin ALL=(ALL:ALL) ALL
Active Directory Configuration
Active Directory (AD) is a fully featured LDAP-based NIS server create by Microsoft. It offers unique features that classic OpenLDAP servers do not have. AD can be more complicated to configure on the client and each version works a little differently with Linux-based LDAP clients. Some more advanced configuration examples, from testing LDAP clients on Cumulus Linux with Active Directory (AD/LDAP), are available in the knowledge base.
LDAP Verification Tools
The LDAP client daemon retrieves and caches password and group information from LDAP. To verify the LDAP interaction, use these command-line tools to trigger an LDAP query from the device.
Identify a User with the id Command
The id command performs a username lookup by following the lookup information sources in NSS for the passwd service. This returns the user ID, group ID and the group list retrieved from the information source. In the following example, the user cumulus is locally defined in /etc/passwd, and myuser is on LDAP. The NSS configuration has the passwd map configured with the sources compat ldap:
cumulus@switch:~$ id cumulus
uid=1000(cumulus) gid=1000(cumulus) groups=1000(cumulus),24(cdrom),25(floppy),27(sudo),29(audio),30(dip),44(video),46(plugdev)
cumulus@switch:~$ id myuser
uid=1230(myuser) gid=3000(Development) groups=3000(Development),500(Employees),27(sudo)
getent
The getent command retrieves all records found with NSS for a given map. It can also retrieve a specific entry under that map. You can perform tests with the passwd, group, shadow, or any other map in the /etc/nsswitch.conf file. The output from this command formats according to the map requested. For the passwd service, the structure of the output is the same as the entries in /etc/passwd. The group map outputs the same structure as /etc/group.
In this example, looking up a specific user in the passwd map, the user cumulus is locally defined in /etc/passwd, and myuser is only in LDAP.
In the next example, looking up a specific group in the group service, the group cumulus is locally defined in /etc/groups, and netadmin is on LDAP.
cumulus@switch:~$ getent group cumulus
cumulus:x:1000:
cumulus@switch:~$ getent group netadmin
netadmin:*:502:larry,moe,curly,shemp
Running the command getent passwd or getent group without a specific request returns all local and LDAP entries for the passwd and group maps.
LDAP search
The ldapsearch command performs LDAP operations directly on the LDAP server. This does not interact with NSS. This command displays the information that the LDAP daemon process receives back from the server. The command has several options. The simplest option uses anonymous bind to the host and specifies the search DN and the attribute to look up.
When setting up LDAP authentication for the first time, turn off the nslcd service using the systemctl stop nslcd.service command (or the systemctl stop nslcd@mgmt.service if you are running the service in a management VRF) and run it in debug mode. Debug mode works whether you are using LDAP over SSL (port 636) or an unencrypted LDAP connection (port 389).
The FQDN of the LDAP server URI does not match the FQDN in the CA-signed server certificate.
nslcd cannot read the SSL certificate and reports a Permission denied error in the debug during server connection negotiation. Check the permission on each directory in the path of the root SSL certificate. Ensure that it is readable by the nslcd user.
NSCD
If the nscd cache daemon is also enabled and you make some changes to the user from LDAP, you can clear the cache using the following commands:
nscd --invalidate = passwd
nscd --invalidate = group
The nscd package works with nslcd to cache name entries returned from the LDAP server. This sometimes causes authentication failures. To work around these issues, disable nscd, restart the nslcd service, then retry authentication:
If you are running the nslcd service in a management VRF, you need to run the systemctl restart nslcd@mgmt.service command instead of the systemctl restart nslcd.service command. For example:
Cumulus Linux implements TACACS+ client AAA in a transparent way with minimal configuration. The client implements the TACACS+ protocol as described in this IETF document. There is no need to create accounts or directories on the switch. Accounting records go to all configured TACACS+ servers by default. Using per-command authorization requires additional setup on the switch.
TACACS+ in Cumulus Linux:
Uses PAM authentication and includes login, ssh, sudo and su.
Allows users with privilege level 15 to run any command with sudo.
Allows users with privilege level 15 to run NVUE nv set, nv unset, and nv apply commands in addition to nv show commands. TACACS+ users with a lower privilege level can only execute nv show commands.
Supports up to seven TACACS+ servers. Be sure to configure your TACACS+ servers in addition to the TACACS+ client. Refer to your TACACS+ server documentation.
TACACS+ Client Packages
NVUE automatically installs the TACACS+ packages; you do not have to install the packages if you use NVUE commands to configure TACACS+.
If you use Linux commands to configure TACACS+, you must install the TACACS+ packages. You can install the TACACS+ packages even if the switch is not connected to the internet; the packages are in the cumulus-local-apt-archive repository in the Cumulus Linux image.
To install all required packages, run these commands:
Configure the following required settings on the switch (the TACACS+ client).
Set the IP address or hostname of at least one TACACS+ server.
Set the secret (key) shared between the TACACS+ server and client.
Set the VRF you want to use to communicate with the TACACS+ server. This is typically the management VRF (mgmt), which is the default VRF on the switch.
If you use NVUE commands to configure TACACS+, you must also set the priority for the authentication order for local and TACACS+ users, and enable TACACS+.
After you configure any TACACS+ settings with NVUE and you run nv config apply, you must restart the NVUE service with the sudo systemctl restart nvued.service command.
NVUE commands require you to specify the priority for each TACACS+ server. You must set a priority even if you only specify one server.
The following example commands set:
The TACACS+ server priority to 5.
The IP address of the server to 192.168.0.30.
The secret to mytacac$key.
If you include special characters in the password (such as $), you must enclose the password in single quotes (').
The VRF to mgmt.
The authentication order so that TACACS+ authentication has priority over local (the lower number has priority).
TACACS+ to enabled.
cumulus@switch:~$ nv set system aaa tacacs server 5 host 192.168.0.30
cumulus@switch:~$ nv set system aaa tacacs server 5 secret 'mytacac$key'
cumulus@switch:~$ nv set system aaa tacacs vrf mgmt
cumulus@switch:~$ nv set system aaa authentication-order 5 tacacs
cumulus@switch:~$ nv set system aaa authentication-order 10 local
cumulus@switch:~$ nv set system aaa tacacs enable on
cumulus@switch:~$ nv config apply
If you want the server to use IPv6, you must add the nv set system aaa tacacs server <priority> prefer-ip-version 6 command:
cumulus@switch:~$ nv set system aaa tacacs server 5 host server5
cumulus@switch:~$ nv set system aaa tacacs server 5 prefer-ip-version 6
...
If you configure more than one TACACS+ server, you need to set the priority for each server. If the switch cannot establish a connection with the server that has the highest priority, it tries to establish a connection with the next highest priority server. The server with the lower number has the higher prioritity. In the example below, server 192.168.0.30 with a priority value of 5 has a higher priority than server 192.168.1.30, which has a priority value of 10.
cumulus@switch:~$ nv set system aaa tacacs server 5 host 192.168.0.30
cumulus@switch:~$ nv set system aaa tacacs server 5 secret 'mytacac$key'
cumulus@switch:~$ nv set system aaa tacacs server 10 host 192.168.1.30
cumulus@switch:~$ nv set system aaa tacacs server 10 secret 'mytacac$key2'
cumulus@switch:~$ nv config apply
Edit the /etc/tacplus_servers file to add at least one server and one shared secret (key). You can specify the server and secret parameters in any order anywhere in the file. Whitespace (spaces or tabs) are not allowed. For example, if your TACACS+ server IP address is 192.168.0.30 and your shared secret is tacacskey, add these parameters to the /etc/tacplus_servers file:
Cumulus Linux supports a maximum of seven TACACS+ servers. To specify multiple servers, add one per line to the /etc/tacplus_servers file. Connections establish in the order in the file.
# If the management network is in a vrf, set this variable to the vrf name.
# This would usually be "mgmt"
# When this variable is set, the connection to the TACACS+ accounting servers
# will be made through the named vrf.
vrf=mgmt
Restart auditd:
cumulus@switch:~$ sudo systemctl restart auditd
Optional TACACS+ Configuration
You can configure the following optional TACACS+ settings:
The port to use for communication between the TACACS+ server and client. By default, Cumulus Linux uses IP port 49.
The TACACS timeout value, which is the number of seconds to wait for a response from the TACACS+ server before trying the next TACACS+ server. You can specify a value between 0 and 60. The default is 5 seconds.
The source IP address to use when communicating with the TACACS+ server so that the server can identify the client switch. You must specify an IPv4 address, which must be valid for the interface you use. This source IP address is typically the loopback address on the switch.
The TACACS+ authentication type. You can specify PAP to send clear text between the user and the server, CHAP to establish a PPP connection between the user and the server, or login. The default is PAP.
The users you do not want to send to the TACACS+ server for authentication; for example, local user accounts that exist on the switch, such as the cumulus user.
A separate home directory for each TACACS+ user when the TACACS+ user first logs in. By default, the switch uses the home directory in the mapping accounts in /etc/passwd. If the home directory does not exist, the mkhomedir_helper program creates it. This option does not apply to accounts with restricted shells (users mapped to a TACACS privilege level that has enforced per-command authorization).
The following example commands set the timeout to 10 seconds and the TACACS+ server port to 32:
cumulus@switch:~$ nv set system aaa tacacs timeout 10
cumulus@switch:~$ nv set system aaa tacacs server 5 port 32
cumulus@switch:~$ nv config apply
The following example commands set the source IP address to 10.10.10.1 and the authentication type to CHAP:
cumulus@switch:~$ nv set system aaa tacacs source-ip 10.10.10.1
cumulus@switch:~$ nv set system aaa tacacs authentication mode chap
cumulus@switch:~$ nv config apply
The following example commands exclude the user USER1 from going to the TACACS+ server for authentication and enables Cumulus Linux to create a separate home directory for each TACACS+ user when the TACACS+ user first logs in:
cumulus@switch:~$ nv set system aaa tacacs exclude-user USER1
cumulus@switch:~$ nv set system aaa tacacs authentication per-user-homedir on
cumulus@switch:~$ nv config apply
To set the server port (use the format server:port), source IP address, authentication type, and enable Cumulus Linux to create a separate home directory for each TACACS+ user, edit the /etc/tacplus_servers file, then restart auditd.
To set the timeout and the usernames to exclude from TACACS+ authentication, edit the /etc/tacplus_nss.conf file (you do not need to restart auditd).
The following example sets the server port to 32, the authentication type to CHAP, the source IP address to 10.10.10.1, and enables Cumulus Linux to create a separate home directory for each TACACS+ user when the TACACS+ user first logs in:
cumulus@switch:~$ sudo nano /etc/tacplus_servers
...
secret=mytacac$key
server=192.168.0.30:32
...
# Sets the IPv4 address used as the source IP address when communicating with
# the TACACS+ server. IPv6 addresses are not supported, nor are hostnames.
# The address must work when passsed to the bind() system call, that is, it must
# be valid for the interface being used.
source_ip=10.10.10.1
...
# If user_homedir=1, then tacacs users will be set to have a home directory
# based on their login name, rather than the mapped tacacsN home directory.
# mkhomedir_helper is used to create the directory if it does not exist (similar
# to use of pam_mkhomedir.so). This flag is ignored for users with restricted
# shells, e.g., users mapped to a tacacs privilege level that has enforced
# per-command authorization (see the tacplus-restrict man page).
user_homedir=1
...
login=chap
cumulus@switch:~$ sudo systemctl restart auditd
The following example sets the timeout to 10 seconds and excludes the user USER1 from going to the TACACS+ server for authentication:
cumulus@switch:~$ sudo nano /etc/tacplus_nss.conf
...
# The connection timeout for an NSS library should be short, since it is
# invoked for many programs and daemons, and a failure is usually not
# catastrophic. Not set or set to a negative value disables use of poll().
# This follows the include of tacplus_servers, so it can override any
# timeout value set in that file.
# It's important to have this set in this file, even if the same value
# as in tacplus_servers, since tacplus_servers should not be readable
# by users other than root.
timeout=10
...
# This is a comma separated list of usernames that are never sent to
# a tacacs server, they cause an early not found return.
#
# "*" is not a wild card. While it's not a legal username, it turns out
# that during pathname completion, bash can do an NSS lookup on "*"
# To avoid server round trip delays, or worse, unreachable server delays
# on filename completion, we include "*" in the exclusion list.
exclude_users=root,daemon,nobody,cron,radius_user,radius_priv_user,sshd,cumulus,quagga,frr,snmp,www-data,ntp,man,_lldpd,USER1,*
Cumulus Linux supports the following additional Linux parameters in the etc/tacplus_nss.conf file. Currently, there are no equivalent NUVE commands.
Linux Parameter
Description
include
Configures a supplemental configuration file to avoid duplicating configuration information. You can include up to eight additional configuration files. For example: include=/myfile/myname.
min_uid
Configures the minimum user ID that the NSS plugin can look up. 0 specifies that the plugin never looks up uid 0 (root). Do not specify a value greater than the local TACACS+ user IDs (0 through 15).
TACACS+ Accounting
When you install the TACACS+ packages and configure the basic TACACS+ settings (set the server and shared secret), accounting is on and there is no additional configuration required.
TACACS+ accounting uses the audisp module, with an additional plugin for auditd and audisp. The plugin maps the auid in the accounting record to a TACACS login, which it bases on the auid and sessionid. The audisp module requires libnss_tacplus and uses the libtacplus_map.so library interfaces as part of the modified libpam_tacplus package.
Communication with the TACACS+ servers occurs with the libsimple-tacact1 library, through dlopen(). A maximum of 240 bytes of command name and arguments send in the accounting record, due to the TACACS+ field length limitation of 255 bytes.
All sudo commands run by TACACS+ users generate accounting records against the original TACACS+ login name.
All Linux and NVUE commands result in an accounting record, including login commands and sub-processes of other commands. This can generate a lot of accounting records.
By default, Cumulus Linux sends accounting records to all servers. You can change this setting to send accounting records to the server that is first to respond:
cumulus@switch:~$ nv set system aaa tacacs accounting send-records first-response
cumulus@switch:~$ nv config apply
To reset to the default configuration (send accounting records to all servers), run the nv set system aaa tacacs accounting send-records all command.
Edit the /etc/audisp/audisp-tac_plus.conf file and change the acct_all parameter to 0:
To reset to the default configuration (send accounting records to all servers), change the value of acct_all to 1 (acct_all=1).
To disable TACACS+ accounting:
cumulus@switch:~$ nv set system aaa tacacs accounting enable off
cumulus@switch:~$ nv config apply
Edit the /etc/audisp/plugins.d/audisp-tacplus.conf file and change the active parameter to no:
cumulus@switch:~$ sudo nano /etc/audisp/plugins.d/audisp-tacplus.conf
...
# default to enabling tacacs accounting; change to no to disable
active = no
Restart auditd:
cumulus@switch:~$ sudo systemctl restart auditd
Local Fallback Authentication
You can configure the switch to allow local fallback authentication for a user when the TACACS servers are unreachable, do not include the user for authentication, or have the user in the exclude user list.
To allow local fallback authentication for a user, add a local privileged user account on the switch with the same username as a TACACS user. A local user is always active even when the TACACS service is not running.
NVUE does not provide commands to configure local fallback authentication.
To configure local fallback authentication:
Edit the /etc/nsswitch.conf file to remove the keyword tacplus from the line starting with passwd. (You need to add the keyword back in step 3.)
The following example shows the /etc/nsswitch.conf file with no tacplus keyword in the line starting with passwd.
cumulus@switch:~$ sudo nano /etc/nsswitch.conf
#
# Example configuration of GNU Name Service Switch functionality.
# If you have the `glibc-doc-reference' and `info' packages installed, try:
# `info libc "Name Service Switch"' for information about this file.
passwd: files
group: tacplus files
shadow: files
gshadow: files
...
To enable the local privileged user to run sudo and NVUE commands, run the adduser commands shown below. In the example commands, the TACACS account name is tacadmin.
The first adduser command prompts for information and a password. You can skip most of the requested information by pressing ENTER.
Edit the /etc/nsswitch.conf file to add the keyword tacplus back to the line starting with passwd (the keyword you removed in the first step).
cumulus@switch:~$ sudo nano /etc/nsswitch.conf
#
# Example configuration of GNU Name Service Switch functionality.
# If you have the `glibc-doc-reference' and `info' packages installed, try:
# `info libc "Name Service Switch"' for information about this file.
passwd: tacplus files
group: tacplus files
shadow: files
gshadow: files
...
Restart the nvued service with the following command:
cumulus@switch:~$ sudo systemctl restart nvued
TACACS+ Per-command Authorization
TACACS+ per-command authorization lets you configure the commands that TACACS+ users at different privilege levels can run.
To reach the TACACS+ server through the default VRF, you must specify the egress interface you use in the default VRF. Either run the NVUE nv set system aaa tacacs vrf <interface> command (for example, nv set system aaa tacacs vrf swp51) or set the vrf=<interface> option in the /etc/tacplus_servers file (for example, vrf=swp51).
The following command allows TACACS+ users at privilege level 0 to run the nv and ip commands (if authorized by the TACACS+ server):
cumulus@switch:~$ nv set system aaa tacacs authorization 0 command ip
cumulus@switch:~$ nv set system aaa tacacs authorization 0 command nv
cumulus@switch:~$ nv config apply
To show the per-command authorization settings, run the nv show system aaa tacacs authorization command:
cumulus@switch:~$ nv show system aaa tacacs authorization
Privilege Level role command
--------------- ------------ -------
0 nvue-monitor ip
nv
tacuser0@switch:~$ sudo tacplus-restrict -i -u tacacs0 -a ip nv
The tacplus-auth command handles authorization for each command. To make this an enforced authorization, change the TACACS+ log in to use a restricted shell, with a very limited executable search path. Otherwise, the user can bypass the authorization. The tacplus-restrict utility simplifies setting up the restricted environment.
The following table provides the tacplus-restrict command options:
Option
Description
-i
Initializes the environment. You only need to issue this option one time per username.
-a
You can invoke the utility with the -a option as often as you like. For each command in the -a list, the utility creates a symbolic link from tacplus-auth to the relative portion of the command name in the local bin subdirectory. You also need to enable these commands on the TACACS+ server (refer to your TACACS+ server documentation). It is common for the server to allow some options to a command, but not others.
-f
Re-initializes the environment. If you need to restart, run the -f option with -i to force re-initialization; otherwise, the utility ignores repeated use of -i. During initialization: - The user shell changes to /bin/rbash. - The utility saves any existing dot files.
After running this command, examine the tacacs0 directory::
cumulus@switch:~$ sudo ls -lR ~tacacs0
total 12
lrwxrwxrwx 1 root root 22 Nov 21 22:07 ip -> /usr/sbin/tacplus-auth
lrwxrwxrwx 1 root root 22 Nov 21 22:07 nv -> /usr/sbin/tacplus-auth
Except for shell built-ins, privilege level 0 TACACS users can only run the ip and nv commands.
If you add commands with the -a option by mistake, you can remove them. The example below removes the nv command:
cumulus@switch:~$ sudo rm ~tacacs0/bin/nv
To remove all commands:
cumulus@switch:~$ sudo rm ~tacacs0/bin/*
Remove the TACACS+ Client Packages
To remove all the TACACS+ client packages, use the following commands:
Run the following commands to show TACACS+ configuration:
To show all TACACS+ configuration (NVUE hides server secret keys), run the nv show aaa tacacs command.
To show TACACS+ authentication configuration , run the nv show system aaa tacacs authentication command.
To show TACACS+ accounting configuration , run the nv show system aaa tacacs accounting command.
To show TACACS+ server configuration, run the nv show system aaa tacacs server command.
To show TACACS+ server priority configuration, run the nv show system aaa tacacs server <priority-id> command.
To show the list of users excluded from TACACS+ server authentication, run the nv show system aaa tacacs exclude-user command.
The following example command shows all TACACS+ configuration:
cumulus@switch:~$ nv show system aaa tacacs
applied
------------------ -------
enable off
debug-level 0
timeout 5
vrf mgmt
accounting
enable off
authentication
mode pap
per-user-homedir off
[server] 5
[server] 10
The following command shows the list of users excluded from TACACS+ server authentication:
cumulus@switch:~$ nv show system aaa tacacs exclude-user
applied
-------- -------
username USER1
Basic Server Connectivity or NSS Issues
You can use the getent command to determine if you configured TACACS+ correctly and if the local password is in the configuration files. In the example commands below, the cumulus user represents the local user, while cumulusTAC represents the TACACS user.
To look up the username within all NSS methods:
cumulus@switch:~$ sudo getent passwd cumulusTAC
cumulusTAC:x:1016:1001:TACACS+ mapped user at privilege level 15,,,:/home/tacacs15:/bin/bash
To look up the user within the local database only:
To look up the user within the TACACS+ database only:
cumulus@switch:~$ sudo getent -s tacplus passwd cumulusTAC
cumulusTAC:x:1016:1001:TACACS+ mapped user at privilege level 15,,,:/home/tacacs15:/bin/bash
If TACACS+ is not working correctly, you can use debugging. Add the debug=1 parameter to the /etc/tacplus_servers and /etc/tacplus_nss.conf files; see the Linux Commands under Optional TACACS+ Configuration above. You can also add debug=1 to individual pam_tacplus lines in /etc/pam.d/common*.
All log messages are in /var/log/syslog.
Incorrect Shared Key
The TACACS client on the switch and the TACACS server must have the same shared secret key. If this key is incorrect, the following message prints to syslog:
2017-09-05T19:57:00.356520+00:00 leaf01 sshd[3176]: nss_tacplus: TACACS+ server 192.168.0.254:49 read failed with protocol error (incorrect shared secret?) user cumulus
Debug Issues with Per-command Authorization
To debug TACACS user command authorization, have the TACACS+ user enter the following command at a shell prompt, then try the command again:
tacuser0@switch:~$ export TACACSAUTHDEBUG=1
When you enable debugging, the command authorization conversation with the TACACS+ server shows additional information.
To disable debugging:
tacuser0@switch:~$ export -n TACACSAUTHDEBUG
Debug Issues with Accounting Records
If you add or delete TACACS+ servers from the configuration files, make sure you notify the audisp plugin with this command:
If accounting records do not send, add debug=1 to the /etc/audisp/audisp-tac_plus.conf file, then run the command above to notify the plugin. Ask the TACACS+ user to run a command and examine the end of /var/log/syslog for messages from the plugin. You can also check the auditing log file /var/log/audit audit.log to be sure the auditing records exist. If the auditing records do not exist, restart the audit daemon with:
Cumulus Linux uses the following packages for TACACS.
Package
Description
audisp-tacplus
Uses auditing data from auditd to send accounting records to the TACACS+ server and starts as part of auditd.
libtac2
Provides basic TACACS+ server utility and communication routines.
libnss-tacplus
Provides an interface between libc username lookups, the mapping functions, and the TACACS+ server.
tacplus-auth
Includes the tacplus-restrict setup utility, which enables you to perform per-command TACACS+ authorization. Per-command authorization is not the default.
libpam-tacplus
Provides a modified version of the standard Debian package.
libtacplus-map1
Provides mapping between local and TACACS+ users on the server. The package:- Sets the immutable sessionid and auditing UID to ensure that you can track the original user through multiple processes and privilege changes.- Sets the auditing loginuid as immutable.- Creates and maintains a status database in /run/tacacs_client_map to manage and lookup mappings.
libsimple-tacacct1
Provides an interface for programs to send accounting records to the TACACS+ server. audisp-tacplus uses this package.
libtac2-bin
Provides the tacc testing program and TACACS+ man page.
TACACS+ Client Configuration Files
The following table describes the TACACS+ client configuration files that Cumulus Linux uses.
Filename
Description
/etc/tacplus_servers
The primary file that requires configuration after installation. All packages with include=/etc/tacplus_servers parameters use this file. Typically, this file contains the shared secrets; make sure that the Linux file mode is 600.
/etc/nsswitch.conf
When the libnss_tacplus package installs, this file configures tacplus lookups through libnss_tacplus. If you replace this file by automation, you need to add tacplus as the first lookup method for the passwd database line.
/etc/tacplus_nss.conf
Sets the basic parameters for libnss_tacplus. The file includes a debug variable for debugging NSS lookups separately from other client packages.
/usr/share/pam-configs/tacplus
The configuration file for pam-auth-update to generate the files in the next row. The file uses these configurations at login, by su, and by ssh.
/etc/pam.d/common-*
The /etc/pam.d/common-* files update for tacplus authentication. The files update with pam-auth-update when you install or remove libpam-tacplus.
/etc/sudoers.d/tacplus
Allows TACACS+ privilege level 15 users to run commands with sudo. The file includes an example (commented out) of how to enable privilege level 15 TACACS users to use sudo without a password and provides an example of how to enable all TACACS users to run specific commands with sudo. Only edit this file with the visudo -f /etc/sudoers.d/tacplus command.
/etc/audisp/plugins.d/audisp-tacplus.conf
The audisp plugin configuration file. You do not need to modify this file.
/etc/audisp/audisp-tac_plus.conf
The TACACS+ server configuration file for accounting. You do not need to modify this file. You can use this configuration file when you only want to debug TACACS+ accounting issues, not all TACACS+ users.
/etc/audit/rules.d/audisp-tacplus.rules
The auditd rules for TACACS+ accounting. The augenrules command uses all rule files to generate the rules file.
/etc/audit/audit.rules
The audit rules file that generate when you install auditd.
Considerations
Multiple TACACS+ Users
If two or more TACACS+ users log in simultaneously with the same privilege level, while the accounting records are correct, a lookup on either name matches both users, while a UID lookup only returns the user that logs in first.
As a result, any processes that either user runs apply to both and all files either user creates apply to the first name matched. This is similar to adding two local users to the password file with the same UID and GID and is an inherent limitation of using the UID for the base user from the password file.
The current algorithm returns the first name matching the UID from the mapping file; either the first or the second user that logs in.
To work around this issue, you can use the switch audit log or the TACACS server accounting logs to determine which processes and files each user creates.
For commands that do not execute other commands (for example, changes to configurations in an editor or actions with tools like clagctl and vtysh), there is no additional accounting.
Per-command authorization is at the most basic level (Cumulus Linux uses standard Linux user permissions for the local TACACS users and only privilege level 15 users can run sudo commands by default).
The Linux auditd system does not always generate audit events for processes when terminated with a signal (with the kill system call or internal errors such as SIGSEGV). As a result, processes that exit on a signal that you do not handle, generate a STOP accounting record.
Issues with the deluser Command
TACACS+ and other non-local users that run the deluser command with the --remove-home option see the following error:
tacuser0@switch: deluser --remove-home USERNAME
userdel: cannot remove entry 'USERNAME' from /etc/passwd
/usr/sbin/deluser: `/usr/sbin/userdel USERNAME' returned error code 1. Exiting
The command does remove the home directory. The user can still log in on that account but does not have a valid home directory. This is a known upstream issue with the deluser command for all non-local users.
Only use the --remove-home option with the user_homedir=1 configuration command.
Both TACACS+ and RADIUS AAA Clients
When you install both the TACACS+ and the RADIUS AAA client, Cumulus Linux does not attempt RADIUS login. As a workaround, do not install both the TACACS+ and the RADIUS AAA client on the same switch.
TACACS+ and PAM
PAM modules and an updated version of the libpam-tacplus package configure authentication initially. When you install the package, the pam-auth-update command updates the PAM configuration in /etc/pam.d. If you make changes to your PAM configuration, you need to integrate these changes. If you also use LDAP with the libpam-ldap package, you need to edit the PAM configuration with the LDAP and TACACS ordering you prefer. The libpam-tacplus package ignore rules and the values in success=2 require adjustments to ignore LDAP rules.
The TACACS+ privilege attribute priv_lvl determines the privilege level for the user that the TACACS+ server returns during the user authorization exchange. The client accepts the attribute in either the mandatory or optional forms and also accepts priv-lvl as the attribute name. The attribute value must be a numeric string in the range 0 to 15, with 15 the most privileged level.
By default, TACACS+ users at privilege levels other than 15 cannot run sudo commands and can only run commands with standard Linux user permissions.
You can edit the /etc/pam.d/common-* files manually. However, if you run pam-auth-update again after making the changes, the update fails. Only configure /usr/share/pam-configs/tacplus, then run pam-auth-update.
NSS Plugin
With pam_tacplus, TACACS+ authenticated users can log in without a local account on the system using the NSS plugin that comes with the tacplus_nss package. The plugin uses the mapped tacplus information if the user is not in the local password file, provides the getpwnam() and getpwuid()entry points, and uses the TACACS+ authentication functions.
The plugin asks the TACACS+ server if it knows the user, and then for relevant attributes to determine the privilege level of the user. When you install the libnss_tacplus package, nsswitch.conf changes to set tacplus as the first lookup method for passwd. If you change the order, lookups return the local accounts, such as tacacs0
If TACACS+ server does not find the user, it uses the libtacplus.so exported functions to do a mapped lookup. The privilege level appends to tacacs and the lookup searches for the name in the local password file. For example, privilege level 15 searches for the tacacs15 user. If the TACACS+ server finds the user, it adds information for the user in the password structure.
If the TACACS+ server does not find the user, it decrements the privilege level and checks again until it reaches privilege level 0 (user tacacs0). This allows you to use only the two local users tacacs0 and tacacs15, for minimal configuration.
TACACS+ Client Sequencing
Cumulus Linux requires the following information at the beginning of the AAA sequence:
Whether the user is a valid TACACS+ user
The user privilege level
For non-local users (users not in the local password file) you need to send a TACACS+ authorization request as the first communication with the TACACS+ server, before authentication and before the user logging in requests a password.
You need to configure certain TACACS+ servers to allow authorization requests before authentication. Contact your TACACS+ server vendor for information.
Multiple Servers with Different User Accounts
If you configure multiple TACACS+ servers that have different user accounts:
TACACS+ authentication allows for fall through; if the first reachable server does not authenticate the user, the client tries the second server, and so on.
TACACS authorization does not fall through. If the first reachable server returns an unauthorized result, the client does not try the next server.
RADIUS AAA
Cumulus Linux provides add-on packages to enable RADIUS users to log into the switch transparently with minimal configuration. There is no need to create accounts or directories on the switch. Authentication uses PAM and includes login, ssh, sudo and su.
Install the RADIUS Packages
NVUE automatically installs the RADIUS AAA packages; you do not have to install the packages if you use NVUE commands to configure RADIUS AAA.
If you use Linux commands to configure RADIUS AAA, you must install the RADIUS libnss-mapuser and libpam-radius-auth packages before you start configuration. The packages are in the cumulus-local-apt-archive repository, which is embedded in the Cumulus Linux image. You can install the packages even when the switch is not connected to the internet.
After installation completes, either reboot the switch or run the sudo systemctl restart nvued command.
The nvshow group includes the radius_user account, and the nvset and nvapply groups. The sudo groups include the radius_priv_user account. This enables all RADIUS logins to run NVUE nv show commands and all privileged RADIUS users to also run nv set, nv unset, and nv apply commands, and to use sudo.
Required RADIUS Client Configuration
After you install the required RADIUS packages, configure the following required settings on the switch (the RADIUS client):
Set the IP address or hostname of at least one RADIUS server. You can specify a port for the server (optional). The default port number is 1812.
Set the secret key shared between the RADIUS server and client. If you include special characters in the key (such as $), you must enclose the key in single quotes (').
If you use NVUE commands to configure RADIUS, you must also:
Set the priority at which Cumulus Linux contacts a RADIUS server for load balancing. You can set a value between 1 and 100. The lower value is the higher priority.
Set the priority for the authentication order for local and RADIUS users. You can set a value between 1 and 100. The lower value is the higher priority.
Enable RADIUS.
After you configure any RADIUS settings with NVUE and you run nv config apply, you must restart the NVUE service with the sudo systemctl restart nvued.service command.
The following example commands set:
The IP address of the RADIUS server to 192.168.0.254 and the port to 42.
The secret to 'myradius$key'.
The priority at which Cumulus Linux contacts the RADIUS server to 10.
The authentication order to 10 so that RADIUS authentication has priority over local.
The RADIUS option to enable.
cumulus@switch:~$ nv set system aaa radius server 192.168.0.254 port 42
cumulus@switch:~$ nv set system aaa radius server 192.168.0.254 secret 'myradius$key'
cumulus@switch:~$ nv set system aaa radius server 192.168.0.254 priority 10
cumulus@switch:~$ nv set system aaa authentication-order 10 radius
cumulus@switch:~$ nv set system aaa authentication-order 20 local
cumulus@switch:~$ nv set system aaa radius enable on
cumulus@switch:~$ nv config apply
Edit the /etc/pam_radius_auth.conf file to specify the hostname or IP address of at least one RADIUS server, and the shared secret you want to use to authenticate and encrypt communication with each server.
You must be able to resolve the hostname of the switch to an IP address. If you cannot find the hostname in DNS, you can add the hostname to the /etc/hosts file manually. Be aware that adding the hostname to the /etc/hosts file manually can cause problems because DHCP assigns the IP address, which can change at any time.
Cumulus Linux verifies multiple server configuration lines in the order listed. Other than memory, there is no limit to the number of RADIUS servers you can use.
The server port number is optional. The system looks up the port in the /etc/services file. However, you can override the ports in the /etc/pam_radius_auth.conf file.
Optional RADIUS Configuration
You can configure the following global RADIUS settings and server specific settings.
Setting
Description
vrf
The VRF you want to use to communicate with the RADIUS servers. This is typically the management VRF (mgmt), which is the default VRF on the switch. You cannot specify more than one VRF.
privilege-level
The minimum privilege level that determines if users can configure the switch with NVUE commands and sudo, or have read-only rights. The default privilege level is 15, which provides full administrator access. This is a global option only; you cannot set the minimum privilege level for specific RADIUS servers.
retransmit
The maximum number of retransmission attempts allowed for requests when a RADIUS authentication request times out. This is a global option only; you cannot set the number of retransmission attempts for specific RADIUS servers.
timeout
The timeout value when a server is slow or latencies are high. You can set a value between 1 and 60. The default timeout is 3 seconds. If you configure multiple RADIUS servers, you can set a global timeout for all servers.
source-ipv4source-ipv6
A specific interface to reach all RADIUS servers. To configure the source IP address for a specific RADIUS server, use the source-ip option.
debug
The debug option for troubleshooting. The debugging messages write to /var/log/syslog. When the RADIUS client is working correctly, you can disable the debug option. You enable the debug option globally for all the servers.
The following example configures global RADIUS settings:
cumulus@switch:~$ nv set system aaa radius vrf mgmt
cumulus@switch:~$ nv set system aaa radius privilege-level 10
cumulus@switch:~$ nv set system aaa radius retransmit 8
cumulus@switch:~$ nv set system aaa radius timeout 10
cumulus@switch:~$ nv set system aaa radius source-ipv4 192.168.1.10
cumulus@switch:~$ nv set system aaa radius debug enable
cumulus@switch:~$ nv config apply
The following example configures RADIUS settings for a specific RADIUS server:
cumulus@switch:~$ nv set system aaa radius server 192.168.0.254 source-ip 192.168.1.10
cumulus@switch:~$ nv set system aaa radius server 192.168.0.254 timeout 10
cumulus@switch:~$ nv config apply
Setting
Description
vrf
The VRF you want to use to communicate with the RADIUS servers. This is typically the management VRF (mgmt), which is the default VRF on the switch. You cannot specify more than one VRF.
privilege-level
Determines the privilege level for the user on the switch.
timeout
The timeout value when a server is slow or latencies are high. You can set a value between 1 and 60. The default timeout is 3 seconds. If you configure multiple RADIUS servers, you can set a global timeout for all servers.
src_ip
A specific IPv4 or IPv6 interface to reach the RADIUS server. If you configure multiple RADIUS servers, you can configure a specific interface to reach all RADIUS servers.
debug
The debug option for troubleshooting. The debugging messages write to /var/log/syslog. When the RADIUS client is working correctly, you can disable the debug option. If you configure multiple RADIUS servers, you can enable the debug option globally for all the servers.
Edit the /etc/pam_radius_auth.conf file.
...
# Set the minimum privilege level in VSA attribute shell:privilege-level=VALUE
# default is 15, range is 0-15.
privilege-level 10
#
# Uncomment to enable debugging, can be used instead of altering pam files
debug
#
# Account for privileged radius user mapping. If you change it here, you need
# to change /etc/nss_mapuser.conf as well
mapped_priv_user radius_priv_user
# server[:port] shared_secret timeout (secs) src_ip
192.168.0.254:42 myradius$key 10 192.168.1.10
vrf-name mgmt
Enable Login without Local Accounts
NVUE does not provide commands to enable login without local accounts.
LDAP is not commonly used with switches and adding accounts locally is cumbersome, Cumulus Linux includes a mapping capability with the libnss-mapuser package.
Mapping uses two NSS (Name Service Switch) plugins, one for account name, and one for UID lookup. The installation process configures these accounts automatically in the /etc/nsswitch.conf file and removes them when you delete the package. See the nss_mapuser (8) man page for the full description of this plugin.
A username is mapped at login to a fixed account specified in the configuration file, with the fields of the fixed account used as a template for the user that is logging in.
For example, if you look up the name dave and the fixed account in the configuration file is radius\_user, and that entry in /etc/passwd is:
then the matching line that returns when you run getent passwd dave is:
cumulus@switch:~$ getent passwd dave
dave:x:1017:1002:dave mapped user:/home/dave:/bin/bash
The login process creates the home directory /home/dave if it does not already exist and populates it with the standard skeleton files by the mkhomedir_helper command.
The configuration file /etc/nss_mapuser.conf configures the plugins. The file includes the mapped account name, which is radius_user by default. You can change the mapped account name by editing the file. The nss_mapuser (5) man page describes the configuration file.
A flat file mapping derives from the session number assigned during login, which persists across su and sudo. Cumulus Linux removes the mapping at logout.
Local Fallback Authentication
NVUE does not provide commands to configure local fallback authentication.
If a site wants to allow local fallback authentication for a user when none of the RADIUS servers are reachable, you can add a privileged user account as a local account on the switch. The local account must have the same unique identifier as the privileged user and the shell must be the same.
To configure local fallback authentication:
Add a local privileged user account. For example, if the radius_priv_user account in the /etc/passwd file is radius_priv_user:x:1002:1001::/home/radius_priv_user:/sbin/radius_shell, run the following command to add a local privileged user account named johnadmin:
Cumulus Linux does not remove the RADIUS fixed account from the /etc/passwd or /etc/group file or the home directories. They remain in case of modifications to the account or files in the home directories.
To remove the home directories of the RADIUS users, obtain the list by running the following command:
cumulus@switch:~$ sudo ls -l /home | grep radius
For all users listed, except the radius_user, run the following command to remove the home directories:
USERNAME is the account name (the home directory relative portion). This command gives the following warning because the user is not listed in the /etc/passwd file.
userdel: cannot remove entry 'USERNAME' from /etc/passwd
/usr/sbin/deluser: `/usr/sbin/userdel USERNAME' returned error code 1. Exiting.
After you remove all the RADIUS users, run the command to remove the fixed account. If there are changes to the account in the /etc/nss_mapuser.conf file, use that account name instead of radius_user.
If two or more RADIUS users log in simultaneously, a UID lookup only returns the user that logs in first. Any process that either user runs applies to both, and all files that either user creates apply to the first name matched. This process is similar to adding two local users to the password file with the same UID and GID, and is an inherent limitation of using the UID for the fixed user from the password file. The current algorithm returns the first name matching the UID from the mapping file, which is either the first or second user that logs in.
When you install both the TACACS+ and the RADIUS AAA client, Cumulus Linux does not attempt the RADIUS login. As a workaround, do not install both the TACACS+ and the RADIUS AAA client on the same switch.
When the RADIUS server is reachable outside of the management VRF, such as the default VRF, you might see the following error message when you try to run sudo:
2008-10-31T07:06:36.191359+00:00 SW01 sudo: pam_radius_auth(sudo:auth): Bind for server 10.1.1.25 failed: Cannot assign requested address
2008-10-31T07:06:36.192307+00:00 sw01 sudo: pam_radius_auth(sudo:auth): No valid server found in configuration file /etc/pam_radius_auth.conf
The error occurs because sudo tries to authenticate to the RADIUS server through the management VRF. Before you run sudo, you must set the shell to the correct VRF:
Netfilter is the packet filtering framework in Cumulus Linux and other Linux distributions. You can use several different tools to configure ACLs in Cumulus Linux:
iptables, ip6tables, and ebtables are Linux userspace tools you use to administer filtering rules for IPv4 packets, IPv6 packets, and Ethernet frames (layer 2 using MAC addresses).
cl-acltool is a Cumulus Linux-specific userspace tool you use to administer filtering rules and configure default ACLs. cl-acltool operates on various configuration files and uses iptables, ip6tables, and ebtables to install rules into the kernel. In addition, cl-acltool programs rules in hardware for switch port interfaces, which iptables, ip6tables and ebtables cannot do on their own.
NVUE is a Cumulus Linux-specific userspace tool you can use to configure custom ACLs.
Traffic Rules
Chains
Netfilter describes the way that the Linux kernel classifies and controls packets to, from, and across the switch. Netfilter does not require a separate software daemon to run; it is part of the Linux kernel. Netfilter asserts policies at layer 2, 3 and 4 of the OSI model by inspecting packet and frame headers according to a list of rules. The iptables, ip6tables, and ebtables userspace applications provide syntax you use to define rules.
The rules inspect or operate on packets at several points (chains) in the life of the packet through the system:
PREROUTING touches packets before the switch routes them.
INPUT touches packets after the switch determines that the packets are for the local system but before the control plane software receives them.
FORWARD touches transit traffic as it moves through the switch.
OUTPUT touches packets from the control plane software before they leave the switch.
POSTROUTING touches packets immediately before they leave the switch but after a routing decision.
Tables
When you build rules to affect the flow of traffic, tables can access the individual chains. Linux provides three tables by default:
Filter classifies traffic or filters traffic
NAT applies Network Address Translation rules
Mangle alters packets as they move through the switch
Each table has a set of default chains that modify or inspect packets at different points of the path through the switch. Chains contain the individual rules to influence traffic.
Rules
Rules classify the traffic you want to control. You apply rules to chains, which attach to tables.
Rules have several different components:
Table: The first argument is the table.
Chain: The second argument is the chain. Each table supports several different chains. See Tables above.
Matches: The third argument is the match. You can specify multiple matches in a single rule. However, the more matches you use in a rule, the more memory the rule consumes.
Jump: The jump specifies the target of the rule; what action to take if the packet matches the rule. If you omit this option in a rule, matching the rule has no effect on the packet, but the counters on the rule increment.
Targets: The target is a user-defined chain (other than the one this rule is in), one of the special built-in targets that decides the fate of the packet immediately (like DROP), or an extended target. See Supported Rule Types below for different target examples.
How Rules Parse and Apply
The switch reads all the rules from each chain from iptables, ip6tables, and ebtables and enters them in order into either the filter table or the mangle table. The switch reads the rules from the kernel in the following order:
IPv6 (ip6tables)
IPv4 (iptables)
ebtables
When you combine and put rules into one table, the order determines the relative priority of the rules; iptables and ip6tables have the highest precedence and ebtables has the lowest.
The Linux packet forwarding construct is an overlay for how the silicon underneath processes packets. Be aware of the following:
The switch silicon reorders rules when switchd writes to the ASIC, whereas traditional iptables execute the list of rules in order.
All rules, except for POLICE and SETCLASS rules, are terminating; after a rule matches, the action occurs and no more rules process.
When processing traffic, rules affecting the FORWARD chain that specify an ingress interface process before rules that match on an egress interface. As a workaround, rules that only affect the egress interface can have an ingress interface wildcard (only swp+ and bond+) that matches any interface you apply so that you can maintain order of operations with other input interface rules. For example, with the following rules:
-A FORWARD -i swp1 -j ACCEPT
-A FORWARD -o swp1 -j ACCEPT <-- This rule processes LAST (because of egress interface matching)
-A FORWARD -i swp2 -j DROP
If you modify the rules like this, they process in order:
-A FORWARD -i swp1 -j ACCEPT
-A FORWARD -i swp+ -o $PORTA -j ACCEPT <-- These rules are performed in order (because of wildcard match on the ingress interface)
-A FORWARD -i swp2 -j DROP
When using rules that do a mangle and a filter lookup for a packet, Cumulus Linux processes them in parallel and combines the action.
If there is no ingress interface or egress interface match, Cumulus Linux installs FORWARD chain rules in ingress by default.
When using the OUTPUT chain, you must assign rules to the source. For example, if you assign a rule to the switch port in the direction of traffic but the source is a bridge (VLAN), the rule does not affect the traffic and you must apply it to the bridge.
If you need to apply a rule to all transit traffic, use the FORWARD chain, not the OUTPUT chain.
The switch puts ebtable rules into either the IPv4 or IPv6 memory space depending on whether the rule uses IPv4 or IPv6 to make a decision. The switch only puts layer 2 rules that match the MAC address into the IPv4 memory space.
Rule Placement in Memory
INPUT and ingress (FORWARD -i) rules occupy the same memory space. A rule counts as ingress if you set the -i option. If you set both input and output options (-i and -o), the switch considers the rule as ingress and occupies that memory space. For example:
If you remove the -o option and the interface, it is a valid rule.
Nonatomic Update Mode and Atomic Update Mode
Cumulus Linux enables atomic update mode by default. However, this mode limits the number of ACL rules that you can configure.
To increase the number of configurable ACL rules, configure the switch to operate in nonatomic mode.
Instead of reserving 50% of your TCAM space for atomic updates, nonatomic mode runs incremental updates that use available free space to write the new TCAM rules, then swap over to the new rules. Cumulus Linux deletes the old rules and frees up the original TCAM space. If there is insufficient free space to complete this task, the regular nonatomic (non-incremental) update runs, which interrupts traffic.
Nonatomic updates offer better scaling because all TCAM resources actively impact traffic. With atomic updates, half of the hardware resources are on standby and do not actively impact traffic.
Incremental nonatomic updates are table based, so they do not interrupt network traffic when you install new rules. The rules map to the following tables and update in this order:
mirror (ingress only)
ipv4-mac (can be both ingress and egress)
ipv6 (ingress only)
The incremental nonatomic update operation follows this order:
Updates are incremental, one table at a time without stopping traffic.
Cumulus Linux checks if the rules in a table are different from installation time; if a table does not have any changes, it does not reinstall the rules.
If there are changes in a table, the new rules populate in new groups or slices in hardware, then that table switches over to the new groups or slices.
Finally, old resources for that table free up. This process repeats for each of the tables listed above.
If there are insufficient resources to hold both the new rule set and old rule set, Cumulus Linux tries regular nonatomic mode, which interrupts network traffic.
If the regular nonatomic update fails, Cumulus Linux reverts back to the previous rules.
To set nonatomic mode:
cumulus@switch:~$ nv set system acl mode non-atomic
cumulus@switch:~$ nv config apply
Edit the /etc/cumulus/switchd.conf file to add acl.non_atomic_update_mode = TRUE, then reload switchd for the changes to take effect:
Reloading switchd does not interrupt network services.
During regular non-incremental nonatomic updates, traffic stops, then continues after all the new configuration is in the hardware.
iptables, ip6tables, and ebtables
Do not use iptables, ip6tables, ebtables directly; installed rules only apply to the Linux kernel and Cumulus Linux does not hardware accelerate. When you run cl-acltool -i, Cumulus Linux resets all rules and deletes anything that is not in /etc/cumulus/acl/policy.conf.
For example, the following rule appears to work:
cumulus@switch:~$ sudo iptables -A INPUT -p icmp --icmp-type echo-request -j DROP
The cl-acltool -L command shows the rule:
cumulus@switch:~$ sudo cl-acltool -L ip
-------------------------------
Listing rules of type iptables:
-------------------------------
TABLE filter :
Chain INPUT (policy ACCEPT 72 packets, 5236 bytes)
pkts bytes target prot opt in out source destination
0 0 DROP icmp -- any any anywhere anywhere icmp echo-request
However, Cumulus Linux does not synchronize the rule to hardware. Running cl-acltool -i or reboot removes the rule without replacing it. To ensure that Cumulus Linux hardware accelerates all rules that can be in hardware, add them to /etc/cumulus/acl/policy.conf and install them with the cl-acltool -i command.
Estimate the Number of Rules
To estimate the number of rules you can create from an ACL entry, first determine if the ACL entry is ingress or egress. Then, determine if the entry is an IPv4-mac or IPv6 type rule. This determines the slice to which the rule belongs. Use the following to determine how many entries the switch uses for each type.
By default, each entry occupies one double wide entry, except if the entry is one of the following:
An entry with multiple comma-separated input interfaces splits into one rule for each input interface. For example, this entry splits into two rules:
-A FORWARD -i swp1s0,swp1s1 -p icmp -j ACCEPT
An entry with multiple comma-separated output interfaces splits into one rule for each output interface. This entry splits into two rules:
-A FORWARD -i swp+ -o swp1s0,swp1s1 -p icmp -j ACCEPT
An entry with both input and output comma-separated interfaces splits into one rule for each combination of input and output interface. This entry splits into four rules:
-A FORWARD -i swp1s0,swp1s1 -o swp1s2,swp1s3 -p icmp -j ACCEPT
An entry with multiple layer 4 port ranges splits into one rule for each range. For example, this entry splits into two rules:
You can match on VLAN IDs on layer 2 interfaces for ingress rules. The following example matches on a VLAN and DSCP class, and sets the internal class of the packet. For extended matching on IP fields, combine this rule with ingress iptable rules.
[ebtables]
-A FORWARD -p 802_1Q --vlan-id 100 -j mark --mark-set 102
[iptables]
-A FORWARD -i swp31 -m mark --mark 102 -m dscp --dscp-class CS1 -j SETCLASS --class 2
Cumulus Linux reserves mark values between 0 and 100; for example, if you use --mark-set 10, you see an error. Use mark values between 101 and 4196.
You cannot mark multiple VLANs with the same value.
If you enable EVPN-MH and configure VLAN match rules in ebtables with a mark target, the ebtables rule might overwrite the mark set by traffic class rules you configure for EVPN-MH on ingress. Egress EVPN MH traffic class rules that match the ingress traffic class mark might not get hit. To work around this issue, add ebtable rules to ACCEPT the packets already marked by EVPN-MH traffic class rules on ingress.
Install and Manage ACL Rules with NVUE
Instead of crafting a rule by hand, then installing it with cl-acltool, you can use NVUE commands. Cumulus Linux converts the commands to the /etc/cumulus/acl/policy.d/50_nvue.rules file. The rules you create with NVUE are independent of the default files /etc/cumulus/acl/policy.d/00control_plane.rules and 99control_plane_catch_all.rules.
Cumulus Linux 5.0 and later uses the -t mangle -A PREROUTING chain for ingress rules and the -t mangle -A POSTROUTING chain for egress rules instead of the - A FORWARD chain used in previous releases.
To create this rule with NVUE, follow the steps below. NVUE adds all options in the rule automatically.
Set the rule type, the matching protocol, source IP address and port, destination IP address and port, and the action. You must provide a name for the rule (EXAMPLE1 in the commands below):
cumulus@switch:~$ nv set acl EXAMPLE1 type ipv4
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip protocol tcp
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip source-ip 10.0.14.2/32
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip tcp source-port ANY
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip dest-ip 10.0.15.8/32
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip tcp dest-port ANY
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 action permit
Apply the rule to an inbound or outbound interface with the nv set interface <interface> acl command.
For rules affecting the -t mangle -A PREROUTING chain (-A FORWARD in previous releases), apply the rule to an inbound or outbound interface: For example:
To see the installed rule, either examine the /etc/cumulus/acl/policy.d/50_nvue.rules file or run the NVUE nv show acl <rule-name> rule <ID> command:
cumulus@switch:~$ sudo cat /etc/cumulus/acl/policy.d/50_nvue.rules
[iptables]
## ACL EXAMPLE1 in dir inbound on interface swp1 ##
-t mangle -A PREROUTING -i swp1 -s 10.0.14.2/32 -d 10.0.15.8/32 -p tcp -j ACCEPT
...
cumulus@switch:~$ nv show acl EXAMPLE1 rule 10
operational applied
------------------- ------------ ------------
match
ip
source-ip 10.0.14.2/32 10.0.14.2/32
dest-ip 10.0.15.8/32 10.0.15.8/32
protocol tcp tcp
tcp
[source-port] ANY ANY
[dest-port] ANY ANY
To remove this rule, run the nv unset acl <acl-name> and nv unset interface <interface> acl <acl-name> commands. These commands delete the rule from the /etc/cumulus/acl/policy.d/50_nvue.rules file.
To show ACL statistics per interface, such as the total number of bytes that match the ACL rule, run the nv show interface <interface-id> acl <acl-id> statistics or nv show interface <interface-id> acl <acl-id> statistics <rule-id> command.
To see the list of all NVUE ACL commands, run the nv list-commands acl command.
Install and Manage ACL Rules with cl-acltool
You can manage Cumulus Linux ACLs with cl-acltool. Rules write first to the iptables chains, as described above, and then synchronize to hardware through switchd.
To examine the current state of chains and list all installed rules, run:
cumulus@switch:~$ sudo cl-acltool -L all
-------------------------------
Listing rules of type iptables:
-------------------------------
TABLE filter :
Chain INPUT (policy ACCEPT 432K packets, 31M bytes)
pkts bytes target prot opt in out source destination
0 0 DROP all -- swp+ any 240.0.0.0/5 anywhere
0 0 DROP all -- swp+ any 127.0.0.0/8 anywhere
0 0 DROP all -- swp+ any base-address.mcast.net/4 anywhere
0 0 DROP all -- swp+ any 255.255.255.255 anywhere
0 0 ACCEPT all -- swp+ any anywhere anywhere
Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
Chain OUTPUT (policy ACCEPT 457K packets, 35M bytes)
pkts bytes target prot opt in out source destination
...
To list installed rules using native iptables, ip6tables and ebtables, use the -L option with the respective commands:
If the install fails, ACL rules in the kernel and hardware roll back to the previous state. You also see errors from programming rules in the kernel or ASIC.
Install Packet Filtering (ACL) Rules
cl-acltool takes access control list (ACL) rule input in files. Each ACL policy file includes iptables, ip6tables and ebtables categories under the tags [iptables], [ip6tables] and [ebtables]. You must assign each rule in an ACL policy to one of the rule categories.
See man cl-acltool(5) for ACL rule details. For iptables rule syntax, see man iptables(8). For ip6tables rule syntax, see man ip6tables(8). For ebtables rule syntax, see man ebtables(8).
See man cl-acltool(5) and man cl-acltool(8) for more details on using cl-acltool.
By default:
ACL policy files are in /etc/cumulus/acl/policy.d/.
All *.rules files in /etc/cumulus/acl/policy.d/ directory are also in /etc/cumulus/acl/policy.conf.
All files in the policy.conf file install when the switch boots up.
The policy.conf file expects rule files to have a .rules suffix as part of the file name.
Here is an example ACL policy file:
[iptables]
-A INPUT -i swp1 -p tcp --dport 80 -j ACCEPT
-A FORWARD -i swp1 -p tcp --dport 80 -j ACCEPT
[ip6tables]
-A INPUT -i swp1 -p tcp --dport 80 -j ACCEPT
-A FORWARD -i swp1 -p tcp --dport 80 -j ACCEPT
[ebtables]
-A INPUT -p IPv4 -j ACCEPT
-A FORWARD -p IPv4 -j ACCEPT
You can use wildcards or variables to specify chain and interface lists.
You can only use swp+ and bond+ as wildcard names.
swp+ rules apply as an aggregate, not per port. If you want to apply per port policing, specify a specific port instead of the wildcard.
You can write ACL rules for the system into multiple files under the default /etc/cumulus/acl/policy.d/ directory. The ordering of rules during installation follows the sort order of the files according to their file names.
Use multiple files to stack rules. The example below shows two rule files that separate rules for management and datapath traffic:
cumulus@switch:~$ ls /etc/cumulus/acl/policy.d/
00sample_mgmt.rules 01sample_datapath.rules
cumulus@switch:~$ cat /etc/cumulus/acl/policy.d/00sample_mgmt.rules
INGRESS_INTF = swp+
INGRESS_CHAIN = INPUT
[iptables]
# protect the switch management
-A $INGRESS_CHAIN -i $INGRESS_INTF -s 10.0.14.2 -d 10.0.15.8 -p tcp -j ACCEPT
-A $INGRESS_CHAIN -i $INGRESS_INTF -s 10.0.11.2 -d 10.0.12.8 -p tcp -j ACCEPT
-A $INGRESS_CHAIN -i $INGRESS_INTF -d 10.0.16.8 -p udp -j DROP
cumulus@switch:~$ cat /etc/cumulus/acl/policy.d/01sample_datapath.rules
INGRESS_INTF = swp+
INGRESS_CHAIN = INPUT, FORWARD
[iptables]
-A $INGRESS_CHAIN -i $INGRESS_INTF -s 192.0.2.5 -p icmp -j ACCEPT
-A $INGRESS_CHAIN -i $INGRESS_INTF -s 192.0.2.6 -d 192.0.2.4 -j DROP
-A $INGRESS_CHAIN -i $INGRESS_INTF -s 192.0.2.2 -d 192.0.2.8 -j DROP
Apply all rules and policies included in /etc/cumulus/acl/policy.conf:
cumulus@switch:~$ sudo cl-acltool -i
Specify the Policy Files to Install
By default, Cumulus Linux installs any .rules file you configure in /etc/cumulus/acl/policy.d/. To add other policy files to an ACL, you need to include them in /etc/cumulus/acl/policy.conf. For example, for Cumulus Linux to install a rule in a policy file called 01_new.datapathacl, add include /etc/cumulus/acl/policy.d/01_new.rules to policy.conf:
cumulus@switch:~$ sudo nano /etc/cumulus/acl/policy.conf
#
# This file is a master file for acl policy file inclusion
#
# Note: This is not a file where you list acl rules.
#
# This file can contain:
# - include lines with acl policy files
# example:
# include <filepath>
#
# see manpage cl-acltool(5) and cl-acltool(8) for how to write policy files
#
include /etc/cumulus/acl/policy.d/01_new.datapathacl
Hardware Limitations for ACL Rules
The maximum number of rules that the switch hardware can store depends on:
The combination of IPv4 and IPv6 rules; Cumulus Linux does not support the maximum number of rules for both IPv4 and IPv6 simultaneously.
The number of default rules that Cumulus Linux provides.
Whether the rules apply on ingress or egress.
Whether the rules are in atomic or nonatomic mode; Cumulus Linux uses nonatomic mode rules when you enable nonatomic updates.
Other resources that share the same table space, such as multicast route entries and internal VLAN counters.
If you exceed the maximum number of rules or run out of related memory resources for the ACL table, cl-acltool -i generates one of the following errors:
error: hw sync failed (sync_acl hardware installation failed) Rolling back .. failed.
error: hw sync failed (Bulk counter init failed with No More Resources). Rolling back ..
NVIDIA Spectrum switches use a TCAM or ATCAM to quickly look up various tables that include ACLs, multicast routes, and certain internal VLAN counters. Depending on the size of the network ACLs, multicast routes, and VLAN counters, you might need to adjust some parameters to fit your network requirements into the tables.
TCAM Profiles on Spectrum 1
The NVIDIA Spectrum 1 ASIC (model numbers 2xx0) has one common TCAM space for both ingress and egress ACLs, which the switch also uses for multicast route entries.
Cumulus Linux controls the ACL and multicast route entry scale on NVIDIA Spectrum 1 switches with different TCAM profiles in combination with the ACL atomic and nonatomic update setting.
Profile
Atomic Mode IPv4 Rules
Atomic Mode IPv6 Rules
Nonatomic Mode IPv4 Rules
Nonatomic Mode IPv6 Rules
Multicast Route Entries
default
500
250
1000
500
1000
ipmc-heavy
750
500
1500
1000
8500
acl-heavy
1750
1000
3500
2000
450
ipmc-max
1000
500
2000
1000
13000
ip-acl-heavy
6000
0
12000
0
0
Even though the table above specifies the ip-acl-heavy profile supports no IPv6 rules, Cumulus Linux does not prevent you from configuring IPv6 rules. However, there is no guarantee that IPv6 rules work under the ip-acl-heavy profile.
The ip-acl-heavy profile shows an updated number of supported atomic mode and nonatomic mode IPv4 rules. The previously published numbers were 7500 for atomic mode and 15000 for nonatomic mode IPv4 rules.
To configure the profile you want to use, set the tcam_resource.profile parameter in the /etc/mlx/datapath/tcam_profile.conf file, then restart switchd:
Spectrum 1 TCAM resource profiles that control ACLs and multicast route scale are different from forwarding resource profiles that control MAC table, IPv4, and IPv6 entry scale.
ATCAM on Spectrum-2 and Later
Switches with Spectrum-2 and later use a newer KVD scheme and an ATCAM design that is more flexible and allows a higher ACL scale than Spectrum 1. There is no TCAM resource profile on Spectrum-2 and later.
The following table shows the tested ACL rule limits. Because the KVD and ATCAM space is shared with forwarding table entries, multicast route entries, and VLAN flow counters, these ACL limits might vary based on your use of other tables.
These limits are valid when using any Spectrum-2 and later forwarding profile, except for the l2-heavy-3 and v6-lpm-heavy1 profiles, which reduce the ACL scale significantly.
For Spectrum-2 and later, all profiles support the same number of rules.
If you see error messages similar to No More Resources .. Rolling back when you try to apply ACLs, refer to Troubleshooting ACL Rule Installation Failures for information on troublshooting and managing netfilter resources.
Supported Rule Types
The iptables/ip6tables/ebtables construct tries to layer the Linux implementation on top of the underlying hardware but they are not always directly compatible. Here are the supported rules for chains in iptables, ip6tables and ebtables.
To learn more about any of the options shown in the tables below, run iptables -h [name of option]. The same help syntax works for options for ip6tables and ebtables.
root@leaf1# ebtables -h tricolorpolice
...
tricolorpolice option:
--set-color-mode STRING setting the mode in blind or aware
--set-cir INT setting committed information rate in kbits per second
--set-cbs INT setting committed burst size in kbyte
--set-pir INT setting peak information rate in kbits per second
--set-ebs INT setting excess burst size in kbyte
--set-conform-action-dscp INT setting dscp value if the action is accept for conforming packets
--set-exceed-action-dscp INT setting dscp value if the action is accept for exceeding packets
--set-violate-action STRING setting the action (accept/drop) for violating packets
--set-violate-action-dscp INT setting dscp value if the action is accept for violating packets
Supported chains for the filter table:
INPUT FORWARD OUTPUT
Rules with input/output Ethernet interfaces do not apply Inverse matches
Standard Targets
ACCEPT, DROP
RETURN, QUEUE, STOP, Fall Thru, Jump
Extended Targets
LOG (IPv4/IPv6); UID is not supported for LOG TCP SEQ, TCP options or IP options ULOG SETQOS DSCP Unique to Cumulus Linux: SPAN ERSPAN (IPv4/IPv6) POLICE TRICOLORPOLICE SETCLASS
ebtables Rule Support
Rule Element
Supported
Unsupported
Matches
ether type input interface/wildcard output interface/wildcard Src/Dst MAC IP: src, dest, tos, proto, sport, dport IPv6: tclass, icmp6: type, icmp6: code range, src/dst addr, sport, dport 802.1p (CoS) VLAN
Rules that have no matches and accept all packets in a chain are currently ignored.
Chain default rules (that are ACCEPT) are also ignored.
Considerations
Splitting rules across the ingress TCAM and the egress TCAM causes the ingress IPv6 part of the rule to match packets going to all destinations, which can interfere with the regular expected linear rule match in a sequence. For example:
A higher rule can prevent a lower rule from matching:
Rule 1 matches all icmp6 packets from to all out interfaces in the ingress TCAM.
This prevents rule 2 from matching, which is more specific but with a different out interface. Make sure to put more specific matches above more general matches even if the output interfaces are different.
When you have two rules with the same output interface, the lower rule might match depending on the presence of the previous rules.
Rule 1: -A FORWARD -o vlan100 -p icmp6 -j ACCEPT
Rule 2: -A FORWARD -o vlan101 -s 00::01 -j DROP
Rule 3: -A FORWARD -o vlan101 -p icmp6 -j ACCEPT
Rule 3 still matches for an icmp6 packet with sip 00:01 going out of vlan101. Rule 1 interferes with the normal function of rule 2 and/or rule 3.
When you have two adjacent rules with the same match and different output interfaces, such as:
Rule 1: -A FORWARD -o vlan100 -p icmp6 -j ACCEPT
Rule 2: -A FORWARD -o vlan101 -p icmp6 -j DROP
Rule 2 never matches on ingress. Both rules share the same mark.
Common Examples
Data Plane Policers
You can configure quality of service for traffic on the data plane. By using QoS policers, you can rate limit traffic so incoming packets get dropped if they exceed specified thresholds.
Counters on POLICE ACL rules in iptables do not show dropped packets due to those rules.
The following example rate limits the incoming traffic on swp1 to 400 packets per second with a burst of 200 packets per second:
cumulus@switch:~$ nv set acl example1 type ipv4
cumulus@switch:~$ nv set acl example1 rule 10 action police
cumulus@switch:~$ nv set acl example1 rule 10 action police mode packet
cumulus@switch:~$ nv set acl example1 rule 10 action police burst 200
cumulus@switch:~$ nv set acl example1 rule 10 action police rate 400
cumulus@switch:~$ nv set interface swp1 acl example1 inbound
cumulus@switch:~$ nv config apply
Use the POLICE target with iptables. POLICE takes these arguments:
--set-rate value specifies the maximum rate in kilobytes (KB) or packets.
--set-burst value specifies the number of packets or kilobytes (KB) allowed to arrive sequentially.
--set-mode string sets the mode in KB (kilobytes) or pkt (packets) for rate and burst size.
For example, to rate limit the incoming traffic on swp1 to 400 packets per second with a burst of 200 packets per second and set this rule in your appropriate .rules file:
You can configure quality of service for traffic on the control plane and rate limit traffic so incoming packets drop if they exceed certain thresholds in the following ways:
Run NVUE commands.
Edit the /etc/cumulus/control-plane/policers.conf file.
Cumulus Linux 5.0 and later no longer uses INPUT chain rules to configure control plane policers.
To configure control plane policers:
Set the burst rate for the trap group with the nv set system control-plane policer <trap-group> burst <value> command. The burst rate is the number of packets or kilobytes (KB) allowed to arrive sequentially.
Set the forwarding rate for the trap group with the nv set system control-plane policer <trap-group> rate <value> command. The forwarding rate is the maximum rate in kilobytes (KB) or packets.
The trap group can be: arp, bfd, pim-ospf-rip, bgp, clag, icmp-def, dhcp-ptp, igmp, ssh, icmp6-neigh, icmp6-def-mld, lacp, lldp, rpvst, eapol, ip2me, acl-log, nat, stp, l3-local, span-cpu, catch-all, or NONE.
The following example changes the PIM trap group forwarding rate and burst rate to 400 packets per second, and the IGMP trap group forwarding rate to 400 packets per second and burst rate to 200 packets per second:
cumulus@switch:~$ nv set system control-plane policer pim-ospf-rip rate 400
cumulus@switch:~$ nv set system control-plane policer pim-ospf-rip burst 400
cumulus@switch:~$ nv set system control-plane policer pim-ospf-rip state on
cumulus@switch:~$ nv set system control-plane policer igmp rate 400
cumulus@switch:~$ nv set system control-plane policer igmp burst 200
cumulus@switch:~$ nv config apply
To rate limit traffic using the /etc/cumulus/control-plane/policers.conf file, you:
Enable an individual policer for a trap group (set enable to TRUE).
Set the policer rate in packets per second. The forwarding rate is the maximum rate in kilobytes (KB) or packets.
Set the policer burst rate in packets per second. The burst rate is the number of packets or kilobytes (KB) allowed to arrive sequentially.
After you edit the /etc/cumulus/control-plane/policers.conf file, you must reload the file with the /usr/lib/cumulus/switchdctl --load /etc/cumulus/control-plane/policers.conf command.
When enable is FALSE for a trap group, the trap group and catch-all trap group have a shared policer. When enable is TRUE, Cumulus Linux creates an individual policer for the trap group.
The following example changes the PIM trap group forwarding rate and burst rate to 400 packets per second, and the IGMP trap group forwarding rate to 400 packets per second and burst rate to 200 packets per second:
To show the control plane police configuration and statistics, run the NVUE nv show system control-plane policer --view=statistics command.
Cumulus Linux provides default control plane policer values. You can adjust these values to accommodate higher scale requirements for specific protocols as needed.
You can configure control plane ACLs to apply a single rule for all packets forwarded to the CPU regardless of the source interface or destination interface on the switch. Control plane ACLs allow you to regulate traffic forwarded to applications on the switch with more granularity than traps and to configure ACLs to block SSH from specific addresses or subnets.
Cumulus Linux applies inbound control plane ACLs in the INPUT chain and outbound control plane ACLs in the OUTPUT chain.
Cumulus Linux does not support a deny all control plane rule. This type of rule blocks traffic for interprocess communication and impacts overall system functionality.
The following example command applies the input control plane ACL called ACL1.
cumulus@switch:~$ nv set system control-plane acl ACL1 inbound
cumulus@switch:~$ nv config apply
The following example command applies the output control plane ACL called ACL2.
cumulus@switch:~$ nv set system control-plane acl ACL2 outbound
cumulus@switch:~$ nv config apply
To show statistics for all control-plane ACLs, run the nv show system control-plane acl command:
cumulus@switch:~$ nv show system control-plane acl
ACL Name Rule ID In Packets In Bytes Out Packets Out Bytes
--------- ------- ---------- -------- ----------- ---------
acl1 1 0 0 0 0
65535 0 0 0 0
acl2 1 0 0 0 0
65535 0 0 0 0
To show statistics for a specific control-plane ACL, run the nv show system control-plane acl <acl_name> statistics command:
cumulus@switch:~$ nv show system control-plane acl ACL1 statistics
Rule In Packet In Byte Out Packet Out Byte Summary
---- --------- ------- ---------- -------- ---------------------------
1 0 0 Bytes 0 0 Bytes match.ip.dest-ip: 9.1.2.3
2 0 0 Bytes 0 0 Bytes match.ip.source-ip: 7.8.2.3
Set DSCP on Transit Traffic
The examples here use the mangle table to modify the packet as it transits the switch. DSCP is in decimal notation in the examples below.
[iptables]
#Set SSH as high priority traffic.
-t mangle -A PREROUTING -i swp+ -p tcp -m multiport --dports 22 -j SETQOS --set-dscp 46
#Set everything coming in swp1 as AF13
-t mangle -A PREROUTING -i swp1 -j SETQOS --set-dscp 14
#Set Packets destined for 10.0.100.27 as best effort
-t mangle -A PREROUTING -i swp+ -d 10.0.100.27/32 -j SETQOS --set-dscp 0
#Example using a range of ports for TCP traffic
-t mangle -A PREROUTING -i swp+ -s 10.0.0.17/32 -d 10.0.100.27/32 -p tcp -m multiport --sports 10000:20000 -m multiport --dports 10000:20000 -j SETQOS --set-dscp 34
Apply the rule:
cumulus@switch:~$ sudo cl-acltool -i
To set SSH as high priority traffic:
cumulus@switch:~$ nv set acl EXAMPLE1 type ipv4
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip protocol tcp
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip tcp dest-port 22
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 action set dscp 46
cumulus@switch:~$ nv set interface swp1-48 acl EXAMPLE1 inbound
cumulus@switch:~$ nv config apply
To set everything coming in swp1 as AF13:
cumulus@switch:~$ nv set acl EXAMPLE1 type ipv4
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 action set dscp 14
cumulus@switch:~$ nv set interface swp1 acl EXAMPLE1 inbound
cumulus@switch:~$ nv config apply
To set Packets destined for 10.0.100.27 as best effort:
cumulus@switch:~$ nv set acl EXAMPLE1 type ipv4
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip dest-ip 10.0.100.27/32
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 action set dscp 0
cumulus@switch:~$ nv set interface swp1-48 acl EXAMPLE1 inbound
cumulus@switch:~$ nv config apply
To use a range of ports for TCP traffic:
cumulus@switch:~$ nv set acl EXAMPLE1 type ipv4
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip protocol tcp
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip source-ip 10.0.0.17/32
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip tcp source-port 10000:20000
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip dest-ip 10.0.100.27/32
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip tcp dest-port 10000:20000
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 action set dscp 34
cumulus@switch:~$ nv set interface swp1-48 acl EXAMPLE1 inbound
cumulus@switch:~$ nv config apply
To specify all ports on the switch in NVUE (swp+ in an iptables rule), you must set the range of interfaces on the switch as in the examples above (nv set interface swp1-48). This command creates as many rules in the /etc/cumulus/acl/policy.d/50_nvue.rules file as the number of interfaces in the range you specify.
Filter Specific TCP Flags
The example rule below drops ingress IPv4 TCP packets when you set the SYN bit and reset the RST, ACK, and FIN bits. The rule applies inbound on interface swp1. After configuring this rule, you cannot establish new TCP sessions that originate from ingress port swp1. You can establish TCP sessions that originate from any other port.
-t mangle -A PREROUTING -i swp1 -p tcp --tcp-flags ACK,SYN,FIN,RST SYN -j DROP
Apply the rule:
cumulus@switch:~$ sudo cl-acltool -i
cumulus@switch:~$ nv set acl EXAMPLE1 type ipv4
cumulus@switch:~$ nv set acl EXAMPLE1 rule 20 match ip protocol tcp
cumulus@switch:~$ nv set acl EXAMPLE1 rule 20 match ip tcp flags syn
cumulus@switch:~$ nv set acl EXAMPLE1 rule 20 match ip tcp mask rst
cumulus@switch:~$ nv set acl EXAMPLE1 rule 20 match ip tcp mask syn
cumulus@switch:~$ nv set acl EXAMPLE1 rule 20 match ip tcp mask fin
cumulus@switch:~$ nv set acl EXAMPLE1 rule 20 match ip tcp mask ack
cumulus@switch:~$ nv set acl EXAMPLE1 rule 20 action deny
cumulus@switch:~$ nv set interface swp1 acl EXAMPLE1 inbound
cumulus@switch:~$ nv config apply
Control Who Can SSH into the Switch
Run the following commands to control who can SSH into the switch.
In the following example, 10.10.10.1/32 is the interface IP address (or loopback IP address) of the switch and 10.255.4.0/24 can SSH into the switch.
-A INPUT -i swp+ -s 10.255.4.0/24 -d 10.10.10.1/32 -j ACCEPT
-A INPUT -i swp+ -d 10.10.10.1/32 -j DROP
Apply the rule:
cumulus@switch:~$ sudo cl-acltool -i
cumulus@switch:~$ nv set acl example2 type ipv4
cumulus@switch:~$ nv set acl example2 rule 10 match ip source-ip 10.255.4.0/24
cumulus@switch:~$ nv set acl example2 rule 10 match ip dest-ip 10.10.10.1/32
cumulus@switch:~$ nv set acl example2 rule 10 action permit
cumulus@switch:~$ nv set acl example2 rule 20 match ip source-ip ANY
cumulus@switch:~$ nv set acl example2 rule 20 match ip dest-ip 10.10.10.1/32
cumulus@switch:~$ nv set acl example2 rule 20 action deny
cumulus@switch:~$ nv set system control-plane acl example2 inbound
cumulus@switch:~$ nv config apply
Block Traffic towards the eth0 Interface
To block traffic towards the eth0 interface, apply an ACL on the system control plane instead of on the eth0 interface. The following example creates an ACL called DENY-IN that blocks traffic from ingressing eth0 with source IP address 192.168.200.10:
cumulus@switch:~$ nv set acl DENY-IN rule 10 action deny
cumulus@switch:~$ nv set acl DENY-IN rule 10 match ip source-ip 192.168.200.10
cumulus@switch:~$ nv set acl DENY-IN type ipv4
cumulus@switch:~$ nv set system control-plane acl DENY-IN inbound
cumulus@switch:~$ nv config apply
Match on ECN Bits in the TCP IP Header
ECN allows end-to-end notification of network congestion without dropping packets. You can add ECN rules to match on the ECE, CWR, and ECT flags in the TCP IPv4 header.
By default, ECN rules match a packet with the bit set. You can reverse the match by using an explanation point (!).
Match on the ECE Bit
After an endpoint receives a packet with the CE bit set by a router, it sets the ECE bit in the returning ACK packet to notify the other endpoint that it needs to slow down.
To match on the ECE bit:
Create a rules file in the /etc/cumulus/acl/policy.d directory and add the following rule under [iptables]:
cumulus@switch:~$ nv set acl example2 type ipv4
cumulus@switch:~$ nv set acl example2 rule 10 match ip protocol tcp
cumulus@switch:~$ nv set acl example2 rule 10 match ip ecn flags tcp-cwr
cumulus@switch:~$ nv set acl example2 rule 10 action permit
cumulus@switch:~$ nv set interface swp1 acl example2 inbound
cumulus@switch:~$ nv config apply
Match on the ECT Bit
The ECT codepoints negotiate if the connection is ECN capable by setting one of the two bits to 1. Routers also use the ECT bit to indicate that they are experiencing congestion by setting both the ECT codepoints to 1.
To match on the ECT bit:
Create a rules file in the /etc/cumulus/acl/policy.d directory and add the following rule under [iptables]:
cumulus@switch:~$ nv set acl example2 type ipv4
cumulus@switch:~$ nv set acl example2 rule 10 match ip protocol tcp
cumulus@switch:~$ nv set acl example2 rule 10 match ip ecn ip-ect 1
cumulus@switch:~$ nv set acl example2 rule 10 action permit
cumulus@switch:~$ nv set interface swp1 acl example2 inbound
cumulus@switch:~$ nv config apply
Example Configuration
The following example demonstrates how Cumulus Linux applies several different rules.
Egress Rule
The following rule blocks any TCP traffic with destination port 200 going through leaf01 to server01 (rule 1 in the diagram above).
[iptables]
-t mangle -A POSTROUTING -o swp1 -p tcp -m multiport --dports 200 -j DROP
cumulus@switch:~$ nv set acl EXAMPLE1 type ipv4
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip protocol tcp
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip tcp dest-port 200
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 action deny
cumulus@switch:~$ nv set interface swp1 acl EXAMPLE1 outbound
cumulus@switch:~$ nv config apply
Ingress Rule
The following rule blocks any UDP traffic with source port 200 going from server01 through leaf01 (rule 2 in the diagram above).
[iptables]
-t mangle -A PREROUTING -i swp1 -p udp -m multiport --sports 200 -j DROP
cumulus@switch:~$ nv set acl EXAMPLE1 type ipv4
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip protocol udp
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip udp source-port 200
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 action deny
cumulus@switch:~$ nv set interface swp1 acl EXAMPLE1 inbound
cumulus@switch:~$ nv config apply
Input Rule
The following rule blocks any UDP traffic with source port 200 and destination port 50 going from server02 to the leaf02 control plane (rule 3 in the diagram above).
[iptables]
-A INPUT -i swp2 -p udp -m multiport --dports 50 -j DROP
cumulus@switch:~$ nv set acl EXAMPLE1 type ipv4
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip protocol udp
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip udp dest-port 50
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 action deny
cumulus@switch:~$ nv set interface swp2 acl EXAMPLE1 inbound control-plane
cumulus@switch:~$ nv config apply
Output Rule
The following rule blocks any TCP traffic with source port 123 and destination port 123 going from leaf02 to server02 (rule 4 in the diagram above).
[iptables]
-A OUTPUT -o swp2 -p tcp -m multiport --sports 123 -m multiport --dports 123 -j DROP
cumulus@switch:~$ nv set acl EXAMPLE1 type ipv4
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip protocol tcp
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip tcp source-port 123
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 match ip tcp dest-port 123
cumulus@switch:~$ nv set acl EXAMPLE1 rule 10 action deny
cumulus@switch:~$ nv set interface swp2 acl EXAMPLE1 outbound control-plane
cumulus@switch:~$ nv config apply
Layer 2 Rules (ebtables)
The following rule blocks any traffic with source MAC address 00:00:00:00:00:12 and destination MAC address 08:9e:01:ce:e2:04 going from any switch port egress or ingress.
[ebtables]
-A FORWARD -s 00:00:00:00:00:12 -d 08:9e:01:ce:e2:04 -j DROP
cumulus@switch:~$ nv set acl EXAMPLE type mac
cumulus@switch:~$ nv set acl EXAMPLE rule 10 match mac source-mac 00:00:00:00:00:12
cumulus@switch:~$ nv set acl EXAMPLE rule 10 match mac dest-mac 08:9e:01:ce:e2:04
cumulus@switch:~$ nv set acl EXAMPLE rule 10 action deny
cumulus@switch:~$ nv set interface swp1-48 acl EXAMPLE inbound
cumulus@switch:~$ nv config apply
Considerations
Not All Rules Supported
Cumulus Linux does not support all iptables, ip6tables, or ebtables rules. Refer to Supported Rules for specific rule support.
ACL Log Policer Limits Traffic
To protect the CPU from overloading, Cumulus Linux limits traffic copied to the CPU to 1 packet per second by an ACL Log Policer.
Bridge Traffic Limitations
Bridge traffic that matches LOG ACTION rules do not log to syslog; the kernel and hardware identify packets using different information.
You Cannot Forward Log Actions
You cannot forward logged packets. The hardware cannot both forward a packet and send the packet to the control plane (or kernel) for logging. A log action must also have a drop action.
SPAN Sessions that Reference an Outgoing Interface
Because Cumulus Linux is a Linux operating system, you can use the iptables commands. However, consider using cl-acltool instead for the following reasons:
Without using cl-acltool, rules do not install into hardware.
Running cl-acltool -i (the installation command) resets all rules and deletes anything that is not in the /etc/cumulus/acl/policy.conf file.
For example, running the following command works:
cumulus@switch:~$ sudo iptables -A INPUT -p icmp --icmp-type echo-request -j DROP
The rules appear when you run cl-acltool -L:
cumulus@switch:~$ sudo cl-acltool -L ip
-------------------------------
Listing rules of type iptables:
-------------------------------
TABLE filter :
Chain INPUT (policy ACCEPT 72 packets, 5236 bytes)
pkts bytes target prot opt in out source destination
0 0 DROP icmp -- any any anywhere anywhere icmp echo-request
However, running cl-acltool -i or reboot removes them. To ensure that Cumulus Linux can hardware accelerate all rules that can be in hardware, place them in the /etc/cumulus/acl/policy.conf file, then run cl-acltool -i.
Where to Assign Rules
If you assign a switch port to a bond, you must assign any egress rules to the bond.
When using the OUTPUT chain, you must assign rules to the source. For example, if you assign a rule to the switch port in the direction of traffic but the source is a bridge (VLAN), the rule does not affect the traffic and you must apply the rule to the bridge.
If you need to apply a rule to all transit traffic, use the FORWARD chain, not the OUTPUT chain.
Troubleshooting ACL Rule Installation Failures
On Spectrum-2 and later, in addition to ACLs, items stored in KVD and ATCAM include internal counters for VLANs and interfaces in a bridge. If the network includes more than 1000 VLAN interfaces, the counters might occupy a significant amount of space and reduce the amount of available space for ACLs.
If netfilter ACL space is exhausted, you might see error messages similar to the following when you try to apply ACLs:
cumulus@switch:$ sudo cl-acltool -i -p 00control_plane.rules
Using user provided rule file 00control_plane.rules
Reading rule file 00control_plane.rules ...
Processing rules in file 00control_plane.rules ...
error: hw sync failed (sync_acl hardware installation failed)
Installing acl policy... Rolling back ..
failed.
error: hw sync failed (Bulk counter init failed with No More Resources). Rolling back ..
You might also see messages similar to the following in the /var/log/syslog file:
On Spectrum-2 and later, you might see resource errors when you try to configure more than 1000 VLAN interfaces because certain VLAN counters share space with ACL memory in the ATCAM.
To free up resources, you can:
Reduce the number of specified VLANs or VLAN interfaces to the number you really need in the network.
Free up VLAN flow counter space; edit the /etc/mlx/datapath/stats.conf file to uncomment and set the hal.mlx.stats.vlan.enable option to FALSE, then reload switchd:
cumulus@switch:$ sudo nano /etc/mlx/datapath/stats.conf
# Once the stat controls are enabled/disabled,
# run 'systemctl reload switchd' for changes to take effect
hal.mlx.stats.vlan.enable = FALSE
The flow counters are internal counters for debugging; you do not see the counters in nv show interface <interface> counters or cl-netstat commands.
To see how much space the flow counters consume, examine the Flow Counters line in the cl-resource-query output.
ACLs Do not Match when the Output Port on the ACL is a Subinterface
The ACL does not match on packets when you configure a subinterface as the output port. The ACL matches on packets only if the primary port is as an output port. If a subinterface is an output or egress port, the packets match correctly.
For example:
-A FORWARD -o swp49s1.100 -j ACCEPT
Egress ACL Matching on Bonds
Cumulus Linux does not support ACL rules that match on an outbound bond interface. For example, you cannot create the following rule:
[iptables]
-A FORWARD -o <bond_intf> -j DROP
To work around this issue, duplicate the ACL rule on each physical port of the bond. For example:
[iptables]
-A FORWARD -o <bond-member-port-1> -j DROP
-A FORWARD -o <bond-member-port-2> -j DROP
SSH Traffic to the Management VRF
To allow SSH traffic to the management VRF, use -i mgmt, not -i eth0. For example:
In INPUT chain rules, the -i swp+ match works only if the destination of the packet is towards a layer 3 swp interface; the match does not work if the packet terminates at an SVI interface (for example, vlan10). To allow traffic towards specific SVIs, use rules without any interface match or rules with individual -i <SVI> matches.
Services (also known as daemons) and processes are at the heart of how a Linux system functions. Most of the time, a service takes care of itself; you just enable and start it, then let it run. However, because a Cumulus Linux switch is a Linux system, you can dig deeper if you like. Services can start multiple processes as they run. Services are important to monitor on a Cumulus Linux switch.
You manage services in Cumulus Linux to identify all active or stopped services and the boot time state of a specific service, disable or enable a specific service, and identify active listener ports.
systemd and the systemctl Command
You manage services that use systemd with the systemctl command.
Command Options
Description
status
Returns the status of the specified service.
start
Starts the service.
stop
Stops the service.
restart
Stops, then starts the service, all the while maintaining state. If there are dependent services or services that mark the restarted service as Required, the other services also restart. For example, running systemctl restart frr.service restarts any of the routing protocol services that you enable and that are running, such as bgpd or ospfd.
reload
Reloads the configuration for the service.
enable
Enables the service to start when the system boots, but does not start it unless you use the systemctl start SERVICENAME.service command or reboot the switch.
disable
Disables the service, but does not stop it unless you use the systemctl stop SERVICENAME.service command or reboot the switch. You can start or stop a disabled service.
reenable
Disables, then enables a service. Run this command so that any new Wants or WantedBy lines create the symlinks necessary for ordering. This does not affect on other services.
You do not need to interact with the services directly using these commands. If a critical service crashes or encounters an error, systemd restarts it automatically. systemd is the caretaker of services in modern Linux systems and responsible for starting all the necessary services at boot time.
The following example restarts the networking service:
Add the service name after the systemctl argument.
Ensure a Service Starts after Multiple Restarts
By default, systemd tries to restart a particular service only a certain number of times within a given interval before the service fails to start. The settings StartLimitInterval (which defaults to 10 seconds) and StartBurstLimit (which defaults to 5 attempts) are in the service script; however, certain services override these defaults, sometimes with much longer times. For example, switchd.service sets StartLimitInterval=10m and StartBurstLimit=3; therefore, if you restart switchd more than three times in ten minutes, it does not start.
When the restart fails for this reason, you see a message similar to the following:
Job for switchd.service failed. See 'systemctl status switchd.service' and 'journalctl -xn' for details.
systemctl status switchd.service shows output similar to:
Active: failed (Result: start-limit) since Thu 2016-04-07 21:55:14 UTC; 15s ago
To clear this error, run systemctl reset-failed switchd.service. If you know you are going to restart frequently (multiple times within the StartLimitInterval), you can run the same command before you issue the restart request. This also applies to stop followed by start.
Keep systemd Services from Hanging after Starting
If you start, restart, or reload a systemd service that you can start from another systemd service, you must use the --no-block option with systemctl.
Identify Active Listener Ports for IPv4 and IPv6
You can identify the active listener ports under both IPv4 and IPv6 using the netstat command:
To see active or stopped services, run the cl-service-summary command:
cumulus@switch:~$ cl-service-summary
Service cron enabled active
Service ssh enabled active
Service rsyslog enabled active
Service asic-monitor enabled inactive
Service clagd disabled active
Service cumulus-poe inactive
Service lldpd enabled active
Service mstpd enabled active
Service neighmgrd enabled active
Service netd enabled active
Service netq-agent disabled inactive
Service ntp disabled inactive
Service portwd enabled inactive
Service ptmd enabled active
Service pwmd enabled active
Service smond enabled active
Service switchd enabled active
Service sysmonitor enabled active
Service vxrd inactive
Service vxsnd inactive
Service rdnbrd disabled inactive
Service frr enabled active
Service ntp@mgmt disabled inactive
Service ntp@ntp disabled inactive
...
You can also run the systemctl list-unit-files --type service command to list all services on the switch and to see their status:
The switchd service enables the switch to communicate with Cumulus Linux and all the applications running on Cumulus Linux.
Configure switchd Settings
You can control certain options associated with the switchd process. For example, you can set polling intervals, optimize ACL hardware resources for better utilization, configure log message levels, set the internal VLAN range, and configure VXLAN encapsulation and decapsulation.
To configure switchd options, you either run NVUE commands or manually edit the /etc/cumulus/switchd.conf file.
NVUE currently only supports a subset of the switchd configuration available in the /etc/cumulus/switchd.conf file.
You can run NVUE commands to set the following switchd options:
The statistic polling interval for physical interfaces and for logical interfaces.
For physical interfaces, you can specify a value between 1 and 10. The default setting is 2 seconds
For logical interfaces, you can specify a value between 1 and 30. The default setting is 5 seconds.
A low setting, such as 1, might affect system performance.
The log level to debug the data plane programming related code. You can specify debug, info, notice, warning, or error. The default setting is info. NVIDIA recommends that you do not set the log level to debug in a production environment.
The DSCP action and value for encapsulation. You can set the DSCP action to copy (to copy the value from the IP header of the packet), set (to specify a specific value), or derive (to obtain the value from the switch priority). The default action is derive. Only specify a value if the action is set.
The DSCP action for decapsulation in VXLAN outer headers. You can specify copy (to copy the value from the IP header of the packet), preserve (to keep the inner DSCP value), or derive (to obtain the value from the switch priority). The default action is derive.
The preference between a route and neighbor with the same IP address and mask. You can specify route, neighbor, or route-and-neighbor. The default setting is route.
The ACL mode (atomic or non-atomic). The default setting is atomic.
The reserved VLAN range. The default setting is 3725-3999.
Certain switchd settings require a switchd restart or reload. Before applying the settings, NVUE indicates if it requires a switchd restart or reload and prompts you for confirmation.
When the switchd service restarts, in addition to resetting the switch hardware configuration, all network ports reset.
When the switchd service reloads, there is no interruption to network services.
The following command example sets both the statistic polling interval for logical interfaces and physical interfaces to 6 seconds:
cumulus@switch:~$ nv set system counter polling-interval logical-interface 6
cumulus@switch:~$ nv set system counter polling-interval physical-interface 6
cumulus@switch:~$ nv config apply
The following command example sets the log level for debugging the data plane programming related code to warning:
cumulus@switch:~$ nv set system forwarding programming log-level warning
cumulus@switch:~$ nv config apply
The following command example sets the DSCP action for encapsulation in VXLAN outer headers to set and the value to af12:
cumulus@switch:~$ nv set nve vxlan encapsulation dscp action set
cumulus@switch:~$ nv set nve vxlan encapsulation dscp value af12
cumulus@switch:~$ nv config apply
The following command example sets the DSCP action for decapsulation in VXLAN outer headers to preserve:
The following command example sets the route or neighbor preference to both route and neighbor:
cumulus@switch:~$ nv set system forwarding host-route-preference route-and-neighbour
cumulus@switch:~$ nv config apply
The following command example sets the ACL mode to non-atomic:
cumulus@switch:~$ nv set system acl mode non-atomic
cumulus@switch:~$ nv config apply
The following command example sets the reserved VLAN range between 4064 and 4094:
cumulus@switch:~$ nv set system global reserved vlan internal range 4064-4094
cumulus@switch:~$ nv config apply
To configure the switchd parameters, edit the /etc/cumulus/switchd.conf file. Change the setting and uncomment the line if needed. The switchd.conf file contains comments with a description for each setting.
The following example shows the first few lines of the /etc/cumulus/switchd.conf file.
The following table describes the /etc/cumulus/switchd.conf file parameters and indicates if you need to restart switchd with the sudo systemctl restart switchd.service command or reload switchd with the sudo systemctl reload switchd.service command for changes to take effect when you update the setting.
Restarting the switchd service causes all network ports to reset in addition to resetting the switch hardware configuration.
Parameter
Description
switchd reload or restart
stats.poll_interval
The statistics polling interval in milliseconds.The default setting is 2000.
restart
buf_util.poll_interval
The buffer utilization polling interval in milliseconds. 0 disables buffer utilization polling.The default setting is 0.
restart
buf_util.measure_interval
The buffer utilization measurement interval in minutes.The default setting is 0.
restart
acl.optimize_hw
Optimizes ACL hardware resources for better utilization.The default setting is FALSE.
restart
acl.flow_based_mirroring
Enables flow-based mirroring.The default setting is TRUE.
restart
acl.non_atomic_update_mode
Enables non atomic ACL updatesThe default setting is FALSE.
reload
arp.next_hops
Sends ARPs for next hops.The default setting is TRUE.
restart
route.table
The kernel routing table ID. The range is between 1 and 2^31.The default is 254.
restart
route.host_max_percent
The maximum neighbor table occupancy in hardware (a percentage of the hardware table size).The default setting is 100.
restart
coalescing.reducer
The coalescing reduction factor for accumulating changes to reduce CPU load.The default setting is 1.
restart
coalescing.timeout
The coalescing time limit in seconds.The default setting is 10.
restart
ignore_non_swps
Ignore routes that point to non-swp interfaces.The default setting is TRUE.
restart
disable_internal_parity_restart
Disables restart after a parity error.The default setting is TRUE.
restart
disable_internal_hw_err_restart
Disables restart after an unrecoverable hardware error.The default setting is FALSE.
restart
nat.static_enable
Enables static NAT. The default setting is TRUE.
restart
nat.dynamic_enable
Enables dynamic NAT. The default setting is TRUE.
restart
nat.age_poll_interval
The NAT age polling interval in minutes. The minimum is 1 minute and the maximum is 24 hours. You can configure this setting only when nat.dynamic_enable is set to TRUE. The default setting is 5.
restart
nat.table_size
The NAT table size limit in number of entries. You can configure this setting only when nat.dynamic_enable is set to TRUE. The default setting is 1024.
restart
nat.config_table_size
The NAT configuration table size limit in number of entries. You can configure this setting only when nat.dynamic_enable is set to TRUE. The default setting is 64.
restart
logging
Configures logging in the format BACKEND=LEVEL. Separate multiple BACKEND=LEVEL pairs with a space. The BACKEND value can be stderr, file:filename, syslog, program:executable. The LEVEL value can be CRIT, ERR, WARN, INFO, DEBUG.The default value is syslog=INFO
restart
interface.<interface>.storm_control.broadcast
Enables broadcast storm control and sets the number of packets per second (pps).The default setting is 400.
reload
interface.<interface>.storm_control.multicast
Enables multicast storm control and sets the number of packets per second (pps).The default setting is 3000.
Enables unicast storm control and sets the number of packets per second (pps).The default setting is 2000.
reload
stats.vlan.aggregate
Enables hardware statistics for VLANs and specifies the type of statistics needed. You can specify NONE, BRIEF, or DETAIL.The default setting is BRIEF.
restart
stats.vxlan.aggregate
Enables hardware statistics for VXLANs and specifies the type of statistics needed. You can specify NONE, BRIEF, or DETAIL.The default setting is DETAIL.
restart
stats.vxlan.member
Enables hardware statistics for VXLAN members and specifies the type of statistics needed. You can specify NONE, BRIEF, or DETAIL.The default setting is BRIEF.
restart
stats.vlan.show_internal_vlans
Show internal VLANs.The default setting is FALSE.
restart
stats.vdev_hw_poll_interval
The polling interval in seconds for virtual device hardware statisitcs.The default setting is 5.
restart
resv_vlan_range
The internal VLAN range.The default setting is 3725-3999.
restart
netlink.buf_size
The netlink socket buffer size in MB.The default setting is 136314880.
restart
route.delete_dead_routes
Delete routes on interfaces when the carrier is down.The default setting is TRUE.
restart
vxlan.default_ttl
The default TTL to use in VXLAN headers.The default setting is 64.
restart
bridge.broadcast_frame_to_cpu
Enables bridge broadcast frames to the CPU even if the SVI is not enabled.The default setting is FALSE.
restart
bridge.unreg_mcast_init
Initialize the prune module for IGMP snooping unregistered layer 2 multicast flood control.The default setting is FALSE.
restart
bridge.unreg_v4_mcast_prune
Enables unregistered layer 2 multicast prune to mrouter ports (IPv4).The default setting is FALSE (flood unregistered layer 2 multicast traffic).
restart
bridge.unreg_v6_mcast_prune
Enables unregistered layer 2 multicast prune to mrouter ports (IPv6).The default setting is FALSE (flood unregistered layer 2 multicast traffic).
restart
netlink libnl logger
The default setting is [0-5].
restart
netlink.nl_logger
The default setting is 0.
restart
vxlan.def_encap_dscp_action
Sets the default VXLAN router DSCP action during encapsulation. You can specify copy if the inner packet is IP, set to set a specific value, or derive to derive the value from the switch priority.The default setting is derive.
reload
vxlan.def_encap_dscp_value
Sets the default VXLAN encapsulation DSCP value if the action is set.
reload
vxlan.def_decap_dscp_action
Sets the default VXLAN router DSCP action during decapsulation. You can specify copy if the inner packet is IP, preserve to preserve the inner DSCP value, or derive to derive the value from the switch priority.The default setting is derive.
reload
ipmulticast.unknown_ipmc_to_cpu
Enables sending unknown IPMC to the CPU.The default setting is FALSE.
restart
vrf_route_leak_enable_dynamic
Enables dynamic VRF route leaking.The default setting is FALSE.
restart
sync_queue_depth_val
The event queue depth.The default setting is 50000.
restart
route.route_preferred_over_neigh
Sets the preference between a route and neighbor with the same IP address and mask. You can specify TRUE to prefer the route over the neighbor, FALSE to prefer the neighbor over the route, or BOTH to install both the route and neighbor.The default setting is TRUE.
reload
evpn.multihoming.enable
Enables EVPN multihoming.The default setting is TRUE.
restart
evpn.multihoming.shared_l2_groups
Enables sharing for layer 2 next hop groups.The default setting is FALSE.
restart
evpn.multihoming.shared_l3_groups
Enables sharing for layer 3 next hop groups.The default setting is FALSE.
restart
evpn.multihoming.fast_local_protect
Enables fast reroute for egress link protection. The default setting is FALSE.
restart
evpn.multihoming.bum_sph_filter
Sets split-horizon filtering for EVPN multihoming. You can specify TRUE to filter only BUM traffic from the Ethernet segment (ES) peer or FALSE to filter all traffic from the ES peer.The default setting is TRUE.
restart
link_flap_window
The duration in seconds during which a link must flap the number of times set in the link_flap_threshold before Cumulus Linux sets the link to protodown and specifies linkflap as the reason.The default setting is 10. A value of 0 disables link flap protection.
restart
link_flap_threshold
The number of times the link must flap within the link flap window before Cumulus Linux sets the link to protodown and specifies linkflap as the reason.The default setting is 5. A value of 0 disables link flap protection.
restart
res_usage_warn_threshold
Sets the percentage over which forwarding resources (routes, hosts, MAC addresses) must go before Cumulus Linux generates a warning. You can set a value between 50 and 95.The default setting is 90.
restart
res_warn_msg_int
The time interval in seconds between resource warning messages. Warning messages generate only one time in the specified interval per resource type even if the threshold falls below or goes over the value set in res_usage_warn_threshold multiple times during this interval. You can set a value between 60 and 3600.The default setting is 300.
restart
Show switchd Settings
You can run the following NVUE commands to show the current switchd configuration settings.
Command
Description
nv show system counter polling-interval
Shows the polling interval for physical and logical interface counters in seconds.
nv show system forwarding programming
Shows the log level for data plane programming logs.
nv show nve vxlan encapsulation dscp
Shows the DSCP action and value (if the action is set) for the outer header in VXLAN encapsulation.
nv show nve vxlan decapsulation dscp
Shows the DSCP action for the outer header in VXLAN decapsulation.
nv show system acl
Shows the ACL mode (atomic or non-atomic).
nv show system global reserved vlan internal
Shows the reserved VLAN range.
The following example command shows that the polling interval setting for logical interface counters is 6 seconds:
cumulus@switch:~$ nv show system counter polling-interval
applied description
----------------- ------- -----------------------------------------------------
logical-interface 0:00:06 Config polling-interval for logical interface(in sec)
The following example command shows that the log level setting for data plane programming logs is warning:
cumulus@switch:~$ nv show system forwarding programming
applied description
--------- ------- -------------------
log-level warning configure Log-level
The following example command shows that the DSCP action setting for the outer header in VXLAN encapsulation is set and the value is af12.
cumulus@switch:~$ nv show nve vxlan encapsulation dscp
operational applied description
------ ----------- ------- --------------------------------------------------
action set set DSCP encapsulation action
value af12 af12 Configured DSCP value to put in outer Vxlan packet
The following command example shows that ACL mode is atomic:
cumulus@switch:~$ nv show system acl
applied description
---- ------- -----------------------------------------
mode atomic configure Atomic or Non-Atomic ACL update
The following command example shows that the reserved VLAN range is between 4064 and 4094:
cumulus@switch:~$ nv show system global reserved vlan internal
operational applied description
----- ----------- --------- -------------------
range 4064-4094 4064-4094 Reserved Vlan range
In addition to restarting switchd when you change certain /etc/cumulus/switchd.conf file parameters manually, you also need to restart switchd whenever you modify a switchd hardware configuration file (any *.conf file that requires making a change to the switching hardware, such as /etc/cumulus/datapath/traffic.conf). You do not have to restart the switchd service when you update a network interface configuration (for example, when you edit the /etc/network/interfaces file).
Configuring a Global Proxy
You configure global HTTP and HTTPS proxies in the /etc/profile.d/ directory of Cumulus Linux. Set the http_proxy and https_proxy variables to configure the switch with the address of the proxy server you want to use to get URLs on the command line. This is useful for programs such as apt, apt-get, curl and wget, which can all use this proxy.
In a terminal, create a new file in the /etc/profile.d/ directory.
Create a file in the /etc/apt/apt.conf.d directory and add the following lines to the file to get the HTTP and HTTPS proxies. The example below uses http_proxy as the file name:
Use ISSU to upgrade and troubleshoot an active switch with minimal disruption to the network.
ISSU includes the following modes:
Restart
Upgrade
Maintenance mode
Maintenance ports
In earlier Cumulus Linux releases, ISSU was Smart System Manager.
The NVIDIA SN5600 (Spectrum-4) switch does not support ISSU.
Restart Mode
You can configure the switch to restart in one of the following modes.
cold restarts the system and resets all the hardware devices on the switch (including the switching ASIC).
fast restarts the system more efficiently with minimal impact to traffic by reloading the kernel and software stack without a hard reset of the hardware. During a fast restart, the system decouples from the network to the extent possible using existing protocol extensions before recovering to the operational mode of the system. The restart process maintains the forwarding entries of the switching ASIC and the data plane is not affected. Traffic outage is much lower in this mode as there is a momentary interruption after reboot, while the system reinitializes.
warm restarts the system with no interruption to traffic for existing route entries. Warm mode restarts the system without a hardware reset of the switch ASIC. While this process does not affect the data plane, the control plane is absent during restart and is unable to process routing updates. However, if no alternate paths exist, the switch continues forwarding with the existing entries with no interruptions.
When you restart the switch in warm mode, BGP only performs a graceful restart if the BGP graceful restart option is set to full. To set BGP graceful restart to full, run the nv set router bgp graceful-restart mode full command, then apply the configuration with nv config apply. For more information about BGP graceful restart, refer to Optional BGP Configuration.
In an eBGP multihop configuration with warm restart mode, you must set the BGP graceful restart timer to 180 seconds or more.
Cumulus Linux supports:
Fast mode for all protocols.
Warm mode for 802.1X, layer 2 forwarding, layer 3 forwarding with BGP, and static routing. Warm mode for VXLAN routing with EVPN is available for beta and open to customer feedback. Cumulus Linux does not support warm boot with EVPN MLAG or EVPN multihoming.
NVIDIA recommends you use NVUE commands to configure restart mode and reboot the system. If you prefer to use csmgrctl commands, you must stop NVUE from managing the /etc/cumulus/csmgrd.conf file before you set restart mode:
Run the following NVUE commands:
cumulus@switch:~$ nv set system config apply ignore /etc/cumulus/csmgrd.conf
cumulus@switch:~$ nv config apply
Edit the /etc/cumulus/csmgrd.conf file and set the csmgrctl_override option to true:
The following command configures the switch to restart in cold mode:
cumulus@switch:~$ nv set system reboot mode cold
cumulus@switch:~$ nv config apply
cumulus@switch:~$ sudo csmgrctl -c
The following command configures the switch to restart in fast mode:
cumulus@switch:~$ nv set system reboot mode fast
cumulus@switch:~$ nv config apply
cumulus@switch:~$ sudo csmgrctl -f
The following command configures the switch to restart in warm mode.
cumulus@switch:~$ nv set system reboot mode warm
cumulus@switch:~$ nv config apply
cumulus@switch:~$ sudo csmgrctl -w
To reboot the switch in the restart mode you configure above with NVUE:
cumulus@switch:~$ nv action reboot system no-confirm
You must specify no-confirm at the end of the command.
To show system reboot information, such as the reboot date and time, reason, and reset mode (fast, cold, warm), run the NVUE nv show system reboot command:
cumulus@switch:~$ nv show system reboot
operational applied pending
--------- -------------------------------- ------- -------
reason
gentime 2023-04-26T15:11:23.140569+00:00
reason Unknown
user system/root
Upgrade Mode
Upgrade mode updates all the components and services on the switch to the latest Cumulus Linux minor release without impacting traffic. After upgrade is complete, you must restart the switch with either a warm, cold, or fast restart.
If the switch is in warm restart mode, restarting the switch after an upgrade does not result in traffic loss (this is a hitless upgrade).
Upgrade mode includes the following options:
all runs apt-get upgrade to upgrade all the system components to the latest release without affecting traffic flow. You must restart the system after the upgrade completes with one of the restart modes.
dry-run provides information on the components you want to upgrade.
The following command upgrades all the system components:
The NVUE command is not supported.
cumulus@switch:~$ sudo csmgrctl -u
The following command provides information on the components you want to upgrade:
The NVUE command is not supported.
cumulus@switch:~$ sudo csmgrctl -d
Maintenance Mode
Maintenance mode globally manages the BGP and MLAG control plane.
When you enable maintenance mode, BGP and MLAG shut down gracefully.
When you disable maintenance mode, BGP and MLAG are enabled based on the individual parameter settings.
To enable maintenance mode:
cumulus@switch:~$ nv action enable system maintenance mode
Action executing ...
System maintenance mode has been enabled successfully
Current System Mode: Maintenance, cold
Maintenance mode since Thu Jun 13 23:59:47 2024 (Duration: 00:00:00)
Ports shutdown for Maintenance
frr : Maintenance, cold, down, up time: 29:06:27
switchd : Maintenance, cold, down, up time: 29:06:31
System Services : Maintenance, cold, down, up time: 29:07:00
Action succeeded
cumulus@switch:~$ sudo csmgrctl -m1
To disable maintenance mode:
cumulus@switch:~$ nv action disable system maintenance mode
Action executing ...
System maintenance mode has been disabled successfully
Current System Mode: cold
frr : cold, up, up time: 12:57:48 (1 restart)
switchd : cold, up, up time: 13:12:13
System Services : cold, up, up time: 13:12:32
Action succeeded
cumulus@switch:~$ sudo csmgrctl -m0
Before you disable maintenance mode, be sure to bring the ports back up.
To show maintenance mode status either run the NVUE nv show system maintenance command or the Linux sudo csmgrctl -s command:
cumulus@switch:~$ nv show system maintenance
operational
----- -----------
mode enabled
ports disabled
cumulus@switch:~$ sudo csmgrctl -s
Current System Mode: cold
frr : cold, up, up time: 00:14:51 (2 restarts)
clagd : cold, up, up time: 00:14:47
switchd : cold, up, up time: 01:09:48
System Services : cold, up, up time: 01:10:07
Maintenance Ports
Maintenance ports globally disables or enables all configured ports.
When you enable maintenance ports, swp interfaces follow individual admin states.
When you disable maintenance ports, swp interfaces are globally admin down, overriding the admin state in the configuration.
To enable maintenance ports:
cumulus@switch:~$ nv action enable system maintenance ports
Action executing ...
System maintenance ports has been enabled successfully
Current System Mode: cold
frr : cold, up, up time: 28:54:36
switchd : cold, up, up time: 28:54:40
System Services : cold, up, up time: 28:55:09
Action succeeded
cumulus@switch:~$ sudo csmgrctl -p0
To disable maintenance ports:
cumulus@switch:~$ nv action disable system maintenance ports
Action executing ...
System maintenance ports has been disabled successfully
Current System Mode: cold
Ports shutdown for Maintenance
frr : cold, up, up time: 28:55:49
switchd : cold, up, up time: 28:55:53
System Services : cold, up, up time: 28:56:22
Action succeeded
cumulus@switch:~$ sudo csmgrctl -p1
To see the status of maintenance ports, run the NVUE nv show system maintenance command:
cumulus@switch:~$ nv show system maintenance
operational
----- -----------
mode enabled
ports disabled
System Power
In certain situations, you might need to power off the switch instead of rebooting. To power off the switch, run the cl-poweroff command, which shuts down the switch.
cumulus@switch:~$ sudo cl-poweroff
Alternatively, you can run the Linux poweroff command, which gracefully shuts down the switch (the switch LEDs stay on). On certain switches, such as the NVIDIA SN2201, SN2010, SN2100, SN2100B, SN3420, SN3700, SN3700C, SN4410, SN4600C, SN4600, SN4700, SN5400, or SN5600, the switch reboots instead of powering off.
cumulus@switch:~$ sudo poweroff
Layer 1 and Switch Ports
This section discusses the following layer 1 and switch port configuration:
To configure and bring an interface up administratively, edit the /etc/network/interfaces file to add the interface stanza, then run the ifreload -a command:
cumulus@switch:~$ sudo nano /etc/network/interfaces
auto lo
iface lo inet loopback
address 10.10.10.1/32
auto mgmt
iface mgmt
address 127.0.0.1/8
address ::1/128
vrf-table auto
auto eth0
iface eth0 inet dhcp
ip-forward off
ip6-forward off
vrf mgmt
auto swp1
iface swp1
...
To bring an interface down administratively after you configure it, add link-down yes to the interface stanza in the /etc/network/interfaces file, then run ifreload -a:
auto swp1
iface swp1
link-down yes
If you configure an interface in the /etc/network/interfaces file, you can bring it down administratively with the ifdown swp1 command, then bring the interface back up with the ifup swp1 command. These changes do not persist after a reboot. After a reboot, the configuration present in /etc/network/interfaces takes effect.
-By default, the ifupdown and ifup command is quiet. Use the verbose option (-v) to show commands as they execute when you bring an interface down or up.
To remove an interface from the configuration entirely, remove the interface stanza from the /etc/network/interfaces file, then run the ifreload -a command.
For additional information on interface administrative state and physical state, refer to this knowledge base article.
Loopback Interface
Cumulus Linux has a preconfigured loopback interface. When the switch boots up, the loopback interface called lo is up and assigned an IP address of 127.0.0.1.
The loopback interface lo must always exist on the switch and must always be up.
To configure an IP address for the loopback interface:
cumulus@switch:~$ nv set interface lo ip address 10.10.10.1
cumulus@switch:~$ nv config apply
Edit the /etc/network/interfaces file to add an address line:
auto lo
iface lo inet loopback
address 10.10.10.1
If the IP address has no subnet mask, it automatically becomes a /32 IP address. For example, 10.10.10.1 is 10.10.10.1/32.
You can configure multiple IP addresses for the loopback interface.
Subinterfaces
On Linux, an interface is a network device that can be either physical, (for example, swp1) or virtual (for example, vlan100). A VLAN subinterface is a VLAN device on an interface, and the VLAN ID appends to the parent interface using dot (.) VLAN notation. For example, a VLAN with ID 100 that is a subinterface of swp1 is swp1.100. The dot VLAN notation for a VLAN device name is a standard way to specify a VLAN device on Linux.
A VLAN subinterface only receives traffic tagged for that VLAN; therefore, swp1.100 only receives packets that have a VLAN 100 tag on switch port swp1. Any packets that transmit from swp1.100 have a VLAN 100 tag.
The following example configures a routed subinterface on swp1 in VLAN 100:
cumulus@switch:~$ nv set interface swp1.100 ip address 192.168.100.1/24
cumulus@switch:~$ nv config apply
Edit the /etc/network/interfaces file, then run ifreload -a:
If you are using a VLAN subinterface, do not add that VLAN under the bridge stanza.
You cannot use NVUE commands to create a routed subinterface for VLAN 1.
Interface IP Addresses
You can specify both IPv4 and IPv6 addresses for the same interface.
For IPv6 addresses:
You can create or modify the IP address for an interface using either :: or 0:0:0 notation. For example, both 2620:149:43:c109:0:0:0:5 and 2001:DB8::1/126 are valid.
Cumulus Linux assigns the IPv6 address with all zeroes in the interface identifier (2001:DB8::/126) for each subnet; connected hosts cannot use this address.
The following example commands configure three IP addresses for swp1; two IPv4 addresses and one IPv6 address.
cumulus@switch:~$ nv set interface swp1 ip address 10.0.0.1/30
cumulus@switch:~$ nv set interface swp1 ip address 10.0.0.2/30
cumulus@switch:~$ nv set interface swp1 ip address 2001:DB8::1/126
cumulus@switch:~$ nv config apply
In the /etc/network/interfaces file, list all IP addresses under the iface section.
auto swp1
iface swp1
address 10.0.0.1/30
address 10.0.0.2/30
address 2001:DB8::1/126
The address method and address family are not mandatory; they default to inet/inet6 and static. However, you must specify inet/inet6 when you are creating DHCP or loopback interfaces.
auto lo
iface lo inet loopback
To make non-persistent changes to interfaces at runtime, use ip addr add:
cumulus@switch:~$ sudo ip addr add 10.0.0.1/30 dev swp1
cumulus@switch:~$ sudo ip addr add 2001:DB8::1/126 dev swp1
To remove an addresses from an interface, use ip addr del:
cumulus@switch:~$ sudo ip addr del 10.0.0.1/30 dev swp1
cumulus@switch:~$ sudo ip addr del 2001:DB8::1/126 dev swp1
Interface Descriptions
You can add a description (alias) to an interface.
In the /etc/network/interfaces file, add a description using the alias keyword:
cumulus@switch:~# sudo nano /etc/network/interfaces
auto swp1
iface swp1
alias swp1 hypervisor_port_1
Interface Commands
You can specify user commands for an interface that run at pre-up, up, post-up, pre-down, down, and post-down.
You can add any valid command in the sequence to bring an interface up or down; however, limit the scope to network-related commands associated with the particular interface. For example, it does not make sense to install a Debian package on ifup of swp1, even though it is technically possible. See man interfaces for more details.
The following examples adds a command to an interface to enable proxy ARP:
NVUE does not provide commands to configure this feature.
If your post-up command also starts, restarts, or reloads any systemd service, you must use the --no-block option with systemctl. Otherwise, that service or even the switch itself might hang after starting or restarting. For example, to restart the dhcrelay service after bringing up a VLAN, the /etc network/interfaces configuration looks like this:
auto bridge.100
iface bridge.100
post-up systemctl --no-block restart dhcrelay.service
Source Interface File Snippets
Sourcing interface files helps organize and manage the /etc/network/interfaces file. For example:
cumulus@switch:~$ sudo cat /etc/network/interfaces
# The loopback network interface
auto lo
iface lo inet loopback
# The primary network interface
auto eth0
iface eth0 inet dhcp
source /etc/network/interfaces.d/bond0
Use the glob keyword to specify bridge ports and bond slaves:
auto br0
iface br0
bridge-ports glob swp1-6.100
auto br1
iface br1
bridge-ports glob swp7-9.100 swp11.100 glob swp15-18.100
Fast Linkup
Cumulus Linux supports fast linkup on interfaces on NVIDIA Spectrum1 switches. Fast linkup enables you to bring up ports with cards that require links to come up fast, such as certain 100G optical network interface cards.
You must configure both sides of the connection with the same speed and FEC settings.
cumulus@switch:~$ nv set interface swp1 link fast-linkup on
cumulus@switch:~$ nv config apply
Edit the /etc/cumulus/switchd.conf file and add the interface.<interface>.enable_media_depended_linkup_flow=TRUE and interface.<interface>.enable_port_short_tuning=TRUE settings for the interfaces on which you want to enable fast linkup. The following example enables fast linkup on swp1:
Reload switchd with the sudo systemctl reload switchd.service command.
Link Flap Protection
Cumulus Linux enables link flap detection by default. Link flap detection triggers when there are five link flaps within ten seconds, at which point the interface goes into a protodown state and shows linkflap as the reason. The switchd service also shows a log message similar to the following:
2023-02-10T17:53:21.264621+00:00 cumulus switchd[10109]: sync_port.c:2263 ERR swp2 link flapped more than 3 times in the last 60 seconds, setting protodown
To show interfaces with the protodown flag, run the NVUE nv show interface command or the Linux ip link command. To check a specific interface, run the nv show interface <interface> link command.
cumulus@switch:~$ nv show interface
Interface State Speed MTU Type Remote Host Remote Port Summary
--------- ----- ----- ----- -------- --------------- ----------- ----------------------------------------
eth0 up 1G 1500 eth oob-mgmt-switch swp10 IP Address: 192.168.200.11/24
IP Address: fe80::4638:39ff:fe22:17a/64
lo up 65536 loopback IP Address: 127.0.0.1/8
IP Address: ::1/128
mgmt up 65575 vrf IP Address: 127.0.0.1/8
IP Address: ::1/128
swp1 up 1500 swp
swp2 protodown 9178 swp
swp3 up 1500 swp
swp4 up 1500 swp
...
cumulus@switch:~$ ip link
...
37: swp2: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP> mtu 9178 qdisc pfifo_fast master bond131 state DOWN mode DEFAULT group default qlen 1000
link/ether 1c:34:da:ba:bb:2a brd ff:ff:ff:ff:ff:ff protodown on protodown_reason <linkflap>
...
Clear the Interface Protodown State and Reason
To clear the protodown state and the reason:
cumulus@switch:~$ nv action clear interface swp1 link protodown link-flap
After a few seconds the port state returns to up. Run the nv show <interface> link state command to verify that the interface is no longer in a protodown state and that the reason clears:
cumulus@switch:~$ nv show swp1 link state
operational applied
----------- -------
up up
To clear all the interfaces from a protodown state, run the nv action clear system link protodown link-flap.
The ifdown and ifup commands do not clear the protodown state. You must clear the protodown state and the reason manually using the sudo ip link set <interface> protodown_reason linkflap off and sudo ip link set <interface> protodown off commands.
cumulus@switch:~$ sudo ip link set swp2 protodown_reason linkflap off
cumulus@switch:~$ sudo ip link set swp2 protodown off
After a few seconds, the port state returns to UP. To verify that the interface is no longer in a protodown state and that the reason clears, run the ip link show <interface> command:
cumulus@switch:~$ ip link show swp2
37: swp2: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP> mtu 9178 qdisc pfifo_fast master bond131 state UP mode DEFAULT group default qlen 1000
link/ether 1c:34:da:ba:bb:2a brd ff:ff:ff:ff:ff:ff
Change Link Flap Protection Settings
You can change the following link flap protection settings:
The duration in seconds during which a link must flap the number of times set in the link flap threshold before link flap protection triggers. You can specify a value between 0 (off) and 60. The default setting is 10.
The number of times the link can flap within the link flap window before link flap protection triggers. You can specify a value between 0 (off) and 30. The default setting is 5.
The following example configures the link flap duration to 30 and the number of times the link must flap to 8.
cumulus@switch:~$ nv set system link flap-protection interval 30
cumulus@switch:~$ nv set system link flap-protection threshold 8
cumulus@switch:~$ nv config apply
Edit the /etc/cumulus/switchd.conf file to change the link_flap_window and link_flap_threshold settings.
Unlike the traditional ifupdown system, ifupdown2 does not run scripts installed in /etc/network/*/ automatically to configure network interfaces.
To enable or disable ifupdown2 scripting, edit the addon_scripts_support line in the /etc/network/ifupdown2/ifupdown2.conf file. 1 enables scripting and 2 disables scripting. For example:
cumulus@switch:~$ sudo nano /etc/network/ifupdown2/ifupdown2.conf
# Support executing of ifupdown style scripts.
# Note that by default python addon modules override scripts with the same name
addon_scripts_support=1
ifupdown2 sets the following environment variables when executing commands:
$IFACE represents the physical name of the interface; for example, br0 or vxlan42. The name comes from the /etc/network/interfaces file.
$LOGICAL represents the logical name (configuration name) of the interface.
$METHOD represents the address method; for example, loopback, DHCP, DHCP6, manual, static, and so on.
$ADDRFAM represents the address families associated with the interface in a comma-separated list; for example, "inet,inet6".
Troubleshooting
To see the link and administrative state of an interface:
cumulus@switch:~$ nv show interface swp1 link state
In the following example, swp1 is administratively UP and the physical link is UP (LOWER_UP).
cumulus@switch:~$ ip link show dev swp1
3: swp1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT qlen 500
link/ether 44:38:39:00:03:c1 brd ff:ff:ff:ff:ff:ff
To show the assigned IP address on an interface:
cumulus@switch:~$ nv show interface swp1 ip address
cumulus@switch:~$ ip addr show swp1
3: swp1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 500
link/ether 44:38:39:00:03:c1 brd ff:ff:ff:ff:ff:ff
inet 192.0.2.1/30 scope global swp1
inet 192.0.2.2/30 scope global swp1
inet6 2001:DB8::1/126 scope global tentative
valid_lft forever preferred_lft forever
To show the description (alias) for an interface:
cumulus@switch$ nv show interface swp1
cumulus@switch$ ip link show swp1
3: swp1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT qlen 500
link/ether aa:aa:aa:aa:aa:bc brd ff:ff:ff:ff:ff:ff
alias hypervisor_port_1
Considerations
Even though ifupdown2 supports the inclusion of multiple iface stanzas for the same interface, use a single iface stanza for each interface. If you must specify more than one iface stanza; for example, if the configuration for a single interface comes from many places, like a template or a sourced file, make sure the stanzas do not specify the same interface attributes. Otherwise, you see unexpected behavior.
In the following example, swp1 is in two files: /etc/network/interfaces and /etc/network/interfaces.d/speed_settings. ifupdown2 parses this configuration because the same attributes are not in multiple iface stanzas.
cumulus@switch:~$ sudo cat /etc/network/interfaces
source /etc/network/interfaces.d/speed_settings
auto swp1
iface swp1
address 10.0.14.2/24
cumulus@switch:~$ cat /etc/network/interfaces.d/speed_settings
auto swp1
iface swp1
link-speed 1000
link-duplex full
ifupdown2 and sysctl
For sysctl commands in the pre-up, up, post-up, pre-down, down, and post-down lines that use the
$IFACE variable, if the interface name contains a dot (.), ifupdown2 does not change the name to work with sysctl. For example, the interface name bridge.1 does not convert to bridge/1.
ifupdown2 and the gateway Parameter
The default route that the gateway parameter creates in ifupdown2 does not install in FRR, therefore does not redistribute into other routing protocols. Define a static default route instead, which installs in FRR and redistributes, if needed.
The following shows an example of the /etc/network/interfaces file when you use a static route instead of a gateway parameter:
auto swp2
iface swp2
address 172.16.3.3/24
up ip route add default via 172.16.3.2
Interface Name Limitations
Interface names can be a maximum of 15 characters. You cannot use a number for the first character and you cannot include a dash (-) in the name. In addition, you cannot use any name that matches with the regular expression .{0,13}\-v.*.
If you encounter issues, remove the interface name from the /etc/network/interfaces file, then restart the networking.service.
ifupdown2 does not honor the configured IP address scope setting in the /etc/network/interfaces file and treats all addresses as global. It does not report an error. Consider this example configuration:
auto swp2
iface swp2
address 35.21.30.5/30
address 3101:21:20::31/80
scope link
When you run ifreload -a on this configuration, ifupdown2 considers all IP addresses as global.
cumulus@switch:~$ ip addr show swp2
5: swp2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 74:e6:e2:f5:62:82 brd ff:ff:ff:ff:ff:ff
inet 35.21.30.5/30 scope global swp2
valid_lft forever preferred_lft forever
inet6 3101:21:20::31/80 scope global
valid_lft forever preferred_lft forever
inet6 fe80::76e6:e2ff:fef5:6282/64 scope link
valid_lft forever preferred_lft forever
To work around this issue, configure the IP address scope:
The NVUE command is not supported.
In the /etc/network/interfaces file, configure the IP address scope using post-up ip address add <address> dev <interface> scope <scope>. For example:
auto swp6
iface swp6
post-up ip address add 71.21.21.20/32 dev swp6 scope site
Then run the ifreload -a command on this configuration.
The following configuration shows the correct scope:
cumulus@switch:~$ ip addr show swp6
9: swp6: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 74:e6:e2:f5:62:86 brd ff:ff:ff:ff:ff:ff
inet 71.21.21.20/32 scope site swp6
valid_lft forever preferred_lft forever
inet6 fe80::76e6:e2ff:fef5:6286/64 scope link
valid_lft forever preferred_lft forever
For NVIDIA Spectrum ASICs, the firmware configures FEC, link speed, duplex mode and auto-negotiation automatically, following a predefined list of parameter settings until the link comes up. You can disable FEC if necessary, which forces the firmware to not try any FEC options.
MTU
Interface MTU applies to traffic traversing the management port, front panel or switch ports, bridge, VLAN subinterfaces, and bonds (both physical and logical interfaces). MTU is the only interface setting that you must set manually.
In Cumulus Linux, ifupdown2 assigns 9216 as the default MTU setting. The initial MTU value set by the driver is 9238. After you configure the interface, the default MTU setting is 9216.
To change the MTU setting, run the following commands. The example command sets the MTU to 1500 for the swp1 interface.
cumulus@switch:~$ nv set interface swp1 link mtu 1500
cumulus@switch:~$ nv config apply
Edit the /etc/network/interfaces file, then run the ifreload -a command.
cumulus@switch:~$ sudo nano /etc/network/interfaces
auto swp1
iface swp1
mtu 1500
cumulus@switch:~$ sudo ifreload -a
Runtime Configuration (Advanced)
Run the ip link set command. The following example command sets the swp1 interface MTU to 1500.
cumulus@switch:~$ sudo ip link set dev swp1 mtu 1500
A runtime configuration is non-persistent; the configuration you create does not persist after you reboot the switch.
Set a Global Policy
To set a global MTU policy, create a policy document (called mtu.json). For example:
The policies and attributes in any file in /etc/network/ifupdown2/policy.d/ override the default policies and attributes in /var/lib/ifupdown2/policy.d/.
Bridge MTU
The MTU setting is the lowest MTU of any interface that is a member of the bridge (every interface specified in bridge-ports in the bridge configuration of the /etc/network/interfaces file). You are not required to specify an MTU on the bridge. Consider this bridge configuration:
For a bridge to have an MTU of 9000, set the MTU for each of the member interfaces (bond1 to bond 4, and peer5) to 9000 at minimum.
When configuring MTU for a bond, configure the MTU value directly under the bond interface; the member links or slave interfaces inherit the configured value. If you need a different MTU on the bond, set it on the bond interface, as this ensures the slave interfaces pick it up. You do not have to specify an MTU on the slave interfaces.
VLAN interfaces inherit their MTU settings from their physical devices or their lower interface; for example, swp1.100 inherits its MTU setting from swp1. Therefore, specifying an MTU on swp1 ensures that swp1.100 inherits the MTU setting for swp1.
If you are working with VXLANs, the MTU for a virtual network interface (VNI must be 50 bytes smaller than the MTU of the physical interfaces on the switch, as various headers and other data require those 50 bytes. Also, consider setting the MTU much higher than 1500.
To show the MTU setting for an interface:
cumulus@switch:~$ nv show interface swp1
...
link
auto-negotiate off on
duplex full full
speed 1G auto
fec auto
mtu 9216 9216
cumulus@switch:~$ ip link show dev swp1
3: swp1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9216 qdisc pfifo_fast state UP mode DEFAULT qlen 500
link/ether 44:38:39:00:03:c1 brd ff:ff:ff:ff:ff:ff
Drop Packets that Exceed the Egress Layer 3 MTU
The switch forwards all packets that are within the MTU value set for the egress layer 3 interface. However, when packets are larger in size than the MTU value, the switch fragments the packets that do not have the DF bit set and drops the packets that do have the DF bit set.
Run the following command to drop all IP packets that are larger in size than the MTU value for the egress layer 3 interface instead of fragmenting packets:
cumulus@switch:~$ nv set system control-plane trap l3-mtu-err state off
cumulus@switch:~$ nv config apply
FEC is an encoding and decoding layer that enables the switch to detect and correct bit errors introduced over the cable between two interfaces. The target IEEE BER on high speed Ethernet links is 10-12. Because 25G transmission speeds can introduce a higher than acceptable BER on a link, FEC is often required to correct errors to achieve the target BER at 25G, 4x25G, 100G, and higher link speeds. The type and grade of a cable or module and the medium of transmission determine which FEC setting is necessary.
For the link to come up, the two interfaces on each end must use the same FEC setting.
FEC requires small latency overhead. For most applications, this small amount of latency is preferable to error packet retransmission latency.
The two FEC types are:
Reed Solomon (RS), IEEE 802.3 Clause 108 (CL108) on individual 25G channels and Clause 91 on 100G (4channels). This is the highest FEC algorithm, providing the best bit-error correction.
Base-R (BaseR), Fire Code (FC), IEEE 802.3 Clause 74 (CL74). Base-R provides less protection from bit errors than RS FEC but adds less latency.
Cumulus Linux includes additional FEC options:
Auto FEC instructs the hardware to select the best FEC. For copper DAC, the remote end can negotiate FEC. However, optical modules do not have auto-negotiation capability; if the device chooses a preferred mode, it might not match the remote end. This is the current default on the NVIDIA Spectrum switch.
No FEC (no error correction).
While Auto FEC is the default setting on the NVIDIA Spectrum switch, do not explicitly configure the fec auto option on the switch as this leads to a link flap whenever you run net commit or ifreload -a.
For 25G DAC, 4x25G Breakouts DAC and 100G DAC cables, the IEEE 802.3by specification creates 3 classes:
CA-25G-L (Long cable) - Requires RS FEC - Achievable cable length of at least 5m. dB loss less or equal to 22.48. Expected BER of 10-5 or better without RS FEC enabled.
CA-25G-S (Short cable) - Requires Base-R FEC - Achievable cable length of at least 3m. dB loss less or equal to 16.48. Expected BER of 10-8 or better without Base-R FEC enabled.
CA-25G-N (No FEC) - Does not require FEC - Achievable cable length of at least 3m. dB loss less or equal to 12.98. Expected BER 10-12 or better with no FEC enabled.
The IEEE classification specifies various dB loss measurements and minimum achievable cable length. You can build longer and shorter cables if they comply to the dB loss and BER requirements.
If a cable has a CA-25G-S classification and FEC is not on, the BER might be unacceptable in a production network. It is important to set the FEC according to the cable class (or better) to have acceptable bit error rates. See
Determining Cable Class below.
You can check bit errors using cl-netstat (RX_ERR column) or ethtool -S (HwIfInErrors counter) after a large amount of traffic passes through the link. A non-zero value indicates bit errors.
Expect error packets to be zero or extremely low compared to good packets. If a cable has an unacceptable rate of errors with FEC enabled, replace the cable.
For 25G, 4x25G Breakout, and 100G Fiber modules and AOCs, there is no classification of 25G cable types for dB loss, BER or length. Use FEC if the BER is low enough.
Cable Class of 100G and 25G DACs
You can determine the cable class for 100G and 25G DACs from the Extended Specification Compliance Code field (SFP28: 0Ah, byte 35, QSFP28: Page 0, byte 192) in the cable EEPROM programming.
For 100G DACs, most manufacturers use the 0x0Bh 100GBASE-CR4 or 25GBASE-CR CA-L value (the 100G DAC specification predates the IEEE 802.3by 25G DAC specification). Use RS FEC for 100G DAC; shorter or better cables might not need this setting.
A manufacturer’s EEPROM setting might not match the dB loss on a cable or the actual bit error rates that a particular cable introduces. Use the designation as a guide, but set FEC according to the bit error rate tolerance in the design criteria for the network. For most applications, the highest mutual FEC ability of both end devices is the best choice.
You can determine for which grade the manufacturer has designated the cable as follows.
In each example below, the Compliance field comes from the method described above; the ethool -m output does not show it.
3meter cable that does not require FEC
(CA-N)
Cost: More expensive
Cable size: 26AWG (Note that AWG does not necessarily correspond to overall dB loss or BER performance)
Compliance Code: 25GBASE-CR CA-N
3meter cable that requires Base-R FEC
(CA-S)
Cost: Less expensive
Cable size: 26AWG
Compliance Code: 25GBASE-CR CA-S
When in doubt, consult the manufacturer directly to determine the cable classification.
Spectrum ASIC FEC Behavior
The firmware in a Spectrum ASIC applies FEC configuration to 25G and 100G cables based on the cable type and whether the peer switch also has a Spectrum ASIC.
When the link is between two switches with Spectrum ASICs:
For 25G optical modules, the Spectrum ASIC firmware chooses Base-R/FC-FEC.
For 25G DAC cables with attenuation less or equal to 16db, the firmware chooses Base-R/FC-FEC.
For 25G DAC cables with attenuation higher than 16db, the firmware chooses RS-FEC.
For 100G cables/modules, the firmware chooses RS-FEC.
Cable Type
FEC Mode
25G optical cables
Base-R/FC-FEC
25G 1,2 meters: CA-N, loss <13db
Base-R/FC-FEC
25G 2.5,3 meters: CA-S, loss <16db
Base-R/FC-FEC
25G 2.5,3,4,5 meters: CA-L, loss > 16db
RS-FEC
100G DAC or optical
RS-FEC
When linking to a non-Spectrum peer, the firmware lets the peer decide. The Spectrum ASIC supports RS-FEC (for both 100G and 25G), Base-R/FC-FEC (25G only), or no-FEC (for both 100G and 25G).
Cable Type
FEC Mode
25G optical cables
Let peer decide
25G 1,2 meters: CA-N, loss <13db
Let peer decide
25G 2.5,3 meters: CA-S, loss <16db
Let peer decide
25G 2.5,3,4,5 meters: CA-L, loss > 16db
Let peer decide
100G
Let peer decide: RS-FEC or No FEC
How Does Cumulus Linux use FEC?
A Spectrum switch enables FEC automatically when it powers up. The port firmware tests and determines the correct FEC mode to bring the link up with the neighbor. It is possible to get a link up to a switch without enabling FEC on the remote device as the switch eventually finds a working combination to the neighbor without FEC.
The following sections describe how to show the current FEC mode, and how to enable and disable FEC.
Show the Current FEC Mode
To show the FEC mode on a switch port, run the NVUE nv show interface <interface> link command.
cumulus@switch:~$ nv show interface swp1 link
operational applied pending description
---------------- ------------ ------- ------- ----------------------------------------------------------------------
auto-negotiate off on on Link speed and characteristic auto negotiation
breakout 1x 1x sub-divide or disable ports (only valid on plug interfaces)
duplex full full full Link duplex
fec auto auto Link forward error correction mechanism
...
Enable or Disable FEC
To enable Reed Solomon (RS) FEC on a link:
cumulus@switch:~$ nv set interface swp1 link fec rs
cumulus@switch:~$ nv config apply
Edit the /etc/network/interfaces file, then run the ifreload -a command. The following example enables RS FEC for the swp1 interface (link-fec rs):
cumulus@switch:~$ sudo nano /etc/network/interfaces
auto swp1
iface swp1
link-autoneg off
link-speed 100000
link-fec rs
cumulus@switch:~$ sudo ifreload -a
Runtime Configuration (Advanced)
Run the ethtool --set-fec <interface> encoding RS command. For example:
A runtime configuration is non-persistent. The configuration you create does not persist after you reboot the switch.
To enable Base-R/FireCode FEC on a link:
cumulus@switch:~$ nv set interface swp1 link fec baser
cumulus@switch:~$ nv config apply
Edit the /etc/network/interfaces file, then run the ifreload -a command. The following example enables Base-R FEC for the swp1 interface (link-fec baser):
cumulus@switch:~$ sudo nano /etc/network/interfaces
auto swp1
iface swp1
link-autoneg off
link-speed 100000
link-fec baser
cumulus@switch:~$ sudo ifreload -a
Runtime Configuration (Advanced)
Run the ethtool --set-fec <interface> encoding baser command. For example:
A runtime configuration is non-persistent. The configuration you create does not persist after you reboot the switch.
To enable FEC with Auto-negotiation:
You can use FEC with auto-negotiation on DACs only.
cumulus@switch:~$ nv set interface swp1 link auto-negotiate on
cumulus@switch:~$ nv config apply
Edit the /etc/network/interfaces file to set auto-negotiation to on, then run the ifreload -a command:
cumulus@switch:~$ sudo nano /etc/network/interfaces
auto swp1
iface swp1
link-autoneg on
cumulus@switch:~$ sudo ifreload -a
Runtime Configuration (Advanced)
You can use ethtool to enable FEC with auto-negotiation. For example:
ethtool -s swp1 speed 10000 duplex full autoneg on
A runtime configuration is non-persistent. The configuration you create does not persist after you reboot the switch.
To show the FEC and auto-negotiation settings for an interface, either run the NVUE nv show interface <interface> link command or the Linux sudo ethtool swp1 | egrep 'FEC|auto' command:
cumulus@switch:~$ nv set interface swp1 link fec off
cumulus@switch:~$ nv config apply
To configure FEC to the default value, run the nv unset interface swp1 link fec command.
Edit the /etc/network/interfaces file, then run the ifreload -a command. The following example disables Base-R FEC for the swp1 interface (link-fec baser):
cumulus@switch:~$ sudo nano /etc/network/interfaces
auto swp1
iface swp1
link-fec off
cumulus@switch:~$ sudo ifreload -a
Runtime Configuration (Advanced)
Run the ethtool --set-fec <interface> encoding off command. For example:
cumulus@switch:~$ sudo ethtool --set-fec swp1 encoding off
A runtime configuration is non-persistent. The configuration you create does not persist after you reboot the switch.
DR1 and DR4 Modules
100GBASE-DR1 modules, such as NVIDIA MMS1V70-CM, include internal RS FEC processing, which the software does not control. When using these optics, you must either set the FEC setting to off or leave it unset for the link to function.
400GBASE-DR4 modules, such as NVIDIA MMS1V00-WM, require RS FEC. The switch automatically enables FEC if it is set to off.
You typically use these optics to interconnect 4x SN2700 uplinks to a single SN4700 breakout downlink. The following configuration shows an explicit FEC example. You can leave the FEC settings unset for autodetection.
SN4700 (400GBASE-DR4 in swp1):
cumulus@SN4700:mgmt:~$ nv set interface swp1 link breakout 4x lanes-per-port 2
cumulus@SN4700:mgmt:~$ nv set interface swp1s0 link fec rs
cumulus@SN4700:mgmt:~$ nv set interface swp1s0 link speed 100G
cumulus@SN4700:mgmt:~$ nv set interface swp1s1 link fec rs
cumulus@SN4700:mgmt:~$ nv set interface swp1s1 link speed 100G
cumulus@SN4700:mgmt:~$ nv set interface swp1s2 link fec rs
cumulus@SN4700:mgmt:~$ nv set interface swp1s2 link speed 100G
cumulus@SN4700:mgmt:~$ nv set interface swp1s3 link fec rs
cumulus@SN4700:mgmt:~$ nv set interface swp1s3 link speed 100G
cumulus@SN4700:mgmt:~$ nv config apply
SN2700 (100GBASE-DR1 in swp11-14):
cumulus@SN2700:mgmt:~$ nv set interface swp11 link fec off
cumulus@SN2700:mgmt:~$ nv set interface swp11 link speed 100G
cumulus@SN2700:mgmt:~$ nv set interface swp12 link fec off
cumulus@SN2700:mgmt:~$ nv set interface swp12 link speed 100G
cumulus@SN2700:mgmt:~$ nv set interface swp13 link fec off
cumulus@SN2700:mgmt:~$ nv set interface swp13 link speed 100G
cumulus@SN2700:mgmt:~$ nv set interface swp14 link fec off
cumulus@SN2700:mgmt:~$ nv set interface swp14 link speed 100G
cumulus@SN4700:mgmt:~$ nv config apply
The FEC operational view of this configuration appears incorrect because FEC is operationally enabled only on the SN4700 400G breakout side. This is because the 100G DR1 module side handles FEC internally, which is not visible to Cumulus Linux.
cumulus@SN2700:mgmt:~$ nv show int swp11 link
operational applied
--------------------- ----------------- -------
auto-negotiate on on
duplex full full
speed 100G auto
fec off off
mtu 9216 9216
fast-linkup off
[breakout]
state up up
...
cumulus@SN4700:mgmt:~$ nv show int swp1s1 link
operational applied
--------------------- ----------------- -------
auto-negotiate on on
duplex full full
speed 100G auto
fec rs off
mtu 9216 9216
fast-linkup off
[breakout]
state up up
...
Default Policies for Interface Settings
Instead of configuring settings for each individual interface, you can specify a policy for all interfaces on a switch or tailor custom settings for each interface. Create a file in /etc/network/ifupdown2/policy.d/ and populate the settings accordingly. The following example shows a file called address.json.
Setting the default MTU also applies to the management interface. Be sure to add the iface_defaults to override the MTU for eth0, to remain at 9216.
Breakout Ports
Cumulus Linux supports the following ports breakout options:
18x SFP28 25G and 4x QSFP28 100G interfaces only support NRZ encoding. You can set all speeds down to 1G.
All 4x QSFP28 ports can break out into 4x SFP28 or 2x QSFP28.
18x 1G - 18x SFP28 set to 1G
16x 1G - 4x QSFP28 configured as 4x breakouts and set to 1G
Max 1G ports: 34
18x 10G - 18x SFP28 set to 10G
16x 10G - 4x QSFP28 configured as 4x breakouts and set to 10G
Maximum 10G ports: 34
18x 25G - 18x SFP28 (native speed)
16x 25G - 4x QSFP28 breakouts to 4x and set to 25G
Maximum 25G ports: 34
4x 40G - 4x QSFP28 set to 40G
Maximum 40G ports: 4
8x 50G - 4x QSFP28 break out into 2x and set to 50G
Maximum 50G ports: 8
4x 100G - 4x QSFP28 (native speed)
Maximum 100G ports: 4
16x QSFP28 100G interfaces only support NRZ encoding. You can set all speeds down to 1G.
All QSFP28 ports can break out into 4x SFP28 or 2x QSFP28.
64x 1G - 16x QSFP28 break out into 4x and set to 1G
Max 1G ports: 64
64x 10G - 16x QSFP28 break out into 4x and set to 10G
Maximum 10G ports: 64
64x 25G - 16x QSFP28 break out into 4x and set to 25G
Maximum 25G ports: 64
16x 40G - 4x QSFP28 set to 40G
Maximum 40G ports: 16
32x 50G - 16x QSFP28 break out into 2x and set to 50G
Maximum 50G ports: 32
16x 100G - 16x QSFP28 (native speed)
Maximum 100G ports: 16
48x 1GBase-T ports (RJ45 up to 100m CAT5E/6) and 4x QSFP28 100G interfaces (only support NRZ encoding). You can set all speeds down to 1G.
All 4x QSFP28 ports can break out into 4x SFP28 or 2x QSFP28.
48x 1GBase-T - 48x Base-T set to 1G. You can set them to also to 10/100Mb.
16x 1G - 4x QSFP28 configured as 4x breakouts and set to 1G
Maximum 10/100MBase-T ports: 48
Maximum 1GBase-T ports: 48
Maximum 1G ports: 16
16x 10G - 4x QSFP28 configured as 4x breakouts and set to 10G
Maximum 10G ports: 16
16x 25G - 4x QSFP28 breakouts to 4x and set to 25G
Maximum 25G ports: 16
4x 40G - 4x QSFP28 set to 40G
Maximum 40G ports: 4
8x 50G - 4x QSFP28 break out into 2x
Maximum 50G ports: 8
4x 100G - 4x QSFP28 (native speed)
Maximum 100G ports: 4
48x SFP28 25G and 8x QSFP28 100G interfaces only support NRZ encoding. You can set all speeds down to 1G.
The top 4x QSFP28 ports can break out into 4x SFP28. You cannot use the lower 4x QSFP28 disabled ports.
All 8x QSFP28 ports can break out into 2x QSFP28 without disabling ports.
48x 1G - 48x SFP28 set to 10G
16x 1G - 4x QSFP28 break out into 4x and set to 1G
Max 1G ports: 64
48x 10G - 48x SFP28 set to 10G
16x 10G - 4x QSFP28 break out into 4x and set to 10G
Maximum 10G ports: 64
48x 25G - 48x SFP28 (native speed)
16x 25G - Top 4x QSFP28 break out into 4x (bottom 4x QSFP28 disabled)
Maximum 25G ports: 64
8x 40G - 8x QSFP28 set to 40G
Maximum 40G ports: 8
16x 50G - 8x QSFP28 break out into 2x
Maximum 50G ports: 16
8x 100G - 8x QSFP28 (native speed)
Maximum 100G ports: 8
32x QSFP28 100G interfaces only support NRZ encoding. You can set all speeds down to 1G.
The top 16x QSFP28 ports can break out into 4x SFP28. You cannot use the lower 4x QSFP28 disabled ports.
All 32x QSFP28 ports can break out into 2x QSFP28 without disabling ports.
64x 1G - Top 16x QSFP28 break out into 4x and set to 1G (bottom 16XQSFP28 disabled)
Max 1G ports: 64
64x 10G - Top 16x QSFP28 break out into 4x and set to 10G (bottom 16x QSFP28 disabled)
Maximum 10G ports: 64
64x 25G - Top 16x QSFP28 break out into 4x (bottom 16x QSFP28 disabled)
Maximum 25G ports: 64
32x 40G - 32x QSFP28 set to 40G
Maximum 40G ports: 32
64x 50G - 64x QSFP28 break out into 2x
Maximum 50G ports: 64
32x 100G - 32x QSFP28 (native speed)
Maximum 100G ports: 32
48x SFP28 25G and 12x QSFP28 100G interfaces only support NRZ encoding. You can set all speeds down to 1G.
All 12x QSFP28 ports can break out into 4x SFP28 or 2x QSFP28.
48x 1G - 48XSFP28 set to 1G
48x 1G - 12XQSFP28 break out into 4x and set to 1G
Max 1G ports: 96
48x 10G - 48x SFP28 set to 10G
48x 10G - 12x QSFP28 break out into 4x and set to 10G
Maximum 10G ports: 96
48x 25G - 48x SFP28 (native speed)
48x 25G - 12x QSFP28 break out into 4x
Maximum 25G ports: 96
12x 40G - 12x QSFP28 set to 40G
Maximum 40G ports: 12
24x 50G - 12x QSFP28 break out into 2x
Maximum 50G ports: 24
12x 100G - 12x QSFP28 (native speed)
Maximum 100G ports: 12
32x QSFP28 100G interfaces only support NRZ encoding. You can set all speeds down to 1G.
All 32x QSFP28 ports can break out into 4x SFP28 or 2x QSFP28.
128x1G - 32XQSFP28 break out into 4x and set to 1G
Max 1G ports: 128
128x 10G - 32x QSFP28 break out into 4x and set to 10G
Maximum 10G ports: 128
128x25G - 32x QSFP28 break out into 4x
Maximum 25G ports: 128
32x 40G - 32x QSFP28 set to 40G
Maximum 40G ports: 32
64x 50G - 32x QSFP28 break out into 2x
Maximum 50G ports: 64
32x 100G - 32x QSFP28 (native speed)
Maximum 100G ports: 32
32x QSFP56 200G interfaces support both PAM4 and NRZ encodings. You can set all speeds down to 1G.
For lower speed interface configurations, PAM4 is automatically converted to NRZ encoding.
All 32x QSFP56 ports can break out into 4xSFP56 or 2x QSFP56.
128x 1G - 32XQSFP56 break out into 4x and set to 1G
Max 1G ports: 128
128x 10G - 32x QSFP56 break out into 4x and set to 10G
Maximum 10G ports: 128
128x 25G - 32x QSFP56 break out into 4x and set to 25G
Maximum 25G ports: 128
32x 40G - 32x QSFP56 set to 40G
Maximum 40G ports: 32
128x 50G - 32x QSFP56 break out into 4x
Maximum 50G ports: 128
64x100G - 32x QSFP56 break out into 2x
Maximum 100G ports: 64
32x 200G - 32x QSFP56 (native speed)
Maximum 200G ports: 32
SN4410 24xQSFP28-DD interfaces [ports 1-24] support both PAM4 and NRZ encoding with all speeds from 200G down to 1G.
The 8xQSFP-DD (400GbE) interfaces [ports 25-32] support both PAM4 and NRZ encodings with all speeds from 400G down to 1G.
For lower speeds, PAM4 is automatically converted to NRZ encoding.
You can split ports #1 to #32 into:
2x ports with PAM 4 and NRZ encoding with no limitations.
4x ports with PAM 4 and NRZ encoding with no limitations.
8x ports with PAM 4 and NRZ encoding but this forces blocking of an adjacent port (the total available number of MAC addresses is 128)
96x 1G - 24XQSFP28-DD break out into 4x and set to 1G
32x 1G - Top 4XQSFP-DD break out into 8x and set to 1G (bottom 4XQSFP-DD blocked*)
Max 1G ports: 128
96x 10G - 24xQSFP28-DD break out into 4x and set to 10G
32x 10G - 4 top QSFP-DD break out into 8x and set to 10G (bottom 4xQSFP-DD blocked*)
Maximum 10G ports: 128
*Other QSFP-DD breakout combinations are available up to maximum of 128x ports.
96x 25G - 24xQSFP28-DD break out into 4x
32x 25G - 4 top QSFP-DD break out into 8x and set to 25G (bottom 4xQSFP-DD blocked*)
Maximum 25G ports: 128
*Other QSFP-DD breakout combinations are available up to maximum of 128x ports.
48x 40G - 24xQSFP28-DD breakout into 2x and set to 40G
16x 40G – 8xQSFP-DD breakout into 2x and set to 40G
Maximum 40G ports: 64
96x 50G - 24xQSFP28-DD/QSFP56 break out into 4x
32x 50G - 8xQSFP-DD break out into 4x
Maximum 50G ports: 128
96x 100G - 24xQSFP28-DD/QSFP56 break out into 4x
32x 100G - 8xQSFP-DD break out into 4x
Maximum 100G ports: 128
48x 200G - 24xQSFP28-DD/QSFP56 break out into 2x
16x 200G - 8xQSFP-DD break out into 2x
Maximum 200G ports: 64
8x400G - 8xQSFP-DD (native speed)
Maximum 400G ports: 8
64x QSFP28 100G interfaces only support NRZ encoding. You can set all speeds down to 1G.
Only 32x QSFP28 ports can break out into 4x SFP28. You must disable the adjacent QSFP28 port. Only the first and third or second and forth rows can break out into 4xSFP28.
All 64x QSFP28 ports can break out into 2x QSFP28 without disabling ports.
128x 1G - 32XQSFP28 break out into 4x and set to 1G
Max 1G ports: 128
128x 10G - 32x QSFP28 break out into 4x and set to 10G
Maximum 10G ports: 128
128x 25G - 32x QSFP28 break out into 4x
Maximum 25G ports: 128
64x 40G - 64x QSFP28 set to 40G
Maximum 40G ports: 64
128x 50G - 64x QSFP28 break out into 2x
Maximum 50G ports: 128
64x 100G - 64x QSFP28 (native speed)
Maximum 100G ports: 64
SN4600 64xQSFP56 (200GbE) interfaces support both PAM4 and NRZ encodings with all speeds down to 1G.
For lower speeds, PAM4 is automatically converted to NRZ encoding.
Only 32xQSFP56 ports can break out into 4xSFP56 (4x50GbE). But, in this case, the adjacent QSFP56 port are blocked (only the first and third or second and fourth rows can break out into 4xSFP56).
All 64xQSFP56 ports can break out into 2xQSFP56 (2x100GbE) without blocking ports.
128x 1G - 32XQSFP56 break out into 4x and set to 1G
Max 1G ports: 128
128x10G - 64xQSFP56 break out into 4x and set to 10G
Maximum 10G ports: 128
128x25G - 64xQSFP56 break out into 4x and set to 25G
Maximum 25G ports: 128
64x40G - 64xQSFP56 set to 40G
Maximum 40G ports: 64
128x50G - 32xQSFP56 break out into 4x
Maximum 50G ports: 128
128x 100G - 64xQSFP56 break out into 2x
64x 100G - 64xQSFP28 set to 100G
Maximum 100G ports: 128
64x200G - 64xQSFP56 (native speed)
Maximum 200G ports: 64
SN4700 32x QSFP-DD 400GbE interfaces support both PAM4 and NRZ encodings. You can set all speeds down to 1G.
For lower speed interface configurations, PAM4 is automatically converted to NRZ encoding.
Only the top 16x QSFP-DD ports can break out into 8x SFP56. You must disable the adjacent QSFP-DD port.
All 32x QSFP-DD ports can break out into 2x QSFP56 at 2x200G or 4x QSFP56 at 4x 100G without disabling ports.
128x 1G - Top 16XQSFP-DD break out into 8x and set to 1G
Maximum 1G ports: 128
128x 10G - 16x QSFP-DD break out into 8x and set to 10G
Maximum 10G ports: 128
*Cumulus Linux supports other QSFP-DD breakout combinations up to maximum of 128x ports.
128x 25G - 16x QSFP-DD break out into 8x and set to 25G
Maximum 25G ports: 128
*Cumulus Linux supports other QSFP-DD breakout combinations up to maximum of 128x ports.
32x 40G - 32x QSFP-DD set to 40G
Maximum 40G ports: 32
128x 50G - 16x QSFP-DD break out into 8x
Maximum 50G ports: 128
*Cumulus Linux supports other QSFP-DD breakout combinations up to maximum of 128x ports.
128x 100G - 32x QSFP-DD break out into 4x
Maximum 100G ports: 128
64x 200G - 64x QSFP-DD break out into 2x
Maximum 200G ports: 64
32x 400G - 32x QSFP-DD (native speed)
Maximum 400G ports: 32
SN5600 64xOSFP (800GbE) interfaces support both PAM4 and NRZ encodings with all speeds down to 10G.
For lower speeds, PAM4 is automatically converted to NRZ encoding.
Bonus port #65 supports 1G, 10G, and 25G but does not support breakouts.
Maximum 1G ports: 1 (bonus port)
256x 10G
Maximum 10G ports: 257 (256 + 1 bonus port)
256x 25G
Maximum 25G ports: 257 (256 + 1 bonus port)
128x 40G
Maximum 40G ports: 128
256x 50G - 32x OSFP break out into 8x - You must disable the adjacent OSFP port.
Maximum 50G ports: 256
256x 100G - 32x OSFP break out into 8x - You must disable the adjacent OSFP port.
Maximum 100G ports: 256
256x 200G - 64x OSFP break out into 4x
Maximum 200G ports: 256
128x 400G - 64x OSFP break out into 2x
Maximum 400G ports: 128
64x 800G
Maximum 800G ports: 64
You can use a single SFP (10/25/50G) transceiver in a QSFP (100/200/400G) port with QSFP-to-SFP Adapter (QSA). Set the port speed to the SFP speed with the nv set interface <interface> link speed <speed> command. Do not configure this port as a breakout port.
If you break out a port, then reload the switchd service on a switch running in nonatomic ACL mode, temporary disruption to traffic occurs while the ACLs reinstall.
Cumulus Linux does not support port ganging.
Configure a Breakout Port
You can break out (split) a port using the following options:
1x does not split the port. This is the default port setting.
2x splits the port into two interfaces.
4x splits the port into four interfaces.
8x splits the port into eight interfaces.
If you split a 100G port into four interfaces and auto-negotiation is on (the default setting), Cumulus Linux advertises the speed for each interface up to the maximum speed possible for a 100G port (100/4=25G). You can overide this configuration and set specific speeds for the split ports if necessary.
Cumulus Linux 5.4 and later uses a new format for port splitting; instead of 1=100G or 1=4x10G, you specify 1=1x or 1=4x. The new format does not support specifying a speed for breakout ports in the /etc/cumulus/ports.conf file. To set a speed, either set the link-speed parameter for each split port in the /etc/network/interfaces file or run the NVUE nv set interface <interface> link speed <speed> command.
The following example breaks out a 100G port on swp1 into four interfaces. Cumulus Linux advertises the speed for each interface up to a maximum of 25G:
cumulus@switch:~$ nv set interface swp1 link breakout 4x
cumulus@switch:~$ nv set interface swp1s0-3 link state up
cumulus@switch:~$ nv config apply
The following example splits the port into four interfaces and forces the link speed to be 10G. Cumulus disables auto-negotiation when you force set the speed.
cumulus@switch:~$ nv set interface swp1 link breakout 4x
cumulus@switch:~$ nv set interface swp1s0-3 link state up
cumulus@switch:~$ nv set interface swp1s0-3 link speed 10G
Certain switches, such as the SN2700, SN4600, and SN4600c, require that you disable the subsequent even-numbered port when you configure a breakout port for 4x or 8x. NVUE automatically disables the subsequent even-numbered port on any switch with this requirement.
To split a port into multiple interfaces, edit the /etc/cumulus/ports.conf file. The following example command breaks out swp1 into four interfaces.
When you configure a breakout port to 4x or 8x on certain switches such as the SN2700, SN4600, and SN4600c, you must set the subsequent even-numbered port to disabled in the /etc/cumulus/ports.conf file. The SN3700, SN3700c, SN2201, SN2010, and SN2100 switch does not have this requirement.
Reload switchd with the sudo systemctl reload switchd.service command. The reload does not interrupt network services.
To configure specific speeds for the split ports, edit the /etc/network/interfaces file, then run the ifreload -a command. The following example configures the speed for each swp1 breakout port (swp1s0, swp1s1, swp1s2, and swp1s3) to 10G with auto-negotiation off.
cumulus@switch:~$ sudo cat /etc/network/interfaces
...
auto swp1s0
iface swp1s0
link-speed 10000
link-duplex full
link-autoneg off
auto swp1s1
iface swp1s1
link-speed 10000
link-duplex full
link-autoneg off
auto swp1s2
iface swp1s2
link-speed 10000
link-duplex full
link-autoneg off
auto swp1s3
iface swp1s3
link-speed 10000
link-duplex full
link-autoneg off
...
cumulus@switch:~$ sudo ifreload -a
The SN4700 and SN4410 switch does not support auto-negotiation on QSFP-DD 400G transceiver modules. You need to force set the speed.
Set the Number of Lanes per Split Port
By default, to calculate the split port width, Cumulus Linux uses the formula split port width = full port width / breakout. For example, a port split into two interfaces (2x breakout) => 8 lanes width / 2x breakout = 4 lanes per split port.
If you need to use a different port width than the default, you can set the number of lanes per port.
QSFP56-DD transceiver ports split into four interfaces (4x) default to one lane per interface for backwards compatibility. You can change the lane setting to two lanes per interface.
The following example command splits swp1 into two interfaces (2x) and sets the number of lanes per split port to 2.
cumulus@switch:~$ nv set interface swp1 link breakout 2x lanes-per-port 2
cumulus@switch:~$ nv config apply
Edit the /etc/cumulus/ports_width.conf file and add the numer of lanes per split port you want to use, then reload switchd:
Remove the breakout interface configuration from the /etc/network/interfaces file, then run the ifreload -a command.
Configure Port Lanes
You can override the default behavior for supported speeds and platforms and specify the number of lanes for a port. For example, for the NVIDIA SN4700 switch, the default port speed is 50G (2 lanes, NRZ signaling mode) and 100G (4 lanes, NRZ signaling mode). You can override this setting to 50G (1 lane, PAM4 signaling mode) and 100G (2 lanes, PAM4 signaling mode).
This setting does not apply when auto-negotiation is on because Cumulus Linux advertises all supported speed options, including PAM4 and NRZ during auto-negotiation.
cumulus@switch:~$ nv set interface swp1 link speed 50G
cumulus@switch:~$ nv set interface swp1 link lanes 1
cumulus@switch:~$ nv config apply
cumulus@switch:~$ nv set interface swp2 link speed 100G
cumulus@switch:~$ nv set interface swp2 link lanes 2
cumulus@switch:~$ nv config apply
Edit the /etc/network/interfaces file, then run the ifreload -a command.
Cumulus Linux includes a ports.conf validator that switchd runs automatically before the switch starts up to confirm that the file syntax is correct. You can run the validator manually to verify the syntax of the file whenever you make changes. The validator is useful if you want to copy a new ports.conf file to the switch with automation tools, then validate that it has the correct syntax.
To run the validator manually, run the /usr/cumulus/bin/validate-ports -f <file> command. For example:
This section shows basic commands for troubleshooting switch ports. For a more comprehensive troubleshooting guide, see Troubleshoot Layer 1.
Interface Settings
To see all settings for an interface, run the nv show interface <interface> command:
cumulus@switch:~$ nv show interface swp1 operational applied
------------------------ ----------------- -------
type swp swp
[acl]
evpn
multihoming
uplink off
ptp
enable off
router
adaptive-routing
enable off
ospf
enable off
ospf6
enable off
pbr
[map]
pim
enable off
synce
enable off
ip
igmp
enable off
ipv4
forward on
ipv6
enable on
forward on
neighbor-discovery
enable on
[dnssl]
home-agent
enable off
[prefix]
[rdnss]
router-advertisement
enable off
vrrp
enable off
vrf default
[gateway]
link
auto-negotiate off on
duplex full full
speed 1G auto
fec auto
mtu 9000 9216
fast-linkup off
[breakout]
state up up
stats
carrier-transitions 4
in-bytes 600 Bytes
in-drops 5
in-errors 0
in-pkts 10
out-bytes 2.11 MB
out-drops 0
out-errors 0
out-pkts 33143
mac 48:b0:2d:39:3f:83
ifindex 3
You can add the --view option to show different views: acl-statistics, brief, detail, lldp, mac, mlag-cc, pluggables, qos-profile, and small. For example, the nv show interface --view=small command lists the interfaces on the switch. The nv show interface --view=brief command shows information about each interface on the switch, such as the interface type, speed, remote host and port. The nv show interface --view=mac command shows the MAC address of each interface.
The description column only shows in the output when you use the --view=detail option.
The following example shows the MAC address of each interface on the switch:
cumulus@switch:~$ nv show interface --view=mac
Interface State Speed MTU MAC Type
---------- ----- ----- ----- ----------------- --------
BLUE up 65575 2a:f9:b5:3c:74:b8 vrf
RED up 65575 8e:91:ed:ed:d5:76 vrf
bond1 up 1G 9000 48:b0:2d:39:3f:83 bond
bond2 up 1G 9000 48:b0:2d:b3:5e:18 bond
bond3 up 1G 9000 48:b0:2d:c2:9d:47 bond
br_default up 9216 44:38:39:22:01:7a bridge
br_l3vni up 9216 44:38:39:22:01:7a bridge
eth0 up 1G 1500 44:38:39:22:01:7a eth
lo up 65536 00:00:00:00:00:00 loopback
mgmt up 65575 8a:58:d0:25:47:7d vrf
swp1 up 1G 9000 48:b0:2d:39:3f:83 swp
swp2 up 1G 9000 48:b0:2d:b3:5e:18 swp
swp3 up 1G 9000 48:b0:2d:c2:9d:47 swp
swp4 down 1500 48:b0:2d:c2:7e:cd swp
swp5 down 1500 48:b0:2d:6e:bc:c1 swp
swp6 down 1500 48:b0:2d:2d:89:16 swp
...
You can filter the nv show interface command output on specific columns. For example, the nv show interface --filter mtu=1500 shows only the interfaces with MTU set to 1500.
To filter on multiple column outputs, enclose the filter types in parentheses; for example, nv show interface --filter "type=bridge&mtu=9216" shows data for bridges with MTU 9216.
You can filter on all revisions (operational, applied, and pending); for example, nv show interface --filter mtu=1500 --rev=applied shows only the interfaces with MTU set to 1500 in the applied revision.
The following example shows information for all bridges configured on the switch with MTU 9216:
cumulus@switch:~$ nv show interface --filter "type=bridge&mtu=9216"
Interface State Speed MTU Type Remote Host Remote Port Summary
---------- ----- ----- ---- ------ ----------- ----------- ---------------------------------------
br_default up 9216 bridge IP Address: fe80::4638:39ff:fe22:17a/64
br_l3vni up 9216 bridge IP Address: fe80::4638:39ff:fe22:17a/64
Statistics
To show interface statistics, run the NVUE nv show interface <interface> counters command or the Linux sudo ethtool -S <interface> command.
To verify SFP settings, run the NVUE nv show interface <interface> pluggable command or the ethtool -m command. The following example shows the vendor, type and power output for swp1.
cumulus@switch:~$ sudo ethtool -m swp1 | egrep 'Vendor|type|power\s+:'
Transceiver type : 10G Ethernet: 10G Base-LR
Vendor name : FINISAR CORP.
Vendor OUI : 00:90:65
Vendor PN : FTLX2071D327
Vendor rev : A
Vendor SN : UY30DTX
Laser output power : 0.5230 mW / -2.81 dBm
Receiver signal average optical power : 0.7285 mW / -1.38 dBm
Considerations
Auto-negotiation and FEC
If auto-negotiation is off on 100G and 25G interfaces, you must set FEC to OFF, RS, or BaseR to match the neighbor. The FEC default setting of auto does not link up when auto-negotiation is off.
Auto-negotiation and Link Speed
If auto-negotiation is on and you set the link speed for a port, Cumulus Linux disables auto-negotiation and uses the port speed setting you configure.
Auto-negotiation with the Spectrum-4 Switch
When you connect an NVIDIA Spectrum-4 switch to another NVIDIA Spectrum-4 switch with PAM4 modulation, you must enable auto-negotiation.
1000BASE-T SFP Modules Supported Only on Certain 25G Platforms
The following 25G switches support 1000BASE-T SFP modules:
NVIDIA SN2410
NVIDIA SN2010
100G or faster switches do not support 1000BASE-T SFP modules.
NVIDIA SN2100 Switch and eth0 Link Speed
After rebooting the NVIDIA SN2100 switch, eth0 always has a speed of 100MB per second. If you bring the interface down and then back up again, the interface negotiates 1000MB. This only occurs the first time the interface comes up.
To work around this issue, add the following commands to the /etc/rc.local file to flap the interface automatically when the switch boots:
modprobe -r igb
sleep 20
modprobe igb
NVIDIA SN5600 Switch and Force Mode
When you configure force mode on NVIDIA SN5600 switch ports 10 through 50, the Rx precoding setting must be the same between local and peer ports to get the optimal Signal-Integrity of the link.
Delay in Reporting Interface as Operational Down
When you remove two transceivers simultaneously from a switch, both interfaces show the carrier down status immediately. However, it takes one second for the second interface to show the operational down status. In addition, the services on this interface also take an extra second to come down.
NVIDIA Spectrum-2 Switches and FEC Mode
The NVIDIA Spectrum-2 (25G) switch only supports RS FEC.
ifplugd is an Ethernet link-state monitoring daemon that executes scripts to configure an Ethernet device when you plug in or remove a cable. Follow the steps below to install and configure the ifplugd daemon.
Install ifplugd
You can install this package even if the switch does not connect to the internet. The package is in the cumulus-local-apt-archive repository on the Cumulus Linux image.
To install ifplugd:
Update the switch before installing the daemon:
cumulus@switch:~$ sudo -E apt-get update
Install the ifplugd package:
cumulus@switch:~$ sudo -E apt-get install ifplugd
Configure ifplugd
After you install ifplugd, you must edit two configuration files:
/etc/default/ifplugd
/etc/ifplugd/action.d/ifupdown
The example configuration below configures ifplugd to bring down all uplinks when the peer bond goes down in an MLAG environment.
Open /etc/default/ifplugd in a text editor and configure the file as appropriate. Add the peerbond name before you save the file.
Open the /etc/ifplugd/action.d/ifupdown file in a text editor. Configure the script, then save the file.
#!/bin/sh
set -e
case "$2" in
up)
clagrole=$(clagctl | grep "Our Priority" | awk '{print $8}')
if [ "$clagrole" = "secondary" ]
then
#List all the interfaces below to bring up when clag peerbond comes up.
for interface in swp1 bond1 bond3 bond4
do
echo "bringing up : $interface"
ip link set $interface up
done
fi
;;
down)
clagrole=$(clagctl | grep "Our Priority" | awk '{print $8}')
if [ "$clagrole" = "secondary" ]
then
#List all the interfaces below to bring down when clag peerbond goes down.
for interface in swp1 bond1 bond3 bond4
do
echo "bringing down : $interface"
ip link set $interface down
done
fi
;;
esac
Restart the ifplugd daemon to implement the changes:
The default shell for ifplugd is dash (/bin/sh) instead of bash, as it provides a faster and more nimble shell. However, dash contains fewer features than bash (for example, dash is unable to handle multiple uplinks).
802.1X Interfaces
The IEEE 802.1X protocol provides a way to authenticate a client (called a supplicant) over wired media. It also provides access for individual MAC addresses on a switch (called the authenticator) after an authentication server authenticates the MAC addresses. The authentication server is typically a RADIUS server.
A Cumulus Linux switch acts as an intermediary between the clients connected to the wired ports and the authentication server, which is reachable over the existing network. EAPOL operates on top of the data link layer; the switch uses EAPOL to communicate with supplicants connected to the switch ports.
Cumulus Linux implements 802.1x using a modified version of the Debian hostapd package to support auth-fail and dynamic VLANS with MBA and EAP authentication for 802.1x interfaces.
Cumulus Linux supports 802.1X on physical interfaces (such as swp1 or swp2s0) that are bridge access ports; the interfaces cannot be part of a bond.
Routed interfaces, bond interfaces, and bridged trunk ports do not support 802.1X.
To enable 802.1X on an access-port, it must be a member of the default NVUE bridge br_default.
eth0 does not support 802.1X.
Cumulus Linux tests 802.1X with only a few wpa_supplicant (Debian), Windows 10 and Windows 7 supplicants.
Cumulus Linux supports RADIUS authentication with FreeRADIUS and Cisco ACS.
802.1X supports simple login and password, and EAP-TLS (Debian).
802.1X supports RFC 5281 for EAP-TTLS, which provides more secure transport layer security.
Mako template-based configurations do not support 802.1X.
Configure the RADIUS Server
Before you can authenticate with 802.1x on your switch, you must configure a RADIUS server somewhere in your network. Popular examples of commercial software with RADIUS capability include Cisco ISE and Aruba ClearPass.
You can also use open source versions of software supporting RADIUS such as PacketFence and FreeRADIUS. This section discusses how to add FreeRADIUS to a Debian server on your network.
Do not use a Cumulus Linux switch as the RADIUS server.
You can configure up to three RADIUS servers (in case of failover).
After you install and configure FreeRADIUS, the FreeRADIUS server can serve Cumulus Linux running hostapd as a RADIUS client. For more information, see the FreeRADIUS documentation.
Configure 802.1X Interfaces
All the 802.1X interfaces share the same RADIUS server settings. Make sure you configure the RADIUS server before you configure the 802.1X interfaces. See Configure the RADIUS Server above.
You must configure 802.1X on a bridged interface. To configure a bridge, refer to Ethernet Bridging - VLANs.
NVUE enables BPDU guard when you enable 802.1X on an interface; the interface goes into a protodown state if it receives BPDU packets.
To configure an 802.1X interface:
Required: Provide the 802.1X RADIUS server IPv4 or IPv6 address. If you want to specify more than one server, provide the priority for each server (a value between 1 and 3). If you specify just one server, Cumulus Linux sets the priority to 1. You can also specify a VRF for outgoing RADIUS accounting and authorization packets. A VRF is optional.
Required: Provide the 802.1X RADIUS shared secret.
Required: Enable 802.1X on an interface.
Optional: Change the default 802.1X RADIUS accounting port. You can specify a value between 1000 and 65535. The default value is 1813.
Optional: Change the default 802.1X RADIUS authentication port. You can specify a value between 1000 and 65535. The default value is 1812.
Optional: Provide the reauthentication interval for EAP. You can set a value between 0 and 86640. The default value is 0 (disabled). This setting only applies to EAP-based authentication; it does not apply to MBA.
Optional: Set a fixed IP address for the RADIUS client to receive requests.
Changing the 802.1X interface settings does not reset existing authorized user ports. However, removing all 802.1X interfaces or changing the RADIUS server IP address, shared secret, authentication port, accounting port, or EAP reauthentication interval restarts hostapd, which forces existing, authorized users to reauthenticate.
The following example:
Sets the 802.1X RADIUS server IP address to 10.10.10.1 and the shared secret to mysecret.
Enables 802.1X on swp1 through swp3.
cumulus@switch:~$ nv set system dot1x radius server 10.10.10.1 shared-secret mysecret
cumulus@switch:~$ nv set interface swp1,swp2,swp3 dot1x eap enabled
cumulus@switch:~$ nv config apply
The following example:
Sets the 802.1X RADIUS server IP address to 10.10.10.1 and the VRF to BLUE.
Sets the 802.1X RADIUS shared secret to mysecret.
Sets the 802.1X RADIUS authentication port to 2813.
Sets the 802.1X RADIUS accounting port to 2812.
Sets the fixed IP address for the RADIUS client to receive requests to 10.10.10.6.
Sets the EAP reauthentication interval to 40.
Enables 802.1X on swp1, swp2, and swp3.
cumulus@switch:~$ nv set system dot1x radius server 10.10.10.1 vrf BLUE
cumulus@switch:~$ nv set system dot1x radius server 10.10.10.1 shared-secret mysecret
cumulus@switch:~$ nv set system dot1x radius server 10.10.10.1 authentication-port 2813
cumulus@switch:~$ nv set system dot1x radius server 10.10.10.1 accounting-port 2812
cumulus@switch:~$ nv set system dot1x radius client-src-ip 10.10.10.6
cumulus@switch:~$ nv set system dot1x reauthentication-interval 40
cumulus@switch:~$ nv set interface swp1,swp2,swp3 dot1x eap enabled
cumulus@switch:~$ nv config apply
When you enable or disable 802.1X on an interface, hostapd reloads; however, existing authorized sessions do not reset.
Edit the /etc/hostapd.conf file to configure 802.1X settings, then restart the hostapd service.
The following example:
Sets the 802.1X RADIUS server IP address to 10.10.10.1.
NVIDIA recommends you set the following configuration in the /etc/network/interfaces file for the 802.1X enabled interfaces:
...
auto swp1
iface swp1
bridge-access <vlan>
bridge-learning off
mstpctl-bpduguard yes
mstpctl-portadminedge yes
auto swp2
iface swp2
bridge-access <vlan>
bridge-learning off
mstpctl-bpduguard yes
mstpctl-portadminedge yes
auto swp3
iface swp3
bridge-access <vlan>
bridge-learning off
mstpctl-bpduguard yes
mstpctl-portadminedge yes
MAC-based Authentication
MAC-based authentication (MBA) enables bridged interfaces to allow devices to bypass authentication based on their MAC address. This is useful for devices that do not support EAP, such as printers or phones.
You must configure MBA on both the RADIUS server and the RADIUS client (the Cumulus Linux switch).
Changing the MBA settings does not reset existing authorized user ports. However, changing the MBA activation delay restarts hostapd, which forces existing, authorized users to reauthenticate.
To configure MBA:
Enable MBA in a bridged interface. The following example enables MBA on swp1:
cumulus@switch:~$ nv set interface swp1 dot1x mba enabled
cumulus@switch:~$ nv config apply
Edit the /etc/hostapd.conf file. The following example enables MBA on swp1.
If a non-authorized supplicant tries to communicate with the switch, you can route traffic from that device to a different VLAN and associate that VLAN with one of the switch ports to which the supplicant attaches. Cumulus Linux assigns the auth-fail VLAN by manipulating the PVID of the interface.
Changing the auth-fail VLAN settings does not reset existing authorized user ports. However, changing the auth-fail VLAN ID restarts hostapd, which forces existing, authorized users to reauthenticate.
The following example sets the auth-fail VLAN ID to 777 and enables auth-fail VLAN on swp1.
cumulus@switch:~$ nv set system dot1x auth-fail-vlan 777
cumulus@switch:~$ nv set interface swp1 dot1x auth-fail-vlan enabled
cumulus@switch:~$ nv config apply
If the authentication for swp1 fails, the interface moves to the auth-fail VLAN:
cumulus@switch:~$ nv show interface swp1 dot1x
Interface MAC Address Attribute Value
--------- ----------------- ---------------------------- -----------------
swp1 00:02:00:00:00:08 Status Flags [PARKED_VLAN]
Username vlan60
Authentication Type MD5
VLAN 777
Session Time (seconds) 24772
EAPOL Frames RX 9
EAPOL Frames TX 12
EAPOL Start Frames RX 1
EAPOL Logoff Frames RX 0
EAPOL Response ID Frames RX 4
EAPOL Response Frames RX 8
EAPOL Request ID Frames TX 4
EAPOL Request Frames TX 8
EAPOL Invalid Frames RX 0
EAPOL Length Error Frames Rx 0
EAPOL Frame Version 2
EAPOL Auth Last Frame Source 00:02:00:00:00:08
EAPOL Auth Backend Responses 8
RADIUS Auth Session ID C2FED91A39D8D605
Edit the /etc/hostapd.conf file to add the auth-fail VLAN ID and interface:
If the authentication for swp1 fails, the interface moves to the auth-fail VLAN.
Dynamic VLAN Assignments
A common requirement for campus networks is to assign dynamic VLANs to specific users in combination with IEEE 802.1x. After authenticating a supplicant, the user is assigned a VLAN based on the RADIUS configuration. Cumulus Linux assigns the dynamic VLAN by manipulating the PVID of the interface.
To enable dynamic VLAN assignment globally, where VLAN attributes from the RADIUS server apply to the bridge:
Run the nv set system dot1x dynamic-vlan optional or nv set system dot1x dynamic-vlan required command. If you run the nv set system dot1x dynamic-vlan required command, when VLAN attributes do not exist in the access response packet from the RADIUS server, the user is not authorized and has no connectivity. If the RADIUS server returns VLAN attributes but the user has an incorrect password, the user goes in the auth-fail VLAN (if you configure auth-fail VLAN).
cumulus@switch:~$ nv set system dot1x dynamic-vlan optional
cumulus@switch:~$ nv config apply
cumulus@switch:~$ nv set system dot1x dynamic-vlan required
cumulus@switch:~$ nv config apply
The following example shows a typical RADIUS configuration (shown for FreeRADIUS,) for a user with dynamic VLAN assignment:
# # VLAN 100 Client Configuration for Freeradius RADIUS Server.
# # This is not part of the CL configuration.
vlan10client Cleartext-Password := "client1password"
Service-Type = Framed-User,
Tunnel-Type = VLAN,
Tunnel-Medium-Type = "IEEE-802",
Tunnel-Private-Group-ID = 100
Verify the configuration (notice the [AUTHORIZED] status in the output):
cumulus@switch:~$ nv show interface dot1x-summary
Interface MAC Address Attribute Value
--------- ----------------- ---------------------------- --------------------------
swp1 00:02:00:00:00:08 Status Flags [DYNAMIC_VLAN][AUTHORIZED]
Username host1
Authentication Type MD5
VLAN 888
Session Time (seconds) 799
EAPOL Frames RX 3
EAPOL Frames TX 3
EAPOL Start Frames RX 1
EAPOL Logoff Frames RX 0
EAPOL Response ID Frames RX 1
EAPOL Response Frames RX 2
EAPOL Request ID Frames TX 1
EAPOL Request Frames TX 2
EAPOL Invalid Frames RX 0
EAPOL Length Error Frames Rx 0
EAPOL Frame Version 2
EAPOL Auth Last Frame Source 00:02:00:00:00:08
EAPOL Auth Backend Responses 2
RADIUS Auth Session ID 939B1A53B624FC56
Edit the /etc/hostapd.conf file to set the dynamic_vlan option.
Specify 1 for VLAN attributes to be optional.
Specify 2 to require VLAN attributes; if VLAN attributes do not exist in the access response packet returned from the RADIUS server, the user is not authorized and has no connectivity. If the RADIUS server returns VLAN attributes but the user has an incorrect password, the user goes in the auth-fail VLAN, if you have configured auth-fail VLAN.
The following example shows a typical RADIUS configuration (shown for FreeRADIUS, not typically configured or run on the Cumulus Linux device) for a user with a dynamic VLAN assignment:
# # VLAN 100 Client Configuration for Freeradius RADIUS Server.
# # This is not part of the CL configuration.
vlan10client Cleartext-Password := "client1password"
Service-Type = Framed-User,
Tunnel-Type = VLAN,
Tunnel-Medium-Type = "IEEE-802",
Tunnel-Private-Group-ID = 100
To disable dynamic VLAN assignment, where the Cumulus Linux ignores VLAN attributes sent from the RADIUS server and users authenticate based on existing credentials:
cumulus@switch:~$ nv set system dot1x dynamic-vlan disabled
cumulus@switch:~$ nv config apply
Edit the /etc/hostapd.conf file to set the eap_send_identity option to 0, then restart the hostapd service with the sudo systemctl restart hostapd command.
Enabling or disabling dynamic VLAN assignment restarts hostapd, which forces existing, authorized users to reauthenticate.
MAC Addresses per Port
You can specify the maximum number of authenticated MAC addresses allowed on an interface. You can specify any number between 0 and 255. The default value is 6.
The following example sets the maximum number of authenticated MAC addresses to 10.
cumulus@switch:~$ nv set system dot1x max-stations 10
cumulus@switch:~$ nv config apply
Edit the /etc/hostapd.conf file to add the max_num_sta= option. For example:
Cumulus Linux provides the following 802.1X host modes:
Multi host authenticated mode, where RADIUS must authorize each supplicant to send traffic through the 802.1X interface. This is the default mode.
Multi host mode, where the interface remains closed for all traffic until RADIUS authorizes the first supplicant. After authorization, any host can send and receive traffic through the 802.1X interface as long as the supplicant remains authorized.
Multi Host Mode and MBA
When you enable multi host mode on an 802.1X interface with MBA, the first authorized supplicant does not need to run an EAP client but authorizes according to its MAC address.
Multi Host Mode and Auth-fail VLAN
When you enable multi host mode on an 802.1X interface with auth-fail VLAN, when the first supplicant fails to authorize, Cumulus Linux changes the access VLAN on the interface to auth-fail-vlan. The port does not allow traffic from other MAC addresses.
Multi Host Mode and Port Security
Port security limits port access to a specific number of MAC addresses or specific MAC addresses so that the port does not forward ingress traffic from undefined source addresses.
If you enable port security and 802.1X multi host mode on an interface, the MAC address limit that port security enforces on the interface limits the number of traffic sources after authorization.
In multi host mode, Cumulus Linux adds the authorized supplicant MAC address as a static sticky MAC in the forwarding table. The MAC address limit that port security enforces does not account for the supplicant MAC. For example, when you set the port security MAC limit to 2 on an interface, the supplicant and two more hosts can send traffic through the interface.
If you enable 802.1X after the switch learns port security MAC addresses, Cumulus Linux deletes the dynamic MAC addresses installed with port security from the forwarding table. Because bridge learning on an interface is disabled with 802.1X configuration, port security applies only after RADIUS authorizes the first supplicant.
Configure the Host Mode
To configure the host mode on an 802.1X interface:
The following example sets multi host mode on swp1:
To change back to the default host mode, you can also run the nv unset interface <interface> dot1x host-mode command.
Edit the /etc/hostapd.conf file to set the multihost_interfaces option to the 802.1X interface on which you want to enable multi host mode, then restart the hostapd service.
The following example configures multi host mode on swp1:
To change host mode back to the default setting (multi host authenticated), remove the interface from the multihost_interfaces line in the /etc/hostapd.conf file, then restart the hostapd service.
When you change the mode on an 802.1X interface from multi host authentication (with multiple authorized supplicants) to multi host, Cumulus Linux brings down all existing sessions and closes down the port until one of the supplicants authenticates successfully.
When you change the mode on an 802.1X interface from multi host to multi host authentication, Cumulus Linux brings down existing sessions and disables bridge learning.
Show the Current Host Mode
To show the current host mode, run the nv show interface <interface> dot1x command:
To deauthenticate an 802.1X supplicant on an interface, run the nv action deauthenticate interface <interface> dot1x authorized-sessions <mac-address> command:
To show the authenticated sessions and statistics for a specific MAC address, run the nv show interface <interface-id> dot1x authenticated-sessions <mac-address> command:
You can perform more advanced troubleshooting with the following commands.
To increase the debug level in hostapd, copy over the hostapd service file, then add -d, -dd or -ddd to the ExecStart line in the hostapd.service file:
This section refers to frames for all internal QoS functionality. Unless explicitly stated, the actions are independent of layer 2 frames or layer 3 packets.
Cumulus Linux supports several different QoS features and standards including:
Cumulus Linux uses two configuration files for QoS:
/etc/cumulus/datapath/qos/qos_features.conf includes all standard QoS configuration, such as marking, shaping and flow control.
/etc/mlx/datapath/qos/qos_infra.conf includes all platform specific configurations, such as buffer allocations and Alpha values.
Cumulus Linux 5.0 and later does not use the traffic.conf and datapath.conf files but uses the qos_features.conf and qos_infra.conf files instead. Before upgrading Cumulus Linux, review your existing QoS configuration to determine the changes you need to make.
switchd and QoS
When you run Linux commands to configure QoS, you must apply QoS changes to the ASIC with the following command:
Unlike the restart command, the reload switchd.service command does not impact traffic forwarding except when the qos_infra.conf file changes, or when the switch pauses frames or controls priority flow, which require modifications to the ASIC buffer and might result in momentary packet loss.
NVUE reloads the switchd service automatically. You do not have to run the reload switchd.service command to apply changes when configuring QoS with NVUE commands.
Classification
When a frame or packet arrives on the switch, Cumulus Linux maps it to an internal COS (switch priority) value. This value never writes to the frame or packet but classifies and schedules traffic internally through the switch.
You can define which values are trusted: 802.1p, DSCP, or both.
The following table describes the default classifications for various frame and switch priority configurations:
Setting
VLAN Tagged?
IP or Non-IP
Result
PCP (802.1p)
Yes
IP
Accept incoming 802.1p marking.
PCP (802.1p)
Yes
Non-IP
Accept incoming 802.1p marking.
PCP (802.1p)
No
IP
Use the default priority setting.
PCP (802.1p)
No
Non-IP
Use the default priority setting.
DSCP
Yes
IP
Accept incoming DSCP IP header marking.
DSCP
Yes
Non-IP
Use the default priority setting.
DSCP
No
IP
Accept incoming DSCP IP header marking.
DSCP
No
Non-IP
Use the default priority setting.
PCP (802.1p) and DSCP
Yes
IP
Accept incoming DSCP IP header marking.
PCP (802.1p) and DSCP
Yes
Non-IP
Accept incoming 802.1p marking.
PCP (802.1p) and DSCP
No
IP
Accept incoming DSCP IP header marking.
PCP (802.1p) and DSCP
No
Non-IP
Use the default priority setting.
port
Either
Either
Ignore any existing markings and use the default priority setting.
If you use NVUE to configure QoS, you define which values are trusted with the nv set qos mapping <profile> trust l2 command (802.1p) or the nv set qos mapping <profile> trust l3 command (DSCP) .
If you use Linux commands to configure QoS, you define which values are trusted in the /etc/cumulus/datapath/qos/qos_features.conf file by configuring the traffic.packet_priority_source_set setting to 802.1p or dscp.
Trust 802.1p Marking
To trust 802.1p marking:
When 802.1p (l2) is trusted, Cumulus Linux classifies these ingress 802.1p values to switch priority values:
Switch Priority
802.1p (PCP)
0
0
1
1
2
2
3
3
4
4
5
5
6
6
7
7
The PCP number is the incoming 802.1p marking; for example PCP 0 maps to switch priority 0.
To change the default profile to map PCP 0 to switch priority 4:
If you configure the trust to be l2 but do not specify any PCP to switch priority mappings, Cumulus Linux uses the default values.
To show the ingress 802.1p mapping for the default profile, run the nv show qos mapping default-global pcp command. To show the PCP mapping for a specific switch priority in the default profile, run the nv show qos mapping default-global pcp <value> command. The following example shows that PCP 0 maps to switch priority 4:
You can map multiple ingress DSCP values to the same switch priority value. For example, to change the default profile to map ingress DSCP values 10, 21, and 36 to switch priority 0:
If you configure the trust to be l3 but do not specify any DSCP to switch priority mappings, Cumulus Linux uses the default values.
To show the DSCP mapping in the default profile, run the nv show qos mapping default-global dscp command. To show the DSCP mapping for a specific switch priority in the default profile, run the nv show qos mapping default-global dscp <value> command. The following example shows that DSCP 22 maps to switch priority 4:
The # in the configuration file is a comment. By default, the file comments out the traffic.cos_*.priority_source.dscp lines. You must uncomment them for them to take effect.
The traffic.cos_ number is the switch priority value; for example DSCP values 0 through 7 map to switch priority 0. To map ingress DSCP 22 to switch priority 4, configure the traffic.cos_4.priority_source.dscp setting.
traffic.cos_4.priority_source.dscp = [22]
You can map multiple ingress DSCP values to the same switch priority value. For example, to map ingress DSCP values 10, 21, and 36 to switch priority 0:
traffic.cos_0.priority_source.dscp = [10,21,36]
You can also choose not to use an switch priority value. This example does not use switch priority values 3 and 4:
To apply a custom DSCP profile to specific interfaces, see Port Groups.
Trust Port
You can assign all traffic to a switch priority regardless of the ingress marking.
The following commands assign all traffic to switch priority 3 regardless of the ingress marking.
cumulus@switch:~$ nv set qos mapping default-global trust port
cumulus@switch:~$ nv set qos mapping default-global port-default-sp 3
cumulus@switch:~$ nv config apply
To show the switch priority setting in the default profile for all traffic regardless of the ingress marking, run the nv show qos mapping default-global command:
cumulus@switch:~$ nv show qos mapping default-global
operational applied description
--------------- ----------- ------- ----------------------------
port-default-sp 3 3 Port Default Switch Priority
trust port port Port Trust configuration
In the /etc/cumulus/datapath/qos/qos_features.conf file, configure traffic.packet_priority_source_set = [port].
The traffic.port_default_priority setting defines the switch priority that all traffic uses.
To apply a custom profile to specific interfaces, see Port Groups.
Mark and Remark Traffic
You can mark or remark traffic in two ways:
Use ingress COS or DSCP to remark an existing 802.1p COS or DSCP value to a new value.
Use iptables to match packets and set 802.1p COS or DSCP values (policy-based marking).
802.1p or DSCP for Marking
To enable global remarking of 802.1p, DSCP or both 802.1p and DSCP values:
In the /etc/cumulus/datapath/qos/qos_features.conf file, modify the traffic.packet_priority_remark_set value to [802.1p], [dscp] or [802.1p,dscp]. For example, to enable the remarking of only 802.1p values:
traffic.packet_priority_remark_set = [802.1p]
You remark 802.1p or DSCP with the priority_remark.8021p or priority_remark.dscp setting. The switch priority (internal cos_) value determines the egress 802.1p or DSCP remarking. For example, to remark switch priority 0 to egress 802.1p 4:
traffic.cos_0.priority_remark.8021p = [4]
To remark switch priority 0 to egress DSCP 22:
traffic.cos_0.priority_remark.dscp = [22]
The # in the configuration file is a comment. The file comments out the traffic.cos_*.priority_remark.8021p and the traffic.cos_*.priority_remark.dscp lines by default. You must uncomment them to set the configuration.
You can remap multiple switch priority values to the same external 802.1p or DSCP value. For example, to map switch priority 1 and 2 to 802.1p 3:
To apply a custom profile to specific interfaces, see Port Groups.
Policy-based Marking
Cumulus Linux supports ACLs through ebtables, iptables or ip6tables for egress packet marking and remarking.
Cumulus Linux uses ebtables to mark layer 2, 802.1p COS values.
Cumulus Linux uses iptables to match IPv4 traffic and ip6tables to match IPv6 traffic for DSCP marking.
For more information on configuring and applying ACLs, refer to Netfilter - ACLs.
Mark Layer 2 COS
You must use ebtables to match and mark layer 2 bridged traffic. You can match traffic with any supported ebtables rule.
To set the new 802.1p COS value when traffic matches, use -A FORWARD -o <interface> -j setqos --set-cos <value>.
You can only set COS on a per-egress interface basis. Cumulus Linux does not support ebtables based matching on ingress.
The configured action always has the following conditions:
The rule is always part of the FORWARD chain.
The interface (<interface>) is a physical swp port.
The jump action is always setqos (lowercase).
The --set-cos value is a 802.1p COS value between 0 and 7.
For example, to set traffic leaving interface swp5 to 802.1p COS value 4:
-A FORWARD -o swp5 -j setqos --set-cos 4
Mark Layer 3 DSCP
You must use iptables (for IPv4 traffic) or ip6tables (for IPv6 traffic) to match and mark layer 3 traffic.
You can match traffic with any supported iptable or ip6tables rule.
To set the new COS or DSCP value when traffic matches, use -A FORWARD -o <interface> -j SETQOS [--set-dscp <value> | --set-cos <value> | --set-dscp-class <name>].
The configured action always has the following conditions:
The rule is always configured as part of the FORWARD chain.
The interface (<interface>) is a physical swp port.
The jump action is always SETQOS (uppercase).
You can configure COS markings with --set-cos and a value between 0 and 7 (inclusive).
You can use only one of --set-dscp or --set-dscp-class. --set-dscp supports decimal or hex DSCP values between 0 and 77.
--set-dscp-class supports standard DSCP naming, described in RFC3260, including ef, be, CS and AF classes.
You can specify either --set-dscp or --set-dscp-class, but not both.
For example, to set traffic leaving interface swp5 to DSCP value 32:
-A FORWARD -o swp5 -j SETQOS --set-dscp 32
To set traffic leaving interface swp11 to DSCP class value CS6:
-A FORWARD -o swp11 -j SETQOS --set-dscp-class cs6
Flow Control
Flow control influences data transmission to manage congestion along a network path.
Cumulus Linux supports the following flow control mechanisms:
Link pause (IEEE 802.3x), sends specialized ethernet frames to an adjacent layer 2 switch to stop or pauseall traffic on the link during times of congestion.
Priority Flow Control (PFC), which is an upgrade of link pause that IEEE 802.1bb defines, extends the pause frame concept to act on a per switch priority value basis instead of an entire link. A PFC pause frame indicates to the peer which specific switch priority value to pause, while other switch priority values or queues continue transmitting.
You can not configure link pause and PFC on the same port.
Flow Control Buffers
Before configuring link pause or PFC, configure the buffer pool memory allocated for lossless and lossy flows. The following example sets each to fifty percent:
cumulus@switch:~$ nv set qos traffic-pool default-lossless memory-percent 50
cumulus@switch:~$ nv set qos traffic-pool default-lossy memory-percent 50
cumulus@switch:~$ nv config apply
Cumulus Linux allocates 100% of the buffer memory to the default-lossy traffic pool by default. The total memory allocation across pools must not exceed 100%.
Edit the following lines in the /etc/mlx/datapath/qos/qos_infra.conf file:
Modify the existing ingress_service_pool.0.percent and egress_service_pool.0.percent buffer allocation. Change the existing ingress setting to ingress_service_pool.0.percent = 50. Change the existing egress setting to egress_service_pool.0.percent = 50.
Add the following lines to create a new service_pool, set flow_control to the service pool, and define buffer reservations:
Link pause is an older flow control mechanism that causes all traffic on a link between two switches, or between a host and switch, to stop transmitting during times of congestion. Link pause starts and stops depending on buffer congestion. You configure link pause on a per-direction, per-interface basis. You can receive pause frames to stop the switch from transmitting when requested, send pause frames to request neighboring devices to stop transmitting, or both.
NVIDIA recommends that you use Priority Flow Control (PFC) instead of link pause.
Before configuring link pause, you must first modify the switch buffer allocation. Refer to Flow Control Buffers.
Link pause buffer calculation is a complex topic that IEEE 802.1Q-2012 defines. This attempts to incorporate the delay between signaling congestion and the reception of the signal by the neighboring device. This calculation includes the delay that the PHY and MAC layers (interface delay) introduce as well as the distance between end points (cable length).
Incorrect cable length settings can cause wasted buffer space (triggering congestion too early) or packet drops (congestion occurs before flow control activates).
The following example configuration:
Creates a profile (port group) called my_pause_ports.
Enables sending pause frames and disables receiving pause frames.
Sets the cable length to 50 meters.
Sets link pause on swp1 through swp4, and swp6.
Cumulus Linux also includes frame transmission start and stop threshold, and port buffer settings. NVIDIA recommends that you do not change these settings but, instead, let Cumulus Linux configure the settings dynamically. Only change the threshold and buffer settings if you are an advanced user who understands the buffer configuration requirements for lossless traffic to work seamlessly.
cumulus@switch:~$ nv set qos link-pause my_pause_ports tx enable
cumulus@switch:~$ nv set qos link-pause my_pause_ports rx disable
cumulus@switch:~$ nv set qos link-pause my_pause_ports cable-length 50
cumulus@switch:~$ nv set interface swp1-swp4,swp6 qos link-pause profile my_pause_ports
cumulus@switch:~$ nv config apply
To show the link pause settings for a profile, run the nv show qos link-pause <profile> command
Uncomment and edit the link_pause section of the /etc/cumulus/datapath/qos/qos_features.conf file.
To process pause frames, you must enable link pause on the specific interfaces.
Priority Flow Control (PFC)
Priority flow control extends the capabilities of link pause by the frames for a specific 802.1p value instead of stopping all traffic on a link. If a switch supports PFC and receives a PFC pause frame for a given 802.1p value, the switch stops transmitting frames from that queue, but continues transmitting frames for other queues.
You use PFC with RDMA over Converged Ethernet - RoCE. The RoCE section provides information to specifically deploy PFC and ECN for RoCE environments.
Before configuring PFC, first modify the switch buffer allocation according to Flow Control Buffers.
PFC buffer calculation is a complex topic defined in IEEE 802.1Q-2012, which attempts to incorporate the delay between signaling congestion and receiving the signal by the neighboring device. This calculation includes the delay that the PHY and MAC layers (called the interface delay) introduce as well as the distance between end points (cable length). Incorrect cable length settings cause wasted buffer space (triggering congestion too early) or packet drops (congestion occurs before flow control activates).
To apply PFC settings on all ports, modify the default PFC profile (default-global).
The following example modifies the default profile and configures:
PFC on egress queue 0.
Enables sending pause frames and disables receiving pause frames.
The cable length to 50 meters.
Cumulus Linux also includes frame transmission start and stop threshold, and port buffer settings. NVIDIA recommends that you do not change these settings but, instead, let Cumulus Linux configure the settings dynamically. Only change the threshold and buffer settings if you are an advanced user who understands the buffer configuration requirements for lossless traffic to work seamlessly.
cumulus@switch:~$ nv set qos pfc default-global switch-priority 0
cumulus@switch:~$ nv set qos pfc default-global tx enable
cumulus@switch:~$ nv set qos pfc default-global rx disable
cumulus@switch:~$ nv set qos pfc default-global cable-length 50
cumulus@switch:~$ nv config apply
To show the PFC settings for the default profile, run the nv show qos pfc default-global command:
cumulus@switch:~$ nv show qos pfc default-global
operational applied description
----------------- ----------- ------- --------------------------------
cable-length 50 50 Cable Length (in meters)
port-buffer 25000 B 25000 B Port Buffer (in bytes)
rx disable disable PFC Rx State
tx enable enable PFC Tx State
xoff-threshold 10000 B 10000 B Xoff Threshold (in bytes)
xon-threshold 2000 B 2000 B Xon Threshold (in bytes)
[switch-priority] 0 0 Collection of switch priorities.
Edit the priority flow control section of the /etc/cumulus/datapath/qos/qos_features.conf file.
To apply a custom profile to specific interfaces, see Port Groups.
PFC Watchdog
PFC watchdog detects and mitigates pause storms on PFC-enabled ports.
In lossless Ethernet, the switch sends PFC PAUSE frames to instruct the link partner to pause sending packets on a traffic class. This back pressure might propagate across the network and, if it persists, can cause the network to stop forwarding traffic. PFC watchdog detects abnormal back pressure caused by receiving an excessive number of pause frames and disables PFC temporarily.
When a lossless queue receives a pause storm from its link partner and the queue is in a paused state for a certain period of time, PFC watchdog mitigates the pause storm. The watchdog stops processing received pause frames on every switch priority corresponding to the traffic class that detects the storm and discards new incoming packets to this egress queue.
The watchdog continues to count pause frames received on the port. If there are no pause frames received in any polling interval period, it restores the PFC configuration on the port and stops dropping packets.
PFC watchdog also detects and mitigates pause storms on link pause-enabled ports. The watchdog configuration for link pause-enabled ports is the same as the configuration for PFC-enabled ports. For a link pause-enabled port, the watchdog stops processing received pause frames on the egress port that detects the storm and discards new incoming packets to all egress queues on the port until congestion diminishes.
PFC watchdog only works for lossless traffic queues.
You can only configure PFC watchdog on a port with PFC (or link pause) configuration.
You can only enable PFC watchdog on a physical interface (swp).
You cannot enable the watchdog on a bond (for example, bond0) but you can enable the watchdog on a port that is a member of a bond (for example, swp1).
To enable PFC watchdog:
Enable PFC watchdog on the interfaces where you enable PFC:
cumulus@switch:~$ nv set interface swp1 qos pfc-watchdog
cumulus@switch:~$ nv set interface swp3 qos pfc-watchdog
cumulus@switch:~$ nv config apply
To disable PFC watchdog, run the nv unset interface <interface> qos pfc-watchdog command or the nv set interface <interface> qos pfc-watchdog state disable command.
Edit the PFC Watchdog Configuration section of the /etc/cumulus/datapath/qos/qos_features.conf file, then reload switchd.
...
# PFC Watchdog Configuration
# Add the port to the port_group_list where you want to enable PFC Watchdog
# It will enable PFC Watchdog on all the traffic-class corresponding to
# the lossless switch-priority configured on the port.
pfc_watchdog.port_group_list = [pfc_wd_port_group]
pfc_watchdog.pfc_wd_port_group.port_set = swp1,swp2
...
cumulus@switch:~$ sudo systemctl reload switchd
You can control the PFC watchdog polling interval and how many polling intervals the PFC watchdog must wait before it mitigates the storm condition. The default polling interval is 100 milliseconds. The default number of polling intervals is 3.
The following example sets the PFC watchdog polling interval to 200 milliseconds and the number of polling intervals to 5:
cumulus@switch:~$ nv set qos pfc-watchdog polling-interval 200
cumulus@switch:~$ nv set qos pfc-watchdog robustness 5
cumulus@switch:~$ nv config apply
Edit the /etc/cumulus/switchd.conf file to set the pfc_wd.poll_interval parameter and the pfc_wd.robustness parameter.
To show if PFC watchdog is on and to show the status for each traffic class, run the nv show interface <interface> qos pfc-watchdog command:
cumulus@switch:~$ nv show interface swp1 qos pfc-watchdog
operational applied
--------------- ----------- -------
state enabled enabled
PFC WD Status
===========================
traffic-class status deadlock-count
------------- -------- --------------
0 OK 0
1 OK 3
2 DEADLOCK 2
3 OK 0
4 OK 0
5 OK 0
6 OK 0
7 DEADLOCK 3
To show PFC watchdog data for a specific traffic class, run the nv show interface <interface> qos pfc-watchdog status <traffic-class> command.
To clear the PFC watchdog deadlock-count on an interface, run the nv action clear interface <interface> qos pfc-watchdog deadlock-count command.
Congestion Control (ECN)
Explicit Congestion Notification (ECN) is an end-to-end layer 3 congestion control protocol. Defined by RFC 3168, ECN relies on bits in the IPv4 header Traffic Class to signal congestion conditions. ECN requires one or both server endpoints to support ECN to be effective.
Instead of telling adjacent devices to stop transmitting during times of buffer congestion, ECN sets the ECN bits of the transit IPv4 or IPv6 header to indicate to end hosts that congestion might occur. As a result, the sending hosts reduce their sending rate until the transit switch no longer sets ECN bits.
ECN operates by having a transit switch that marks packets between two end hosts.
The transmitting host indicates it is ECN-capable by setting the ECN bits in the outgoing IP header to 01 or 10
If the buffer of a transit switch is greater than the configured minimum threshold of the buffer, the switch remarks the ECN bits to 11 indicating Congestion Encountered or CE.
The receiving host marks any reply packets, like a TCP-ACK, as CE (11).
The original transmitting host reduces its transmission rate.
When the switch buffer congestion falls below the configured minimum threshold of the buffer, the switch stops remarking ECN bits, setting them back to 01 or 10.
A receiving host reflects this new ECN marking in the next reply so that the transmitting host resumes sending at normal speeds.
The default profile (default-global) enables ECN by default on egress queue 0 for all ports with the following settings:
A minimum buffer threshold of 150000 bytes. Random ECN marking starts when buffer congestion crosses this threshold. The probability determines if ECN marking occurs.
A maximum buffer threshold of 1500000 bytes. Cumulus Linux marks all ECN-capable packets when buffer congestion crosses this threshold.
A probability of 100 percent that Cumulus Linux marks an ECN-capable packet when buffer congestion is between the minimum threshold and the maximum threshold.
Random Early Detection (RED) disabled. ECN prevents packet drops in the network due to congestion by signaling hosts to transmit less. However, if congestion continues after ECN marking, packets drop after the switch buffer is full. By default, Cumulus Linux tail-drops packets when the buffer is full. You can enable RED to drop packets that are in the queue randomly instead of always dropping the last arriving packet. This might improve overall performance of TCP based flows.
The following example commands change the default ECN profile that applies to all ports. The commands enable ECN on egress queue 4, 5, and 7, set the minimum buffer threshold to 40000 and the maximum buffer threshold to 200000, and enable RED.
cumulus@switch:~$ nv set qos congestion-control default-global traffic-class 4,5,7 min-threshold 40000
cumulus@switch:~$ nv set qos congestion-control default-global traffic-class 4,5,7 max-threshold 200000
cumulus@switch:~$ nv set qos congestion-control default-global traffic-class 4,5,7 red enable
cumulus@switch:~$ nv config apply
The following example disables ECN bit marking in the default profile for all ports.
To show the ECN settings for the default profile, run the nv show qos congestion-control default-global command:
cumulus@switch:~$ nv show qos congestion-control default-global
operational applied description
-- ----------- ------- -----------
ECN Configurations
=====================
traffic-class ECN RED Min Th Max Th Probability
------------- ------ ------ ------- -------- -----------
4 enable enable 40000 B 200000 B 100
5 enable enable 40000 B 200000 B 100
7 enable enable 40000 B 200000 B 100
To show the ECN settings in the default profile for a specific egress queue, run the nv show qos congestion-control default-global traffic-class <value> command:
cumulus@switch:~$ nv show qos congestion-control default-global traffic-class 4
operational applied description
------------- ----------- -------- -----------------------------------
ecn enable enable Early Congestion Notification State
max-threshold 200000 B 200000 B Maximum Threshold (in bytes)
min-threshold 40000 B 40000 B Minimum Threshold (in bytes)
probability 100 100 Probability
red enable enable Random Early Detection State
Edit the Explicit Congestion Notification section of the /etc/cumulus/datapath/qos/qos_features.conf file.
To disable ECN bit marking, set ecn_enable to false. The following example disables ECN bit marking in the default profile for all ports.
...
default_ecn_red_conf.ecn_enable = false
...
To apply a custom ECN profile to specific interfaces, see Port Groups.
Egress Queues
Cumulus Linux supports eight egress queues to provide different classes of service. By default switch priority values map directly to the matching egress queue. For example, switch priority value 0 maps to egress queue 0.
You can remap queues by changing the switch priority value to the corresponding queue value. You can map multiple switch priority values to a single egress queue.
You do not have to assign all egress queues.
The following command examples assign switch priority 2 to egress queue 7:
To show the egress queue mapping for a specific switch priority in the default profile, run the nv show qos egress-queue-mapping default-global switch-priority <value> command. The following example command shows that switch priority 2 maps to egress queue 7.
cumulus@switch:~$ nv show qos egress-queue-mapping default-global switch-priority 2
operational applied description
------------- ----------- ------- -------------
traffic-class 7 7 Traffic Class
You configure egress queues in the qos_infra.conf file.
Cumulus Linux supports 802.1Qaz, Enhanced Transmission Selection, which allows the switch to assign bandwidth to egress queues and then schedule the transmission of traffic from each queue. 802.1Qaz supports Priority Queuing.
Cumulus Linux provides a default egress scheduler that applies to all ports, where the bandwidth allocated to egress queues 0,2,4,6 is 12 percent and the bandwidth allocated to egress queues 1,3,5,7 is 13 percent. You can also apply a custom egress scheduler for specific ports; see Port Groups.
The following example modifies the default profile. The commands change the bandwidth allocation for egress queues 0, 1, 5, and 7 to strict, bandwidth allocation for egress queues 2 and 6 to 30 percent and bandwidth allocation for egress queues 3 and 4 to 20 percent.
The traffic-class value defines the egress queue where you want to assign bandwidth. For example, traffic-class 2 defines the bandwidth allocation for egress queue 2.
For each egress queue, you can either define the mode as dwrr or strict. In dwrr mode, you must define a bandwidth percent value between 1 and 100. If you do not specify a value for an egress queue, Cumulus Linux uses a DWRR value of 0 (no egress scheduling). The combined total of values you assign to bw_percent must be less than or equal to 100.
You configure the egress scheduling policy in the egress scheduling section of the /etc/cumulus/datapath/qos/qos_features.conf file.
The egr_queue_ value defines the egress queue where you want to assign bandwidth. For example, egr_queue_0 defines the bandwidth allocation for egress queue 0.
The bw_percent value defines the bandwidth allocation you want to assign to an egress queue. If you do not specify a value for an egress queue, there is no egress scheduling. If you specify a value of 0 for an egress queue, Cumulus Linux assigns strict priority mode to the egress queue and always processes it ahead of other queues. The combined total of values you assign to bw_percent must be less than or equal to 100.
strict mode does not define a maximum bandwidth allocation. This can lead to starvation of other queues.
To apply a custom egress scheduler for specific ports, see Port Groups.
Policing and Shaping
Traffic shaping and policing control the rate at which the switch sends or receives traffic on a network to prevent congestion.
Traffic shaping typically occurs at egress and traffic policing at ingress.
Shaping
Traffic shaping allows a switch to send traffic at an average bitrate lower than the physical interface. Traffic shaping prevents a receiving device from dropping bursty traffic if the device is either not capable of that rate of traffic or has a policer that limits what it accepts.
Traffic shaping works by holding packets in the buffer and releasing them at specific time intervals.
Cumulus Linux supports two levels of hierarchical traffic shaping: one at the egress queue level and one at the port level. This allows for minimum and maximum bandwidth guarantees for each egress queue and a defined port traffic shaping rate.
The following example configuration:
Sets the profile name (port group) to use with the traffic shaping settings to shaper1.
Sets the minimum bandwidth for egress queue 2 to 100 kbps. The default minimum bandwidth is 0 kbps.
Sets the maximum bandwidth for egress queue 2 to 500 kbps. The default minimum bandwidth is 2147483647 kbps.
Sets the maximum packet shaper rate for the port group to 200000. The default maximum packet shaper rate is 2147483647 kbps.
Applies the traffic shaping configuration to swp1, swp2, swp3, and swp5.
When the minimum bandwidth for an egress queue is 0, there is no bandwidth guarantee for this queue.
The maximum bandwidth for an egress queue must not exceed the maximum packet shaper rate for the port group.
The maximum packet shaper rate for the port group must not exceed the physical interface speed.
Cumulus Linux only shapes traffic for the traffic classes in a profile that include shaper configuration.
Traffic policing prevents an interface from receiving more traffic than intended. You use policing to enforce a maximum transmission rate on an interface. The switch drops any traffic above the policing level.
Cumulus Linux supports both a single-rate policer and a dual-rate policer (tricolor policer).
You configure traffic policing using ebtables, iptables, or ip6table rules.
For more information on configuring and applying ACLs, refer to Netfilter - ACLs.
Single-rate Policer
To configure a single-rate policer, use iptables JUMP action -j POLICE.
Cumulus Linux supports the following iptable flags with a single-rate policer.
iptables Flag
Description
--set-mode [pkt | KB]
Define the policer to count packets or kilobytes.
--set-rate [<kbytes> | <packets>]
The maximum rate of traffic in kilobytes or packets per second.
--set-burst <kilobytes>
The allowed burst size in kilobytes.
For example, to create a policer to allow 400 packets per second with 100 packet burst: -j POLICE --set-mode pkt --set-rate 400 --set-burst 100
Dual-rate Policer
To configure a dual-rate policer, use the iptables JUMP action -j TRICOLORPOLICE.
Cumulus Linux supports the following iptable flags with a dual-rate policer.
iptables Flag
Description
--set-color-mode [blind | aware]
The policing mode: single-rate (blind) or dual-rate (aware). The default is aware.
--set-cir <kbps>
The committed information rate (CIR) in kilobits per second.
--set-cbs <kbytes>
The committed burst size (CBS) in kilobytes.
--set-pir <kbps>
The peak information rate (PIR) in kilobits per second.
--set-ebs <kbytes>
The excess burst size (EBS) in kilobytes.
--set-conform-action-dscp <dscp value>
The numerical DSCP value to mark for traffic that conforms to the policer rate.
--set-exceed-action-dscp <dscp value>
The numerical DSCP value to mark for traffic that exceeds the policer rate.
--set-violate-action-dscp <dscp value>
The numerical DSCP value to mark for traffic that violates the policer rate.
--set-violate-action [accept | drop]
Cumulus Linux either accepts and remarks, or drops packets that violate the policer rate.
For example, to configure a dual-rate, three-color policer, with a 3 Mbps CIR, 500 KB CBS, 10 Mbps PIR, and 1 MB EBS and drops packets that violate the policer:
Cumulus Linux supports profiles (port groups) for all features including ECN and RED. Profiles apply similar QoS configurations to a set of ports.
Configurations with a profile override the global settings for the ingress ports in the port group.
Ports not in a profile use the global settings.
To apply a profile to all ports, use the global profile.
Trust and Marking
You can use port groups to assign different profiles to different ports. A profile is a label for a group of configuration settings.
The following example configures two profiles. customer1 applies to swp1, swp4, and swp6. customer2 applies to swp5 and swp7.
cumulus@switch:~$ nv set qos mapping customer1 trust l3
cumulus@switch:~$ nv set qos mapping customer1 dscp 0 switch-priority 1-7
cumulus@switch:~$ nv set interface swp1,swp4,swp6 qos mapping profile customer1
cumulus@switch:~$ nv set qos mapping customer2 trust l2
cumulus@switch:~$ nv set qos mapping customer2 pcp 1 switch-priority 4
cumulus@switch:~$ nv set interface swp5,swp7 qos mapping profile customer2
cumulus@switch:~$ nv config apply
The following example configures the profile customports, which assigns traffic on swp1, swp2, and swp3 to switch priority 4 regardless of the ingress marking.
cumulus@switch:~$ nv set qos mapping customports trust port
cumulus@switch:~$ nv set qos mapping customports port-default-sp 4
cumulus@switch:~$ nv set interface swp1,swp2,swp3 qos mapping profile customports
cumulus@switch:~$ nv config apply
You define profiles with the source.port_group_list configuration in the qos_features.conf file. A source.port_group_list is one or more names used for a group of settings.
The following example configures two profiles. customer1 applies to swp1, swp4, and swp6. customer2 applies to swp5 and swp7.
The names of the port groups (profiles) you want to use. The following example defines customer1 and customer2: source.port_group_list = [customer1,customer2]
source.customer1.packet_priority_source_set
The ingress marking trust. In the following example, ingress DSCP values are for group customer1: source.customer1.packet_priority_source_set = [dscp]
source.customer1.port_set
The set of ports on which to apply the ingress marking trust policy. In the following example, ports swp1, swp2, swp3, swp4, and swp6 are for customer1: source.customer1.port_set = swp1-swp4,swp6
source.customer1.port_default_priority
The default switch priority marking for unmarked or untrusted traffic. In the following example, Cumulus Linux marks unmarked traffic or layer 2 traffic for customer1 ports with switch priority 0: source.customer1.port_default_priority = 0
source.customer1.cos_0.priority_source
The ingress DSCP values to a switch priority value mapping for customer1. In the following example, the set of DSCP values from 0 through 7 map to switch priority 0: source.customer1.cos_0.priority_source.dscp = [0,1,2,3,4,5,6,7]
source.customer2.packet_priority_source_set
The ingress marking trust for customer2. In the following example, 802.1p is trusted: source.packet_priority_source_set = [802.1p]
source.customer2.port_set
The set of ports on which to apply the ingress marking trust policy. In the following example, swp5 and swp7 apply for customer2: source.customer2.port_set = swp5,swp7
source.customer2.port_default_priority
The default switch priority marking for unmarked or untrusted traffic. In the following example, Cumulus Linux marks unmarked tagged layer 2 traffic or unmarked VLAN tagged traffic for customer1 ports with switch priority 0: source.customer2.port_default_priority = 0
source.customer2.cos_0.priority_source
The switch priority value to an ingress 802.1p value mapping for customer2. The following example maps ingress 802.1p value 4 to switch priority 1: source.customer2.cos_1.priority_source.8021p = [4]
The following example configures the profile customports, which assigns traffic on swp1, swp2, and swp3 to switch priority 4 regardless of the ingress marking.
You can use profiles to remark 802.1p or DSCP on egress according to the switch priority (internal COS) value.
To change the marked value on a packet, the switch ASIC reads the enable or disable rewrite flag on the ingress port and refers to the mapping configuration on the egress port to change the marked value. To remark 802.1p or DSCP values, you have to enable the rewrite on the ingress port and configure the mapping on the egress port.
In the following example configuration, only packets that ingress on swp1 and egress on swp2 change the marked value of the packet. Packets that ingress on other ports and egress on swp2 do not change the marked value of the packet. The commands map switch priority 0 and 1 to egress DSCP 37.
cumulus@switch:~$ nv set qos remark remark_port_group1 rewrite l3
cumulus@switch:~$ nv set interface swp1 qos remark profile remark_port_group1
cumulus@switch:~$ nv set qos remark remark_port_group2 switch-priority 0 dscp 37
cumulus@switch:~$ nv set qos remark remark_port_group2 switch-priority 1 dscp 37
cumulus@switch:~$ nv set interface swp2 qos remark profile remark_port_group2
cumulus@switch:~$ nv config apply
You define these profiles with remark.port_group_list in the /etc/cumulus/datapath/qos/qos_features.conf file. The name is a label for configuration settings.
You can use port groups with egress scheduling weights to assign different profiles to different egress ports.
In the following example, the profile list2 applies to swp1, swp3, and swp18. list2 only assigns weights to queues 2, 5, and 6, and schedules the other queues on a best-effort basis when there is no congestion in queues 2, 5, or 6. list1 applies to swp2 and assigns weights to all queues.
You define port groups with egress_sched.port_group_list in the /etc/cumulus/datapath/qos/qos_features.conf file. An egress_sched.port_group_list includes the names for the group settings. The name is a label (profile) for the configuration settings.
The names of the port groups (labels) to use. The following example defines port groups list1 snd list2: egress_sched.port_group_list = [list1,list2]
egress_sched.list1.port_set
The interfaces on which you want to apply the port group. egress_sched.list1.port_set = swp2
egress_sched.list1.egr_queue_0.bw_percent
The percentage of bandwidth for egress queue 0. egress_sched.list1.egr_queue_0.bw_percent = 10
egress_sched.list1.egr_queue_1.bw_percent
The percentage of bandwidth for egress queue 1. egress_sched.list1.egr_queue_1.bw_percent = 20
egress_sched.list1.egr_queue_2.bw_percent
The percentage of bandwidth for egress queue 2. egress_sched.list1.egr_queue_2.bw_percent = 30
egress_sched.list1.egr_queue_3.bw_percent
The percentage of bandwidth for egress queue 3. egress_sched.list1.egr_queue_3.bw_percent = 10
egress_sched.list1.egr_queue_4.bw_percent
The percentage of bandwidth for egress queue 4. egress_sched.list1.egr_queue_4.bw_percent = 10
egress_sched.list1.egr_queue_5.bw_percent
The percentage of bandwidth for egress queue 5.
egress_sched.list1.egr_queue_5.bw_percent = 10
egress_sched.list1.egr_queue_6.bw_percent
The percentage of bandwidth for egress queue 6. egress_sched.list1.egr_queue_6.bw_percent = 10
egress_sched.list1.egr_queue_7.bw_percent
The percentage of bandwidth for egress queue 7. 0 indicates a strict priority queue: egress_sched.list1.egr_queue_7.bw_percent = 0
egress_sched.list2.port_set
The interfaces you want to apply to the port group. The following example applies swp1, swp3 and swp18 to port group list2: egress_sched.list2.port_set = [swp1,swp3,swp18]
egress_sched.list2.egr_queue_2.bw_percent
The percentage of bandwidth for egress queue 2. egress_sched.list2.egr_queue_2.bw_percent = 50
egress_sched.list2.egr_queue_5.bw_percent
The percentage of bandwidth for egress queue 5. egress_sched.list2.egr_queue_5.bw_percent = 50
egress_sched.list2.egr_queue_6.bw_percent
The percentage of bandwidth for egress queue 6. 0 indicates a strict priority queue: egress_sched.list2.egr_queue_6.bw_percent = 0
PFC
To set priority flow control on a group of ports, you create a profile to define the egress queues that support sending PFC pause frames and define the set of interfaces to which you want to apply PFC pause frame configuration. Cumulus Linux automatically enables PFC frame transmit and PFC frame receive, and derives all other PFC settings, such as the buffer limits that trigger PFC frames transmit to start and stop, the amount of reserved buffer space, and the cable length.
The following example applies a PFC profile called my_pfc_ports for egress queue 3 and 5 on swp1, swp2, swp3, swp4, and swp6.
The following example applies a PFC profile called my_pfc_ports2 for egress queue 0 on swp1. The commands disable PFC frame receive, and set the buffer limit that triggers PFC frame transmission to stop to 1500 bytes and to start to 1000 bytes. The commands also set the amount of reserved buffer space to 2000 bytes, and the cable length to 50 meters:
cumulus@switch:~$ nv set qos pfc my_pfc_ports2 switch-priority 0
cumulus@switch:~$ nv set qos pfc my_pfc_ports2 xoff-threshold 1500
cumulus@switch:~$ nv set qos pfc my_pfc_ports2 xon-threshold 1000
cumulus@switch:~$ nv set qos pfc my_pfc_ports2 tx enable
cumulus@switch:~$ nv set qos pfc my_pfc_ports2 rx disable
cumulus@switch:~$ nv set qos pfc my_pfc_ports2 port-buffer 2000
cumulus@switch:~$ nv set qos pfc my_pfc_ports2 cable-length 50
cumulus@switch:~$ nv set interface swp1 qos pfc profile my_pfc_ports2
cumulus@switch:~$ nv config apply
All PFC commands
Command
Description
nv set qos pfc <profile> port-buffer <value>
The amount of reserved buffer space (from the global shared buffer) for the interfaces defined in the port group list . The following example sets the amount of reserved buffer space to 25000 bytes: nv set qos pfc my_pfc_ports port-buffer 25000
nv set qos pfc <profile> xoff-threshold <value>
The amount of reserved buffer that the switch must consume before sending a PFC pause frame out of the set of interfaces in the port group list. The following example sends PFC pause frames after consuming 20000 bytes of reserved buffer: nv set qos pfc my_pfc_ports xoff-threshold 20000
nv set qos pfc <profile> xon-threshold <value>
The number of bytes below the xoff threshold that the buffer consumption must drop below before sending PFC pause frames stops. In the following example, the buffer congestion must reduce by 1000 bytes (to 8000 bytes) before PFC pause frames stop: nv set qos pfc my_pfc_ports xon-threshold 1000
nv set qos pfc <profile> rx enable nv set qos pfc <profile> rx disable
Enables or disables sending PFC pause frames. The default value is enable. The following example disables sending PFC pause frames: nv set qos pfc my_pfc_ports rx disable
nv set qos pfc <profile> tx enable nv set qos pfc <profile> tx disable
Enables or disables receiving PFC pause frames. You do not need to define the COS values for rx enable. The switch receives any COS value. The default value is enable. The following example disables receiving PFC pause frames: nv set qos pfc my_pfc_ports tx disable
nv set qos pfc <profile> cable-length <value>
The length, in meters, of the cable that attaches to the ports. Cumulus Linux uses this value internally to determine the latency between generating a PFC pause frame and receiving the PFC pause frame. The default is 10 meters. The following example sets the cable length to 5 meters: nv set qos pfc my_pfc_ports cable-length 5
Edit the priority flow control section of the /etc/cumulus/datapath/qos/qos_features.conf file.
The following example applies a PFC profile called my_pfc_ports for egress queue 3 and 5 on swp1, swp2, swp3, swp4, and swp6.
The following example applies a PFC profile called my_pfc_ports2 for egress queue 0 on swp1. The commands also disable PFC frame receive, and set the xoff-size to 1500 bytes, the xon-size to 1000 bytes, the headroom to 2000 bytes, and the cable length to 10 meters:
The amount of reserved buffer space (from the global shared buffer) for the interfaces defined in the port group list. The following example sets the amount of reserved buffer space to 25000 bytes: pfc.my_pfc_ports.port_buffer_bytes = 25000
pfc.my_pfc_ports.xoff_size
The amount of reserved buffer that the switch must consume before sending a PFC pause frame out the set of interfaces in the port group list. The following example sends PFC pause frames after consuming 10000 bytes of reserved buffer: pfc.my_pfc_ports.xoff_size = 10000
pfc.my_pfc_ports.xon_delta
The number of bytes below the xoff threshold that the buffer consumption must drop below before sending PFC pause frames stops. The following example the buffer congestion must reduce by 2000 bytes (to 8000 bytes) before PFC pause frames stop: pfc.my_pfc_ports.xon_delta = 2000
pfc.my_pfc_ports.rx_enable
Enables (true) or disables (false) sending PFC pause frames. The default value is true. The following example enables sending PFC pause frames: pfc.my_pfc_ports.tx_enable = true
pfc.my_pfc_ports.tx_enable
Enables (true) or disables (false) receiving PFC pause frames. You do not need to define the COS values for rx_enable. The switch receives any COS value. The default value is true. The following example enables receiving PFC pause frames: pfc.my_pfc_ports.rx_enable = true
pfc.my_pfc_ports.cable_length
The length, in meters, of the cable that attaches to the port in the port group list. Cumulus Linux uses this value internally to determine the latency between generating a PFC pause frame and receiving the PFC pause frame. The default is 10 meters In this example, the cable is 5 meters: pfc.my_pfc_ports.cable_length = 5
ECN
You can create ECN profiles and assign them to different ports.
The following example creates a custom ECN profile called my-red-profile for egress queue (traffic-class) 1 and 2. The commands set the minimum buffer threshold to 40000 bytes, maximum buffer threshold to 200000 bytes, and the probability to 10. The commands also enable RED and apply the ECN profile to swp1 and swp2.
cumulus@switch:~$ nv set qos congestion-control my-red-profile traffic-class 1,2 min-threshold-bytes 40000
cumulus@switch:~$ nv set qos congestion-control my-red-profile traffic-class 1,2 max-threshold-bytes 200000
cumulus@switch:~$ nv set qos congestion-control my-red-profile traffic-class 1,2 probability 10
cumulus@switch:~$ nv set qos congestion-control my-red-profile traffic-class 1,2 red enable
cumulus@switch:~$ nv set interface swp1,swp2 qos congestion-control my-red-profile
cumulus@switch:~$ nv config apply
You can configure different thresholds and probability values for different traffic classes in a custom profile:
cumulus@switch:~$ nv set qos congestion-control my-red-profile traffic-class 1,2 min-threshold-bytes 40000
cumulus@switch:~$ nv set qos congestion-control my-red-profile traffic-class 1,2 max-threshold-bytes 200000
cumulus@switch:~$ nv set qos congestion-control my-red-profile traffic-class 1,2 probability 10
cumulus@switch:~$ nv set qos congestion-control my-red-profile traffic-class 1,2 red enable
cumulus@switch:~$ nv set qos congestion-control my-red-profile traffic-class 4 min-threshold-bytes 30000
cumulus@switch:~$ nv set qos congestion-control my-red-profile traffic-class 4 max-threshold-bytes 150000
cumulus@switch:~$ nv set qos congestion-control my-red-profile traffic-class 4 probability 80
cumulus@switch:~$ nv set interface swp1,swp2 qos congestion-control my-red-profile
cumulus@switch:~$ nv config apply
You can disable ECN bit marking for an ECN profile. The following example disables ECN bit marking in the my-red-profile profile:
Edit the Explicit Congestion Notification section of the /etc/cumulus/datapath/qos/qos_features.conf file.
The following example creates a custom ECN profile called my-red-profile for egress queue 1 and 2, with a minimum buffer threshold of 40000 bytes, maximum buffer threshold of 200000 bytes, and a probability of 10. The commands also enable RED and apply the ECN profile to swp1 and swp2.
You can only have a single lossless pool configured on the switch at a time. Configure the roce-lossless pool when you are using RoCE, otherwise configure the default-lossless pool.
You can configure multiple lossy pools concurrently.
You configure a traffic pool by associating switch priorities and defining the buffer memory percentages allocated to the pools. The following example associates switch priority 2 and allocates a memory percentage of 30 for the mc-lossy pool:
cumulus@switch:~$ nv set qos traffic-pool default-lossy switch-priority 0,1,3,4,5,6,7
cumulus@switch:~$ nv set qos traffic-pool default-lossy memory-percent 70
cumulus@switch:~$ nv set qos traffic-pool mc-lossy switch-priority 2
cumulus@switch:~$ nv set qos traffic-pool mc-lossy memory-percent 30
cumulus@switch:~$ nv config apply
Configure the following settings in the /etc/mlx/datapath/qos/qos_infra.conf file:
For additional default-lossless and RoCE pool examples, see Flow Control Buffers and RoCE. You can view traffic-pool configuration with the nv show qos traffic-pool <pool name> command:
You can use NVUE commands to tune advanced buffer properties in addition to the supported traffic pool configurations. Advanced buffer configuration can override the base traffic-pool profiles configured on the system.
You can only configure advanced buffer settings for the default-global profile.
Buffer Regions
You can adjust advanced buffer settings with the following NVUE command:
nv set qos advance-buffer-config default-global <buffer> <priority-group | property> <value>
You can adjust settings for the following supported buffer regions and properties:
Buffers
Supported Property Values
ingress-lossy-buffer
Cumulus Linux supports the following properties for the bulk, control, and service[1-6] priority groups: name - The priority group alias name. reserved - The reserved buffer allocation in bytes. service-pool - Service pool mapping. shared-alpha - The dynamic shared buffer alpha allocation. shared-bytes - The static shared buffer allocation in bytes. switch-priority - Switch priority values.
egress-lossless-buffer
reserved - The reserved buffer allocation in bytes. service-pool - Service pool mapping. shared-alpha - The dynamic shared buffer alpha allocation. shared-bytes - The static shared buffer allocation in bytes.
ingress-lossless-buffer
service-pool - Service pool mapping. shared-alpha - The dynamic shared buffer alpha allocation. shared-bytes - The static shared buffer allocation in bytes.
egress-lossy-buffer
multicast-port - Multicast port reserved or shared-bytes allocation in bytes. multicast-switch-priority [0-7] - Set the reserved, service-pool,shared-alpha, or shared-bytes properties for each multicast switch priority. traffic-class [0-15] - Set the reserved, service-pool,shared-alpha, or shared-bytes properties for each traffic class.
Configure shared-bytes for buffer regions mapped to static service pools, and shared-alpha for buffer regions mapped to dynamic service pools.
The shared buffer alpha value determines the proportion of available shared memory allocated across buffer regions. Regions with higher alpha values receive a higher proportion of available shared buffer memory. The following example changes the ingress-lossless-buffer shared alpha value to alpha_2 when using RoCE lossless mode:
You can configure ingress and egress service pool profile properties with the following NVUE commands:
nv set qos advance-buffer-config default-global ingress-pool <pool-id> <property> <value>
nv set qos advance-buffer-config default-global egress-pool <pool-id> <property> <value>
You can adjust the following properties for each pool:
Property
Description
infinite
The pool infinite flag.
memory-percent
The pool memory percent allocation.
mode
The pool mode: static or dynamic.
reserved
The reserved buffer allocation in bytes.
shared-alpha
The dynamic shared buffer alpha allocation.
shared-bytes
The static shared buffer allocation in bytes.
A relationship exists between the default traffic pools and the advanced buffer configuration settings.
Use caution when configuring advanced buffer settings. NVUE presents a warning if you attempt to apply incompatible traffic pool and advanced buffer configurations. NVUE performs the following validation checks before applying advanced buffer configurations:
You must map all switch priorities (0-7) to a priority group. You can map more than one switch priority to the same priority group.
The sum of memory-percent values across all ingress pools must be less than or equal to 100 percent.
The sum of memory-percent values across all egress pools must be less than or equal to 100 percent.
Reference the table below to view the mappings between the default traffic pool and advanced buffer properties:
Default Traffic Pool
Default Traffic Pool Properties
Advanced Buffer Region or Service Pool
Advanced Buffer Properties
default-lossy
memory-percent
ingress-pool 0 egress-pool 0
memory-percent
default-lossy
switch-priority
ingress-lossy-buffer
priority-group bulk switch-priority
default-lossless
memory-percent
ingress-pool 1 egress-pool 1
memory-percent
roce-lossless
memory-percent
ingress-pool 1 egress-pool 1
memory-percent
mc-lossy
memory-percent
ingress-pool 2 egress-pool 2
memory-percent
mc-lossy
switch-priority
ingress-lossy-buffer
priority-group service2 switch-priority
For example, to assign 20 percent of memory to a new static service pool, you must allow 20 percent of memory to be available from the default traffic pools. The following commands reduce the default-lossy traffic pool to 80 percent memory, allowing you to assign the memory to ingress-pool 3:
Cumulus Linux provides a syntax checker for the qos_features.conf and qos_infra.conf files to check for errors, such missing parameters or invalid parameter labels and values.
The syntax checker runs automatically with every switchd reload.
You can run the syntax checker manually from the command line with the cl-consistency-check --datapath-syntax-check command. If errors exist, they write to stderr by default. If you run the command with -q, errors write to the /var/log/switchd.log file.
The cl-consistency-check --datapath-syntax-check command takes the following options:
Option
Description
-h
Displays this list of command options.
-q
Runs the command in quiet mode. Errors write to the /var/log/switchd.log file instead of stderr.
-qi
Runs the syntax checker against a specified qos_infra.conf file.
-qf
Runs the syntax checker against a specified qos_features.conf file.
By default the syntax checker assumes:
qos_infra.conf is in /etc/mlx/datapath/qos/qos_infra.conf
qos_features.conf is in /etc/cumulus/datapath/qos/qos_features.conf
You can run the syntax checker when switchd is either running or stopped.
Show Qos Counters
NVUE provides the following commands to show QoS statistics for an interface:
NVUE Command
Description
nv show interface <interface> counters qos
Shows all QoS statistics for a specific interface.
nv show interface <interface> counters qos egress-queue-stats
Shows QoS egress queue statistics for a specific interface.
nv show interface <interface> counters qos ingress-buffer-stats
Shows QoS ingress buffer statistics for a specific interface.
nv show interface <interface> counters qos pfc-stats
Shows QoS PFC statistics for a specific interface.
nv show interface <interface> counters qos port-stats
Shows QoS port statistics for a specific interface.
The following example shows all QoS statistics for swp1: