Cumulus Linux 5.4 Release Notes

Download 5.4 Release Notes xls    Download all 5.4 release notes as .xls

5.4.0 Release Notes

Open Issues in 5.4.0

Issue IDDescriptionAffectsFixed
3375047
If you run the NVUE nv set service snmp-server readonly-community command to set an SNMP V2 trap community string that includes fewer than eight characters, the configuration fails. The SNMP V2 trap community string must include eight or more characters.5.4.0
3362113
If you restore an NVUE startup.yaml file after upgrade that includes breakout ports with QoS configured on the breakout interfaces or you run the nv config patch command to update a configuration with a yaml file that includes breakout ports with QoS configured on the breakout interfaces, the NVUE configuration fails to apply and subsequent attempts to run nv config apply fail with the following message:
cumulus@switch:mgmt:/var/log# nv config applyInvalid config [rev_id: 11]
qos config is not supported on following invalid interface: swp1s0. Supported on swp and bond interface types
To work around this issue, run nv unset on the configured QoS settings, then apply the breakout port configuration before you configure QoS or remove the QoS configuration from the yaml file and patch it separately after applying the breakout configuration.
5.4.0
3361904
The NVUE PTP shaping commands are available in the NVUE command list; however, these commands are disabled and do not configure PTP shaping. PTP shaping is not supported in Cumulus Linux 5.4.5.4.0
3358909
When you expand the root schema in the NVUE Swagger API, you see many errors in the Error panel at the top of the page.5.3.1-5.4.0
3351941
Cumulus Linux 5.4 package upgrade (apt-upgrade) does not support warm restart to complete the upgrade; performing an unsupported upgrade can result in unexpected or undesirable behavior, such as a traffic outage.5.4.0
3351938
Configuring TACACS NSS debugging can cause NVUE commands to break for all users who log in while they are enabled. To work around this issue, if your terminal session is unaffected and the debugs are configured with NVUE, unset the debugs or set the debug level to 0.5.4.0
3350061
If you use TACACS+ authentication, modifying the TACACS+ configuration with NVUE might result in a timeout error when you run the nv config apply command. To work around the issue, restart the nvued service with the sudo systemctl restart nvued.service command, then apply the configuration again.5.4.0
3350027
If you uninstall dynamic NAT rules and switchd restarts before all the dynamic NAT flows age out and are deleted, you might see errors in switchd.log due to dynamic flow deletion errors. These errors do not affect new dynamic NAT flows from new NAT rules.5.4.0
3349533
On the Spectrum-2 and Spectrum-3 switch with ports operating at 1G speed, there is loss of frames that have an odd or random frame size. In the frame size range of 75 to 1000 bytes, there is frame loss of less than approximately one percent for all odd or random frame sizes in the range. In the frame size range greater than 1000 bytes, there is no loss observed.5.4.0
3349207
The switch does not learn MAC addresses from DHCP packets. When a DHCP enabled host is plugged in for the first time, it tries to obtain an IP address through DHCP. The switch does not learn the MAC address of the host when it receives these DHCP packets; therefore, the host MAC address is not updated in the local forwarding database and it does not get advertised across EVPN. The switch learns the MAC address when it receives other packets, such as ARP or ND from the host. To work around this issue, either configure a temporary IP address on the host to initiate ARP/ND or enable IPv6, which sends ND after link local address creation.5.2.0-5.4.0
3347538
When connecting a NVIDIA-to-NVIDIA in PAM4, auto-negotiation should always be enabled.5.4.0
3345054
The NVUE nv show interface qos command takes a significant time to show output or times out. To work around this issue, use specific QoS commands. For example, to show congestion control information, run the nv show interface qos congestion-control command.5.4.0
3341214
If you use the NVUE REST API to configure a local user with a hashed password, the user cannot log in and the /etc/nvue.d/startup.yaml file shows the password as plain text.5.4.0
3340890
When you run the NVUE nv show interface command, you see an error similar to the following:
Error: GET /nvue_v1/interface/swp45?rev=operational responded with 500 INTERNAL SERVER ERROR
5.3.0-5.4.0
3339278
Using the NVUE REST API with a TACACS+ user account does not work due to authentication failures. To work around this problem, replace /etc/pam.d/nvueapi with the following content:
@include common-auth@include common-account@include common-session-noninteractive
5.4.0
3336808
If you run the NVUE nv set interface description command without providing a description, the nv config apply command fails with an error:
root@cumulus:mgmt:~# nv set interface swp5 description ““root@cumulus:mgmt:~# nv config apply
Unable to restart services (ifreload-nvue.service):
Job for ifreload-nvue.service failed because the control process exited with error code
During restart of ifreload-nvue.service:error: /etc/network/interfaces: line28: iface swp5: invalid syntax ‘alias’
»> Full logs available in: /var/log/ifupdown2/network_config_ifupdown2_104_Jan-20-2023_15:00:10.761330 ««br />Failed to start ifreload wrapper service (for NVUE compatibility)
5.4.0
3331929
In a fairly high scale BGP EVPN route environment, running the NVUE nv show vrf default router bgp address-family l2vpn-evpn loc-rib command to obtain data leads to high resource usage. In some cases, there is an out of memory error, which leads to multiple daemon crashes.5.4.0
3330654
When using TACACS+, if the /etc/nsswitch.conf file specifies passwd: files tacplus (files is listed before tacplus), the user name mapping might be incorrect; for example, the user name shown in the default prompt might be incorrect. When you use NVUE, this occurs when the priority for the authentication order of local is higher than tacacs.3.7.0-3.7.16, 4.0.0-4.4.5, 5.0.0-5.4.0
3329518
When using TACACS+, if the /etc/nsswitch.conf file specifies passwd: files tacplus (files is listed before tacplus), a user that is present in both the local /etc/passwd file and the TACACS+ server cannot log into the switch.5.4.0
3329494
Ethtool HwIfInDot3FrameErrors (Rx FCS Errors) might lead to an incorrect very large HwIfInErrors count. To work around this issue, stop the source of the FCS errors, then reset the interface counters:1. Run the sudo mst status command to find the device
2. Run the sudo mlxlink -d -p <port_number> -pc command to reset the interface counters. For example: sudo mlxlink -d /dev/mst/mt53104_pciconf0 -p 39 -pc.
5.3.1-5.4.0
3327477
Using su to change to a user specified through TACACS+ results in becoming the local tacacs0 thru tacacs15 user instead of the named user to run sudo commands. When sudo asks for the password of the named user, it is unlikely to match that of the local tacacs0 thru tacacs15 user.3.7.0-3.7.16, 4.0.0-4.4.5, 5.0.0-5.4.0
3326659
If you have a large number of MAC addresses, they do not age out at the MAC ageing timeout value configured on the switch. It might take up to 30 seconds more for the MAC addresses to age out and be deleted from the hardware. To work around this issue, wait for the ageing timeout value plus 30 seconds to allow for the MAC addresses to age out and be deleted from the hardware.5.4.0
3312240
On the Spectrum 1 switch if PTP is configured on a bond, the ptp4l service fails to start and you see a message similar to the following:
systemd[1]: ptp4l.service: Failed with result ‘exit-code’
systemd[1]: Failed to start Precision Time Protocol (PTP) service
5.4.0
3293560
If you run NVUE commands to break out a port into four interfaces, NVUE disables the subsequent port automatically. However, if you run NVUE commands to break out a port into eight interfaces, NVUE does not disable the subsequent port automatically; you have to run the NVUE command to disable the subsequent port.5.4.0
3269691
When you restart the LLDP service, you see a broken pipe error and a log message in the lldpd.service logs. This error does not affect LLDP functionality.5.4.0
3258232
If you use NVUE to configure multiple SNMP listener addresses at the same time, the SNMP service fails to start. To work around this issue, configure multiple SNMP listener addresses one at a time.5.3.0-5.4.0
3241567
When you apply switch configuration for the first time on a freshly booted switch, you might see the error message Failed to start Hostname Service when you run the nv config apply command after setting the hostname with nv set system hostname. To work around this issue, run the nv config apply command a second time.5.3.0-5.4.0
3234814
With double tagged QinQ interfaces, if the bridge corresponding to the QinQ interface flaps, you might see invalid learning notifications and errors from similar to the following:
Can’t set non-static MAC address for non-vPort 0x0001006B when VID is VFID. 
5.3.0-5.4.0
3232091
The NVUE nv unset interface link lanes command does not restore the port lane setting to default value. To work around this issue, run the nv set interface link lanes command.5.4.0
3226506
The l1-show eth0 command does not show port information and is not supported in this release.5.3.0-5.4.0
3221628
Cumulus Linux 5.2.0 and 5.2.1 VX images might include an incorrect entry at the end of /etc/apt/sources.list, which produces warnings when you run apt update. Remove this entry to avoid these warnings.5.2.0-5.4.0
3172682
On rare occasions, when you query the system hostname through the hostnamctl application, you see a timeout. NVUE uses the hostnamctl application to determine the system hostname, which can result in an nv config apply command failure.5.2.0-5.4.0
3172504
When you connect the NVIDIA SN4600C switch to a Spectrum 1 or Spectrum-3 switch with a 40GbE passive copper cable (Part Number: MC2210126-005) on edge ports 1-4 and 61-64, there is an Effective BER of 1E-12 in PHY.5.2.0-5.4.0
3147782
You cannot use NVUE to configure an SNMP view to include a subtree beginning with a period. For example:
$ nv set service snmp-server viewname cumulusOnly included .1.3.6.1.4.1.40310Error: GET /nvue_v1/service/snmp-server/viewname/cumulusOnly/included?pointers=%5B%22%2Fparameters%22%2C+%22%2Fpatch%2FrequestBody%2Fcontent%2Fapplication~1json%2Fschema%22%2C+%22%2Fpatch%2Fparameters%22%2C+%22%2Fpatch%2Fresponses%2F200%2Flinks%22%5D responded with 404 NOT FOUND
To work around this problem, reference the OID without the preceding period ( . ) in the command.
5.3.0-5.4.0
3145224
If you disable the NVUE service, the /etc/cumulus/datapath/nvue_traffic.conf file does not delete automatically, which prevents ECMP and LAG hash settings in the /etc/cumulus/datapath/traffic.conf file from taking effect. To work around this issue, delete the nvue_traffic.conf file with the sudo rm /etc/cumulus/datapath/nvue_traffic.conf command.5.2.0-5.4.0
3145204
On the NVIDIA Spectum-1 switch, the nv show system forwarding command shows GTP hashing output, which is not supported on this switch.5.2.0-5.4.0
3142615
The BGP4-MIB.txt file is missing from Net-SNMP in Cumulus Linux 5.0.0 through 5.2.1 and 5.4.0.5.0.0-5.4.0
3135952
PAM4 split cables (such as 2x100G, 4x100G, and 4x50G) do not work with a forced speed setting (when auto-negotiation is off) as the default speed enabled is for NRZ mode (such as 100G_4X). To work around this issue, set the appropriate lanes for forced speed (with auto-negotation off) with the ethtool -s swpX speed <port_speed> autoneg off lanes <no_of_lanes> command. For example:
ethtool -s swp1 speed 100000 autoneg off lanes 2
5.2.0-5.4.0
3122301
On the NVIDIA SN4700 switch, inserting and removing the PSU might cause loss of frames.5.2.0-5.4.0
3115242
When you configure two VNIs in the same VLAN, ifupdown2 shows a vlan added to two or more VXLANS warning, which is only issued after the VNI is already added to the bridge. This leaves the new VNI in the PVID even if there is already an existing VNI configured in that PVID.5.1.0-5.4.0
3103821
On the NVIDIA SN4700 switch, inserting and removing the PSU might cause loss of frames.5.2.0-5.4.0
3084476
QOS traffic shaping doesn’t restore the default configuration after you disable traffic shaping in the /etc/cumulus/datapath/qos/qos_features.conf file. To work around this issue, restart switchd.4.4.3, 5.0.0-5.4.04.4.4-4.4.5
3084027
Under a high load, you might see ingress drop counters increase. The drops are classified as HwIfInDiscards in ethtool and shown as ingress_general in hardware.4.3.0-4.4.5, 5.0.0-5.4.0
3071652
On rare occasions, after you reboot or restart switchd on a Spectrum 1 switch, any 25G connections with Direct Attach Copper (DAC) cables that connect from the switch to a non-NVIDIA device might flap continuously. To work around this issue, bring the affected link administratively down for a few seconds on the non-NVIDIA device, then bring the link back up.4.4.4-4.4.5, 5.1.0-5.4.0
3061656
When the CPU load is high during a warm boot, bonds with a slow LACP rate fail to forward layer 2 traffic for up to 60 seconds (depending on the duration of the CPU load) and static bonds fail to forward layer 2 traffic for up to 5 seconds.5.1.0-5.4.0
3055283
After you run Linux commands to enable a custom ECMP or LAG hash parameter, if you set the hash_config.enable or lag_hash_config.enable parameter to false, the custom parameters do not restore their default values. To work around this issue, change the custom ECMP or LAG hash parameters to their default values manually.5.1.0-5.4.0
3053197
The cl-resource-query command output shows ECMP nextHop Table exhaustion (above 100 percent utilization) and the switchd.log file contains ECMP resource errors with routes and next hops failing to install.4.2.1-4.4.5, 5.0.0-5.4.0
3053094
When the CPU load is high during a warm boot, bonds with a slow LACP rate fail to forward layer 2 traffic for up to 60 seconds (depending on the duration of the CPU load) and static bonds fail to forward layer 2 traffic for up to 5 seconds.5.1.0-5.4.0
3045310
If GTP Hashing is set to true, after more than two warm boots, switchd fails and a cl-support file is generated.5.1.0-5.4.0
3034435
In a MLAG EVPN deployment when either of the clag peer undergoes reboot then the local host entries in ARP table are incorrectly programmed as remote by FRR.4.4.4-4.4.5, 5.1.0-5.4.0
2972540
With RADIUS enabled for user shell authentication, there might be a delay in local user authentication for non cumulus user accounts.5.0.0-5.4.0
2964279
When a VNI flaps, an incorrect list of layer 2 VNIs are associated with a layer 3 VNI. The NCLU net show evpn vni detail command output shows duplicate layer 2 VNIs under a layer 3 VNI.3.7.15, 4.4.2-4.4.5, 5.0.0-5.4.03.7.16
2951110
The net show time ntp servers command does not show any output with management VRF.3.7.15-3.7.16, 4.1.1-4.4.5, 5.0.0-5.4.0
2904450
When you run the ethtool -m or the l1-show command, the 400G interface optical values do not show.4.4.0-4.4.5, 5.0.0-5.4.0
2891255
CVE-2021-39925: Buffer overflow in the Bluetooth SDP dissector in Wireshark 3.4.0 to 3.4.9 and 3.2.0 to 3.2.17 allows denial of service via packet injection or crafted capture file
Vulnerable: <= 2.6.20-0+deb10u1Fixed: 2.6.20-0+deb10u2
4.0.0-4.4.1, 5.0.0-5.4.04.4.2-4.4.5
2890681
CVE-2021-42771: relative path traversal in Babel, a set of tools for internationalising Python applications, could result in the execution of arbitrary code
Vulnerable: 2.6.0+dfsg.1-1Fixed: 2.6.0+dfsg.1-1+deb10u1
4.0.0-4.4.1, 5.0.0-5.4.04.4.2-4.4.5
2867042
When connecting the SN4600V to an SNxxxx system, it must run in auto-negotiation mode (not force mode) otherwise the wrong Tx configuration might be chosen.5.0.0-5.4.0
2859015
In a static VXLAN configuration with a traditional VXLAN device, enabling bridge learning on the VNI leads to an incorrect warning and the setting is removed in the next commit. The warning is similar to the following:
warning: vni10: possible mis-configuration detected: l2-vni configured with bridge-learning ON while EVPN is also configured - these two parameters conflict with each other
5.0.0-5.4.0
2847919
Configuring a router with the REST API through the switch front panel ports (swps) is supported in the default VRF only
To work around this issue, use the localHost IP address or the MGMT IP address to configure router using the Rest API.
5.0.0-5.4.0
2847755
When you use NCLU to remove the configuration for a peer that is a member of a group but also has other peer-specific configuration, you must remove the peer-specific configuration before you delete the peer in a separate NCLU commit.5.0.0-5.4.0
2823307
Cumuls Linux does not support a bond with more than 64 ports. Any configuration with more than 64 ports in a bond changes all ports to down when you apply the configuration.5.0.0-5.4.0
2736108
When you change the VRRP advertisement interval on the master, the master advertisement interval field in the show vrrp command output does not show the updated value.4.4.0-4.4.5, 5.0.0-5.4.0
2705056
SVIs do not inherit the pinned MAC address of the bridge.4.3.0, 5.0.0-5.4.04.3.1-4.4.5
2701000
A default route learned from DHCP on eth0 in the management VRF might install in the default VRF if eth0 is disconnected and the original next hop is reachable in the default VRF. To work around this issue, delete the DHCP lease file for eth0 with the sudo rm /var/lib/dhcp/dhclient.eth0.leases command.4.3.0, 5.0.0-5.4.04.3.1-4.4.5
2684925
The NVUE nv show vrf default router bgp peer command produces a 404 not found error.4.4.0-4.4.5, 5.0.0-5.4.0

Fixed Issues in 5.4.0

Issue IDDescriptionAffects
3351953
In rare circumstances, attempting to install a Cumulus Linux 5.3 image can fail during installation. The device stops at the (initramfs) prompt. To resume installation, enter the exit command at the (initramfs) prompt.5.3.0-5.3.1
3351951
None
Currently, the default core dump size limit on Cumulus Linux is 256M but the SDK generates core dumps around 800M. To avoid incomplete core files, you can increase the core dump size limit.4.2.1-5.3.1
3351936
Switch fans run at very high speed but the temperature is normal.5.2.0-5.3.1
3344373
When the switch boots up, you might see logs similar to the following in the nvued log files because switchd is not up and running. This does not impact switch functionality
2023-01-29T06:05:18.683152+00:00 cumulus nvued:  INFO: apply_config.py:2177 Apply Issues: (b'),(update-ports returned with error (code 254): ports validation node file is not accessibleswitchd validate_node is absent),(ports configuration(ports.conf/ports_width.conf) is invalid),(')
3339336
The ethtool -m command does not show Digital Optical Monitoring (DOM) for SFP transceivers. To work around this issue, run the l1-show or mlxlink command instead.5.2.0-5.3.1
3332869
When a switch is operating as a PTP Grand Master, the phc2sys service might exit shortly after starting as the initial offset to correct is the delta from epoch, which is too large to correct.
3330705
When using TACACS+, a TACACS+ server name that returns more than one IP address, such as an IPv6 and IPv4 address, is counted many times against the limit of seven TACACS+ servers, which might cause some of the later listed servers to be ignored as over the limit. To work around this issue, you can set the prefer_ip_version configuration option (the default value is 4) to choose between an IPv4 or IPv6 address if both are present.3.7.0-5.3.1
3330600
The SNMP monitor might fail to send the expected traps.5.3.0-5.3.1
3329096
None
The traffic control rules that the EVPN multihoming configuration adds to an interface are deleted when the hsflowd service restarts. The hsflowd service deletes the EVPN multihoming traffic control filters after you stop hsflowd, then adds back the match-all filters with the psample action; however, hsflowd does not add back the EVPN multihoming traffic control rules.4.4.0-5.3.1
3308248
DHCP packets do not forward over VXLAN interfaces in multicast replication environments. This issue does not affect VXLAN environments using head end replication (HER).5.2.0-5.3.1
3303084
The memory consumption in ptmd can grow when the socket being used for a BFD session needs to be recreated. This is often seen when the route being used to forward the BFD packets is removed; for example, if the connected route is removed when an interface goes down, over which a single hop BFD session is formed.5.3.0-5.3.1
3301988
Some EVPN multihoming show commands might cause BGP to crash if you use the json flag and attempt to reference the default VRF by name. For example, show bgp l2vpn evpn es-vrf json.5.0.0-5.3.1
3301950
When upgrading from Cumulus Linux 5.0.0 thru 5.2.1 to Cumulus Linux 5.3.0 or 5.3.1, the babeltrace and python3-babeltrace packages are not added automatically even though they are in the default image in Cumulus Linux 5.3.0 and later. You may need these packages to decode LTTNG traces with /usr/lib/frr/frr_babeltrace.py.. If you need to use this script, run the sudo apt update && sudo apt install babeltrace python3-babeltrace command to install the packages.5.3.0-5.3.1
3298616
NVUE gracefully detects and handles upgrades that include valid flexible snippets. For any invalid (incompatible) flexible snippets, you must delete the snippets before you apt upgrade Cumulus Linux; otherwise, the NVUE nv config apply command and the equivalent REST API, do not run.5.3.0-5.3.1
3293039
When you add the /etc/frr/frr.conf file to the ignore list for NVUE, any configuration change causes FRR to restart because a check is done to see if any running configuration has changed since the previously applied configuration in the vtysh shell.5.3.0-5.3.1
3292773
NVUE requires the SNMPv2 community string to be a minimum of eight characters.5.3.0-5.3.1
3289972
None
When the switch needs to forward a frame that has a source MAC address of 00:00:00:00:00:00, the dmesg log might report the message bridge: RTM_NEWNEIGH with invalid ether address in a loop every 30 seconds. The log message is harmless and frames with that MAC forward correctly.5.3.0-5.3.1
3289646
The memory consumption in ptmd can grow when the socket being used for a BFD session needs to be recreated. This is often seen when the route being used to forward BFD packets is removed; for example, if the connected route is removed when an interface goes down, over which a single hop BFD session is formed.5.2.0-5.3.1
3283598
None
After you restart the FRR service, show commands incorrectly reflect the VLAN associated with layer 3 VNIs as 0:
# net show evpn vni 123VNI: 123Type: L3Tenant VRF: BLUEVlan: 0
5.3.0-5.3.1
3267328
On Spectrum 1 switches when configuring ACLs in non-atomic mode, if there are too many IPv6 matches due to rules with both input-interface and output-interface matches on SVIs, the ACL install fails and switchd crashes.5.2.0-5.3.1
3266050
Due to a race at the initial configuration, the SDK RDQ test may test RDQ configured for WJH and fail the test resulting in a fatal health event.5.2.0-5.3.1
3262012
None
When an FRR routing service (such as bgpd) becomes unresponsive, watchfrr might fail to stop and restart service. To work around this issue, restart FRR with the systemctl restart frr command.4.4.0-5.3.1
3255899
The Linux utility that sends ARP packets is constrained to 512 interfaces on the system. In large scale deployments, the warm boot process fails repeatedly as it sends gratuitous ARP requests for each local address. This issue does not impact the functionality and can be ignored.5.2.0-5.3.1
3244955
ACL configurations fail when the TCAM memory is exhausted because the CTCAM profile is configured with duplicate entries.5.2.0-5.3.1
3241047
When you delete a route under the following conditions, switchd might crash:- The minimum number of routes is set to a non-zero value
- KVD utilization is higher than sixty percent
- The number of routes currently configured is less than the minimum reserved value, and multiple KVD linear resources have just been freed and are waiting in the Garbage Collector queue.
5.2.0-5.3.1
3234085
None
When you configure or unconfigure a BGP peer and interface towards a host, memory corruption can cause BGP to crash.5.0.1-5.3.1
3226525
When using TACACS+, if you configure per-command authorization with the tacplus-restrict command, NVUE configuration commands fail for any user with a privilege level lower than 15. This occurs because NVUE is not able to create a .local user directory.5.2.0-5.3.1
3145222
The NVUE nv show system forwarding –output json command does not provide any output. To work around this issue, run the nv show system forwarding command.5.2.0-5.3.1
3074390
You can not apply NVUE configurations when TACACS is enabled for user authentication. To work around this issue, add the nvue account to the exclude_users line in /etc/tacplus_nss.conf:
exclude_users=root,daemon,nobody,cron,radius_user,radius_priv_user,sshd,cumulus,quagga,frr,nvue,snmp,www-data,ntp,man,_lldpd,*
5.0.1-5.3.1
3037824
The NVUE nv show interface link state command shows an empty table instead of showing the port link state.5.0.0-5.3.1
3015393
The NVUE nv show interface command shows the operational state of the tunnel as down even though the tunnel is up, and encapsulation and decapsulation occurs correctly.5.1.0-5.3.1
2821929
FRR restarts even when the NVUE configuration overwrite mode is set.5.0.0-5.3.1