Monitoring and Troubleshooting
This chapter introduces the basics for monitoring and troubleshooting Cumulus Linux.
The serial console is a useful tool for debugging issues, especially if you reboot the switch often or if you do not have a reliable network connection.
The default serial console baud rate is 115200, which is the baud rate ONIE uses.
Configure the Serial Console
On x86 switches, you configure serial console baud rate by editing
Incorrect configuration settings in
grub can cause the switch to be inaccessible via the console. Review
grub changes carefully before you implement them.
The valid values for the baud rate are:
To change the serial console baud rate:
/etc/default/grubfile. The two relevant lines in
/etc/default/grubare as follows; replace the 115200 value with a valid value specified above in the
--speedvariable in the first line and in the
consolevariable in the second line:
GRUB_SERIAL_COMMAND="serial --port=0x2f8 --speed=115200 --word=8 --parity=no --stop=1" GRUB_CMDLINE_LINUX="console=ttyS1,115200n8 cl_platform=accton_as5712_54x"
After you save your changes to the grub configuration, type the following at the command prompt:
If you plan on accessing the switch BIOS over the serial console, you need to update the baud rate in the switch BIOS. For more information, see this knowledge base article.
Reboot the switch.
Change the Console Log Level
By default, the console prints all log messages except debug messages. To tune console logging to be less verbose so that certain levels of messages are not printed, run the
dmesg -n <level> command, where the log levels are:
|0||Emergency messages (the system is about to crash or is unstable).|
|1||Serious conditions; you must take action immediately.|
|2||Critical conditions (serious hardware or software failures).|
|3||Error conditions (often used by drivers to indicate difficulties with the hardware).|
|4||Warning messages (nothing serious but might indicate problems).|
|5||Message notifications for many conditions, including security events.|
Only messages with a value lower than the level specified are printed to the console. For example, if you specify level 3, only level 2 (critical conditions), level 1 (serious conditions), and level 0 (emergency messages) are printed to the console:
cumulus@switch:~$ sudo dmesg -n 3
Alternatively, you can run
dmesg --console-level <level> command, where the log levels are
debug. For example, to print critical conditions, run the following command:
cumulus@switch:~$ sudo dmesg --console-level crit
dmesg command is applied until the next reboot.
For more details about the
dmesg command, run
Show General System Information
Two commands are helpful for getting general information about the switch and the version of Cumulus Linux you are running. These are helpful with system diagnostics and if you need to submit a support request.
For information about the version of Cumulus Linux running on the switch, run the
net show version,command which displays the contents of
cumulus@switch:~$ net show version NCLU_VERSION=1.0-cl4u1 DISTRIB_ID="Cumulus Linux" DISTRIB_RELEASE=4.4.0 DISTRIB_DESCRIPTION="Cumulus Linux 4.4.0"
For general information about the switch, run
net show system, which gathers information about the switch from a number of files in the system:
cumulus@switch:~$ net show system Hostname......... mlx-3700 Build............ Cumulus Linux 4.3.0~1605304302.c2213761 Uptime........... 19 days, 9:35:29.710000 Model............ Mlnx MSN3700C CPU.............. x86_64 Intel Pentium D D1508 2.20 GHz Memory........... 8GB Disk............. 28GB ASIC............. Mellanox Spectrum-2 MTxxxxxx Ports............ 32 x 100G-QSFP28 ...
Diagnostics Using cl-support
You can use
cl-support to generate a single export file that contains various details and the configuration from a switch. This is useful for remote debugging and troubleshooting. For more information about
cl-support, read Understanding the cl-support Output File.
cl-support before you submit a support request as this file helps in the investigation of issues.
cumulus@switch:~$ sudo cl-support -h Usage: [-h (help)] [-cDjlMsv] [-d m1,m2,...] [-e m1,m2,...] [-p prefix] [-r reason] [-S dir] [-T Timeout_seconds] [-t tag] -h: Display this help message -c: Run only modules matching any core files, if no -e modules -D: Display debugging information -d: Disable (do not run) modules in this comma separated list -e: Enable (only run) modules in this comma separated list; "-e all" runs all modules and sub-modules, including all optional modules ...
Send Log Files to a syslog Server
You can configure the remote syslog server on the switch using the following configuration:
cumulus@switch:~$ net add syslog host ipv4 192.168.0.254 port udp 514 cumulus@switch:~$ net pending cumulus@switch:~$ net commit
This creates a file called
/etc/rsyslog.d/11-remotesyslog.conf in the
rsyslog directory. The file has the following content:
cumulus@switch:~$ cat /etc/rsyslog.d/11-remotesyslog.conf # This file was automatically generated by NCLU. *.* @192.168.0.254:514 # UDP
Log Technical Details
Logging on Cumulus Linux is done with rsyslog.
rsyslog provides both local logging to the
syslog file as well as the ability to export logs to an external
syslog server. High precision timestamps are enabled for all
rsyslog log files; for example:
2015-08-14T18:21:43.337804+00:00 cumulus switchd: switchd.c:1409 switchd version 1.0-cl2.5+5
There are applications in Cumulus Linux that can write directly to a log file without going through
rsyslog. These files are typically located in
All Cumulus Linux rules are stored in separate files in
/etc/rsyslog.d/, which are called at the end of the
GLOBAL DIRECTIVES section of
/etc/rsyslog.conf. As a result, the
RULES section at the end of
rsyslog.conf is ignored because the messages have to be processed by the rules in
/etc/rsyslog.d and then dropped by the last line in
Most logs within Cumulus Linux are sent through
rsyslog, which writes them to files in the
/var log directory. There are default rules in the
/etc/rsyslog.d/ directory that define where the logs are written:
|10-rules.conf||Sets defaults for log messages, include log format and log rate limits.|
|22-linkstate.conf||Logs link state changes for all physical and logical network links to |
|45-frr.conf||Logs routing protocol messages to |
|99-syslog.conf||All remaining processes that use |
Log files that are rotated are compressed into an archive. Processes that do not use
rsyslog write to their own log files within the
/var/log directory. For more information on specific log files, see Troubleshooting Log Files.
Enable Remote syslog
By default, not all log messages are sent to a remote server. To send other log files (such as
switchd logs) to a
syslog server, follow these steps:
Create a file in
/etc/rsyslog.d/. Make sure the filename starts with a number lower than 99 so that it executes before log messages are dropped in, such as
25-switchd.conf. The example file below is called
/etc/rsyslog.d/11-remotesyslog.conf. Add content similar to the following:
## Logging switchd messages to remote syslog server @192.168.1.2:514
This configuration sends log messages to a remote
syslogserver for the following processes:
syslog. It follows the same syntax as the
/var/log/syslogfile, where @ indicates UDP, 192.168.12 is the IP address of the
syslogserver, and 514 is the UDP port.
- For TCP-based syslog, use two @@ before the IP address @@192.168.1.2:514.
syslogover TCP places a burden on the switch to queue packets in the
syslogbuffer. This may cause detrimental effects if the remote
syslogserver becomes unavailable.
- The numbering of the files in
/etc/rsyslog.d/dictates how the rules are installed into
rsyslog.d. Lower numbered rules are processed first, and
rsyslogprocessing terminates with the
stopkeyword. For example, the
rsyslogconfiguration for FRR is stored in the
45-frr.conffile with an explicit
stopat the bottom of the file. FRR messages are logged to the
/var/log/frr/frr.logfile on the local disk only (these messages are not sent to a remote server using the default configuration). To log FRR messages remotely in addition to writing FRR messages to the local disk, rename the
11-remotesyslog.conf. FRR messages are first processed by the
11-remotesyslog.confrule (transmit to remote server), then continue to be processed by the
45-frr.conffile (write to local disk in the
- Do not use the
imfilemodule with any file written by
cumulus@switch:~$ sudo systemctl restart rsyslog.service
Write to syslog with Management VRF Enabled
You can write to syslog with management VRF enabled by applying the following configuration; this configuration is commented out in the
cumulus@switch:~$ cat /etc/rsyslog.d/11-remotesyslog.conf ## Copy all messages to the remote syslog server at 192.168.0.254 port 514 action(type="omfwd" Target="192.168.0.254" Device="mgmt" Port="514" Protocol="udp")
For each syslog server, configure a unique
action line. For example, to configure two syslog servers at 192.168.0.254 and 10.0.0.1:
cumulus@switch:~$ cat /etc/rsyslog.d/11-remotesyslog.conf ## Copy all messages to the remote syslog servers at 192.168.0.254 and 10.0.0.1 port 514 action(type="omfwd" Target="192.168.0.254" Device="mgmt" Port="514" Protocol="udp") action(type="omfwd" Target="10.0.0.1" Device="mgmt" Port="514" Protocol="udp")
Rate-limit syslog Messages
If you want to limit the number of
syslog messages that can be written to the
syslog file from individual processes, add the following configuration to the
/etc/rsyslog.conf file. Adjust the interval and burst values to rate-limit messages to the appropriate levels required by your environment. For more information, read the rsyslog documentation.
module(load="imuxsock" SysSock.RateLimit.Interval="2" SysSock.RateLimit.Burst="50")
The following test script shows an example of rate-limit outputvin Cumulus Linux
Harmless syslog Error: Failed to reset devices.list
The following message is logged to
/var/log/syslog when you run
systemctl daemon-reload and during system boot:
systemd: Failed to reset devices.list on /system.slice: Invalid argument
This message is harmless, and can be ignored. It is logged when
systemd attempts to change group attributes that are read only. The upstream version of
systemd has been modified to not log this message by default.
systemctl daemon-reload command is often issued when Debian packages are installed, so the message may be seen multiple times when upgrading packages.
Syslog Troubleshooting Tips
You can use the following commands to troubleshoot
Verifying that rsyslog is Running
To verify that the
rsyslog service is running, use the
sudo systemctl status rsyslog.service command:
cumulus@leaf01:mgmt-vrf:~$ sudo systemctl status rsyslog.service rsyslog.service - System Logging Service Loaded: loaded (/lib/systemd/system/rsyslog.service; enabled) Active: active (running) since Sat 2017-12-09 00:48:58 UTC; 7min ago Docs: man:rsyslogd(8) http://www.rsyslog.com/doc/ Main PID: 11751 (rsyslogd) CGroup: /system.slice/rsyslog.service └─11751 /usr/sbin/rsyslogd -n Dec 09 00:48:58 leaf01 systemd: Started System Logging Service.
Verify your rsyslog Configuration
After making manual changes to any files in the
/etc/rsyslog.d directory, use the
sudo rsyslogd -N1 command to identify any errors in the configuration files that might prevent the
rsyslog service from starting.
In the following example, a closing parenthesis is missing in the
11-remotesyslog.conf file, which is used to configure
syslog for management VRF:
cumulus@leaf01:mgmt-vrf:~$ cat /etc/rsyslog.d/11-remotesyslog.conf action(type="omfwd" Target="192.168.0.254" Device="mgmt" Port="514" Protocol="udp" cumulus@leaf01:mgmt-vrf:~$ sudo rsyslogd -N1 rsyslogd: version 8.4.2, config validation run (level 1), master config /etc/rsyslog.conf syslogd: error during parsing file /etc/rsyslog.d/15-crit.conf, on or before line 3: invalid character '$' in object definition - is there an invalid escape sequence somewhere? [try http: /www.rsyslog.com/e/2207 ] rsyslogd: error during parsing file /etc/rsyslog.d/15-crit.conf, on or before line 3: syntax error on token 'crit_log' [try http://www.rsyslog.com/e/2207 ]
After correcting the invalid syntax, issuing the
sudo rsyslogd -N1 command produces the following output.
cumulus@leaf01:mgmt-vrf:~$ cat /etc/rsyslog.d/11-remotesyslog.conf action(type="omfwd" Target="192.168.0.254" Device="mgmt" Port="514" Protocol="udp") cumulus@leaf01:mgmt-vrf:~$ sudo rsyslogd -N1 rsyslogd: version 8.4.2, config validation run (level 1), master config /etc/rsyslog.conf rsyslogd: End of config validation run. Bye.
If a syslog server is not accessible to validate that
syslog messages are being exported, you can use
In the following example, a syslog server has been configured at 192.168.0.254 for UDP syslogs on port 514:
cumulus@leaf01:mgmt-vrf:~$ sudo tcpdump -i eth0 host 192.168.0.254 and udp port 514
A simple way to generate
syslog messages is to use
sudo in another session, such as
sudo date. Using
sudo generates an
cumulus@leaf01:mgmt-vrf:~$ sudo tcpdump -i eth0 host 192.168.0.254 and udp port 514 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes 00:57:15.356836 IP leaf01.lab.local.33875 > 192.168.0.254.syslog: SYSLOG authpriv.notice, length: 105 00:57:15.364346 IP leaf01.lab.local.33875 > 192.168.0.254.syslog: SYSLOG authpriv.info, length: 103 00:57:15.369476 IP leaf01.lab.local.33875 > 192.168.0.254.syslog: SYSLOG authpriv.info, length: 85
To see the contents of the
syslog file, use the
tcpdump -X option:
cumulus@leaf01:mgmt-vrf:~$ sudo tcpdump -i eth0 host 192.168.0.254 and udp port 514 -X -c 3 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes 00:59:15.980048 IP leaf01.lab.local.33875 > 192.168.0.254.syslog: SYSLOG authpriv.notice, length: 105 0x0000: 4500 0085 33ee 4000 4011 8420 c0a8 000b E...3.@.@....... 0x0010: c0a8 00fe 8453 0202 0071 9d18 3c38 353e .....S...q..<85> 0x0020: 4465 6320 2039 2030 303a 3539 3a31 3520 Dec..9.00:59:15. 0x0030: 6c65 6166 3031 2073 7564 6f3a 2020 6375 leaf01.sudo:..cu 0x0040: 6d75 6c75 7320 3a20 5454 593d 7074 732f mulus.:.TTY=pts/ 0x0050: 3120 3b20 5057 443d 2f68 6f6d 652f 6375 1.;.PWD=/home/cu 0x0060: 6d75 6c75 7320 3b20 5553 4552 3d72 6f6f mulus.;.USER=roo 0x0070: 7420 3b20 434f 4d4d 414e 443d 2f62 696e t.;.COMMAND=/bin 0x0080: 2f64 6174 65 /date